Unsupervised Similarity Learning from Textual Data (2012)(About) > Two main components of the model are a semantic interpreter of texts and a similarity function whose properties are derived from data. The first one associates particular documents with concepts defined in a knowledge base corresponding to the topics covered by the corpus. It shifts the representation of a meaning of the texts from words that can be ambiguous to concepts with predefined semantics. With this new representation, the similarity function is derived from data using a modification of the dynamic rule-based similarity model, which is adjusted to the unsupervised case.
By same author: [Interactive Document Indexing Method Based on Explicit Semantic Analysis](https://link.springer.com/chapter/10.1007/978-3-642-32115-3_18)