Ranking Measures and Loss Functions in Learning to Rank (2009)(About) > While most learning-to-rank methods learn the ranking function by minimizing the loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking function. In this work, we reveal the relationship between ranking measures and loss functions in learning-to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE.
> we have proved that many pairwise/listwise losses in learning to rank are actually upper bounds of measure-based ranking errors. As a result, the minimization of these loss functions will lead to the maximization of the ranking measures. The key to obtaining this result is to model ranking as a sequence of classification tasks, and define a so-called essential loss as the weighted sum of the classification errors of individual tasks in the sequence.
> We have also shown a way to improve existing methods
by introducing appropriate weights to their loss functions.
[1810.00438] Zero-training Sentence Embedding via Orthogonal Basis (2018)(About) **training-free approach for building sentence representations**, "Geometric Embedding" (GEM), based on the **geometric structure** of word embedding space.
> we build an orthogonal basis of the subspace spanned by a word and its surrounding context in a sentence. **We model the semantic meaning of a word in a sentence** based on two aspects. One is its relatedness to the word vector subspace already spanned by its contextual words. The other is the word’s novel semantic meaning which shall be introduced as a new basis vector perpendicular to this existing subspace
[Open Revieww](/doc/?uri=https%3A%2F%2Fopenreview.net%2Fforum%3Fid%3DrJedbn0ctQ) ; [Related to this paper](/doc/?uri=https%3A%2F%2Farxiv.org%2Fabs%2F1704.05358)
Knowledge Graph and Text Jointly Embedding (2014)(About) method of jointly embedding knowledge graphs and a text corpus so that entities and words/phrases are represented in the same vector space.
Promising improvement in the accuracy of predicting facts, compared to separately embedding knowledge graphs and text (in particular, enables the prediction of facts containing entities out of the knowledge graph)
[cité par J. Moreno](/doc/?uri=https%3A%2F%2Fhal.archives-ouvertes.fr%2Fhal-01626196%2Fdocument)
Enriching Word Embeddings Using Knowledge Graph for Semantic Tagging in Conversational Dialog Systems - Microsoft Research (2015)(About) > new simple, yet effective approaches to
learn domain specific word embeddings.
> Adapting word embeddings, such as jointly capturing
syntactic and semantic information, can further enrich semantic
word representations for several tasks, e.g., sentiment
analysis (Tang et al. 2014), named entity recognition
(Lebret, Legrand, and Collobert 2013), entity-relation extraction
(Weston et al. 2013), etc. (Yu and Dredze 2014)
has introduced a lightly supervised word embedding learning
extending word2vec. They incorporate prior information to the objective
function as a regularization term considering synonymy relations
between words from Wordnet (Fellbaum 1999).
> In this work, we go one step further and investigate if
enriching the word2vec word embeddings trained on unstructured/
unlabeled text with domain specific semantic relations
obtained from knowledge sources (e.g., knowledge
graphs, search query logs, etc.) can help to discover relation
aware word embeddings. Unlike earlier work, **we encode the
information about the relations between phrases, thereby,
entities and relation mentions are all embedded into a lowdimensional
## Related work (Learning Word Embeddings with Priors)
- Relational Constrained Model (RTM) (Yu and Dredze 2014)
While CBOW learns lexical word embeddings from provided text, the RTM learns embeddings of words based on their similarity to other words provided by a knowledge resource (eg. wordnet)
- Joint model (Yu and Dredze 2014)
combines CBOW and RTM through linear combination
Vectorland: Brief Notes from Using Text Embeddings for Search(About) > the elegance is in the learning model, but the magic is in the structure of the information we model
> The source-target training pairs dictate **what notion of "relatedness"** will be modeled in the embedding space
> is Eminem more similar to Rihanna or rap?
A Ranking Approach to Keyphrase Extraction - Microsoft Research (2009)(About) Previously, automatic keyphrase extraction was formalized as classification and learning methods for classification were utilized. This paper points out that it is more essential to cast the keyphrase extraction problem as ranking and employ a learning to rank method to perform the task. As example, it employs Ranking SVM, a state-of-art method of learning to rank, in keyphrase extraction
Challenges of the email domain for text classification(About) JD Brutlag, C Meek - ICML, 2000 - research.microsoft.com
Interactive classification of email into a userdefined hierarchy of folders is a natural
domain for application of text classification methods. This domain presents several
challenges. First, the user's changing mailfiling habits mandate classification technology ...