About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Arturs Backurs
- sl:arxiv_num : 1910.04126
- sl:arxiv_published : 2019-10-09T17:12:41Z
- sl:arxiv_summary : The Optimal Transport (a.k.a. Wasserstein) distance is an increasingly
popular similarity measure for rich data domains, such as images or text
documents. This raises the necessity for fast nearest neighbor search with
respect to this distance, a problem that poses a substantial computational
bottleneck for various tasks on massive datasets.
In this work, we study fast tree-based approximation algorithms for searching
nearest neighbors w.r.t. the Wasserstein-1 distance. A standard tree-based
technique, known as Quadtree, has been previously shown to obtain good results.
We introduce a variant of this algorithm, called Flowtree, and formally prove
it achieves asymptotically better accuracy. Our extensive experiments, on
real-world text and image datasets, show that Flowtree improves over various
baselines and existing methods in either running time or accuracy. In
particular, its quality of approximation is in line with previous high-accuracy
methods, while its running time is much faster.@en
- sl:arxiv_title : Scalable Nearest Neighbor Search for Optimal Transport@en
- sl:arxiv_updated : 2020-02-14T14:54:37Z
- sl:bookmarkOf : https://arxiv.org/abs/1910.04126
- sl:creationDate : 2020-02-20
- sl:creationTime : 2020-02-20T09:11:40Z
Documents with similar tags (experimental)