corpus-texmex.irisa.fr
Evaluation of Approximate nearest neighbors: large datasetsDatasets for approximate nearest neighbor search. This page provides several evaluation sets to evaluate the quality of approximate nearest neighbors search algorithm on different kinds of data and varying database sizes. In particular, we provide a very large set of 1 billion vectors. To our knowledge this is the largest set provided to evaluate ANN methods. Each comprises 3 subsets of vectors: • base vectors. The vectors in which the search is performed • query vectors. 4 bytes for .fvecs.
http://corpus-texmex.irisa.fr/