Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Current state: ["Under Discussion"]

...

This API does range search for index, it returns all unsorted results with distance "better than radius" (for IP: > radius; for others: < radius).

PROTO
virtual DatasetPtr
QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset)

INPUT

Dataset {
    knowhere::meta::TENSOR: -   // query data
    knowhere::meta::ROWS: -      // rows of queries
    knowhere::meta::DIM: -          // dimension
}

Config {

    knowhere::meta::RADIUS: -   // radius for range search

}

OUTPUT

Dataset {
    knowhere::meta::IDS: -                // result IDs with length LIMS[nq]
    knowhere::meta::DISTANCE: -  // result DISTANCES with length LIMS[nq]
    knowhere::meta::LIMS: -            // result offset prefix sum with length nq + 1
}

...

This API does range search for no-index dataset, it returns all unsorted results with distance "better than radius" (for IP: > radius; for others: < radius).

PROTO
static DatasetPtr
RangeSearch(const DatasetPtr base_dataset,
const DatasetPtr query_dataset,
const Config& config,
const faiss::BitsetView bitset);

INPUT

Dataset {
    knowhere::meta::TENSOR: -   // base data
    knowhere::meta::ROWS: -      // rows of base data
    knowhere::meta::DIM: -          // dimension
}

Dataset {
    knowhere::meta::TENSOR: -   // query data
    knowhere::meta::ROWS: -      // rows of queries
    knowhere::meta::DIM: -          // dimension
}

Config {

    knowhere::meta::RADIUS: -   // radius for range search

}

OUTPUT

Dataset {
    knowhere::meta::IDS: -                // result IDs with length LIMS[nq]
    knowhere::meta::DISTANCE: -  // result DISTANCES with length LIMS[nq]
    knowhere::meta::LIMS: -            // result offset prefix sum with length nq + 1
}

...

Compatibility, Deprecation, and Migration Plan(optional)

There This is a new functionality, there is no compatibility issue.

...

  1. Add new unittest
  2. Add benchmark using to test range search runtime and recall
  3. Add benchmark to test range search datasetQPS

There is no public dataset for range search. I have created range search data set based on sift1M and glove200.

You can find them in NAS:

  1. test/milvus/ann_hdf5/binary/sift-

...

  1. 4096-

...

  1. hamming-range.hdf5
    1. base dataset and query dataset are identical with sift1m
    2. unified radius = 
  2. test/milvus/ann_hdf5/sift-128-euclidean-range

...

  1. .hdf5
    1. base dataset and query dataset are identical with sift1m
    2. unified radius = 186.0
    3. result length for each nq is different
  2. test/milvus/ann_hdf5/

...

  1. sift-

...

  1. 128-

...

  1. euclidean-range-multi.hdf5
    1. base dataset and query dataset are identical with sift1m
    2. each nq has different radius setting
    3. result length for each nq is 100
  2. test/milvus/ann_hdf5/

...

  1. glove-

...

  1. 200-

...

  1. angular-range.hdf5
    1. base dataset and query dataset are identical with glove200
    2. unified radius = 
    3. result length for each nq is different


Segcore

  1. Add new range search unittest

...