Current state: ["Under Discussion"]
...
- both radius low bound and radius high bound MUST be set
- falling in range scope means
Metric type | Behavior |
---|---|
IP | radius_low_bound < distance <= radius_high_bound |
L2 and other metric types | radius_low_bound <= distance < radius_high_bound |
Motivation(required)
Many users request the "range search" functionality. With this capability, user can
...
In Knowhere, two new parameters "radius_low_bound" and "radius_high_bound" are passed in via config, and range search will return all un-sorted results with distance falling in this range scope.
Metric type | Behavior |
---|---|
IP | radius_low_bound < distance <= radius_high_bound |
L2 or other metric types | radius_low_bound <= distance < radius_high_bound |
We add 3 new APIs CheckRangeSearch(), QueryByRange() and BruteForce::RangeSearch() to support range search.
...
This API returns all unsorted results with distance falling in the specified range scope.
PROTO | virtual DatasetPtr |
INPUT | Dataset { Config { knowhere::meta::RADIUS_LOW_BOUND: - knowhere::meta::RADIUS_HIGH_BOUND: - } |
OUTPUT | Dataset { |
...
This API does range search for no-index dataset, it returns all unsorted results with distance "better than radius" (for IP: > radius; for others: < radius).
PROTO | static DatasetPtr |
INPUT | Dataset { Dataset { Config { knowhere::meta::RADIUS_LOW_BOUND: - knowhere::meta::RADIUS_HIGH_BOUND: - } |
OUTPUT | Dataset { |
...
- test/milvus/ann_hdf5/binary/sift-4096-hamming-range.hdf5
- base dataset and query dataset are identical with sift1m
- unified radius radius_low_bound = 0.0, radius_high_bound = 291.0
- result length for each nq is different
- total result num 1,063,078
- test/milvus/ann_hdf5/sift-128-euclidean-range.hdf5
- base dataset and query dataset are identical with sift1m
- unified radius radius_low_bound = 0.0, radius_high_bound = 186.0
- result length for each nq is different
- total result num 1,054,377
- test/milvus/ann_hdf5/sift-128-euclidean-range-multi.hdf5
- base dataset and query dataset are identical with sift1m
- ground truth IDs and Distances are identical with sift1m
- each nq's radius_low_bound is 0.0, radius_high_bound is set to the last ground truth distance
- result length for each nq is 100
- total result num 1,000,000
- test/milvus/ann_hdf5/glove-200-angular-range.hdf5
- base dataset and query dataset are identical with glove200
- unified radius_low_bound = 0.52, radius_high_bound = 1.01
- result length for each nq is different
- total result num 1,060,888
...