...
Code Block | ||
---|---|---|
| ||
// range search parameter legacy check virtual bool CheckRangeSearch(Config& cfg, const IndexType type, const IndexMode mode); // range search virtual DatasetPtr QueryByRange(const DatasetPtr& dataset, const Config& config, const faiss::BitsetView bitset); // brute force range search static DatasetPtr BruteForce::RangeSearch(const DatasetPtr base_dataset, const DatasetPtr query_dataset, const Config& config, const faiss::BitsetView bitset); |
Design Details(required)
...
Knowhere
Range search related We add 3 new APIs CheckRangeSearch(), QueryByRange() and BruteForce::RangeSearch() to support range search, these APIs are already available since knowhere-v1.3.0.
- QueryByRange()
This API is used to get all unsorted results with distance "better than radius" (for L2: < radius; for IP: > radius).
PROTO | DatasetPtr QueryByRange(const DatasetPtr&, const Config&, const faiss::BitsetView); |
INPUT | Dataset { Config { knowhere::meta::RADIUS: - // radius for range search } |
OUTPUT | Dataset { |
LIMS is with length "nq+1", it's the offset prefix sum for result IDS and result DISTANSES. The length of IDS and DISTANCES are the same but variable.
Suppose N queried vectors is with label: {0, 1, 2, ..., n-1}
The result counts for each queried vectors are: {r(0), r(1), r(2), ..., r(n-1)}
Then the data in LIMS will be like this: {0, r(0), r(0)+r(1), r(0)+r(1)+r(2), ..., r(0)+r(1)+r(2)+...+r(n-1)}
The total range search result num is: LIMS[nq]
The range search result for each query vector is: IDS[lims[n], lims[n+1]) and DISTANCE[lims[n], lims[n+1])
The memory used for IDS, DISTANCE and LIMS are allocated in Knowhere, they will be auto-freed when Dataset deconstruction.
By new, following index types support QueryByRange():
BinaryIDMAP
BinaryIVF
IDMAP
IVF_FLAT
IVF_PQ
IVF_SQ8
HNSW
Milvus
Range search completely reuses the call stack from SDK to segcore.
...