Current state: ["Under Discussion"]
ISSUE: #17754
PRs:
Keywords: IDMAP, BinaryIDMAP, brute force search, chunk
Released: Milvus-v2.2.0
In this MEP, we put forward an IDMAP/BinaryIDMAP Enhancement proposal that let knowhere index type IDMAP/BinaryIDMAP accept external vector data instead of adding real vector data in.
This Enhanced IDMAP/BinaryIDMAP can be used for growing segment searching to improve code reuse and reduce code maintenance effort.
Generally no one will create IDMAP/BinaryIDMAP index type for sealed segment, because it does not bring any search performance improvement but consumes identical size of memory and disk.
The only reasonable use scenario for IDMAP/BinaryIDMAP is for growing segment. But creating vector index is a resource consuming operation, because it involves all Milvus nodes in -- an index file is created by index node, saved by data node and loaded by query node, meanwhile proxy / rootcoord / indexcoord / datacoord / querycoord are also involved to coordinate all these operations.
So currently in Milvus, it uses following 2 strategies for growing segment searching:
The advantage of this solution is resource saving, except query node, no other nodes will be involved in; while the shortcoming is code duplication. See following "Search Flow" chart, `FloatSearchBruteForce` and `BinarySearchBruteForce` are copied from knowhere::IDMAP and knowhere::BinaryIDMAP. More code duplicated, more effort on code maintenance. And when realize new feature on IDMAP/BinaryIDMAP in Knowhere, such as range search, same work need be re-done on Milvus segcore.
This is why enhanced IDMAP/BinaryIDMAP is proposed. For enhanced IDMAP/BinaryIDMAP, vector data is not really added in, but only set an external vector data pointer. User should guarantee that the memory is contiguous and safe.
In this way:
Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed.
Knowhere need add a new interface `AddExWithoutIds()` for IDMAP and BinaryIDMAP
// set external data pointer instead really add data void AddExWithoutIds(const DatasetPtr&, const Config&); |
Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.
Knowhere need add some unittests to test new interface `AddExWithoutIds()`.
No extra testcases need be added in Milvus because current growing segment search testcases can cover this change.
Search result and performance will be identical with before.
Briefly list all references