...
Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed.
Faiss need add a new field "xb_ex" and new interface "add_ex" for structure IndexBinaryFlat; also add new field "codes_ex" and new interface "add_ex()" for IDMAP and BinaryIDMAP" for structure IndexFlat.
In IndexBinaryFlat, "xb" and "xb_ex" are mutual exclusive, user cannot set them at the same time; it's same in IndexFlat, "codes" and "codes_ex" are also mutual exclusive, user cannot set both of them.
Code Block | ||
---|---|---|
| ||
//============================================================================ struct IndexBinaryFlat : IndexBinary { /// database vectors, size ntotal * d / 8 std::vector<uint8_t> xb; /// external database vectors, size ntotal * d / 8 uint8_t* xb_ex = nullptr; // <==== new added ... ... } void IndexBinaryFlat::add_ex(idx_t n, const uint8_t* x) { xb_ex = (uint8_t*)x; ntotal = n; } //============================================================================ struct IndexFlatCodes : Index { ... ... /// encoded dataset, size ntotal * code_size std::vector<uint8_t> codes; // external encoded dataset , size ntotal * code_size uint8_t* codes_ex = nullptr; // <==== new added ... ... } void IndexFlatCodes::add_ex(idx_t n, const float* x) { FAISS_THROW_IF_NOT(is_trained); FAISS_THROW_IF_NOT(codes.empty()); codes_ex = (uint8_t*)x; ntotal = n; } |
...
Knowhere need add a new interface `AddExWithoutIds()` for both IDMAP and BinaryIDMAP.
Code Block | ||
---|---|---|
| ||
// set external data pointer instead really add data void AddExWithoutIds(const DatasetPtr&, const Config&); |
When Knowhere detect that "codes_ex" is used in current IDMAP index or "xb_ex" is used in current BinaryIDMAP index, serialization is banned.
For Milvus, the implementation of API "FloatSearchBruteForce()" and "BinarySearchBruteForce()" will be re-written, but the interface need not change.
...
Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgment based on the scope of the change.
In Milvus, when growing segment need create an enhanced IDMAP index, it can do in this way:
Code Block | ||
---|---|---|
| ||
auto idmap_index = std::make_shared<knowhere::IDMAP>();
idmap_index->Train(train_dataset, conf);
idmap_index->AddExWithoutIds(train_dataset, conf); // <==== call ""AddExWithoutIds"
auto result = idmap_index->Query(query_dataset, conf, bitset); |
This enhanced IDMAP index cannot be serialized, and will be auto destroyed without any cost.
Compatibility, Deprecation, and Migration Plan(optional)
...