...
Typically, it cost several hours to insert one billion entities with 128-dimensional vectors. Lots of time wasted in two major areas: RPC transfer and Pulsar management.
We need a new interface to do bulk load for the following purposeswithout network bandwidth wasting and skip the Pulsar management. A brief requirements of the new interface:
- import data from JSON format files. (first stage)
- import data from Numpy format files. (first stage)
- copy a collection within on one Milvus 2.0 service. (second stage)
- copy a collection from one Milvus 2.0 service to another. (second stage)
- import data from Milvus 1.x to Milvus 2.0 (third stage)
- parquet/faiss files (TBD)
...