Current state: Accepted
...
To reduce network transmission and skip Plusar management, the new interface will allow users to input the path of some data files(json, numpy, etc.) on MinIO/S3 storage, and let the data nodes directly read these files and parse them into segments. The internal logic of the process becomes:
1. client calls import() to pass some file paths to Milvus proxy node
2. proxy node passes the file paths to data coordinator node
3. data coordinator node picks a data node or multiple data nodes (according to the sharding number) to parse files, each file can be parsed into a segment or multiple segments.
SDK Interfaces
The python API declaration:
def import(collection_name, files, partition_name=None, bucket=None, default_fields=None)
- collection_name: the target collection name (required)
- partition_name: target partition name (optional)
- files: a list of files with pre-defined format (required)
- bucket: the MinIO/S3 bucket where the files come from, same with Milvus server bucket by default (optional)
- default_fields: a dict to set the default value for some fields (optional)
Proxy RPC Interfaces
The declaration of import API in proxy RPC:
...