Current state: Accepted
ISSUE: https://github.com/milvus-io/milvus/issues/7713
...
Released: with Milvus 2.1
Authors:
Summary
...
Deliver C++ SDK toolkit with full functionality for Milvus 2.0. Provide both static lib and dynamic lib for users.
Motivation
...
We've seen many users demands for C++ SDK, it is probably the most useful SDK which could be used in distributed systems.
Public Interfaces
...
Design Details(required)
TODO lists
- API Design → Reference Java 1.1 SDK, Pymilvus SDK
- Initial version of the code, finish all the frameworks, including how the sdk communicate with milvus cluster, how we do unit test.
- Finish DDL, DML , Control API implementation
- Unit Test
- Usage examples
- API Document
- CI Test
Which C++ versions are we going to support?
...
Client interfaces declaration:
Code Block |
---|
class MilvusClient {
public:
/**
* Create a MilvusClient instance.
*
* @return std::shared_ptr<MilvusClient>
*/
static std::shared_ptr<MilvusClient>
Create();
/**
* Connect to Milvus server.
*
* @param [in] connect_param server address and port
* @return Status operation successfully or not
*/
virtual Status
Connect(const ConnectParam& connect_param) = 0;
/**
* Break connections between client and server.
*
* @return Status operation successfully or not
*/
virtual Status
Disconnect() = 0;
/**
* Create a collection with schema.
*
* @param [in] schema schema of the collection
* @return Status operation successfully or not
*/
virtual Status
CreateCollection(const CollectionSchema& schema) = 0;
/**
* Check existence of a collection.
*
* @param [in] collection_name name of the collection
* @param [out] has true: collection exists, false: collection doesn't exist
* @return Status operation successfully or not
*/
virtual Status
HasCollection(const std::string& collection_name, bool& has) = 0;
/**
* Drop a collection, with all its partitions, index and segments.
*
* @param [in] collection_name name of the collection
* @return Status operation successfully or not
*/
virtual Status
DropCollection(const std::string& collection_name) = 0;
/**
* Load collection data into CPU memory of query node.
* If the timeout is specified, this api will call ShowCollections() to check collection's loading state,
* waiting until the collection completely loaded into query node.
*
* @param [in] collection_name name of the collection
* @param [in] progress_monitor set timeout to wait loading progress complete, set to ProgressMonitor::NoWait() to
* return instantly
* @return Status operation successfully or not
*/
virtual Status
LoadCollection(const std::string& collection_name, const ProgressMonitor& progress_monitor) = 0;
/**
* Release collection data from query node.
*
* @param [in] collection_name name of the collection
* @return Status operation successfully or not
*/
virtual Status
ReleaseCollection(const std::string& collection_name) = 0;
/**
* Get collection description, including its schema.
*
* @param [in] collection_name name of the collection
* @param [out] collection_desc collection's description
* @return Status operation successfully or not
*/
virtual Status
DescribeCollection(const std::string& collection_name, CollectionDesc& collection_desc) = 0;
/**
* Get collection statistics, currently only return row count.
* If the timeout is specified, this api will call Flush() and wait all segmetns persisted into storage.
*
* @param [in] collection_name name of the collection
* @param [in] progress_monitor set timeout to wait flush progress complete, set to ProgressMonitor::NoWait() to
* return instantly
* @param [out] collection_stat statistics of the collection
* @return Status operation successfully or not
*/
virtual Status
GetCollectionStatistics(const std::string& collection_name, const ProgressMonitor& progress_monitor,
CollectionStat& collection_stat) = 0;
/**
* If the collection_names is empty, list all collections brief informations.
* If the collection_names is specified, return the specified collection's loading process state.
*
* @param [in] collection_names name array of collections
* @param [out] collections_info brief informations of the collections
* @return Status operation successfully or not
*/
virtual Status
ShowCollections(const std::vector<std::string>& collection_names, CollectionsInfo& collections_info) = 0;
/**
* Create a partition in a collection.
*
* @param [in] collection_name name of the collection
* @param [in] partition_name name of the partition
* @return Status operation successfully or not
*/
virtual Status
CreatePartition(const std::string& collection_name, const std::string& partition_name) = 0;
/**
* Drop a partition, with its index and segments.
*
* @param [in] collection_name name of the collection
* @param [in] partition_name name of the partition
* @return Status operation successfully or not
*/
virtual Status
DropPartition(const std::string& collection_name, const std::string& partition_name) = 0;
/**
* Check existence of a partition.
*
* @param [in] collection_name name of the collection
* @param [in] partition_name name of the partition
* @param [out] has true: partition exists, false: partition doesn't exist
* @return Status operation successfully or not
*/
virtual Status
HasPartition(const std::string& collection_name, const std::string& partition_name, bool& has) = 0;
/**
* Load specific partitions data of one collection into query nodes.
* If the timeout is specified, this api will call ShowPartitions() to check partition's loading state,
* waiting until the collection completely loaded into query node.
*
* @param [in] collection_name name of the collection
* @param [in] partition_names name array of the partitions
* @param [in] progress_monitor set timeout to wait loading progress complete, set to
* ProgressMonitor::NoWait() to return instantly
* @return Status operation successfully or not
*/
virtual Status
LoadPartitions(const std::string& collection_name, const std::vector<std::string>& partition_names,
const ProgressMonitor& progress_monitor) = 0;
/**
* Release specific partitions data of one collection into query nodes.
*
* @param [in] collection_name name of the collection
* @param [in] partition_names name array of the partitions
* @return Status operation successfully or not
*/
virtual Status
ReleasePartitions(const std::string& collection_name, const std::vector<std::string>& partition_names) = 0;
/**
* Get partition statistics, currently only return row count.
* If the timeout is specified, this api will call Flush() and wait all segmetns persisted into storage.
*
* @param [in] collection_name name of the collection
* @param [in] partition_name name of the partition
* @param [in] progress_monitor set timeout to wait flush progress complete, set to ProgressMonitor::NoWait() to
* return instantly
* @param [out] partition_stat statistics of the partition
* @return Status operation successfully or not
*/
virtual Status
GetPartitionStatistics(const std::string& collection_name, const std::string& partition_name,
const ProgressMonitor& progress_monitor, PartitionStat& partition_stat) = 0;
/**
* If the partition_names is empty, list all partitions brief informations.
* If the partition_names is specified, return the specified partition's loading process state.
*
* @param [in] collection_name name of the collection
* @param [in] partition_names name array of the partitions
* @param [out] partitions_info brief informations of the partitions
* @return Status operation successfully or not
*/
virtual Status
ShowPartitions(const std::string& collection_name, const std::vector<std::string>& partition_names,
PartitionsInfo& partitions_info) = 0;
/**
* Create an alias for a collection. Alias can be used in search or query to replace the collection name.
* For more information: https://wiki.lfaidata.foundation/display/MIL/MEP+10+--+Support+Collection+Alias
*
* @param [in] collection_name name of the collection
* @param [in] alias alias of the partitions
* @return Status operation successfully or not
*/
virtual Status
CreateAlias(const std::string& collection_name, const std::string& alias) = 0;
/**
* Drop an alias.
*
* @param [in] alias alias of the partitions
* @return Status operation successfully or not
*/
virtual Status
DropAlias(const std::string& alias) = 0;
/**
* Change an alias from a collection to another.
*
* @param [in] collection_name name of the collection
* @param [in] alias alias of the partitions
* @return Status operation successfully or not
*/
virtual Status
AlterAlias(const std::string& collection_name, const std::string& alias) = 0;
/**
* Create an index on a field. Currently only support index on vector field.
*
* @param [in] collection_name name of the collection
* @param [in] index_desc the index descriptions and parameters
* @param [in] progress_monitor set timeout to wait index progress complete, set to ProgressMonitor::NoWait() to
* return instantly
* @return Status operation successfully or not
*/
virtual Status
CreateIndex(const std::string& collection_name, const IndexDesc& index_desc,
const ProgressMonitor& progress_monitor) = 0;
/**
* Get index descriptions and parameters.
*
* @param [in] collection_name name of the collection
* @param [in] field_name name of the field
* @param [out] index_desc index descriptions and parameters
* @return Status operation successfully or not
*/
virtual Status
DescribeIndex(const std::string& collection_name, const std::string& field_name, IndexDesc& index_desc) = 0;
/**
* Get state of an index. From the state client can know whether the index has finished or in-progress.
*
* @param [in] collection_name name of the collection
* @param [in] field_name name of the field
* @param [out] state index state of field
* @return Status operation successfully or not
*/
virtual Status
GetIndexState(const std::string& collection_name, const std::string& field_name, IndexState& state) = 0;
/**
* Get progress of an index. From the progress client can how many rows have been indexed.
*
* @param [in] collection_name name of the collection
* @param [in] field_name name of the field
* @param [out] progress progress array of field, currently only return one index progress
* @return Status operation successfully or not
*/
virtual Status
GetIndexBuildProgress(const std::string& collection_name, const std::string& field_name,
IndexProgress& progress) = 0;
/**
* Drop index of a field.
*
* @param [in] collection_name name of the collection
* @param [in] field_name name of the field
* @return Status operation successfully or not
*/
virtual Status
DropIndex(const std::string& collection_name, const std::string& field_name) = 0;
}; |
Design Details
Project framework
The C++ sdk can be designed as two levels:
- the orm classes: ConnectionInstance/Collection/Partition/Index/Schema/Parameters, and maybe ConnectionPool
- the client implementation: a class to maintain grpc channel, a class to transfer parameters to rpc interface
CI Workflow
Use github ci process to run code lint, clang format check, compile project and run unittest.
Use mergify to automatically add ci-passed label.
Use code coverage tool to generate report and upload to codecov.io.
API Document
Add description for each class/method/constant, follow the Doxygen comment style.
ORM
To be determined.
C++ versions to support
Support C++ version above C++11. The Reason is that :
- Wider user range, as may organizations and devices support C++11.
- Easy to maintain as it will get a large group of developer support.
OS platform to support
For the supported platform, need to be tested with mainstream distributions(e.g. Ubuntu 18.04+, CentOS 7+) using google tests.
Milvus cpp SDK 1.1 using cmake, and only build a shared library as output. In this new SDK for milvus 2, the user can choose which version to be built by setting cmake options. Quality gates such as clang-format, clang-tidy, cpplint are needed.
Code Style
A basic rule of C++ code style:
- Namespace should use
lower_case
- Class name should use
CamelCase
Class member name should use lower_case_ (with a underscore append)
- Enum member name should use
UPPER_CASE
- The static/public Function name should use
CamelCase
, and the private/protected member Function name usecamelBack
For more details, follow the Google C++ Style Guide.
Test Plan
...
- Unit test
- C++ SDK will implement a mock milvus for basic testing, need to be tested with mainstream distributions(e.g. Ubuntu 18.04+, CentOS 7+)
- Start a standalone milvus complicated test.
- CI test
- Do we need to setup basic CI test for further improvement?
- Examples
- finish all the examples in user guide and make sure it works like https://milvus.io/docs/v2.0.0/example_code.md
Rejected Alternatives(optional)
...
References
Current state: Accepted
ISSUE: https://github.com/milvus-io/milvus/issues/7713
PRs:
Keywords: C++ SDK
Released: with Milvus 2.1
Authors: @matrixji @ArkaprabhaChakraborty @yhmo