...
- We understand that there are 2 key steps in machine learning - Model Training and Result Inference. In this first release of this protocol, we will only focus on inference. Training is provided here but it's subjected to more discussion.
Overall Flow
REST APIs
All of the REST APIs call presented below use bearer tokens for authorization. The {prefix} of each API is configurable in the hosted servers. This protocol is inspired by Delta Sharing.
...
High Level Protocol - Training
- BI tool has some data on which predictive analytics would be valuable.
- BI tool requests AI platform, through OBAIC, to train/prepare a model that accepts features of a certain type (numeric, categorical, text, etc.) by providing a token to allow access to the training data with a SQL statement running against the datastore.
- BI tool polls for the status/result of the training. When training is completed, results and performance will be returned.
- AI vendor provides predictions on data shared by BI vendor, again using an access token.
High Level Protocol - Inference
REST APIs
All of the REST APIs call presented below use bearer tokens for authorization. The {prefix} of each API is configurable in the hosted servers. This protocol is inspired by Delta Sharing.
List Models - Step (1)
Expand | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
|
...
Expand | ||||||
---|---|---|---|---|---|---|
| ||||||
|
Potential Future Enhancement
...
...
Nest Step
- Finalize Logo
- Determine what other AI framework can be supported by OBAIC besides ONNX and PMML
Potential Future Enhancement
- Formally design JSON in http://json-schema.org/ so that future development can validate the JSON structure
- Define data pipeline to transform data before running
- Define containerized model so that prediction can run in BI instead of in AI
- Define format of nextPageToken
- Define different types of
errorCode
andmessage
for each API call
References
- Tableau version of OBAIC https://tableau.github.io/analytics-extensions-api/docs/ae_example_tabpy.html
- Qlik version of OBAIC: https://github.com/qlik-oss/server-side-extension
- Delta Sharing: https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md#delta-sharing-protocol
Authors
Decision to be made
- Data file type: What type of data we are supporting: e.g. for Delta needs to be parquet, RDBMS? Can modify the Jeffrey init cut below to support multiple data types, depending on the use case.
- Inference: Pass by value should be good enough if it's only for predicting
- Train: not immediate, maybe later in Phase 2
- Metadata structure, what kind of JSON schema do we need
- Do we only support a specific model type (ONNX) or arbitrary number of framework
- Decouple model (asking the model to predict and train) and data (listing, upload, download)
- Finalize Logo
FAQ
Why should I share our model to you?
Ownership? Model and Data?
Security?
How can the data be accessed mechanically, for training?
Original content from Jeffrey. To be integrated with the main content
This is a short doc illustrating a sample skeleton OBAIC protocol. This proposal envisions a data-centric workflow:
...
FAQ
- Why should AI share model to BI?
- The setting of OBAIC assumes an organization owns both the BI Tool(s) and AI platform(s). However, they are 2 (or more) discrete entities and may not have a good way to integrate. Hence OBAIC comes in to connect the dots.
- Who owns the model and data?
- The AI platform owns the model but share with BI tools through OBAIC. The data is owned by the business but BI has been authorized to use it and re-share this to AI for training and inference.
- How do you deal with Security?
- Call will be handled by HTTPS protocol and authorized by bearer token standard
References
- Tableau version of OBAIC https://tableau.github.io/analytics-extensions-api/docs/ae_example_tabpy.html
- Qlik version of OBAIC: https://github.com/qlik-oss/server-side-extension
- Delta Sharing: https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md#delta-sharing-protocol
Authors
Name | Affiliation |
---|---|
Cupid Chan | Pistevo Decision |
Xiangxiang Meng | Redfin |
Deepak Karuppiah | MicroStrategy |
Nancy Rausch | SAS |
Dalton Ruer | Qlik |
Sachin Sinha | Microsoft |
Yi Shao | IBM |
Jeffrey Tang | Predibaes |
Lingyan Yin | Salesforce |
...
Train a New Model
function TrainModel(inputs, outputs, modelOptions, dataConfig) -> UUID |
...