Open Business and Artificial Intelligence Connectivity (OBAIC)
Overview
Open Business and Artificial Intelligence Connectivity (OBAIC) borrows the concept from Open Database Connectivity (ODBC), which is an interface that makes it possible for applications to access data from a variety of database management systems (DBMSs). The aim of OBAIC is to define an interface allowing BI tools to access machine learning models and to run inferencing against those models on a variety of different AI platforms from a variety of AI platforms - “AI ODBC for BI”
Through OBAIC, BI vendors can connect to any AI platform freely without concerning themselves with the underlying implementation, or how the AI platform trains the model or infers results. It's just like what we use for databases via ODBC - the caller doesn't need to concern about how the database stores the data or execute queries.
The committee has decided this standard will only define the REST APIs protocol of how AI and BI communicate. The design or the actual implementation of OBAIC, such as whether this should be Server vs Server-less VS Docker, will be left up to the implementing vendors. If this protocol grows to another open-sourced project, that team may provide such implementation guidance/example(s).
There are 3 key aspects designed into this standard:
BI - What specific call do I need this standard to provide so that I can better leverage the AI/ML platform counterpart?
AI - What should be the common denominator be for an AI platform that can provide support for this standard?
Data - Shall data be moved around in the communication between AI and BI (passed by value) or will the data remain in it's source location (passed by reference)?
All of the REST APIs call presented below use bearer tokens for authorization. The {prefix} of each API is configurable in the hosted servers. This protocol is inspired by Delta Sharing.
SwaggerHub API (Will port this document to SwaggerHub AFTER we confirm the design)
https://app.swaggerhub.com/apis-docs/cupidchan/OBAIC/1.0.0-oas3
Protocol - Training
* Blue text below, either in the diagram or in the description, means it's out of scope of OBAIC and it's up to the BI tool, AI Platform or Data source vendor to implement. OBAIC is the connecting tissue to coordinate the communications among them to extend the capability of these 3 major components
An End User is analyzing data using their BI Tool and determines that predictive analytics for the data would be valuable and they wish to train a model with the data for that purpose. This step is the traditional step when a user interacts with BI.
(a) Obtain a token with permission associated to the user making the request. This token is going to pass to AI allowing the access to the training data with a SQL statement running against the datastore. (b) BI tool, on behalf of the user, requests AI platform through OBAIC, to train/prepare a model that accepts features of a certain type (numeric, categorical, text, etc.)
AI Platform provides the implementation to fulfill the request by connecting to the datasource with the provided token and the set of training data specified in SQL. This step is up to how the AI platform interacts with the data source to performance the training.
BI tool polls for the status or retrieve the training result. If the training is still in progress, the status will be returned. When training is completed, results and performance of the model will be returned.
BI tool presents the result to the user in their own way, which is the "secret sauce" and unique to each other.
Protocol - Inference
1. When a BI user wants to extend its capability to AI, it reaches out to AI platform and requests a list of available models of which the credential of the provided token is authorized to see
Example:
2. After the list of models is returned, the BI user can selectively retrieve the detail of the model(s). This step can also be called right after the newly trained model is completed as described in the previous section since modelID is returned as a result of the training request.
Example:
3. The BI tools will use the information retrieved from the AI platform to display to the user, including what type of models are available and the performance. It can optionally match the data and suggest what may be the good match based on what the user has.
4. User interacts with the result BI presented and decides what can be a good model to make a prediction on certain set of data. Please note that the model can also be returned as the result of the training step described in the previous section. In the case, the user may bypass these 2 steps and go directly to see the result.
5. Once the BI user/developer decides which model to run for predictions, they will take the appropriate actions in the BI tool to prepare the data and call OBAIC and request it run that model with the data.
Query should be in the body
Explanation
6. AI will connect to the underlying data source and run the prediction using the information provided by BI
7. In case the result cannot be returned immediately because of the prediction volume, BI can poll for the result.
In case of pass by reference
In case of pass by value:
Error - Apply to all API calls above
Next Step
Finalize Logo
Determine what other AI model format can be supported by OBAIC besides ONNX, Neuropod and PMML per https://landscape.lfai.foundation/card-mode?category=format-interface&grouping=category)
Future Enhancement
Define potential value for each parameter in the API call
Formally define JSON in http://json-schema.org/ so that future development can validate the JSON structure
Define data pipeline to transform data before running
Define containerized model so that prediction can run in BI instead of in AI
Define format of nextPageToken
Define different types of
errorCodeandmessagefor each API call
FAQ
Why should AI vendors care to participate in the OBAIC standard?
Most AI model training and execution is done by a very small set of data scientists. The OBAIC protocol extends the influence and ability for the AI vendors. They will no longer need to work with BI partners to create one off implementations and continually maintain those.
Why should BI vendors care to participate in the OBAIC standard?
Providing end users access to predictions/prescriptions is the desired goal. Writing one off drivers to support every AI flavor of the week isn't the goal. The OBAIC standard provides BI vendors the opportunity to build and support 1 driver, while then enabling all customers/prospects to bring their own AI implementation(s) to the table and not lose deals as the result of not having a customer driver in place for that customer to expand, or prospect to purchase.
Who owns the model and data?
The AI platform owns the models but share those with BI tools through OBAIC. The data is owned by the business but BI has been authorized to use it and re-share this to AI for training and inference.
How do you deal with Security?
Call will be handled by HTTPS protocol and authorized by bearer token standard
References
Tableau version of OBAIC https://tableau.github.io/analytics-extensions-api/docs/ae_example_tabpy.html
Qlik version of OBAIC: https://github.com/qlik-oss/server-side-extension
Delta Sharing: https://github.com/delta-io/delta-sharing/blob/main/PROTOCOL.md#delta-sharing-protocol
Authors
Name | Affiliation |
|---|---|
Cupid Chan | Pistevo Decision |
Xiangxiang Meng | Upstart |
Deepak Karuppiah | MicroStrategy |
Dalton Ruer | Qlik |
Sachin Sinha | Microsoft |
Yi Shao | IBM |
Stu Sztukowski | SAS |
Jeffrey Tang |