The Cognitio API is a simple REST-ish Interface over HTTP(s) that can be used to access HeuroLabs computational intelligence platform. The goal of the API is to enable customers to easily extract knowledge from content that they can further act upon it in their mobile , web or desktop applications. Customers can sign up for an account and receive the credentials to access the API. Once the account is setup, the API revolves around the concept of a pipeline. A pipeline is a logical representation where data is submitted to the HeuroLabs platform for processing including entities identification and knowledge extraction. A customer can have one or more pipeline. Pipelines are envisioned to support both active and transparent ingestion. Note that in both cases ingestion is asynchronous. More details can be found below. Results can be retrieved actively or transparently. Transparent retrieval may require configuration of a destination(Message queue, db server, etc.) and credentials with write permission in order to be able to persist the results. In the following sections we will cover the main operations and concepts of the platform and how they are exposed by the API.
One attribute of the HeuroLabs Platform is that we try to minimize the burden on our users as much as possible, as long as it makes sense. For example we will not force our users to deal with streamed results till we have clarified the requirements with them. However when dealing with content, Users of the HeuroLabs platform don't need to specify the content or what modules to apply to their pipelines, let alone building models or uploading training data. The HeuroLabs Platform is built to understand the content, has a great and growing set of models and is continuously learning and improving. The platform is also offered as a service in the cloud or on private premises where a public cloud solution is inappropriate.
Customers can sign up for the API by providing their email address and setting a password.They can use these credentials to create a token , programmatically through the API or using the HeuroLabs Management Console. Tokens can be deleted or created as needed by the customers. We plan to offer more granular access control in the future for the customers who need it. Signup operation is supported both through the API and the Management Console
As explained in the introduction, a pipeline is a logical concept that covers the ingestion, processing and results of some data input through the HeuroLabs platform. The input can be discrete in itself or is discretized in the process. An example of a discrete input is an image file or a text article. An example of a stream which will be discretized for processing is a video stream from a live webcam or an audio stream from a radio broadcast. A user of the platform can create as many pipelines as they need and inputs in different modalities including text, audio and video with more modalities planned for public release soon. A pipeline must have a name which should be descriptive.
Once a pipeline is created, the owner of t his pipeline can ingest data to it using the pipeline id and providing their access token. Active ingestion supports both ingestion of a URL where the content to be analyzed can be found or providing files with their content. On a successful submission the user will receive success status code and a key. This key can be used to poll the system for processing status and retrieval of results.
We plan to support transparent ingestion where the user provides a resource where data can be fetched transparent to the user. Example sources include RSS feeds urls , message queues, twitter streams or live cams. The challenge however is that for sources that are continuous, the user must be prepared to act upon the results or discard them in a similar manner. We believe that asynchronous processing of streams can unlock grewat value and will be increasingly necessary and we want to prepare our users for this environment.
Upon successful submission of an input URL or File, the user will receive a key as part of the response. This key can be used to poll the systems for the results of the processing step. The results is a JSON serialization of the knowledge extracted from the input. A few examples can be found later in this post. The results will include the input source as a secondary key and are retained for 7 calendar days before the system discards them.
A live demo of what can be built using the API can be found on our Cognitio Web site.