Voysis API Overview

The Voysis API provides per provider REST based access to your custom Voysis conversational service.

Flexible Deployment

Run your Voysis powered natural language voice system on embedded hardware, on premises, or as a Voysis hosted service. Each hosted service receives a customer specific URL like yourcompany.voysis.com for easy access.

Easy-to-use integrations

Integrating Voysis with your applications is done through an easy-to-use REST API or, our recommend approach, using one of our SDKs. Currently, Voysis provides SDKs for iOS and Android.

Note: The Voysis API is customised for each customer integration. As such the specific structure of the ‘entities’ and ‘context’ parameters used below is left generic as it is dependent upon the customer use cases and scope of integration.

Get Started Now


Authentication

Basic authentication is used to authenticate each client API request. Each client API request must include an Authorization field in the HTTP header.

Authorization: Bearer APP_TOKEN

The API also supports API key based authentication. In this case a header specifying your API key must be added to each request.

x-api-key: 32d5318c-a616-4e6e-9348-570ce77e314f

Resources

The Voysis API exposes the following endpoints.

/conversations: Provides support for running voice based searches

/tts: Provides text to speech functionality


HTTP Headers

Content Type

The Content-Type header must be: Content-Type: application/json

User-Agent

Each request should contain the User-Agent HTTP header to identify the specific application, and it’s version, making the request.

Accept

A client should specify the API and schema version it is using in each call. If omitted, the latest versions will be defaulted.

Accept Header

Accept: application/json; version=1, schemaVersion=beerApp-123456

Rate Limits

Rate limits can be applied per user or per application. When application level authentication is being used the limits are applied globally for all users of the application. This allows a certain number of API calls per API endpoint every 15 mins.

Rate limits can also be applied to each individual user when user level authentication is being used. As with application level limits, each is limited to a certain number of API requests per 15 min window.

The above limits are applied on a per endpoint basis meaning that the rate limit for each application is specific to the endpoint to which the request was sent. The headers in each API response contain information on the current rate limits states for the API endpoint to which the request was sent.

  • X-Rate-Limit-Limit: the rate limit for the endpoint to which the request was sent
  • X-Rate-Limit-Remaining: the number of remaining requests allowed in the current window
  • X-Rate-Limit-Reset: the time remaining in the current window before the rate limit resets, in UTC epoch seconds

When the limit is reached the server will return a 429 (Too Many Requests) status code. The response body will contain information on the allowed rate limits.


Conversations

An API client executes a conversation with Voysis using the following high-level flow:

  1. Initiate a new conversation.
  2. Create one or more queries within that conversation.
    • A query may be either text based or audio based.

Initiate a New Conversation

To initiate a new conversation, the API client must POST a conversation entity to the /conversations endpoint. The structure of a conversation entity is defined by:

Name Required Type Description
lang Yes string Language code of the conversation in BCP47

An example call to initiate a new conversation:

Initiate Conversation

        POST /conversations HTTP/1.1
        Authorization: Bearer APP_TOKEN
        Content-type: application/json
        Accept: application/json; version=1, schemaVersion=beerApp-123456
        User-Agent: myBeerApp
        {
          "lang": "en-US"
        }
      

A sample response to creating a new conversation

Initiate Conversation Response

        Content-type: application-json; version=1, schemaVersion=beerApp-123456
        {
          "id": 671,
          "lang": "en-US",
          "_links": [
          "_self": "https://beercity.voysis.com/conversations/671",
          "queries": "https://beercity.voysis.com/conversations/671/queries",
          ]
        }
      

Create a Query

Once a conversation is created, it is possible to create one or more queries in that conversation. Queries can be text based or audio based, and a conversation may be made up of a mixture of both types of query.

The structure of a query entity is defined by

Name Required Type Description
queryType Yes string The type of the query, one of either "text" or "audio".
query Yes* object Defines the text query to execute. This parameter is only required if the queryType is "text".
context optional object A collection of key value pairs that represents the application state. This is specific to each application (type object)
intent No string A value indicating the intent of the query. Examples are "search" and "addToCart". This parameter should not be populated by the caller when creating the query, Voysis will return it.
reply No object Entity containing optional reply audio to be played to the API user, or text that may be displayed. This parameter should not be populated by the caller when creating the query, Voysis will return it.
entities No any An integration-specific return value. This parameter should not be populated by the caller when creating the query, Voysis will return it.

The structure of the "query" entity is defined by

Name Required Type Description
text yes string Text of user query
userExpressions optional array of strings Array of user expressions which represents the previous user queries

The structure of the "reply" entity is defined by

Name Required Type Description
text No string Natural language string that may be displayed to the end user.
audio No object Object containing audio data that may be played to the end user

The structure of the "audio" entity is defined by

Name Required Type Description
audioData Yes if audioDataUrl is not present string Base64-encoded binary audio data
audioDataUrl Yes if audioData is not present string A URL from which the binary audio data may be retrieved
mimeType Yes string The MIME type of the audio data.
chunkNumber No integer The number of this chunk of audio data. Only relevent when audioData is present and the audio has been chunked. If chunkNumber is not present and audioData is, then audioData represents an unchunked piece of audio.
totalChunks No integer The total number of chunks the audio data has been split into
chunksReceived No integer The total number of chunks of audio data Voysis has received

Sample Text Query

        POST /conversations/671/queries
        Authorization: Bearer APP_TOKEN
        Content-Type: application/json; charset=utf-8
        Accept: application/json; version=1, schemaVersion=beerApp-123456

        User-Agent: myBeerApp
        {
          "queryType": "text",
          "query": {
            "text": "What are the top selling beers?",
            "userExpressions": []
          }
        }
      

Sample Response

        Content-Type: application/json; charset=utf-8
        {
          "id": 1845,
          "_links": [
            "_self": "https://beercity.voysis.com/conversations/671/queries/1845",
            "conversation": "https://beercity.voysis.com/conversations/671"
          ],
          "queryType": "text",
          "query": {
            "text": "What are the top selling beers?",
            "userExpressions": [
              "top", "selling", "beers"
            ]
          },
          "reply": {
            "text": "Here are the top 5 selling beers",
            "audio": {
              "audioDataUrl":
              "https://beercity.voysis.com/audio/489fu3948fu9834uf983.wav",
              "mimeType": "audio/wav"
            }
          },
        "entities": {
          "1": {
            "Name":"guinness_can",
            "Value":"296244"
          },
          "2": {
            "Name":"snow_bottle",
            "Value":"2164386"
          },
          "3": {
            "Name":"tsingtao_bottle",
            "Value":"292343"
          },
          "4": {
            "Name":"budlight_bottle",
            "Value":"296244"
          },
          "5": {
            "Name":"skol_bottle",
            "Value":"789105"
          }
        }
      }
    

Create an Audio Query

An audio query is created in a similar manner to the text query. The query is created, and then chunked audio data are posted to the API. Once all audio chunks are received and processed, the query is executed and the response may be retrieved by a subsequent GET request to the query entity's URL.

The query response will contain the transcribed text representation of the user's query as well as any integration-specific results.

API clients transport the audio data to Voysis in chunks. A client may use chunked HTTP/1.1 transfer encoding, or split the audio data into chunks itself and provide them in multiple calls to Voysis.

Sample Request to Initialise an Audio Query

      POST /conversations/671/queries
      Authorization: Bearer APP_TOKEN
      Content-Type: application/json; charset=utf-8
      Accept: application/json; version=1, schemaVersion=beerApp-123456
      User-Agent: myBeerApp
      {
        "queryType": "audio",
        "audio": {
          "mimeType": "audio/wav"
        }
      }
    

Sample Response

      Content-Type: application/json; charset=utf-8

      {
        "id": 1846,
        "_links": [
          "_self":
          "https://beercity.voysis.com/conversations/671/queries/1846",
          "audio":
          "https://beercity.voysis.com/conversations/671/queries/1846/audio"
          "conversation": "https://beercity.voysis.com/conversations/671"
        ],
        "queryType": "audio",
        "audio": {
          "mimeType": "audio/wav",
          "chunksReceived": 0
        }
      }
    

Post the Audio Data

      Authorization: Bearer APP_TOKEN
      Content-Type: application/json; charset=utf-8
      Accept: application/json; version=1, schemaVersion=beerApp-123456
      User-Agent: myBeerApp

      POST /conversations/671/queries/1846/audio
      {
        "audioData": "...",
        "chunkNumber": 3
      }

      Authorization: Bearer APP_TOKEN
      Content-Type: application/json; charset=utf-8
      Accept: application/json; version=1, schemaVersion=beerApp-123456
      User-Agent: myBeerApp
      POST /conversations/671/queries/1846/audio
      {
        "audioData": "...",
        "chunkNumber": 5,
        "totalChunks": 5
      }
    

Audio Data Response

      Content-Type: application/json; charset=utf-8
      {
        "id": 7829,
        "_links": [
          "_self":
          "https://beercity.voysis.com/conversations/671/queries/1846/audio/7829",
          "query":
          "https://beercity.voysis.com/conversations/671/queries/1846",
          "conversation": "https://beercity.voysis.com/conversations/671"
        ],
        "chunkNumber": 3,
        "chunksReceived": 4
      }
    
Retreiving the Query's Results

Once all audio data has been supplied, the API client should GET the query resource. This will return the results of the query once all processing is complete. The result object is identical to the object returned by a text query.

Retrieve Query Results

      Authorization: Bearer APP_TOKEN
      Content-Type: application/json; charset=utf-8
      Accept: application/json; version=1, schemaVersion=beerApp-123456
      User-Agent: myBeerApp

      GET /conversations/671/queries/1846
    

Pagination

In many cases running a query may return a large set of results, in this case the API will paginate the results. Regardless of whether results are paginated or not, a HTTP header called X-Total-Count will always contain the total number of results available.

The per_page URL parameter allows the client to specify the number items that should be returned in each page. In our case, this is the number of items that will be returned in the entities field of the response. If the per_page parameter is not provided in a request, the API will default to 50 items per page. A maximum limit of 100 items per page is allowed.

The page parameter is used to specify the page number to be returned. If page is not specified the first page is returned by default.

The Voysis API allows discoverability for REST pagination. The Link header is used in line with RFC5988, and uses the relative link types next, prev, first and last. This means that for paginated results a HTTP Link header will contain absolute URLs.

For example, a paginated response Link header would be of the form:

        Content-Type: application/json; charset=utf-8
        X-Total-Count: 3829
        {
          "id": 1845,
          "_links": [
            "_self": "https://beercity.voysis.com/conversations/671/queries/1845",
            "pageNext": "https://beercity.voysis.com/conversations/671/queries/1845?per_page=50&page=2",
            "pageFirst": "https://beercity.voysis.com/conversations/671/queries/1845?per_page=50&page=1",
            "pageLast": "https://beercity.voysis.com/conversations/671/queries/1845?per_page=50&page=76",
            "conversation": "https://beercity.voysis.com/conversations/671"
          ],
          "queryType": "audio",
          "query": {
            ...
          },
          "reply": {
            ...
          },
          "entities": {
            ...
          }
        }
      

Text to Speech

The TTS (Text to Speech) resource converts text to audio.

The TTS parameters are shown in the table below. The endpoint accepts a GET with the following parameters

GET request /tts parameters

Name Required Description
text yes Text to convert to speech (type string)
lang optional Language code in bcp47 format (defaults to en-us if not specified)
voice optional Voice type that will be used to generate the audio (type string) (defaults to Cathy if not specified)
format optional Format of audio to produce (type string) (defaults to wav if not specified)

The GET request takes the form:

https://api.voysis.com/tts?lang=en_us_lf&voice=cathy&format=wav&text=this+is+voysis

Response /tts parameters

Name Description
audio Contains audio and parameters (Type object containing ‘content’, ‘format’ and ‘isLast’)
content Base64 encoded representation of the spoken audio binary file (type string)
format Format of audio (type string) (defaults to wav if not specified)
isLast Indicates if audio chunk is the last one (type bool)
text Text which was converted to speech (type string)
lang Language code in bcp47 format (defaults to en-us if not specified)

Each audio response uses chunked transfer encoding and this will be indicated in the header of the response with the presence of Transfer-Encoding: chunked HTTP header.

Audio chunk boundaries are indicated with the delimiter /r/n with the isLast field being used to indicate the last chunk.

Sample response

Headers:
      X-Voysis-Media-Type v1
      Content-Type: application/json; charset=utf-8
      Transfer-Encoding: chunked

      {
         "Audio":{
            "content":"audioData",
            "format":"wav",
            "isLast":"True"
         },
         "text":"Here are the top five beers",
         "lang":"en-us",
         "schemaVersion":"beerApp-123456"
      }

Status and Error Codes

The below table shows the status and error codes that you can expect to receive from the API. Each response contains a status code and error type.

Status Code Error Description
200 Success Successful Request
400 Bad Request Invalid request
401 Unauthorized Invalid credentials provided
404 Not Found Resource does not exist or invalid URL provided
405 Not Allowed HTTP verb not allowed on endpoint
429 Too Many Requests Rate limit reached
500 Internal Server Error Execution of request failed for some reason

Error responses contain a JSON error object which contains details on the type of error. Here is an example error response.

      Headers:
      Content-Type: application/json; charset=utf-8
      {
          "errors":  [
             {
                  "status": "400",
                  "title":  "Bad Request",
                  "detail": "JSON request object could not be parsed’
             }
         ]
      }
      

API Integration Options

The Voysis API is customised for each customer integration. As such the specific structure of the ‘entities’ and ‘context’ parameters used below is left generic as it is dependent upon the customer use cases and scope of integration.

Voysis Search

There are two primary options for the integration of Voysis enabled search. The difference between these is the structure of the entities field in each response.

It should be noted that these options only apply to searches, where the ‘intent’ field in the response will be of type ‘search’ or ‘heuristicSearch’. The Voysis transact feature is described later.

Option 1: Search parameters

In this case the Voysis API will return a set of search parameters that correspond to the users natural language query. These parameters can then be used to perform a search on your existing API.

Option 2: Matching entities

This returns a list of identifiers, for example in an e-commerce use case this would be the list of product identifiers that match the user's natural language query. Your existing API can then be used to retrieve the detailed product information for each.

Voysis Transact

In the case of Voysis transact, in which specific products can be added directly to your cart, the ‘intent’ will be ‘addToCart’ and the ‘entities’ object will contain the information required to populate the user’s cart. This is customised for each customer but would for example contain the list of products and quantities, delivery address and requested delivery time.