Introduction

Welcome to the reinfer API. We strive to make the API predictable, easy to use and painless to integrate. If there is anything you feel we can do to improve it or if you encounter any bugs or unexpected behaviour, please email support@reinfer.io and we will get back to you as soon as possible.

All API requests are sent to reinfer as JSON objects to an endpoint over HTTPS.

https://reinfer.io/api/...

Authentication

All API requests require authentication to identify the user making the request. Authentication is provided through an access token. The developer access token can be obtained from your account page.

You need to include the following HTTP header for every API call you make:

Authorization: Bearer $REINFER_TOKEN

where $REINFER_TOKEN is your reinfer API token.

To make it easy to run the bash examples on this page, you should save your token in an environment variable.

export REINFER_TOKEN=...

# Using curl, pass the authentication header with each request
curl "https://reinfer.io/api/..."
  -H "Authorization: Bearer $REINFER_TOKEN"

For the Python and Node examples, it will be assumed that the token has been stored in a local variable REINFER_TOKEN via your chosen config solution.

Errors

We use conventional HTTP response codes to indicate success or failure of an API request. In general, codes in the 2xx range indicate success, codes in the 4xx range indicate an error that resulted from the provided request and codes in the 5xx range indicate a problem with the reinfer platform.

curl -X GET 'https://reinfer.io/api/v1/nonexistent_page' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "error",
  "message": "404 Not Found"
}

Requests that error will also return a body with a status value of error instead of ok.

Getting started

This is a tutorial style introduction to the API - jump straight to the reference if you're feeling lucky.

All data, individual pieces of which are called verbatims, are grouped into sources. A source should correspond to the origin of the data, like a single mailbox, or a particular feedback channel. These can be combined for the purposes of a single inference model, so it's better to err on the side of multiple different sources than a single monolith if you're in any doubt.

A dataset is a combination of sources together with the associated label categories. For instance one dataset may be built on a website feedback source, with labels like Ease of Use or Available Information, while a different dataset could base itself on various post-purchase survey response sources and apply completely different labels about Packaging or Speed of Delivery.

So before adding any comments, you need to create a source to put them in.

Create a source example

curl -X PUT 'https://reinfer.io/api/v1/sources/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"source": {
          "title": "An Example Source",
          "description": "An optional long form description."}}'

Response

{
  "status": "ok",
  "source": {
    "id": "22f0f76e82fd8867",
    "owner": "<organisation>",
    "name": "example",
    "language": "en",
    "title": "An Example Source",
    "description": "An optional long form description.",
    "should_translate": false,
    "sensitive_properties": [],
    "created_at": "2018-10-16T10:43:56.463000Z",
    "updated_at": "2018-10-16T10:43:56.463000Z",
    "last_modified": "2018-10-16T10:43:56.463000Z"
  }
}

To create a source you need four things:

  1. An organisation. This is an existing organisation you are a part of.
  2. A name. Alphanumeric characters, hyphens and underscores are all OK (e.g. 'post-purchase').
  3. A title. A nice, short human-readable title for your source to display in the UI (e.g. 'Post Purchase Survey Responses').
  4. A description. Optionally, a longer form description of the source to show on the sources overview page.

The first two form the 'fully qualified' name of your source, which is used to refer to it programatically. The latter two are meant for human consumption in the UI.

Go ahead and create an example source.

You should now be the proud owner of a source! Check out your sources page, then come back.

List sources example

Let's programmatically retrieve the same information available on the sources page with all metadata for all sources. You should see your source.

curl -X GET 'https://reinfer.io/api/v1/sources' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "sources": [
    {
      "id": "22f0f76e82fd8867",
      "owner": "<organisation>",
      "name": "example",
      "language": "en",
      "title": "An Example Source",
      "description": "An optional long form description.",
      "should_translate": false,
      "sensitive_properties": [],
      "created_at": "2018-10-16T10:43:56.463000Z",
      "updated_at": "2018-10-16T10:43:56.463000Z",
      "last_modified": "2018-10-16T10:43:56.463000Z"
    }
  ]
}
curl -X GET 'https://reinfer.io/api/v1/sources/<organisation>' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "datasets": [
    {
      "id": "22f0f76e82fd8867",
      "owner": "<organisation>",
      "name": "example",
      "language": "en",
      "title": "An Example Source",
      "description": "An optional long form description.",
      "should_translate": false,
      "sensitive_properties": [],
      "created_at": "2018-10-16T10:43:56.463000Z",
      "updated_at": "2018-10-16T10:43:56.463000Z",
      "last_modified": "2018-10-16T10:43:56.463000Z"
    }
  ]
}

If you only want the sources belonging to a specific organisation you can add its name to the endpoint.

Delete a source example

Deleting a source irretrievably destroys all verbatims and any other information associated with it. Any datasets which use this source will also lose the training data supplied by any labels which have been added to verbatims in this source, so this endpoint should be used with caution. That said, it should be safe to delete the source we created for your organisation in the previous section.

curl -X DELETE 'https://reinfer.io/api/v1/sources/id:22f0f76e82fd8867' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json"

Response

{"status": "ok"}
curl -X GET 'https://reinfer.io/api/v1/sources' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "sources": []
}

The response should be {"status": "ok"}. To be sure it's gone, you can request all sources again.

Add comments example

Sources would be useless without the comments that go in them. A comment in reinfer is either an individual piece of text, or multiple text items that are combined into a conversation. Examples of the former include survey responses, support tickets, and customer reviews, while examples of the latter include email chains.

We will go ahead and add a couple of comments to the 'example' source created in the previous section.

Adding emails

curl -X POST 'https://reinfer.io/api/v1/sources/<organisation>/example/sync' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"comments": [
          {
            "id": "0123456789abcdef",
            "timestamp": "2011-12-11T01:02:03.000000+00:00",
            "messages": [
              {
                "from": "alice@company.com",
                "to": ["bob@organisation.org"],
                "sent_at": "2011-12-11T11:02:03.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"
                }
              },
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com"],
                "sent_at": "2011-12-11T11:05:10.000000+00:00",
                "body": {
                  "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"
                }
              },
              {
                "from": "alice@company.com",
                "to": ["bob@organisation.org"],
                "sent_at": "2011-12-11T11:18:43.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"
                }
              }
            ]
          },
          {
            "id": "abcdef0123456789",
            "timestamp": "2011-12-11T02:03:04.000000+00:00",
            "messages": [
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com", "carol@company.com"],
                "sent_at": "2011-12-12T10:04:30.000000+00:00",
                "body": {
                  "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob"
                }
              },
              {
                "from": "carol@company.com",
                "to": ["alice@company.com", "bob@organisation.org"],
                "sent_at": "2011-12-12T10:06:22.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol"
                }
              },
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com", "carol@company.com"],
                "sent_at": "2011-12-11T10:09:40.000000+00:00",
                "body": {
                  "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob"
                }
              }
            ],
            "user_properties": {
                "string:Sender Domain": "organisation.org",
                "string:Recipient Domain": "company.com",
                "number:severity": 3
            }
          }
        ]}'

Response

{
  "status": "ok",
  "new": 2,
  "unchanged": 0,
  "updated": 0
}

This example shows how to add a comment that consists of multiple messages. This is most commonly used for adding emails.

The fields used in the requests in the accompanying code should be self-explanatory. The only required fields are id, timestamp, and messages.body.text. You can learn more about available fields in the documentation.

The ID field should be a hexadecimal number, unique amongst comments, of at most 256 digits. It is otherwise left to the user of the API to choose, allowing easier integration with other systems. If your IDs are not hexadecimal, you can convert them. If you want to additionally retain the original IDs, you can put them into the user_properties field that holds arbitrary user-defined metadata.

The timestamp should be in UTC and refer to the time when the comment was recorded (e.g. the survey was responded to), not the current time.

The response should confirm that two new comments have been created.

Adding single-message comments

curl -X POST 'https://reinfer.io/api/v1/sources/<organisation>/example/sync' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"comments": [
          {
            "id": "fedcba098765",
            "timestamp": "2011-12-12T20:00:00.000000+00:00",
            "messages": [
              {
                "language": "fr",
                "body": {
                  "text": "I was impressed with the speed of your service, but the price is quite high.",
                  "translated_from": "J\'ai été impressionné par la rapidité de votre service, mais le prix est assez élevé."
                }
              }
            ]
          }
        ]}'

Response

{
  "status": "ok",
  "new": 1,
  "unchanged": 0,
  "updated": 0
}

This example shows how to add a comment that contains a single message. This format can suit data such as survey responses, customer reviews, etc.

The required and available fields are same as in the emails example, with the only difference that the messages field should contain a single entry. You can skip email-specific fields that don't fit your data, as they are not required.

The response should confirm that one new comment has been created.

Retrieve comments example

Once added, a comment may be retrieved by its ID. You should see the comment added in the previous section.

curl -X GET https://reinfer.io/api/v1/sources/<organisation>/example/comments/0123456789abcdef \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "comment": {
    "uid": "22f0f76e82fd8867.0123456789abcdef",
    "id": "0123456789abcdef",
    "source_id": "22f0f76e82fd8867",
    "timestamp": "2011-12-11T01:02:03Z",
    "user_properties": {},
    "messages": [
      {
        "from": "alice@company.com",
        "to": ["bob@organisation.org"],
        "sent_at": "2011-12-11T11:02:03.000000+00:00",
        "body": {
          "text": "Hi Bob,\n\nCould you send me today\'s figures?\n\nThanks,\nAlice"
        }
      },
      {
        "from": "bob@organisation.org",
        "to": ["alice@company.com"],
        "sent_at": "2011-12-11T11:05:10.000000+00:00",
        "body": {
          "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"
        }
      },
      {
        "from": "alice@company.com",
        "to": ["bob@organisation.org"],
        "sent_at": "2011-12-11T11:18:43.000000+00:00",
        "body": {
          "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"
        }
      },
    ]
    "last_modified": "2018-10-16T10:51:46.247000Z",
    "context": "0"
  }
}

Create a dataset example

Having successfully added some raw data to reinfer, we can now start to add datasets. A dataset corresponds to a taxonomy of labels along with the training data supplied by applying those labels to the verbatims in a series of selected sources. You can create many datasets which refer to the same source(s) without the act of labelling verbatims using the taxonomy of one dataset having any impact on the other datasets (or the underlying sources), allowing different teams to use reinfer to gather insights independently.

curl -X PUT 'https://reinfer.io/api/v1/datasets/<organisation>/my-dataset' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"dataset": {
         "title": "An Example Dataset",
         "description": "An optional long form description.",
         "source_ids": ["22f0f76e82fd8867"]}}'

Response

{
  "status": "ok",
  "dataset": {
    "id": "b2ad67f9dfd2e76b",
    "owner": "<organisation>",
    "name": "my-dataset",
    "title": "An Example Dataset",
    "description": "An optional long form description.",
    "created": "2018-10-16T10:57:44.667000Z",
    "last_modified": "2018-10-16T10:57:44.667000Z",
    "model_family": "english",
    "source_ids": [
      "22f0f76e82fd8867"
    ],
    "has_sentiment": true,
    "limited_access": false
  }
}

Once sources have been created, appropriately-permissioned users can also create datasets in the UI, which may be more convenient.

List datasets example

curl -X GET 'https://reinfer.io/api/v1/datasets/<organisation>/my-dataset' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "dataset": {
    "id": "b2ad67f9dfd2e76b",
    "owner": "<organisation>",
    "name": "my-dataset",
    "title": "An Example Dataset",
    "description": "An optional long form description.",
    "created": "2018-10-16T10:57:44.667000Z",
    "last_modified": "2018-10-16T10:57:44.667000Z",
    "model_family": "random",
    "source_ids": [
      "22f0f76e82fd8867"
    ],
    "has_sentiment": true,
    "limited_access": false
  }
}

Like sources, datasets have several GET routes corresponding to:

  • all the datasets the user has access to;
  • datasets belonging to the specified organisation;
  • a single dataset specified by organisation and name.

We supply an example of the latter in action.

Update a dataset example

All of the permissible fields used to create a dataset can be updated, with the exception of has_sentiment, which is fixed for a given dataset.

curl -X POST 'https://reinfer.io/api/v1/datasets/<organisation>/my-dataset' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"dataset": {"description": "An updated description."}}'

Response

{
  "status": "ok",
  "dataset": {
    "id": "b2ad67f9dfd2e76b",
    "owner": "<organisation>",
    "name": "my-dataset",
    "title": "An Example Dataset",
    "description": "An updated description.",
    "created": "2018-10-16T10:57:44.667000Z",
    "last_modified": "2018-10-16T10:57:44.667000Z",
    "model_family": "random",
    "source_ids": [
      "22f0f76e82fd8867"
    ],
    "has_sentiment": true,
    "limited_access": false
  }
}

Delete a dataset example

Deleting a dataset will completely remove the associated taxonomy as well as all of the labels which have been applied to its sources. You will no longer be able to get predictions based on this taxonomy and would have to start the training process of labelling verbatims from the beginning in order to reverse this operation, so use it with care.

curl -X DELETE 'https://reinfer.io/api/v1/datasets/<organisation>/my-dataset' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json"

Response

{
  "status": "ok"
}

Get predictions from a pinned model example

curl -X POST 'https://reinfer.io/api/v1/datasets/<organisation>/<dataset>/labellers/<model_version>/predict' \
     -H "Authorization: Bearer $REINFER_TOKEN"  \
        -H "Content-Type: application/json" \
        -d '{
  "documents": [
    {
      "messages": [
        {
          "body": {
            "text": "Hi Bob, has my trade settled yet? Thanks, Alice"
          },
          "subject": {
            "text": "Trade Ref: 8726387 Settlement"
          },
          "from": "alice@company.com",
          "sent_at": "2011-12-11T11:02:03.000000+00:00",
          "to": [
            "bob@organisation.org"
          ]
        }
      ],  
      "user_properties": {
           "number:Deal Value": 12000,
           "string:City": "London"
      }
    },
    {
      "messages": [
        {
          "body": {
            "text": "All, just to let you know that processing is running late today. Regards, Bob"
          },
          "subject": {
            "text": "Trade Processing Delay"
          },
          "from": "bob@organisation.org",
          "sent_at": "2011-12-12T10:04:30.000000+00:00",
          "to": [
            "alice@company.com",
            "carol@company.com"
          ]
        }
      ], 
   "user_properties": {
        "number:Deal Value": 4.9,
        "string:City": "Luton"
      }
    }
  ],
  "labels": [ 
   {
    "name": ["Trade", "Settlement"],
    "threshold":0.8 
    }, 
    {
     "name": ["Delay"],
     "threshold":0.75 
    }  
 ],  
 "threshold": 0 
}'

Response

{
  "entities": [
    [
      {
        "formatted_value": "2019-01-01 00:00 UTC",
        "kind": "date",
        "span": {
          "content_part": "body",
          "message_index": 0,
          "utf16_byte_end": 120,
          "utf16_byte_start": 94
        }
      },
      {
        "formatted_value": "Bob",
        "kind": "person",
        "span": {
          "content_part": "body",
          "message_index": 0,
          "utf16_byte_end": 6,
          "utf16_byte_start": 12
        }
      }
    ],
    []
  ],
  "model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" },
  "predictions": [
    [
      {
        "name": ["Trade", "Settlement"],
        "probability": 0.86687008142471313,
      }
    ],
    [
      {
        "name": ["Delay"],
        "probability": 0.26687008142471313,
      }
    ]
  ],
  "status": "ok"
}

Once you have a trained model, you can now use this model to predict labels against other pieces of data. To do this you simply need to provide the following:

  1. Documents: This is an array of message data that the model will predict labels for and each message object can only contain one verbatim along with any optional properties. For optimal model performance, the data provided needs to be consistent with the data and format that was labelled on the platform, as the model takes all available data and metadata into consideration. E.g emails should include subject, from/bcc/cc fields, etc (if these were present in the training data). Additionally, user properties in the training dataset should also be included in the API request body.
  2. Labels: This is an array of the model trained labels that you want the model to predict in the data provided. Additionally, for each label a confidence threshold to filter labels by should be provided. The optimal threshold can be decided based on your precision vs recall trade off. Further information regarding how to choose a threshold can be found in the user guide, under the "Using Validation" section.
  3. Default threshold (optional): This is a default threshold value that will be applied across all labels provided. Please note, if default and per-label thresholds are provided together in a request, the per-label thresholds will override the default threshold. As best practice, default thresholds can be used for testing or exploring data. For optimal results when using predictions for automated decision making, it is highly recommended to use per-label thresholds.

Note: A hierarchical label will be formatted as a list of labels. For instance, the label "Trade > Settlements" will have the format ["Trade", "Settlements"] in the request.

Within the API URL it is important to pass in the following arguments:

  1. Organisation name: This is an existing organisation you are a part of.
  2. Dataset name: This is a dataset the model has been trained on.
  3. Model version: The model version is a number that can be found on the "Models" page for your chosen dataset.

Understanding the Response

Because a specific model version is being used, the response to the same request will always return the same results even if the model is being trained further. Once you have validated the results of the new model and would like to submit a request against the new model, you should update the model version in your request. Additionally, you should also update the label thresholds to fit the new model. For every new model you will have to iterate through the steps again.

By default, the response will always provide a list of predicted labels for each verbatim with a confidence greater than the threshold levels provided.

However, the response of a request can vary if entity recognition and sentiments are enabled for your model:

  1. Entities Enabled. The response will also provide a list of entities that have been identified for each label (first response example)
  2. Sentiments Enabled. The response will also provide a sentiment score between -1 (perfectly negative) and 1 (perfectly positive) to every label object classified above the confidence threshold. (second response example)

Response

{
  "model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" },
  "predictions": [
    [
      {
        "name": ["Trade", "Settlement"],
        "probability": 0.86687008142471313,
        "sentiment": 0.8762539502232571
      }
    ],
    [
      {
        "name": ["Delay"],
        "probability": 0.26687008142471313,
        "sentiment": 0.8762539502232571
      }
    ]
  ],
  "status": "ok"
}

API Reference

Summary

All the available API routes are listed below:

Managing sources:

MethodURLDescription
GETapiv1sourcesRetrieve metadata about all accessible sources
GETapiv1sources<organisation>Retrieve metadata about all sources in an organisation
GETapiv1sources<organisation><source_name>Retrieve metadata about a source by name
GETapiv1sourcesid:<source_id>Retrieve metadata about a source by id
PUTapiv1sources<organisation><source_name>Create a source
POSTapiv1sources<organisation><source_name>Update a source
DELETEapiv1sourcesid:<source_id>Delete a source by id

Managing datasets:

MethodURLDescription
GETapiv1datasetsRetrieve metadata about all accessible datasets
GETapiv1datasets<organisation>Retrieve metadata about all datasets in an organisation
GETapiv1datasets<organisation><dataset_name>Retrieve metadata about a dataset by name
PUTapiv1datasets<organisation><dataset_name>Create a dataset
POSTapiv1datasets<organisation><dataset_name>Update a dataset
DELETEapiv1datasets<organisation><dataset_name>Delete a dataset by name

Uploading comments:

MethodURLDescription
POSTapiv1sources<organisation><source_name>syncCreate or update comments
POSTapiv1sources<organisation><source_name>sync-raw-emailsCreate or update comments from raw emails

Managing comments:

MethodURLDescription
GETapiv1sources<organisation><source_name>comments<comment_id>Retrieve comment by ID
DELETEapiv1sources<organisation><source_name>comments?ids=<comment_id0>[,<comment_id1>,...]Delete comments by ID

Fetching predictions:

MethodURLDescription
POSTapiv1datasets­<organisation>­<dataset_name>­labellers­<version>­predictGet predictions
POSTapiv1datasets­<organisation>­<dataset_name>­labellers­<version>­predict-commentsGet predictions by comment ID
POSTapiv1datasets­<organisation>­<dataset_name>­labellers­<version>­predict-raw-emailsGet predictions for raw emails

Comment structure

A comment in reinfer represents a text item (or sometimes multiple text items that are combined into a conversation), such as an email, a survey response, a support ticket, or a customer review.

A comment consists of an ID, a timestamp, a message (or multiple messages joined into a conversation), and metadata. The API routes accepting or returning comments use the following format:

NameTypeRequiredDescription
idstringsee individual route descriptionsIdentifies a comment uniquely. Any hexadecimal string of up to 512 characters is valid (conforms to /[0-9a-f]{1,512}/).
timestampstringyesA ISO-8601 timestamp indicating when the comment was created. If the timestamp does not specify a timezone, UTC will be assumed. The timestamp must be in the range 01/01/1950 to 31/12/2049 inclusive.
messagesarray<Message>yesAn array of zero or more messages. Conversations are represented as a chronological series of messages, whilst a single piece of text should be a single-element array.
user_propertiesmap<string, string | number>noAny user-defined metadata that applies to the comment. The key of a user property has the format "type:name" eg. "string:email" or "number:age". The value must be a string or a number depending on the type of the user property.

Where Message has the following format:

NameTypeRequiredDescription
bodyContentyesAn object containing the main body text of the message.
subjectContentnoAn object containing the messages's subject.
signatureContentnoAn object containing the messages's signature.
fromstringnoThe message sender.
toarray<string>noAn array of primary recipients.
ccarray<string>noAn array of carbon-copy recipients.
bccarray<string>noAn array of blind carbon-copy recipients.
sent_atstringnoA ISO-8601 timestamp indicating when the message was created. If the timestamp does not specify a timezone, UTC will be assumed.
languagestringnoThe original language of the message. If this is supplied, both text and translated_from should be supplied for the Content fields.

Where Content has the following format:

NameTypeRequiredDescription
textstringyesIf language (other than the source's language) has been supplied, this should be the translated text of the content. Otherwise, it should be in the original language it was collected; it will be translated if not in the source's language and the source has should_translate set to true.
translated_fromstringnoIf language (other than the source's language) has been supplied, this should by the original text of the content. Supplying this field without having supplied a language will result in an error.

Prediction structure

The API returns predictions in the following format:

NameTypeDescription
modelModelAn object containing information about the model version used to make the prediction.
predictionsarray<array<Prediction>>An array containing an array of predicted labels for each comment. Each comment may have zero, one, or multiple predicted labels. The index of each array of predictions will correspond to the index of the comment in the request.
entitiesarray<array<Entity>>An array containing an array of extracted entities for each comment. Each comment may have zero, one, or multiple extracted entities. The index of each array of entities will correspond to the index of the comment in the request. Only returned if entities are enabled in the dataset.

Where Model has the following format:

NameTypeDescription
timetimestampWhen the model version was pinned.
versionnumberModel version.

Where Prediction has the following format:

NameTypeDescription
namearray<string>The name of the predicted label, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
probabilitynumberConfidence score. A number between 0.0 and 1.0.
sentimentnumberSentiment score. A number between -1.0 and 1.0. Only returned if sentiments are enabled in the dataset.

Where Entity has the following format:

NameTypeDescription
formatted_valuestringExtracted entity value.
kindstringExtracted entity kind.
spanSpanAn object containing location of the entity in the comment.

Using model predictions

Reinfer provides a number of ways of using a trained model to get predictions. You can iterate through existing comments in a dataset, get predictions for specific comments by ID, or get predictions for arbitrary emails or other pieces of text.

In addition to choosing the method you want to use to get predictions, you will also need to determine the trained model you want to use, the labels you want to predict, and the appropriate confidence thresholds for those labels. The table below provides more detail.

Please consult our knowledge base to learn how to verify that your model is performing well.

ItemDescription
Dataset nameThe dataset contains the training data used to create a model, and the label taxonomy describing available labels. A dataset together with a model version uniquely identifies the model you want to query for predictions.
Model versionThe model version is an integer that points to a particular snapshot of the model. This provides deterministic results for your application even as the model is being continuously improved. In order to use a model version in the API, you need to "pin" it to ensure it will not be deleted when new models are trained for the same dataset. You can do so in the "Models" page of your dataset, where you can also see all available pinned model versions.
List of label names, list of label thresholdsThe model will predict 0, 1, or multiple labels for each comment. Each predicted label will have an associated level of confidence. To convert this prediction into a "Yes/No" answer, the confidence level needs to be checked against a threshold. This threshold needs to be picked according to the requirements of your specific use case and will be different for each label. Selecting a threshold should be done by inspecting the validation page and choosing a value that strikes the appropriate balance between false positives and false negatives. If a label's validation scores are not reliable (this is flagged on the Validation page) then the predictions can't be reliably thresholded. In such a case, for the purposes of testing the threshold can be set to an arbitrary value (e.g. 0.1), however such a label should be excluded entirely in a production scenario.

Model Change Management

As the model improves, the model version and confidence thresholds will need to be periodically updated in your application. We recommend that you make them configurable.

The configuration should contain the model version, list of labels in use by your application, and a confidence threshold for each label.

Entity kinds

Most entity kinds are self-explanatory. This section describes some of the more complicated entity kinds.

Monetary Quantity

The Monetary Quantity entity will extract a wide variety of monetary amounts and apply a common formatting. For example, "1M USD", "USD 1000000", and "1,000,000 usd" will all be extracted as 1,000,000.00 USD. Since the extracted value is formatted in a consistent way, you can easily get the currency and the amount by splitting on whitespace.

However, if the currency is ambiguous, the extracted value will retain the ambiguous currency. For example, "$1M" and "$1,000,000" will be extracted as $1,000,000.00 rather than 1,000,000.00 USD, since a "$" sign could refer to a Canadian or Australian dollar as well as a US dollar.

Get all sources

GETapiv1sources

Permissions required: View sources
curl -X GET 'https://reinfer.io/api/v1/sources' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "sources": [
    {
      "id": "18ba5ce699f8da1f",
      "owner": "<organisation>",
      "name": "example",
      "title": "An Example Source"
      "description": "An optional long form description.",
      "should_translate": false,
      "sensitive_properties": [],
      "created_at": "2016-02-10T23:13:28.340295+00:00",
      "updated_at": "2016-02-10T23:13:28.340295+00:00",
      "last_modified": "2016-02-10T23:13:28.340295+00:00",
    },
    ...
  ]
}

Get sources by organisation

GETapiv1sources<organisation>

Permissions required: View sources
curl -X GET 'https://reinfer.io/api/v1/sources/<organisation>' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "sources": [
    {
      "id": "18ba5ce699f8da1f",
      "owner": "<organisation>",
      "name": "example",
      "title": "An Example Source"
      "description": "An optional long form description.",
      "should_translate": false,
      "sensitive_properties": [],
      "created_at": "2016-02-10T23:13:28.340295+00:00",
      "updated_at": "2016-02-10T23:13:28.340295+00:00",
      "last_modified": "2016-02-10T23:13:28.340295+00:00",
    },
    ...
  ]
}

Get a source by organisation and name

GETapiv1sources<organisation><source_name>

Permissions required: View sources
curl -X GET 'https://reinfer.io/api/v1/sources/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "source":
    "id": "18ba5ce699f8da1f",
    "owner": "<organisation>",
    "name": "example",
    "title": "An Example Source"
    "description": "An optional long form description.",
    "should_translate": false,
    "sensitive_properties": [],
    "created_at": "2016-02-10T23:13:28.340295+00:00",
    "updated_at": "2016-02-10T23:13:28.340295+00:00",
    "last_modified": "2016-02-10T23:13:28.340295+00:00",
  }
}

Get a source by id

GETapiv1sourcesid:<source_id>

Permissions required: View sources
curl -X GET 'https://reinfer.io/api/v1/sources/id:18ba5ce699f8da1f' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "source":
    "id": "18ba5ce699f8da1f",
    "owner": "<organisation>",
    "name": "example",
    "language": "en",
    "title": "An Example Source"
    "description": "An optional long form description.",
    "should_translate": false,
    "sensitive_properties": [],
    "created_at": "2016-02-10T23:13:28.340295+00:00",
    "updated_at": "2016-02-10T23:13:28.340295+00:00",
    "last_modified": "2016-02-10T23:13:28.340295+00:00",
  }
}

Create a source

PUTapiv1sources<organisation><source_name>

Permissions required: Sources admin
curl -X PUT 'https://reinfer.io/api/v1/sources/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"source": {
         "title": "An Example Source",
         "description": "An optional long form description."}}'

Response

{
  "status": "ok",
  "source": {
    "id": "18ba5ce699f8da1f",
    "owner": "<organisation>",
    "name": "example",
    "language": "en",
    "title": "An Example Source"
    "description": "An optional long form description.",
    "should_translate": false,
    "sensitive_properties": [],
    "created_at": "2016-02-10T23:13:28.340295+00:00",
    "updated_at": "2016-02-10T23:13:28.340295+00:00",
    "last_modified": "2016-02-10T23:13:28.340295+00:00",
  }
}
NameTypeRequiredDescription
languagestringnoThe primary language of the source. Supported values are en (English) and de (German). Defaults to en.
titlestringnoOne-line human-readable title for the source.
descriptionstringnoA longer description of the source.
should_translatebooleannoWhether verbatims uploaded to this source should be translated into the language where required. Defaults to false.
sensitive_propertiesarray<string>noAn array of properties which should be marked as sensitive and hidden from non-privileged users.

Update a source

POSTapiv1sources<organisation><source_name>

Permissions required: Sources admin
curl -X POST 'https://reinfer.io/api/v1/sources/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"source": {"description": "An alternative description."}}'

Response

{
  "status": "ok",
  "source": {
    "id": "18ba5ce699f8da1f",
    "owner": "<organisation>",
    "name": "example",
    "language": "en",
    "title": "An Example Source"
    "description": "An alternative description.",
    "should_translate": false,
    "sensitive_properties": [],
    "created_at": "2016-02-10T23:13:28.340295+00:00",
    "updated_at": "2016-02-11T08:06:14.944290+00:00",
    "last_modified": "2016-02-11T08:06:14.944290+00:00",
  }
}
NameTypeRequiredDescription
titlestringnoOne-line human-readable title for the source.
descriptionstringnoA longer description of the source.
should_translatebooleannoWhether verbatims uploaded to this source should be translated into English where required. Defaults to false.
sensitive_propertiesarray<string>noAn array of properties which should be marked as sensitive and hidden from non-privileged users.

Delete a source

DELETEapiv1sourcesid:<source_id>

Permissions required: Sources admin
curl -X DELETE 'https://reinfer.io/api/v1/sources/id:18ba5ce699f8da1f' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json"

Response

{
  "status": "ok",
}

Add or update comments

POSTapiv1sources<organisation><source_name>sync

Permissions required: Edit verbatims
curl -X POST 'https://reinfer.io/api/v1/sources/<organisation>/example/sync' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
        "comments": [
          {
            "id": "0123456789abcdef",
            "timestamp": "2011-12-11T01:02:03.000000+00:00",
            "messages": [
              {
                "from": "alice@company.com",
                "to": ["bob@organisation.org"],
                "sent_at": "2011-12-11T11:02:03.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"
                }
              },
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com"],
                "sent_at": "2011-12-11T11:05:10.000000+00:00",
                "body": {
                  "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"
                }
              },
              {
                "from": "alice@company.com",
                "to": ["bob@organisation.org"],
                "sent_at": "2011-12-11T11:18:43.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"
                }
              }
            ]
          },
          {
            "id": "abcdef0123456789",
            "timestamp": "2011-12-11T02:03:04.000000+00:00",
            "messages": [
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com", "carol@company.com"],
                "sent_at": "2011-12-12T10:04:30.000000+00:00",
                "body": {
                  "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob"
                }
              },
              {
                "from": "carol@company.com",
                "to": ["alice@company.com", "bob@organisation.org"],
                "sent_at": "2011-12-12T10:06:22.000000+00:00",
                "body": {
                  "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol"
                }
              },
              {
                "from": "bob@organisation.org",
                "to": ["alice@company.com", "carol@company.com"],
                "sent_at": "2011-12-11T10:09:40.000000+00:00",
                "body": {
                  "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob"
                }
              }
            ],
            "user_properties": {
                "string:Sender Domain": "organisation.org",
                "string:Recipient Domain": "company.com",
                "number:severity": 3
            }
          }
        ]
    }'

Response

{
  "status": "ok",
  "new": 3,
  "unchanged": 0,
  "updated": 0
}

A comment's ID uniquely identifies it within a source. If the provided comment ID does not exist in the source, a new comment will be created. Otherwise, an existing comment will be updated. (When a comment is updated, any assigned labels will still apply, while any assigned entities will be discarded).

NameTypeRequiredDescription
commentsarray<Comment>yesA batch of at most 1024 comments in the format described below. Larger batches are faster (per comment) that smaller ones.

Where Comment has the format described in Comment structure:

NameTypeRequired
idstringyes
timestampstringyes
messagesarray<Message>yes
user_propertiesmap<string, string | number>no

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Add or update comments from raw emails

POSTapiv1sources<organisation><source_name>sync-raw-emails

Permissions required: Edit verbatims
curl -X POST 'https://reinfer.iohttps://reinfer.io/api/v1/sources/org1/collateral/sync-raw-emails' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "documents": [
    {
      "raw_email": {
        "body": {
          "plain": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"
        },
        "headers": {
          "parsed": {
            "Date": "Thu, 09 Jan 2020 16:34:45 +0000",
            "From": "alice@company.com",
            "Message-ID": "abcdef@company.com",
            "Subject": "Figures Request",
            "To": "bob@organisation.org"
          }
        }
      },
      "user_properties": {
        "number:Deal Value": 12000,
        "string:City": "London"
      }
    }
  ],
  "include_comments": true,
  "transform_tag": "name.0.ABCD1234"
}'

Response

{
  "comments": [
    {
      "id": "61626364656640636f6d70616e792e636f6d",
      "messages": [
        {
          "body": {
            "text": "Hi Bob,\n\nCould you send me today's figures?"
          },
          "from": "alice@example.com",
          "sent_at": "2020-01-09T16:34:45Z",
          "signature": {
            "text": "\n\rThanks,\nAlice"
          },
          "subject": {
            "text": "Figures Request"
          },
          "to": [
            "bob@organisation.org"
          ]
        }
      ],
      "source_id": "d8d3e7f7a2e9ac16",
      "timestamp": "2020-01-09T16:34:45Z",
      "uid": "d8d3e7f7a2e9ac16.61626364656640636f6d70616e792e636f6d",
      "user_properties": {
        "number:Deal Value": 12000,
        "number:Participants": 2,
        "number:Position in Thread": 1,
        "number:Recipients": 1,
        "string:City": "London",
        "string:Message ID": "abcdef@company.com",
        "string:Sender": "alice@example.com",
        "string:Sender Domain": "example.com",
        "string:Thread": "abcdef@company.com"
      }
    }
  ],
  "new": 1,
  "status": "ok",
  "unchanged": 0,
  "updated": 0
}

You can upload raw emails directly to the Re:infer API. Re:infer will automatically convert an emails to comments, and assign them a unique ID.

You can set include_comments to true to include the generated comments in the response.

NameTypeRequiredDescription
transform_tagstringyesA tag identifying the email integration sending the data. You should have recieved this tag during integration configuration setup.
documentsarray<Document>yesA batch of at most 4096 documents in the format described below. Larger batches are faster (per document) than smaller ones.
include_commentsbooleannoIf set to true, the comments parsed from the emails will be returned in the response body.
override_user_propertiesarray<string>noUser properties supplied in documents will override auto-generated user properties if their names appear in this list. The user property names in this list should be specified without the type prefix, e.g. as My Property and not string:My Property.

Where Document has the following format:

NameTypeRequiredDescription
raw_emailRawEmailyesEmail data, in the format described below.
user_propertiesmap<string, string | number>noAny user-defined metadata that applies to the comment. The key of a user property has the format "type:name" eg."string:email" or "number:age". The value must be a string or a number depending on the type of the user property.

Note: Some user properties are generated based on the email content. If these conflict with uploaded user properties, the request will fail with 422 Unprocessable Entity.

Where RawEmail has the following format:

NameTypeRequiredDescription
headersHeadersyesAn object containing the headers of the email.
bodyBodyyesAn object containing the main body of the email.

Where Headers has the following format:

NameTypeRequiredDescription
rawstringnoOne of raw and parsed is required. The raw email headers, given as a single string, with each header on its own line.
parsedmap<string, string | array<string>>noOne of raw and parsed is required. The parsed email headers, given as an object with string keys and string or array<string> values. Each key represents one email header. Lists of values will be concatenated with , before being set as a single header value.

If you require duplicate header keys, please use raw instead.

Where Body has the following format:

NameTypeRequiredDescription
plainstringnoAt least one of plain and html is required. The plaintext content of the email.
htmlstringnoAt least one of plain and html is required. The HTML content of the email.

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Delete comments

DELETEapiv1sources<organisation><source_name>comments?ids=<comment_id0>[,<comment_id1>,...]

Permissions required: Edit verbatims
curl -X DELETE 'https://reinfer.io/api/v1/sources/<organisation>/<source_name>/comments?ids=abcdef0123456789' \
    -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
}

Individual comments can be deleted from a source, using the ID provided when the comment was added.

All data associated with this comment will be permanently deleted.

Get a comment by id

GETapiv1sources<organisation><source_name>comments<comment_id>

Permissions required: View sources
curl -X GET 'https://reinfer.io/api/v1/sources/<organisation>/example/comments/abcdef0123456789' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "comment": {
    "uid": "18ba5ce699f8da1f.0123456789abcdef",
    "id": "0123456789abcdef",
    "source_id": "18ba5ce699f8da1f",
    "timestamp": "2011-12-11T01:02:03.000000+00:00",
    "messages": [
      {
        "from": "alice@company.com",
        "to": ["bob@organisation.org"],
        "sent_at": "2011-12-11T11:02:03.000000+00:00",
        "body": {
          "text": "Hi Bob,\n\nCould you send me today\'s figures?\n\nThanks,\nAlice"
        }
      },
      {
        "from": "bob@organisation.org",
        "to": ["alice@company.com"],
        "sent_at": "2011-12-11T11:05:10.000000+00:00",
        "body": {
          "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"
        }
      },
      {
        "from": "alice@company.com",
        "to": ["bob@organisation.org"],
        "sent_at": "2011-12-11T11:18:43.000000+00:00",
        "body": {
          "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"
        }
      },
    ],
    "last_modified": "2018-10-15T15:39:51.815000Z",
    "context": "1"
  }
}

Get all datasets

GETapiv1datasets

Permissions required: View labels, View sources
curl -X GET 'https://reinfer.io/api/v1/datasets' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "datasets": [
    {
      "id": "18ba5ce699f8da1f",
      "owner": "<organisation>",
      "name": "example",
      "title": "An Example Dataset"
      "description": "An optional long form description.",
      "created": "2018-10-15T15:48:49.603000Z",
      "last_modified": "2018-10-15T15:48:49.603000Z",
      "model_family": "english",
      "source_ids": ["18ba5ce699f8da1f"],
      "has_sentiment": true
    },
    ...
  ]
}

Get datasets by organisation

GETapiv1datasets<organisation>

Permissions required: View labels, View sources
curl -X GET 'https://reinfer.io/api/v1/datasets/<organisation>' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "datasets": [
    {
      "id": "18ba5ce699f8da1f",
      "owner": "<organisation>",
      "name": "example",
      "title": "An Example Dataset"
      "description": "An optional long form description.",
      "created": "2018-10-15T15:48:49.603000Z",
      "last_modified": "2018-10-15T15:48:49.603000Z",
      "model_family": "english",
      "source_ids": ["18ba5ce699f8da1f"],
      "has_sentiment": true
    },
    ...
  ]
}

Get a dataset by name

GETapiv1datasets<organisation><dataset_name>

Permissions required: View labels, View sources
curl -X GET 'https://reinfer.io/api/v1/datasets/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "dataset": {
    "id": "18ba5ce699f8da1f",
    "owner": "<organisation>",
    "name": "example",
    "title": "An Example Dataset"
    "description": "An optional long form description.",
    "created": "2018-10-15T15:48:49.603000Z",
    "last_modified": "2018-10-15T15:48:49.603000Z",
    "model_family": "english",
    "source_ids": ["18ba5ce699f8da1f"],
    "has_sentiment": true
  }
}

Create a dataset

PUTapiv1datasets<organisation><dataset>

Permissions required: Datasets admin, View sources
curl -X PUT 'https://reinfer.io/api/v1/datasets/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"dataset": {
         "title": "An Example Dataset",
         "description": "An optional long form description.",
         "source_ids": ["18ba5ce699f8da1f"],
         "model_family": "english"}}'

Response

{
  "status": "ok",
  "dataset": {
    "id": "b9a1fd75f6133bce",
    "owner": "<organisation>",
    "name": "example",
    "title": "An Example Dataset",
    "description": "An optional long form description.",
    "created": "2018-10-15T15:48:49.603000Z",
    "last_modified": "2018-10-15T15:48:49.603000Z",
    "model_family": "english",
    "source_ids": ["18ba5ce699f8da1f"],
    "has_sentiment": true
  }
}
NameTypeRequiredDescription
titlestringnoOne-line human-readable title for the dataset.
descriptionstringnoA longer description of the dataset.
source_idsarray<string>noAn array of source ids to be included in this dataset.
entity_kindsarray<string>noAn array of entity kinds to be extracted in this dataset.
model_familystringnoDataset model family, can be english or german. Defaults to english.
has_sentimentbooleannoWhether labels in the dataset should be applied with sentiment. Defaults to true.
copy_labels_fromstringnoA dataset where to copy labels from, specified by a dataset id. Only labels for comments in common sources will be copied.

Update a dataset

POSTapiv1datasets<organisation><dataset>

Permissions required: Datasets admin, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{"dataset": {"title": "An Alternative Title"}}'

Response

{
  "status": "ok",
  "dataset": {
    "id": "b9a1fd75f6133bce",
    "owner": "<organisation>",
    "name": "example",
    "title": "An Alternative Title",
    "description": "An optional long form description.",
    "created": "2018-10-15T15:48:49.603000Z",
    "last_modified": "2018-10-15T15:53:08.479000Z",
    "model_family": "english",
    "source_ids": ["18ba5ce699f8da1f"],
    "has_sentiment": true
  }
}
NameTypeRequiredDescription
titlestringnoOne-line human-readable title for the dataset.
descriptionstringnoA longer description of the dataset.
source_idsarray<string>noAn array of source ids to be included in this dataset.
entity_kindsarray<string>noAn array of entity kinds to be extracted from this dataset.

Delete a dataset

DELETEapiv1datasets<organisation><dataset_name>

Permissions required: Datasets admin, View sources
curl -X DELETE 'https://reinfer.io/api/v1/datasets/<organisation>/example' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json"

Response

{
  "status": "ok",
}

Get predictions for a pinned model

POSTapiv1datasets<organisation><dataset_name>labellers<version>predict

Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/labellers/1/predict' \
     -H "Authorization: Bearer $REINFER_TOKEN"  \
        -H "Content-Type: application/json" \
        -d '{
  "documents": [
    {
      "messages": [
        {
          "body": {
            "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"
          },
          "from": "alice@company.com",
          "sent_at": "2011-12-11T11:02:03.000000+00:00",
          "to": [
            "bob@organisation.org"
          ]
        },
        {
          "body": {
            "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob"
          },
          "from": "bob@organisation.org",
          "sent_at": "2011-12-11T11:05:10.000000+00:00",
          "to": [
            "alice@company.com"
          ]
        },
        {
          "body": {
            "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice"
          },
          "from": "alice@company.com",
          "sent_at": "2011-12-11T11:18:43.000000+00:00",
          "to": [
            "bob@organisation.org"
          ]
        }
      ],
      "timestamp": "2013-09-12T20:01:20.000000+00:00",
      "user_properties": {
        "number:Deal Value": 12000,
        "string:City": "London"
      }
    },
    {
      "messages": [
        {
          "body": {
            "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob"
          },
          "from": "bob@organisation.org",
          "sent_at": "2011-12-12T10:04:30.000000+00:00",
          "to": [
            "alice@company.com",
            "carol@company.com"
          ]
        },
        {
          "body": {
            "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol"
          },
          "from": "carol@company.com",
          "sent_at": "2011-12-12T10:06:22.000000+00:00",
          "to": [
            "alice@company.com",
            "bob@organisation.org"
          ]
        },
        {
          "body": {
            "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob"
          },
          "from": "bob@organisation.org",
          "sent_at": "2011-12-11T10:09:40.000000+00:00",
          "to": [
            "alice@company.com",
            "carol@company.com"
          ]
        }
      ],
      "timestamp": "2013-09-13T18:03:56.000000+00:00",
      "user_properties": {
        "number:Deal Value": 4.9,
        "string:City": "Luton"
      }
    }
  ],
  "threshold": 0.25
}'

Response

{
  "entities": [
    [
      {
        "formatted_value": "2019-01-01 00:00 UTC",
        "kind": "date",
        "span": {
          "content_part": "body",
          "message_index": 0,
          "utf16_byte_end": 120,
          "utf16_byte_start": 94
        }
      },
      {
        "formatted_value": "Bob",
        "kind": "person",
        "span": {
          "content_part": "body",
          "message_index": 0,
          "utf16_byte_end": 6,
          "utf16_byte_start": 12
        }
      }
    ],
    []
  ],
  "model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" },
  "predictions": [
    [
      {
        "name": ["Some Label"],
        "probability": 0.7195301055908203,
        "sentiment": 0.9676201351246866
      },
      {
        "name": ["Parent Label", "Child Label"],
        "probability": 0.26687008142471313,
        "sentiment": 0.8762539502232571
      }
    ],
    [
      {
        "name": ["Some Label"],
        "probability": 0.26687008142471313,
        "sentiment": 0.8762539502232571
      }
    ],
    [
      {
        "name": ["Other Label"],
        "probability": 0.26687008142471313,
        "sentiment": 0.8762539502232571
      }
    ]
  ],
  "status": "ok"
}
NameTypeRequiredDescription
documentsarray<Document>yesA batch of at most 4096 documents in the format described below. Larger batches are faster (per document) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.

Where Document has the format described in Comment structure:

NameTypeRequired
timestampstringyes
messagesarray<Message>yes
user_propertiesmap<string, string | number>no

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

The response will have the format described in Prediction structure.

Get predictions for a pinned model by comment id

POSTapiv1datasets<organisation><dataset_name>labellers<version>predict-comments

Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/<organisation>/example/labellers/0/predict-comments' \
     -H "Authorization: Bearer $REINFER_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
        "uids": ["18ba5ce699f8da1f.0001", "18ba5ce699f8da1f.0002", "b84d8e2641f36bf5.abc001"],
        "threshold": 0.25
     }'

Response

{
  "status": "ok",
  "model": {
    "version": "0",
    "time": "2018-12-20T15:05:43.906000Z"
  },
  "predictions": [
    {
      "uid": "18ba5ce699f8da1f.0001",
      "labels": [
        {
          "name": ["Some Label"],
          "probability": 0.7195301055908203,
          "sentiment": 0.9676201351246866
        },
        {
          "name": ["Parent Label", "Child Label"],
          "probability": 0.26687008142471313,
          "sentiment": 0.8762539502232571
        }
      ],
      "entities": [
        {
          "kind": "date",
          "span": {
            "content_part": "body",
            "message_index": 0,
            "utf16_byte_end": 120,
            "utf16_byte_start": 94
          },
          "formatted_value": "2019-01-01 00:00 UTC"
        },
        {
          "kind": "person",
          "span": {
            "content_part": "subject",
            "message_index": 0,
            "utf16_byte_end": 0,
            "utf16_byte_start": 10
          },
          "formatted_value": "Kevin"
        }
      ]
    },
    {
      "uid": "18ba5ce699f8da1f.0002",
      "labels": [
        {
          "name": ["Some Label"],
          "probability": 0.26687008142471313,
          "sentiment": 0.8762539502232571
        }
      ],
      "entities": []
    },
    {
      "uid": "b84d8e2641f36bf5.abc001",
      "labels": [
        {
          "name": ["Other Label"],
          "probability": 0.26687008142471313,
          "sentiment": 0.8762539502232571
        }
      ],
      "entities": [],
      "translations": [
        {
          "body": "Sometimes open web speed is too slow",
          "subject" : "Could improve web speed",
          "language": "zh-CH",
        }
      ]
    }
  ]
}
NameTypeRequiredDescription
uidsarray<string>yesA list of at most 4096 combined source_id-s and comment_id-s in the format of source_id.comment_id. Sources don't need to belong to the current dataset - so you can request predictions of comments for a source in a different (or no) dataset. Larger lists are faster (per comment) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

The response will have the format described in Prediction structure.

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Get predictions for a pinned model for raw emails

POSTapiv1datasets<organisation><dataset_name>labellers<version>predict-raw-emails

Permissions required: View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/labellers/1/predict-raw-emails' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "documents": [
    {
      "raw_email": {
        "body": {
          "plain": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice"
        },
        "headers": {
          "parsed": {
            "Date": "Thu, 09 Jan 2020 16:34:45 +0000",
            "From": "alice@company.com",
            "Message-ID": "abcdef@company.com",
            "Subject": "Figures Request",
            "To": "bob@organisation.org"
          }
        }
      },
      "user_properties": {
        "number:Deal Value": 12000,
        "string:City": "London"
      }
    },
    {
      "raw_email": {
        "body": {
          "html": "<p>Alice,</p><p>Here are the figures for today.</p><p>Regards,<br/>Bob</p>"
        },
        "headers": {
          "raw": "Message-ID: 012345@company.com\nDate: Thu, 09 Jan 2020 16:44:45 +0000\nSubject: Re: Figures Request\nFrom: bob@organisation.org\nTo: alice@company.com"
        }
      }
    }
  ],
  "include_comments": true,
  "threshold": 0.25,
  "transform_tag": "name.0.ABCD1234"
}'

Response

{
  "comments": [
    {
      "messages": [
        {
          "body": {
            "text": "Hi Bob,\n\nCould you send me today's figures?"
          },
          "from": "alice@company.com",
          "sent_at": "2020-01-09T16:34:45Z",
          "signature": {
            "text": "Thanks,\nAlice"
          },
          "subject": {
            "text": "Figures Request"
          },
          "to": [
            "bob@organisation.org"
          ]
        }
      ],
      "timestamp": "2020-01-09T16:34:45Z",
      "user_properties": {
        "number:Deal Value": 12000.0,
        "string:City": "London",
        "string:Message ID": "abcdef@company.com",
        "string:Sender": "alice@company.com"
      }
    },
    {
      "messages": [
        {
          "body": {
            "text": "Alice,\n\nHere are the figures for today."
          },
          "from": "bob@organisation.org",
          "sent_at": "2020-01-09T16:44:45Z",
          "signature": {
            "text": "Regards,\nBob"
          },
          "subject": {
            "text": "Re: Figures Request"
          },
          "to": [
            "alice@company.comContent-Type: text/html; charset=UTF-8"
          ]
        }
      ],
      "timestamp": "2020-01-09T16:44:45Z",
      "user_properties": {
        "string:Message ID": "012345@company.com",
        "string:Sender": "bob@organisation.org"
      }
    }
  ],
  "model": {
    "time": "2020-02-06T20:42:58.047000Z",
    "version": 0
  },
  "predictions": [
    [
      {
        "name": [
          "Some Label"
        ],
        "probability": 0.8896465003490448,
        "sentiment": 0.21210604394557794
      }
    ],
    [
      {
        "name": [
          "Other Label"
        ],
        "probability": 0.6406207121908665,
        "sentiment": 0.9126404295573658
      }
    ]
  ],
  "status": "ok"
}
NameTypeRequiredDescription
transform_tagstringyesA tag identifying the email integration sending the data. You should have recieved this tag during integration configuration setup.
documentsarray<Document>yesA batch of at most 4096 documents in the format described below. Larger batches are faster (per document) than smaller ones.
thresholdnumbernoThe confidence threshold to filter the label results by. A number between 1.0 and 0.0. 0.0 will include all results.
labelsarray<Label>noA list of requested labels to be returned with optionally label-specific thresholds.
include_commentsbooleannoIf set to true, the comments parsed from the emails will be returned in the response body.

Where Document has the following format:

NameTypeRequiredDescription
raw_emailRawEmailyesEmail data, in the format described below.
user_propertiesmap<string, string | number>noAny user-defined metadata that applies to the comment. The key of a user property has the format "type:name" eg."string:email" or "number:age". The value must be a string or a number depending on the type of the user property.

Note: Some user properties are generated based on the email content. If these conflict with uploaded user properties, the request will fail with 422 Unprocessable Entity.

Where RawEmail has the following format:

NameTypeRequiredDescription
headersHeadersyesAn object containing the headers of the email.
bodyBodyyesAn object containing the main body of the email.

Where Headers has the following format:

NameTypeRequiredDescription
rawstringnoOne of raw and parsed is required. The raw email headers, given as a single string, with each header on its own line.
parsedmap<string, string | array<string>>noOne of raw and parsed is required. The parsed email headers, given as an object with string keys and string or array<string> values. Each key represents one email header. Lists of values will be concatenated with , before being set as a single header value.

If you require duplicate header keys, please use raw instead.

Where Body has the following format:

NameTypeRequiredDescription
plainstringnoAt least one of plain and html is required. The plaintext content of the email.
htmlstringnoAt least one of plain and html is required. The HTML content of the email.

Where Label has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumbernoThe confidence threshold to use for the label. If not specified, will default to the threshold specified at the top-level.

The response will have the format described in Prediction structure.

Note: For large requests, this endpoint may take longer to respond. You should increase your client timeout.

Create a trigger

PUTapiv1datasets<organisation><dataset_name>triggers

Permissions required: Triggers admin, View labels, View sources
curl -X PUT 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "trigger": {
    "comment_filter": {
      "user_properties": {
        "number:Spend": {
          "maximum": 100000,
          "minimum": 100
        },
        "number:Transactions": {
          "one_of": [
            1
          ]
        },
        "string:Country": {
          "one_of": [
            "uk",
            "de"
          ]
        }
      }
    },
    "description": "Used by ACME RPA to create tickets for disputes.",
    "model": {
      "version": 8
    },
    "name": "dispute",
    "title": "Collateral Disputes"
  }
}'

Response

{
  "status": "ok",
  "trigger": {
    "created_at": "2019-08-03T12:30:00.123456Z",
    "dataset_id": "abcdef0123456789",
    "description": "Used by ACME RPA to create tickets for disputes.",
    "id": "0123456789abcdef",
    "model": {
      "version": 8
    },
    "name": "dispute",
    "title": "Collateral Disputes",
    "updated_at": "2019-08-03T12:30:00.123456Z"
  }
}

A trigger defines a stream of comments from a dataset that can be used to "trigger" downstream actions in an automated process. At its core, it enables persistent, stateful iteration through comments in a dataset, with predicted labels and entities computed using a pinned model.

Once created, the fetch and advance methods can be used to iterate through comments. If a comment_filter is defined, only those comments that match the comment filter will be included.

NameTypeRequiredDescription
namestringyesAPI name for the trigger, used in URLs. Must be unique within a dataset and must match [A-Za-z0-9-_]{1,256}.
titlestringnoOne-line human-readable title for the trigger.
descriptionstringnoA longer description of the trigger.
modelModelnoIf specified, comments fetched from this trigger will contain predictions from a pinned model. The only supported field currently is version, corresponding to a pinned model version e.g. "model": {"version": 2}.
comment_filterCommentFilternoIf specified, comments not matching the filter will not be returned.

Where Model has the following format:

NameTypeRequiredDescription
versionintegeryesA model version that has been pinned via the Models page.
label_thresholdsarray<LabelThreshold>noFilter predictions, so that onlyt values matching the given label_thresholds are returned. If not set, all labels and all prediction values will be returned.

Where LabelThreshold has the following format:

NameTypeRequiredDescription
namearray<string>yesThe name of the label to be returned, formatted as a list of hierarchical labels. For instance, the label "Parent Label > Child Label" will have the format ["Parent Label", "Child Label"].
thresholdnumberyesThe confidence threshold to use for the label. The label will only be returned for a comment if it's prediction is above this threshold.

Where CommentFilter has the following format:

NameTypeRequiredDescription
user_propertiesUserPropertyFilternoA filter that applies to the user properties of a comment. For more on user properties, see Comment structure.

The UserPropertyFilter is a map of user property name to filter. String properties may be filtered to values in a set ({"one_of": ["val_1", "val_2"]}). Number properties may be filtered either to values in a set ({"one_of": [123, 456]}) or to a range ({"minimum": 123, "maximum": 456}).

Get a trigger

GETapiv1datasets<organisation><dataset_name>triggers<trigger_name>

Permissions required: View labels, View sources, View triggers
curl -X GET 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers/dispute' \
    -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok",
  "trigger": {
    "created_at": "2019-08-03T12:30:00.123456Z",
    "dataset_id": "abcdef0123456789",
    "description": "Used by ACME RPA to create tickets for disputes.",
    "id": "0123456789abcdef",
    "model": {
      "version": 8
    },
    "name": "dispute",
    "title": "Collateral Disputes",
    "updated_at": "2019-08-03T12:30:00.123456Z"
  }
}

Delete a trigger

DELETEapiv1datasets<organisation><dataset_name>triggers<trigger_name>

Permissions required: Triggers admin, View labels, View sources
curl -X DELETE 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers/dispute' \
    -H "Authorization: Bearer $REINFER_TOKEN"

Response

{
  "status": "ok"
}

Fetch comments from a trigger

POSTapiv1datasets<organisation><dataset_name>triggers<trigger_name>fetch

Permissions required: Consume triggers, View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers/dispute/fetch' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "size": 8
}'

Response

{
  "filtered": 6,
  "results": [
    {
      "comment": {
        "context": "1",
        "created_at": "2018-10-15T15:39:51.815000Z",
        "id": "0123456789abcdef",
        "last_modified": "2018-10-15T15:39:51.815000Z",
        "messages": [
          {
            "body": {
              "text": "Hi Bob,\n\nCould you send me today's figures?"
            },
            "from": "alice@company.com",
            "sent_at": "2011-12-11T11:02:03.000000+00:00",
            "signature": {
              "text": "Thanks,\nAlice"
            },
            "subject": {
              "text": "Today's figures"
            },
            "to": [
              "bob@organisation.org"
            ]
          }
        ],
        "source_id": "18ba5ce699f8da1f",
        "thread_id": "3c314542414538353242393446393",
        "timestamp": "2011-12-11T01:02:03.000000+00:00",
        "uid": "18ba5ce699f8da1f.0123456789abcdef",
        "user_properties": {
          "number:Participants": 2,
          "number:Position in Thread": 1,
          "number:Recipients": 1,
          "string:Folder": "Sent (/ Sent)",
          "string:Has Signature": "Yes",
          "string:Message ID": "<abcdef@abc.company.com>",
          "string:Sender": "alice@company.com",
          "string:Sender Domain": "company.com",
          "string:Thread": "<abcdef@abc.company.com>"
        }
      },
      "entities": [],
      "labels": [],
      "sequence_id": "qs8QcHIBAACuYzDeit-pwQdWGYGQImdy"
    },
    {
      "comment": {
        "context": "1",
        "created_at": "2018-10-15T18:39:51.815000Z",
        "id": "abcdef0123456789",
        "last_modified": "2018-10-15T18:39:51.815000Z",
        "messages": [
          {
            "body": {
              "text": "Alice,\n\nHere are the figures for today."
            },
            "from": "bob@organisation.org",
            "sent_at": "2011-12-11T11:02:03.000000+00:00",
            "signature": {
              "text": "Regards,\nBob"
            },
            "subject": {
              "text": "RE: Today's figures"
            },
            "to": [
              "alice@company.com"
            ]
          }
        ],
        "source_id": "18ba5ce699f8da1f",
        "thread_id": "3c314542414538353242393446393",
        "timestamp": "2011-12-11T02:02:03.000000+00:00",
        "uid": "18ba5ce699f8da1f.abcdef0123456789",
        "user_properties": {
          "number:Participants": 3,
          "number:Position in Thread": 2,
          "number:Recipients": 2,
          "string:Folder": "Inbox (/ Inbox)",
          "string:Has Signature": "No",
          "string:Message ID": "def@xyz.organisation.com",
          "string:Sender": "bob@organisation.org",
          "string:Sender Domain": "organisation.org",
          "string:Thread": "<abcdef@abc.company.com>"
        }
      },
      "entities": [],
      "labels": [
        {
          "name": [
            "Some Top-Level Label"
          ],
          "probability": 0.8374786376953125
        },
        {
          "name": [
            "Another Top-Level Label",
            "Child Label"
          ],
          "probability": 0.6164003014564514
        }
      ],
      "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR"
    }
  ],
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}

Request

Once a trigger is created and set up, it can be queried to fetch comments and their predicted labels and entities. The comments are returned in first-in-first-out order i.e. older comments are returned first. After processing a batch of comments, you need to advance the trigger before fetching the next batch, otherwise you will receive the same batch of comments.

Advancing the trigger being explicit means the API guarantees at-least-once processing of all comments---if a client fails while processing a batch, on restart it will pick up the same batch again.

NameTypeRequiredDescription
sizenumberyesThe number of comments to fetch for this trigger. Will return fewer if reached end of triggered batch or if comments are filtered out according to trigger filter. Max value is 1024.
max_filterednumbernoConvenience parameter for Triggers with a Comment Filter. When provided, up to max_filtered filtered comments will not count towards the requested size. This is useful if you expect a large number of comments to not match the filter. Has no effect on Triggers without a Comment Filter. Max value is 1024.

Response

The response contains a batch of comments (of at most size comments). If the trigger was configured with a pinned model version, the response also contains predicted labels and entities for each comment. Refer to the corresponding sections to learn more about the format of the comment and predicted labels and entities.

Additionally, the response contains the sequence_id for the batch, and a sequence_id for each individual comment. They are used when advancing the trigger to the next batch or next comment.

If you have defined a comment filter on your trigger and that comment filter excludes a comment, it will get counted in filtered, but not returned in results.

NameTypeDescription
statusstringok if the request is successful, or error in case of an error. See the Introduction to learn more about error responses.
filterednumberNumber of comments that were filtered out according to trigger filter. If the trigger was created without a filter, this number will always be 0.
sequence_idstringThe batch sequence ID. Used to acknowledge processing of this batch and advance trigger to the next batch.
resultsarray<Result>An array containing result objects.

Where Result has the following format:

NameTypeDescription
commentCommentComment data. For a detailed explanation, see Comment structure section.
sequence_idstringThe comment's sequence ID. Used to acknowledge processing of this comment and advance trigger to the next comment.
labelsarray<Label>An array containing predictions for this comment. For a detailed explanation, see the Prediction structure section.
entitiesarray<Entity>An array containing extracted entities for this comment. An array containing entitites. For a detailed explanation, see the Prediction structure section.

Advance a trigger

POSTapiv1datasets<organisation><dataset_name>triggers<trigger_name>advance

Permissions required: Consume triggers, View labels, View sources
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers/dispute/advance' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR"
}'

Response

{
  "sequence_id": "qs8QcHIBAADJ1p3W2FtmBB3QiOJsCJlR",
  "status": "ok"
}

Each fetch request returns a sequence_id which represents the position it has fetched up to. Passing that same sequence_id to the advance api will make sure that next time a fetch is performed on the trigger it will start from this position. You can advance to the next batch by using the current batch's sequence_id. Alternatively, you can advance to the next comment by using the current comment's sequence_id.

Since an application can successfully process a comment but fail at the advance step, it is important to handle seeing a comment multiple time on the client application side.

NameTypeRequiredDescription
sequence_idstringyesThe sequence id to advance the trigger to.

Reset a trigger

POSTapiv1datasets<organisation><dataset_name>triggers<trigger_name>reset

Permissions required: Consume triggers, View labels
curl -X POST 'https://reinfer.io/api/v1/datasets/org1/collateral/triggers/dispute/reset' \
    -H "Authorization: Bearer $REINFER_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{
  "to_comment_created_at": "2020-06-03T16:05:00"
}'

Response

{
  "status": "ok",
  "trigger": {
    "created_at": "2019-08-03T12:30:00.123456Z",
    "dataset_id": "abcdef0123456789",
    "description": "Used by ACME RPA to create tickets for disputes.",
    "id": "0123456789abcdef",
    "model": {
      "version": 8
    },
    "name": "dispute",
    "title": "Collateral Disputes",
    "updated_at": "2019-08-03T12:30:00.123456Z"
  },
  "sequence_id": "4LvtenIBAAA="
}

A trigger can be reset to move its position backwards or forwards in time, either to repeat previously returned comments or to skip comments. The timestamp used to reset a trigger refers to the time the comments were uploaded (i.e. the comment's created_at property, rather than its timestamp property).

NameTypeRequiredDescription
to_comment_created_atstringyesA ISO-8601 timestamp.

The response will contain the sequence_id corresponding to the new Trigger position.