The Kasabi Jobs API offers a simple protocol for carrying out data management tasks relating to a dataset as a whole.

Currently this service only supports reset of a dataset. Bulk download capabilities will be added later

Basic API Reference

Endpoint URL

The base URL for the Jobs API associated with a given dataset is:

http://api.kasabi.com/dataset/[short-code]/jobs

Where short-code is the short name for the dataset. E.g. the NASA dataset available from http://beta.kasabi.com/dataset/nasa has an Update API available at:

http://api.kasabi.com/dataset/nasa/jobs

Authentication

Only the owner of a dataset has the permissions to interact with the Jobs API. To access the API will require use of your API key. For more information on Kasabi authentication options read the authentication documentation.

Parameters

The Jobs API accepts two parameters:

  • jobType -- (required) the name of the job to be carried out (see below), e.g. reset
  • startTime -- (optional) the time at which the job should begin, e.g. 2011-05-27T14:30:00Z

These values can be provided as parameters to POST request to the Jobs API endpoint. If startTime is not used, the job will run as soon as possible. Alternatively the parameters can be provided as a JSON object included in the body of the request, e.g:


{
  "jobType": "reset",
  "startTime": "2011-05-27T14:30:00Z"
}

HTTP Response Codes

Clients should be prepared to receive any valid HTTP response code. The following table lists the most frequently used codes

Code Meaning
202 Accepted Request to process the job has been accepted
400 Bad Request Invalid data, e.g. missing parameters
401 Not Authorized API key is not authorized to access the data

Please also review our additional notes on response codes and error reporting.

Response Formats

The Jobs API typically returns plain text messages. For a successfully queued job request, an HTTP Location header is returned in the response. This indicates a URL that can be monitored to check on the status of a job.

Resetting a Dataset

A reset job (jobType of reset) can be used to delete all data from a dataset. During a reset job the storage for the dataset will be put into a read-only mode. The status of the storage can be monitored via the Status API. Once the dataset has been cleared, the dataset will be returned to a read-write status.

Reset jobs are provided to support developer workflows, e.g. testing and populating a draft dataset (read more about the lifecycle of Kasabi datasets). Once a dataset has been published, data owners are encouraged to avoid unnecessarily resetting their store as this will impact potential users of the data.

Monitoring Job Status

An HTTP request to successfully queue a job will return an HTTP Location header which contains the URL of a resource that can be monitored to track the progress of the job. A GET request on that resource will return a JSON response that describes the current state of the job.

For example, a request to reset data in the NASA dataset might return the following URL for a job resource:

http://data.kasabi.com/data/nasa/jobs/12345

A GET request to that resource will return a JSON response with the following format:


{
  "created": "2011-05-27T14:30:00Z",
  "startTime": "2011-05-27T14:31:00Z",
  "endTime": "2011-05-27T14:33:00Z",
  "status": "succeeded"
}

The fields in the response are defined as follows:

  • created -- date-time when the job request was created in the system
  • startTime -- scheduled start time for the job
  • endTime -- date-time when the job request finished running
  • status -- status of the job. This will be one of the following values: scheduled, running, completed, failed