Kasabi supports exposing the RDF stored in a dataset as Linked Data. This allows URIs in a dataset to be directly accessed from an application or web browser in order to fetch a description of the resource.
Linked Data facilitates the connecting of datasets to the growing web of data, and use of the data by Linked Data browsers and applications.
Data publishers wishing to take advantage of this feature must construct the URIs for their data based on some basic guidelines. All the URIs in a dataset must be given a consistent base URI based on http://data.kasabi.com. The following section indicates how to determine the correct base URI for a specific dataset.
In future releases the Linked Data publishing feature will be expanded to support more options, including support for domain hosting. This will allow data owners to use their own domains for assigning identifiers to resources in their datasets, ensuring portability.
Publishing Linked Data
The Linked Data views for Kasabi are available from a predictable location based on the URL of the dataset. The basic pattern is as follows:
http://data.kasabi.com/dataset/[dataset-id]/[user-defined-path]Where dataset-id is the short name for the dataset, and user-defined-path is a relative URL path defined by the dataset owner.
For example the NASA dataset, available from http://kasabi.com/dataset/nasa has a short name of nasa. Resources in that dataset all share the following base URI:
http://data.kasabi.com/dataset/nasaAn example of a resource identifier in that dataset is:
http://data.kasabi.com/dataset/nasa/spacecraft/1983-010AIn this case the user defined part of the identifier is /spacecraft/1983-010A.
Other than these guidelines required to take advantage of the built-in Linked Data hosting facility, dataset owners are free to define URIs as they see fit. Future releases will support domain hosting avoiding the need to use a specific base URI.
However some basic limitations will remain in force:
- No support for Hash URIs -- do not use hash based identifiers, e.g.
/spacecraft/1983-010A#spacecraft. These will be stored and managed correclty within the graph store, but will not resolve as Linked Data. Kasabi encourages the use of "slash" based identifiers for resources. - No support for URNs -- Linked Data is based on using HTTP URIs to allow data to be integrated directly into the web. Whilst the graph store within Kasabi supports use of URN based identifiers, the Linked Data hosting facility will not support use of URNs
Accessing Linked Data
To access Linked Data in the browser or an application, simply perform a GET request on a resource identifier. Application code should be configured to expect HTTP redirects as user agents will be directed to a document containing the requested data.
Linked Data is available in a range of standard RDF serializations, including RDF/XML and Turtle. JSON is also supported via the RDF/JSON format. A simple HTML view is also provided to facilitate basic navigation through a dataset.
Content negotiation is supported using the standard Kasabi content negotiation methods, i.e. the HTTP Accept header and output parameter.
In addition it is also possible to add a suffix to the URI, such as .rdf or .json to explicitly request a specific data format. E.g:
http://data.kasabi.com/dataset/nasa/spacecraft/1983-010A.rdfLinked Data for Datasets
Every dataset has a machine-readable description available as Linked Data. This description is automatically populated by metadata provided in the Kasabi application and includes title, description, links to example resources, and pointers to APIs deployed against the dataset.
Dataset metadata is published using the VoiD vocabulary for describing linked datasets.
The Kasabi identifier for a dataset is simply the base URI of the dataset itself. E.g. the description of the NASA dataset is available from:
http://data.kasabi.com/dataset/nasaAn RDF/JSON view of that data is available from:
http://data.kasabi.com/dataset/nasa.jsonA directory of all Kasabi datasets is available from:
http://data.kasabi.com/datasetsThis URL provides an RDF view that lists all datasets in the system.