RDF and SPARQL

RDF

The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource. However, by generalizing the concept of a "Web resource", RDF can also be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web. Examples include information about items available from on-line shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery.

RDF is intended for situations in which this information needs to be processed by applications, rather than being only displayed to people. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created.

RDF is based on the idea of identifying things using Web identifiers (called Uniform Resource Identifiers, or URIs), and describing resources in terms of simple properties and property values. This enables RDF to represent simple statements about resources as a graph of nodes and arcs representing the resources, and their properties and values.

RDF Primer

The RDF graph creation language allows graphs to be created, with typed vertices, here the vertices have attribtues, that reference Web based resources. An example given in the primer is http://www.foo.org/someWebPage.html has a creator whose name is Foo Bar. RDF is an attempt to supply a semantic framework of metadata that describes objects on the Web. The ontology is described by something called RDF-CONCEPTS, which supports the specification of vertices and edges. The RDF-CONCEPTS specification can also describe attributes associated with vertex types. Data in this graph is described in a triple, of the form: { vertex, field name, value}. Links between vertices are defined by a triple of the form: {vertex source, link, vertex destination}. An example is shown below, where the first line defines a link between two vertices and the second and third lines define attribute values on a vertex.

<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740> .

<http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999" .

<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/language> "en" .

SPARQL

SPARQL is a query language for graphs (vertices and edges) constructed via RDF.

The SPARQL query language references vertices using this format. In the example below a query is formulated against vertices of type http://example.org/book/book1, referencing the field http://purl.org/dc/elements/1.1/title and what ever value is in that field (?title is a variable to which that value is assigned). The title value is returned as the query result.

SELECT ?title
WHERE  { <http://example.org/book/book1> <http://purl.org/dc/elements/1.1/title> ?title }

The SPARQL select can be formulated on multiple fields as shown below (here the . is a logical and).

PREFIX foaf:   <http://xmlns.com/foaf/0.1/> 
SELECT ?mbox
WHERE
  { ?x foaf:name "Johnny Lee Outlaw" .
    ?x foaf:mbox ?mbox }

This query selects on the basis of the name (Johnny Lee Outlaw) and returns only mbox (much like a relational SQL query).

As the last query suggests, the SPARQL query language mirror relational SQL in many way, including the ability to nest queries.

SPARQL can return subgraphs, but much of the language is designed to return what are, in effect tables (in the same way that relational SQL does).

Graph Query Language Research at the Lawrence Livermore National Laboratory

The Distributed Semantic Graph Engine (DSGE) research project at the Lawrence Livermore National Laboratory has developed a graph query language for semantic graph structured databases. This query language is described in the technical report A Semantic Graph Query Language by Ian Kaplan, Lawrence Livermore National Laboratory, October 17, 2006 UCRL-TR-255447 (PDF format). A revised version of this technical report should be available by the end of 2007.


Last modified: Thu Dec 8 22:23:09 PST 2005