Unifier is an Operator that makes it possible to execute complex, declarative, conjunctive queries against any Context implementation that supportsTripleMatcher.

Conjunctive queries in practice

Declarative, conjuctive queries are useful for extracting tabular information from complex RDF graphs. For instance, suppose I have the following RDF data (in this example "ns:" is shorthand for a namespace prefix, let's say "http://ns#"):

<ns:person1> <ns:hasDog> <ns:dog1> .
<ns:person2> <ns:hasDog> <ns:dog2> .
<ns:person3> <ns:hasDog> <ns:dog3> .
<ns:person4> <ns:hasDog> <ns:dog3> .
<ns:person5> <ns:hasDog> <ns:dog4> .
<ns:person1> <ns:hasPhoneNumber> "123-4567" .
<ns:person2> <ns:hasPhoneNumber> "234-5678" .
<ns:person3> <ns:hasPhoneNumber> "345-6789" .
<ns:person4> <ns:hasPhoneNumber> "456-7890" .
<ns:person5> <ns:hasPhoneNumber> "567-8901" .
<ns:dog1> <ns:hasName> "fido" .
<ns:dog2> <ns:hasName> "rover" .
<ns:dog3> <ns:hasName> "spot" .
<ns:dog4> <ns:hasName> "spot" .

This data describes people and their dogs. Note that person3 and person4 share dog3, and that dog3 and dog4 are both named "spot".

Suppose I want to find out who has a dog named "spot", and what their phone numbers are. This is not possible with Tupelo 2's TripleMatcheroperator, but it is possible to express in a declarative query language such as SPARQL:

SELECT ?person ?phoneNumber
WHERE {
  { ?person <ns:hasDog> ?dog . }
  { ?dog <ns:hasName> "spot" . }
  { ?person <ns:hasPhoneNumber> ?phoneNumber . }
}

Constructing a unifier query

Unifier provides an API to construct and execute conjunctive queries. For instance, the query shown above would be constructed in the following manner. First, for each node we want to match against a known value or identifier, we construct the appropriate UriRef representing the value or identifier:

UriRef hasDog = Resource.uriRef("http://ns#hasDog");
UriRef hasName = Resource.uriRef("http://ns#hasName");
UriRef hasPhoneNumber = Resource.uriRef("http://ns#hasPhoneNumber");
Literal spot = Resource.literal("spot");

Now we create a Unifier. We then configure the Unifier with a list of column names, which corresponds to the SELECT clause in SPARQL. Because these correspond to variable names in the query patterns, we can call them whatever we want:

Unifier unifier = new Unifier();
unifier.setColumnNames("person", "phoneNumber");

To construct the rest of the query, we add patterns to the Unifier with the addPattern method:

unifier.addPattern("person", hasDog, "dog");
unifier.addPattern("dog", hasName, spot);
unifier.addPattern("person", hasPhoneNumber, "phoneNumber");

Each of the three terms in the addPattern method is either a String, which is taken to be the name of a variable; or a Resource, which is taken as a value to match against.

Processing query results

The execution of a Unifier query produces a set of results; each result is a set of variable bindings that taken together match all the patterns. Unifier returns a Table of Resource where each row represents a result and each column represents the column variable's value for that result.

Context someContext = ...
someContext.perform(unifier);
Table<Resource> results = unifier.getResult();

Now we can process the results. The following code

for(Tuple<Resource> row : results) {
    System.out.println(row);
}

will produce the following output:

[http://ns#person4,456-7890]
[http://ns#person5,567-8901]
[http://ns#person3,345-6789]

Note that depending on the Context implementation, the results might be in a different order. A Table can be sorted by putting its rows into aSortedSet. Tuples sort by terms, in order. For instance, we could print the rows from the previous example in order like this:

TreeSet<Tuple<Resource>> sortedResults = new TreeSet<Tuple<Resource>>();
for(Tuple<Resource> row : results) {
    sortedResults.add(row);
}
for(Tuple<Resource> row : sortedResults) {
    System.out.println(row);
}
  • No labels