Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
TripleMatcher tm = new TripleMatcher();
tm.setObject(Resource.literal("comment*"));
luceneContext.perform(tm);
//... etc.




Would find every triple in the context whose object has a string that starts with "comment". Specifying subject and/or predicate would filter the results. Leaving the object unbound will simply pass the matcher off to the backing context. Lucene does allow for initial wildcard searches, such "*fnord" but as an performance issue, these are implemented (in Lucene) using various nested looping contructs and are therefore likely to be slow.

Matching and unification

A warning about matching vs. unification. When matching, Lucene is queried for possible matches as is the backing context. In unification only the backing context is used.

Searching for dates

The way that dates are entered is using the DateTools from Lucene, so every date ends up in the format

yyyyMMddHHmmss

Wiki Markup
This might cause some problems. For one thing, the full date (including time) mean that date range queries can't work \-\- they must include times. Normally one would issue something like \[19970122 TO 19980215\] to search for all documents between Jan. 22, 1997 and Feb. 15, 1998, but instead this should be done as \[19970122000000 TO 19980215000000\]. Lucene will not do a range query on something like \[19970122\* TO 19980215*\].

Performance notes

Lucene, if not properly tweaked, is very, very slow at updating its indices. Straight out of the box it is optimized for simple testing rather than heavy use. Performance considerations also mean that it is not practical to use LuceneContext and JournalingContext together.