Introduction
The classes provided by org.apache.clerezza.rdf.enrichment allows generation of new triples based upon a TripleCollection. These triples are dynamically created when invoking the filter()-method. The returned Iterator<Triple> will contain additional triples that are added by implementations of org.apache.clerezza.rdf.enrichment.Enricher, but these additional triples are not stored in the underlying TripleCollection.
Example use cases are:
- resources with a common eg:parent should be eg:sibling.
- for every foaf:Person there shall be a foaf:PersonalProfileDocument
- every non literal resource shall be subject of rdf:type rdfs:Resource statement
- every eg:City shall have a eg:currentWeather property pointing to a bnode with temperature and humidity
Enriching a TripleCollection
To enrich a TripleCollection, it has to be wrapped by org.apache.clerezza.rdf.enrichment.EnrichmentTriples. When instantiating EnrichmentTriples it also takes a Collection of org.apache.clerezza.rdf.enrichment.Enricher as argument. These Enrichers will be invoked when calling EnrichmentTriples.filter() and generate the additional triples which will be combined in the returned Iterator<Triple>.
// instantiate the graph that will be enriched MGraph underlyingGraph = new SimpleMGraph(); // prepare the enrichers which will be given as argument to the EnrichmentTriples constructor Set<Enricher> enrichers = new HashSet<Enricher>(); enrichers.add(new ExampleEnricher()); // enrich the underlyingGraph with the enrichers. MGraph enrichedGraph = new EnrichmentTriples(underlyingGraph, enrichers);
org.apache.clerezza.rdf.enrichment.Enricher
Enricher is an abstract class which has two abstract methods:
| Method |
Description |
|---|---|
| filter() | Returns the iterator containing the additional triples. As arguments it takes the TripleCollection which has to be enriched and the filter arguments subject, predicate and object used for the EnrichmentTriples.filter() call. |
| providedTriplesCount() | Takes the to-be-enriched TripleCollection as argument and returns the number of enrichment triples for this specified TripleCollection. This is the total number of distinct triples which would be returned by the Enricher.filter()-method with the specified TripleCollection as one argument and all possible filter arguments as the others. |
To prevent unnecessary Enricher.filter() calls, each filter argument is tested by a ResourceFilter returned by one of the corresponding methods:
- Enricher.getSubjectFilter()
- Enricher.getPredicateFilter()
- Enricher.getObjectFilter()
Each of these methods returns an implementation of org.apache.clerezza.rdf.enrichment.Enricher.ResourceFilter. A ResourceFilter has an accept()-method which takes the filter argument and the to-be-enriched TripleCollection as arguments. The accept()-method returns true if the filter argument passed, false otherwise. The Enricher.filter()-method is only called if all three filters returned by the above methods pass. Filter arguments that are null always pass, independently of the returned ResourceFilter. The ResourceFilters returned by the default implementation accept all resources, therefore these methods have to be overridden to return ResourceFilters with other behaviors, if pre-filtering should be done.
Enricher has serveral static methods which return ResourceFilters which do some specific filtering. These methods are:
| Method | Description of the returned ResourceFilter |
|---|---|
| getDataTypeFilter(UriRef dataType) | A filter accepting only typed literals of the specified dataType |
| getExtensionalFilter(Resource[] resources) | A filter that matches only if the resource is one of the specified resources |
| getFilterForSubjectsWith(UriRef predicate, Resource object) | A filter accepting all resources that are the subject of a statement with the specified predicate and object |
| getFilterForSubjectsWithProperty(UriRef predicate) | A filter accepting all resources that are the subject of a statement with the specified predicate |
The org.apache.clerezza.rdf.enrichment.OrConnector extends also ResourceFilter. It takes an array of ResourceFilter as argument. The OrConnector accepts a Resource iff any of its underlying base filters accepts it.
org.clerezza.utils.rdf.enrichment.VGraph
VGraph is a sub-class of Enricher. Its purpose is to make the enrichment implementation easier. The VGraph's constructor receives two ResourceFilter(s) as arguments. The first ResourceFilter is a filter for subjects, while the second is for objects. VGraph is also an abstract class with a single abstract method resourceAccessEvent(). This method is called when either the subject or the object argument used for EnrichedTriples.filter() was accepted by the respective ResourceFilter specified in the VGraph constructor.
The resourceAccessEvent()-method receives the accepted resource used in the filter()-call as an argument. An implementor then can build a graph around this resource. This is done by adding triples to the VGraph (which also implements the MGraph-interface). The graph built around this resource is then used for the current and future filter calls for enrichment.
In the following example an rdf list is maintained for each resource of RDF:type observedClass. The current date is appended to the list, when the resource is the subject in the filter statement. The list is emptied as soon as the resource is accessed more than 10 times.
At the moment only UriRef resources are supported for enrichment by VGraph.
public class VGraphExample extends VGraph { public static final UriRef observedClass = new UriRef("http://localhost:8080/example#Observed"); public VGraphExample() { // The VGraph constructor takes two ResourceFilters as arguments. The first filters subjects // and the second objects. super(getFilterForSubjectsWith(RDF.type, observedClass), null); } @Override public void resourceAccessEvent(NonLiteral resource, TripleCollection tc) { addDateToList(resource); } private void addDateToList(NonLiteral resource) { // Instantiate with the VGraph as the TripleCollection RdfList list = new RdfList(resource, this); if (list.size() > 10) { list.clear(); } list.add(LiteralFactory.getInstance().createTypedLiteral(new Date())); } }
org.apache.clerezza.platform.rdf.enrichment.EnrichmentTcProvider
The EnrichmentTcProvider is an implementation of WeightedTcProvider. It provides a read-only MGraph with the name http://zz.localhost/enrichment.graph containing the enrichments provided by all available OSGi services which implement the Enricher interface (i.e. @Service(Enricher.class)). The enrichment is done on the http://tpf.localhost/content.graph and the enrichment graph http://zz.localhost/enrichment.graph is also a content graph addition.
Note that for performance reasons EnrichmentTcProvider gets the http://tpf.localhost/content.graph from TcManager and doesn't do access control check on every request, to prevent deductions about the content of the content graph by unauthorized users the required permissions on the enrichment graph are set to those on the content graph.
Remarks: There are some examples of usage of Enricher and VGraph available in the bundle org.clerezza.examples.