The cost/benefits on implementing transaction with the ACID properties in the light of a semantic store:
Atomicity
Atomicity guarantees that a whole transaction is either fully done or not done at all. Also all adding/removals are done at a single point in time. This property requires logging all actions and making sure when the system is started that all transactions that are not marked as completed are rolled back. In a REST scenario this requires giving the client a ticket-URI that can be used for querying the state of a transaction, just returning the transaction result in the response doesn't work as the system may be interrupted between completion of the transaction and the sending of the response so that the client has no way to find out the status of the transaction (sending the response before the transaction is marked as completed is not an option either, as the system might be interrupted between sending the response and logging the transaction as completed which will cause the transaction to be rolled back when the system resumes).
When the platform will support the time-versioning the ability of executing add and deletes at a single point in time will be essential.
For now Atomicity must be provided by the storage implementations on a triple level, a triple is either in the store or not. For applications designed against the semantic web paradigm of open world assumption and monotonicity this is sufficient: every added triple describes a (true) fact of the world, more triples may further reduce the possible worlds in which the graph is true, but in no case multiple triples must be added as a whole for the resulting graph not to be false. Applications accessing the graph must be aware that the absence of a triple says nothing about it's truth value, they must provide their functionality within the limits of the known facts without being confused by some data missing.
Consistency
A transaction is required to leave the database in a consistent state. RDBMS have defined states of inconsistency, this is not the case for RDF/OWL, in some cases a grapg might be contradictory (i.e. it evaluates to false in every possible world) but there's nothing illegal or inconsistent with such a graph.
Isolation
A transaction is executed as if in single-user mode, that is a transaction must be protected from inferences from other transactions. This can be trivially achieved by executing all transactions sequentially.To allow better perfromance the system might however support different type of locking, allowing the locking to be less restrictive while maintaining this property by rolling back the transaction in the event of a concurrent change that affects the result of the queries of the transaction being isolated (and thus possibly affects the modification the trasaction does). Within a transaction the code should see the result of previous modification within the transaction on queries but no modifications made outside the transaction. Implementing this in an efficient fashion (i.e. without exclusive sequential execution) requires the store to at least temporarily expose multiple versions of the graph.
Without providing full transaction/isolation support SCB could provide a Lock for its mutable graph as well as one for the TcManager. It would have to be defined if read and write operations would automatically request such a lock on invocation of the respective methods or if this is left to the application, in any case the application could make lock for a bigger scope than a dingle method call on the tc-manager allowing things like "check if a user with that username already exists, if not create one" to be executed within a write-locked section.
Durability
The result of a result is persistent, this is guaranteed with persistent stores. Obviously not for in-memory stores.
JTA
Java Transaction API (JTA) specifies standard Java interfaces between a transaction manager and the parties involved in a distributed transaction system.
While JTA seems to be the most relevant standard for transactions on the Java platform, to access the specification linked at http://java.sun.com/javaee/technologies/jta/index.jsp one must agree to a software license agreement, this agreement says in the first section: "[...] developing applications intended to run on an implementation of the Specification, provided that such applications do not themselves implement any portion(s) of the Specification" this scared me away at first, but re-reading I see that the license goes on with a second section saying: "Sun also grants you a [...] license [...] to create and/or distribute an Independent Implementation of the Specification that: (a) fully implements the Specification including all ts required interfaces and functionality; (b) [...] (c) passes the Technology Compatibility Kit [...]" if we get an open source license of the TCK using JTA might be an option.
Comments (2)
Jun 12, 2009
Tobias Hofer says:
Atomicity and consistency A transaction is a transformation of state which has ...Atomicity and consistency
A transaction is a transformation of state which has the properties of atomicity (all or nothing), durability (effects survive failures) and consistency (a correct transformation). The transaction concept is key to the structuring of data management applications.
A system state consists of records with changeable values and includes assertions about the values of records and about the allowed transformations of the values. These assertions are called the system consistency constraints.
The system provides actions which read and transform the values of records. A collection of actions which comprise a consistent transformation of the state may be grouped to form a transaction. Transactions preserve the system consistency constraints – they objey the laws by transforming consistent states into new consistent states.
The open world assumption and the rules of monotonicity are based upon the constraint that every known statement is true. One cannot trust information of a source that is not able to follow that constraint.
I strongly disagree your conclusion that atomicity of level statement is sufficent to fulfil the above requirement.
Imagine a graph containing the following statements
:Person_1 foaf:name "Tobias S. Hofer" foaf:firstName "Tobias" foaf:surname "Hofer"Next, a user likes to change it to
:Person_1 foaf:name "Reto Bachmann-Gmür" foaf:firstName "Reto" foaf:surname "Bachmann-Gmür"The graph will evidently be false, if the change fails (for any ambiguous reason) after the first updated statment and if the application cannot guarantee an atomic transformation.
Durability
Durability also refers to the ability of the system to recover committed transaction updates if either the system or the storage media fails. Some features to consider for durability are:
A system must provide the ability to let its data backuped during lifetime without noticeable reduction of its operability
Aug 18, 2009
Reto Bachmann says:
Atomicity and consistency The example seem to based on the concept of URI recicl...Atomicity and consistency
The example seem to based on the concept of URI recicling. But URIs do have a fixed transtemporal identity, unless a person has two names, the two graphs cannot be true at the same time, so if change my identity and the previous becomes false the 3 statements are false, the graph will only be true if all 3 statements got changed, before that the graph will always be false, idenpendently if the change happened attomicly or not.
Durability
The triple-store implementation should guarantee this on the triple level, I agree that there are some cases in whichj this is not enought, i.e. currently some integritiy checking might be needed when the triple stores comes up again.