Here is the content I’d like to suggest for addition to AQL spec. The spec becomes a a bit repetitive if this is replaces 3.10.2 but I need to explain various things to expand the argument so I can live with it. Comments are welcome.
3.10.2 FROM
The FROM
clause defines the scope of the query in terms of reference model (RM) types of data to be retrieved along with additional constraints that further narrow down the matching instances of data. These constraints can be constraints on attributes of the RM type in addition to structural constraints for data elements. Structural constraints can be in the form of logical relationships or instances of data directly containing other instances as nested attributes of an object. Both types of structural constraints are expressed with the same CONTAINS
keyword.
All other clauses in the AQL query reference data instances defined in the FROM clause using their aliases and either express further constrains on these, or define new data instances using relative paths based on these aliases.
The simplest example of a FROM
query would be one in which there is only a single RM type is declared.
SELECT .... FROM EHR e ....
The e
alias, used to refer to all instances of data that has the reference model type EHR, is required to refer to this set of data from other clauses, namely SELECT
and WHERE
. This alias can be used directly such as:
SELECT e FROM EHR e... (select all EHR instances)
or as the root of a relative path, which allows the query to express constraints or select data items accessible from the root of the relative path, as in:
SELECT e FROM EHR e WHERE e/ehr_id/value='some_ehr_id' (select all EHR instances that has an ehr_id with value 'some_ehr_id' )
In accordance with the XPath like constraint syntax of AQL, FROM clause can introduce attribute constraints to data instances it defines as in:
SELECT ... FROM EHR e[ehr_id/value='some_ehr_id']
Note that the example above is semantically equivalent to the using the WHERE clause. The attribute constraint in the FROM
clause is usually preferred.
As stated above, FROM
clause also allows AQL to define structural constraints. This feature is supposed via the CONTAINS
keyword, which expresses a containment relationship which can be logical or direct data instance containment. Therefore, the semantics of CONTAINS
keyword is overloaded for multiple types of relationships. An example FROM
clause, which depicts both types of relationships would be:
SELECT ... FROM EHR e[ehr_id/value='some_ehr_id'] CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1] CONTAINS CLUSTER cls[at0018]
From a reference model point of view, and EHR instance does not directly contain COMPOSITION instances. Instead, it has references to them expressed via its compositions attribute. The CONTAINS
keyword that establishes a structural constraint between EHR instances and COMPOSITION instances therefore implies COMPOSITION instances accessible from and EHR instance through resolving the values of its attribute, which in turn implies this is a logical structural constraint.
The second use of CONTAINS
keyword in the same query above establishes a structural constraint based on an instance of a COMPOSITION reference model type actually containing an instance of CLUSTER reference model type, where both instances individually must also satisfy the attribute constraints for their archetype node id, as defined on their aliases, c
and cls
. This structural constraint demonstrates direct data containment semantics of CONTAINS
keyword, similar to the concept of composition in object oriented languages, which is different than aggregation, as employed by the EHR type to refer to COMPOSITIONs. AQL implementations deal with this different semantics internally so that CONTAINS
keyword can be used seamlessly to express constraints on data to be retrieved.
The CONTAINS
keyword is complemented with logical operators such as AND and OR to express structural constraints that go beyond a single ‘path’ in the EHR. The details of these are provided below.
An important point regarding the use of complex structural constraints in the FROM
clause is that FROM
clause always has a single root declaration and all CONTAINS
keywords and logical operators describe constraints relative to this single root.
The data items defined in the FROM
clause can be of any RM type, however implementations of AQL usually support a subset of reference model types. This is usually done in order to ensure query performance and real life access patterns to data.
To clarify, a query such as
SELECT d FROM EHR CONTAINS DV_QUANTITY
is perfectly valid from a syntactic point of view, but its results, all instances of data with quantity type, across all data contained in all EHRs, is completely useless in real life.
Based on the syntax and intended functionality defined above, more formal semantics of FROM clause can potentially be represented with various existing formalisms, probably via extensions. One such formalism, Tree Pattern Queries is discussed in detail in regards to its use as a formalism for AQL in (Arikan 2016)
This particular formalism, presumably one of the many that could be used, defines semantics of FROM clause based on a single rooted tree pattern as introduced by the Tree Pattern Query (TPQ). This representation, with potential extensions borrowed from labelled property graphs (Green et al. 2018), can encode constraints defined in FROM node. Other relevant formalisms are discussed in the original work that led to the creation of AQL (Ma et al. 2007)
REFERENCES:
Arikan, S. S. 2016. ‘An Experimental Study and Evaluation of a New Architecture for Clinical Decision Support - Integrating the OpenEHR Specifications for the Electronic Health Record with Bayesian Networks’. Doctoral, UCL (University College London). An experimental study and evaluation of a new architecture for clinical decision support - integrating the openEHR specifications for the Electronic Health Record with Bayesian Networks - UCL Discovery.
Green, Alastair, Martin Junghanns, Max Kießling, Tobias Lindaaker, Stefan Plantikow, and Petra Selmer. 2018. ‘OpenCypher: New Directions in Property Graph Querying.’ In EDBT, 520–23.
Ma, Chunlan, Heath Frankel, Thomas Beale, and Sam Heard. 2007. ‘EHR Query Language (EQL)-a Query Language for Archetype-Based Health Records’. Medinfo 129: 397–401.