Specifying and implementing AQL CONTAINS

In the interests of clarifying what became a far more contentious issue than I would have ever imagined, let me describe briefly why I thought we should retain AQL’s independence at the spec level from any particular model.

Let’s say we have two models, openEHR RM and Acme RM, a model of some company structures. To let an AQL processor know where the logical CONTAINment relations are, the processor needs some model information. It could interrogate a meta-model, but let’s say we don’t want to provide a whole meta-model (although, we already have it, implemented and in use…), so we provide a simple graph of CONTAINS relations, like so:

// openEHR RM CONTAINS relations
source         target
---------      --------

// ACME IM CONTAINS relations
source         target
---------      --------
ASSET          ITEM

Now, for an AQL query to be checked, the above table just needs to be looked up for whatever the info model at hand is. Clearly, more simple info can be added, e.g. cardinality, ref type etc:

// openEHR RM CONTAINS relations
source         target       cardinality       ref_type
---------      --------     -----------       --------
EHR            COMPOSITION  *                 IND
COMPOSITION    SECTION      *                 DIR

// ACME IM CONTAINS relations
source         target       cardinality       ref_type
---------      --------     -----------       --------
COMPANY        ORG_UNIT     1                 IND
ORG_UNIT       ORG_UNIT     *                 IND
ORG_UNIT       ASSET        *                 DIR

Now, in terms of implementation, let’s say AQL processor AQL-A runs over a certain RDBMS with a particular schema for openEHR RM, and another for Acme IM. Let’s say it converts AQL queries to SQL queries. It’s going to need to know what SQL to use to get a COMPOSITION for an EHR, i.e. to traverse the EHR indirect ref; same for the indirect refs in the ACME model.

Assuming a 3NF schema for the moment, it will need to know something like the following SQL for the AQL EHR[id=$id] CONTAINS c, where c is a COMPOSITION`:

SELECT id     // assume we get the id, not the object
FROM Composition
WHERE ehr_id = $id

or similar. A super-efficient schema might be non-3NF, and the above could be quite different, but you get the idea. So the AQL back-end will need whole bunch of things like this (many of which could actually be inferred from the schema…).

AQL-B implementation or binding will have a whole lot of different mappings.

The AQL spec needs to know nothing of this, there just needs to be a way of converting each logical relation to its concrete queries. If we were using a graph DB, then it will be some kind of API calls.

So as far as I can see, in principle the ability to query Containment graph meta-model info is all that needs to be mentioned in the AQL spec. Being able to query a full meta-model would provide a bit more power, but might not be that useful.

The correct graph Containment table for openEHR and any other RM would of course in reality be generated from a true meta-model representation (assuming such was available), such as we have in BMM. I could write the generator in ADL workbench or in Archie in an hour or two. It could also be added to the code in the UML extractor. But it would also be easy to write by hand.

Doing something like the above seems simple and practical to me, and why I don’t see any argument to create a direct dependency from AQL to openEHR RM. I’m sorry if I made that point too forcefully on the other thread.

If someone does have an argument as to why this or an equivalent simple approach will not work, I’d be very happy to hear it.

I agree and I don’t think this was ever controversial. Now, due to implementation details and the fact that we only implement CDR over openEHR RM, and no other RMs, we (ab)use this fact:

As a consequence, it is in our interest to give priority (and your and our limited time) to other pressing issues around AQL, than to make sure the specification is “clean”. (Actually, the specification should be general, but I see benefit in it containing non-formal parts (examples and explanations) that relate to openEHR RM and possibly Demographics RM.)


This is true. The above is IMHO not problematic. It’s about the functional expectations to the result set given an AQL and a defined dataset.

I’ve tried to provide an example here AQL - the simplest possible question?