Improve AQL to simplifying the querying of terms

This is a spinoff from How to make mappings easier to query and to document in openEHR and [EHRbase] Storing and querying data without an standalone archetype focusing on issue 4 as described below:

2 Likes

One thing that would help here is an AQL function that would treat the defining_code and any mappings as a single flat list (like FHIR CodeableConcept), plus allow token matches ( terminologyId and term combined)
e.g hasTerm{"SNOMEDCT::“123456”)

1 Like
SELECT
    sys/value as Systolic,
    dia/value as Diastolic
FROM COMPOSITION c
CONTAINS (ELEMENT sys[code='SNOMED-CT::271649006'] AND ELEMENT dia[code='SNOMED-CT::271650006'])

Would be nice to hear from @matijap about design considerations. I know that we can introduce conventions into AQL but introducing a predicate that is like an alias and another syntax for the codes and values does not feel right.

Edit: after fiddling around a bit I must confess that this is ugly when sticking to XPath conventions so I start understanding where this approach is coming from :zipper_mouth_face:

1 Like

This approach is simple, straightforward and well-defined. The last point is important so that we can actually deliver an implementation. :slight_smile: There were many things we considered, but I don’t think I can reconstructs all the arguments that went into it…

(One of the things was whether this would be the first functionality to break our approach of independence between AQL and the model. We were considering finding all the code mappings in the archetypes and converting them to at codes before executing the query, which would mean you could add new codes to old archetypes and have your old data automatically be mapped to those new codes. At the end we opted against it…)

1 Like

Why not define a function like this :

ELEMENT sys[CONTAINS_CODE('SNOMED-CT', '271649006')]

Yes, but we abandoned that idea.

Fair point…

Sorry, I messed up when I tried to edit. With a function, we could also introduce another parameter if the “match” field needs to be considered.

1 Like

Perhaps we can get the best of both worlds.

I like the use of a token to shortcut the terminologyId and codestring. These appear in FHIR and GDL

Wr also need something to blur the difference between a term carried in defining_code and one carried in mappings.

So introduce a function like

CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" )

which means … find that SNOMED code either as a defining_code or as a mapping, where the mapping.matches = ‘=’

Better still if the first parameter was a list of terms that could be returned from a terminology expand operation.

This could be used as both

WHERE ELEMENT.name/value/CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" )

or

WHERE ELEMENT.value/CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" )

and then

CONTAINS (ELEMENT sys[code=‘SNOMED-CT::271649006’]

is essentially a shortcut for

WHERE sys.name/value/CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" )

@matijap - does your current code construct include something in ELEMENT.name/value/defining_code?

No, only mappings are considered, not the defining code that could be there if the name is a DV_CODED_TEXT.

What about DV_CODED_TEXT as answer for e.g. Problem/Diagnose name ?
That would be the only case i still see that term mappings should not cover.
If no, how did you imagine the queries then ?
Searching Element for both with an OR ?

I had assumed that [code=…] only applied to LOCATABLE.name but perhaps I’m wrong

That why I was suggesting separate function that could be applied to both name and value, and could include defining_code.

It is possible to use current AQL to do this, of course, but it is really convoluted.

1 Like

In our case not even to all LOCATABLE.name but only to ELEMENT.name, and within that only to ELEMENT.name.mappings. (We might consider including ELEMENT.name.defining_code in the future though.)

1 Like

This is what I have on my mind, too. Would only write as separate parameters

CONTAINS_TERM('SNOMED-CT','271649006', '=' )or

CONTAINS_TERM('SNOMED-CT::271649006', '=' )

and I think there would be merit to add ELEMENT.name.defining_code.

@matijap would this work for you as well?

1 Like

Nice discussion, focused on problem solving.
Could you try to address this problem in a more formal way? I’m thinking that you should make a proposal for SEC or and AQL WG to extend the list of functions with this one. Because this AQL might be implemented now mainly at EHRBase & Better, but in fact there are more parties involved or impacted. Also, the specs will have to be updated, with Antlr included - and the change should be consistent with previous spec and styles. Of course, you can still implement this outside of AQL standard, but that’s not that nice I guess.

1 Like

Hi @sebastian.iancu,

this would be the next step for me. Once we got a rough outline, we can create tickets, discuss in the SEC meeting etc.

2 Likes

If I am not mistaken, the current grammar does not support function-calls on the CONTAINS expression, but it does allow the above [code='...'] as predicate even though code does not exist.
Is a WHERE expression always a solution to avoid CONTAINS expressions with function calls?

1 Like

Based on current grammar and spec-style I would suggest
CONTAINS_TERM(sys.name/value, 'SNOMED-CT', '271649006', '=' )
instead of
sys.name/value/CONTAINS_TERM('SNOMED-CT', '271649006', '=' )
unless we make this CONTAINS_TERM a function on the RM level.

Name of the function ‘CONTAINS_TERM’ is also debatable from my point of view if we make it an AQL function.

Also, consider the complex functionality supported by TERMINOLOGY function, which is not only matching codes, but can also expand, validate, map, subsume. How would that relate to CONTAINS_TERM working?

2 Likes

I agree we need to support TERMINOLOGY expansions so the term codes/terminologyIds need to be an array, and I think pushes us back to an array of tokens and we may well have a mixed terminology expansion

CONTAINS_TERM(sys.name/value, [‘SNOMED-CT::271649006’, LOINC::2345-3], ‘=’ )

or

CONTAINS_TERM(sys.name/value, TERMINOLOGY(…), ‘=’ )

Following this, I would be very careful when adding syntactic sugar to AQL. Not only because you need to have a very clear definition of it to avoid different interpretations and implementations, but more importantly, if we add syntactic sugar that it is only applicable to an openEHR reference model we would be deviating from the purpose of AQL (ARCHETYPE Query Language) and end with an OpenEHR Query Language. This would mean that the specification could not be valid for other reference models or even with the openEHR RM itself if it evolves in the following 10, 20, 30 years (future-proof, you know).

4 Likes

@sebastian.iancu the function based approach (also without providing the sub-path to my understanding) is compliant with XPath which I think is a good point of reference how we should deal with predicates. In contrast, defining “virtual” predicates like “code” which are not present in the instance. As a user, I personally would find this confusing.