# Improve AQL to simplifying the querying of terms **Category:** [AQL](https://discourse.openehr.org/c/aql/43) **Created:** 2024-04-17 09:21 UTC **Views:** 410 **Replies:** 21 **URL:** https://discourse.openehr.org/t/improve-aql-to-simplifying-the-querying-of-terms/5120 --- ## Post #1 by @sebastian.iancu This is a spinoff from https://discourse.openehr.org/t/how-to-make-mappings-easier-to-query-and-to-document-in-openehr/5063 and https://discourse.openehr.org/t/ehrbase-storing-and-querying-data-without-an-standalone-archetype/5058 focusing on **issue 4** as described below: [quote="ian.mcnicoll, post:14, topic:5063"] scope which is really about how/where to record/query additional term codes, alongside internal codes like atcodes. This is (IMO) about how to be able to define and record additional multiple termCodes, to support querying via those additional terms In scope 1. Where to add additional terms e.g LOINC and/or SNOMED to * a Node name (LOCATABLE.name/value), alongside the archetypeNodeId * a Node value, usually DV_CODED_TEXT, in addition to any defining_code e.g an internal atCode is used as the value defining_code but we also want to carry the SNOMED equivalent code in the patient record 2. How do we constrain archetypes/templates to ‘force’ any additional codes to be carried? 3. Can and should we constrain these additional codes as mandatory, or impose validation? 4. Can we improve AQL to simplifying the querying of terms? [/quote] [quote="ian.mcnicoll, post:37, topic:5058"] What is new in Better CDR 4.0 is the ability to query an ELEMENT for one of these terms , regardless of the parent archetype context. This uses a `code` predicate. Not how I might have done it but I guess it aligns with the FHIR approach. ``` SELECT sys/value as Systolic, dia/value as Diastolic FROM COMPOSITION c CONTAINS (ELEMENT sys[code='SNOMED-CT::271649006'] AND ELEMENT dia[code='SNOMED-CT::271650006']) ``` [/quote] --- ## Post #2 by @ian.mcnicoll One thing that would help here is an AQL function that would treat the defining_code and any mappings as a single flat list (like FHIR CodeableConcept), plus allow token matches ( terminologyId and term combined) e.g hasTerm{"SNOMEDCT::"123456") --- ## Post #3 by @birger.haarbrandt ``` SELECT sys/value as Systolic, dia/value as Diastolic FROM COMPOSITION c CONTAINS (ELEMENT sys[code='SNOMED-CT::271649006'] AND ELEMENT dia[code='SNOMED-CT::271650006']) ``` Would be nice to hear from @matijap about design considerations. I know that we can introduce conventions into AQL but introducing a predicate that is like an alias and another syntax for the codes and values does not feel right. Edit: after fiddling around a bit I must confess that this is ugly when sticking to XPath conventions so I start understanding where this approach is coming from :zipper_mouth_face: --- ## Post #4 by @matijap This approach is simple, straightforward and well-defined. The last point is important so that we can actually deliver an implementation. :slight_smile: There were many things we considered, but I don't think I can reconstructs all the arguments that went into it... (One of the things was whether this would be the first functionality to break our approach of independence between AQL and the model. We were considering finding all the code mappings in the archetypes and converting them to at codes before executing the query, which would mean you could add new codes to old archetypes and have your old data automatically be mapped to those new codes. At the end we opted against it...) --- ## Post #5 by @birger.haarbrandt Why not define a function like this : `ELEMENT sys[CONTAINS_CODE('SNOMED-CT', '271649006')]` --- ## Post #6 by @matijap Yes, but we abandoned that idea. Fair point... --- ## Post #7 by @birger.haarbrandt Sorry, I messed up when I tried to edit. With a function, we could also introduce another parameter if the "match" field needs to be considered. --- ## Post #8 by @ian.mcnicoll Perhaps we can get the best of both worlds. I like the use of a token to shortcut the terminologyId and codestring. These appear in FHIR and GDL Wr also need something to blur the difference between a term carried in defining_code and one carried in mappings. So introduce a function like CONTAINS_TERM('SNOMED-CT'::271649006','=" ) which means ... find that SNOMED code either as a defining_code or as a mapping, where the mapping.matches = '=' Better still if the first parameter was a list of terms that could be returned from a terminology expand operation. This could be used as both WHERE ELEMENT.name/value/CONTAINS_TERM('SNOMED-CT'::271649006','=" ) or WHERE ELEMENT.value/CONTAINS_TERM('SNOMED-CT'::271649006','=" ) and then CONTAINS (ELEMENT sys[code='SNOMED-CT::271649006'] is essentially a shortcut for WHERE sys.name/value/CONTAINS_TERM('SNOMED-CT'::271649006','=" ) @matijap - does your current `code` construct include something in `ELEMENT.name/value/defining_code`? --- ## Post #9 by @matijap No, only mappings are considered, not the defining code that could be there if the name is a DV_CODED_TEXT. --- ## Post #10 by @SevKohler What about DV_CODED_TEXT as answer for e.g. Problem/Diagnose name ? That would be the only case i still see that term mappings should not cover. If no, how did you imagine the queries then ? Searching Element for both with an OR ? --- ## Post #11 by @ian.mcnicoll I had assumed that [code=...] only applied to LOCATABLE.name but perhaps I'm wrong That why I was suggesting separate function that could be applied to both name and value, and could include defining_code. It is possible to use current AQL to do this, of course, but it is really convoluted. --- ## Post #12 by @matijap In our case not even to all LOCATABLE.name but only to ELEMENT.name, and within that only to ELEMENT.name.mappings. (We might consider including ELEMENT.name.defining_code in the future though.) --- ## Post #13 by @birger.haarbrandt [quote="ian.mcnicoll, post:8, topic:5120"] So introduce a function like CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" ) [/quote] This is what I have on my mind, too. Would only write as separate parameters `CONTAINS_TERM('SNOMED-CT','271649006', '=' )`or `CONTAINS_TERM('SNOMED-CT::271649006', '=' )` and I think there would be merit to add ELEMENT.name.defining_code. @matijap would this work for you as well? --- ## Post #14 by @sebastian.iancu Nice discussion, focused on problem solving. Could you try to address this problem in a more formal way? I'm thinking that you should make a proposal for SEC or and AQL WG to extend the list of functions with this one. Because this AQL might be implemented now mainly at EHRBase & Better, but in fact there are more parties involved or impacted. Also, the specs will have to be updated, with Antlr included - and the change should be consistent with previous spec and styles. Of course, you can still implement this outside of AQL standard, but that's not that nice I guess. --- ## Post #15 by @birger.haarbrandt Hi @sebastian.iancu, this would be the next step for me. Once we got a rough outline, we can create tickets, discuss in the SEC meeting etc. --- ## Post #16 by @sebastian.iancu [quote="ian.mcnicoll, post:8, topic:5120"] CONTAINS (ELEMENT sys[code=‘SNOMED-CT::271649006’] is essentially a shortcut for WHERE sys.name/value/CONTAINS_TERM(‘SNOMED-CT’::271649006’,'=" ) [/quote] If I am not mistaken, the current grammar does not support function-calls on the CONTAINS expression, but it does allow the above `[code='...']` as predicate even though code does not exist. Is a WHERE expression always a solution to avoid CONTAINS expressions with function calls? --- ## Post #17 by @sebastian.iancu [quote="birger.haarbrandt, post:13, topic:5120"] CONTAINS_TERM('SNOMED-CT::271649006', '=' ) [/quote] Based on current grammar and spec-style I would suggest `CONTAINS_TERM(sys.name/value, 'SNOMED-CT', '271649006', '=' )` instead of `sys.name/value/CONTAINS_TERM('SNOMED-CT', '271649006', '=' )` unless we make this CONTAINS_TERM a function on the RM level. Name of the function 'CONTAINS_TERM' is also debatable from my point of view if we make it an AQL function. Also, consider the complex functionality supported by TERMINOLOGY function, which is not only matching codes, but can also expand, validate, map, subsume. How would that relate to CONTAINS_TERM working? --- ## Post #18 by @ian.mcnicoll [quote="sebastian.iancu, post:17, topic:5120"] Also, consider the complex functionality supported by TERMINOLOGY function, which is not only matching codes, but can also expand, validate, map, subsume. How would that relate to CONTAINS_TERM working? [/quote] I agree we need to support TERMINOLOGY expansions so the term codes/terminologyIds need to be an array, and I think pushes us back to an array of tokens and we may well have a mixed terminology expansion CONTAINS_TERM(sys.name/value, ['SNOMED-CT::271649006', LOINC::2345-3], '=' ) or CONTAINS_TERM(sys.name/value, TERMINOLOGY(...), '=' ) --- ## Post #19 by @damoca [quote="matijap, post:12, topic:5120, full:true"] In our case not even to all LOCATABLE.name but only to ELEMENT.name, and within that only to ELEMENT.name.mappings. (We might consider including ELEMENT.name.defining_code in the future though.) [/quote] Following this, I would be very careful when adding syntactic sugar to AQL. Not only because you need to have a very clear definition of it to avoid different interpretations and implementations, but more importantly, if we add syntactic sugar that it is only applicable to an openEHR reference model we would be deviating from the purpose of AQL (ARCHETYPE Query Language) and end with an OpenEHR Query Language. This would mean that the specification could not be valid for other reference models or even with the openEHR RM itself if it evolves in the following 10, 20, 30 years (future-proof, you know). --- ## Post #20 by @birger.haarbrandt @sebastian.iancu the function based approach (also without providing the sub-path to my understanding) is compliant with XPath which I think is a good point of reference how we should deal with predicates. In contrast, defining "virtual" predicates like "code" which are not present in the instance. As a user, I personally would find this confusing. --- ## Post #21 by @ian.mcnicoll From my limited understanding, AQL functions do not normally include the path - see [AQL spec]( https://specifications.openehr.org/releases/QUERY/latest/AQL.html) ``` WHERE e/value/defining_code/code_string matches TERMINOLOGY('expand', 'hl7.org/fhir/4.0', 'http://snomed.info/sct?fhir_vs=isa/50697003') ``` if we also want to search for terms in mappings then currently we have to do. ``` WHERE e/value/defining_code/code_string matches TERMINOLOGY('expand', 'hl7.org/fhir/4.0', 'http://snomed.info/sct?fhir_vs=isa/50697003') OR e/value/mappings/target/code_string matches TERMINOLOGY('expand', 'hl7.org/fhir/4.0', 'http://snomed.info/sct?fhir_vs=isa/50697003') ``` and it is even worse if looking for an array of terms as we have to query on the term and the terminologyId separately. So I still think this is closer to current AQL function approaches WHERE sys.name/value/CONTAINS_TERM(TERMINOLOGY('expand', 'hl7.org/fhir/4.0', 'http://snomed.info/sct?fhir_vs=isa/50697003'), '=' ) Happy to be proven wrong (again!) ``` --- ## Post #22 by @ian.mcnicoll Haha I can already see where I am wrong! So I agree AQL function would be something like CONTAINS_TERM(sys.name/value, TERMINOLOGY(…), ‘=’ ) an RM function which flattens the current CODED_TEXT construct into something closer to a FHIR CodeableConcept might have extra utility however. --- **Canonical:** https://discourse.openehr.org/t/improve-aql-to-simplifying-the-querying-of-terms/5120 **Original content:** https://discourse.openehr.org/t/improve-aql-to-simplifying-the-querying-of-terms/5120