AQL- New feature suggestion: descendant paths

@ian.mcnicoll @pablo can you please clarify the use case you have in mind for descendant paths and the suggested changes to current syntax? (i.e. under which clauses this should be available)

jira issue: [SPECQUERY-18] Add support for descendant paths - openEHR JIRA

This is based on item 8. in the last section “Items for SEC discussion:” on my AQL review document: https://docs.google.com/document/d/1g8zOh06LhSNi1yFZWKuBzUX0bJN88r7mKpAFqDNi2JI/edit?usp=sharing

The whole idea started from the current example in AQL v1.0 https://specifications.openehr.org/releases/QUERY/latest/AQL.html#_exists_operator which contains a path “o//*/state…”, which wasn’t defined anywhere in the spec.

Then I discussed with Thomas about the use of “//*” or “/*” to shorten paths that mean “any descendant node”, similar to XPath. From that discussion, I found a similarity of that kind of path matching with the CONTAINS operator. In fact if we allow that on paths, and also the archetype matching predicates like “/o[openEHR-EHR-OBSERVATION.blood_pressure.v1]//*/value”, this could be used as CONTAINS but also has the ability of pointing to data, which would be pretty powerful.

All this is related to point ax. in my AQL review document https://docs.google.com/document/d/1g8zOh06LhSNi1yFZWKuBzUX0bJN88r7mKpAFqDNi2JI/edit?usp=sharing

Either way, if this is currently not supported by AQL, we need to remove the example with the //* from the next version of AQL spec.

Hope that gives a little context, I think the discussions are saved in Slack, I might forgot something.

Thanks Pablo. I’d suggest simply removing the use of // from the spec.

CONTAINS is a very well established clause in AQL and it does a specific thing (as I discussed in the topic related to FROM). Descendant paths are powerful indeed but I’d say we should ideally see the requirement to give that power to users. In almost all languages where some form of descendant paths is supported, its liberal use is also discouraged, because it’ll almost always have performance downsides. This is also why most vendors don’t support all types in the CONTAINS clause.

@ian.mcnicoll you said you had some examples when we met last week. Still thinking that’s the case? Can you share them? :wink:

I guess I had two main uses, one of which may be invalid!!

  1. When working with complex queries I have sometimes found it easier to get consistent results, or at least the optimum results by using full paths in SELECT, rather than the alias that CONTAINS allows. The downside of that is that one cannot ignore parts of the upper path e.g SECTIONS that one wants to ignore for cross-template queries at ENTRY level, for example. Now, I guess that might be down to improper/irregular implementation of CONTAINS , or indeed a feature of descendant paths and CONTAINS, and if so, probably not a good argument, since if done properly // or CONTAINS should return the same resultset.

  2. The other, possibly better argument is ease of understanding, conversion to/from other formalisms. The // idea is pretty well-established in xquery and json equivalents, so easier to explain to newbies. Also possibly easier to map between AQL and other higher-level constructs such as GraphQL. I think you might have told me that thinking about AQL resultsets in terms of what Xquery might do is helpful :wink: I guess it is part of the challenge of keeping our stuff familiar where-ever possible. I’m not suggesting dropping CONTAINS as it is very elegant,once you get familar with the rules.

I would expect // to be supported only in so far as CONTAINS is supported, in terms of types. Though I guess it might be possible to have any processing that is not supported by CONTAINS done on the resultset (with an expected performance hit).

Having said all that, I think there is a good argument for dropping it from current spec. AFAIK it is not supported by current implementations.

I think we can drop the example from the spec, and discuss the implications of adding support to “//” “/*” “//*” on the WIKI, of course that will have some implementation complexities and maybe cause some issues, so I guess we need to start analyzing this from scratch and evaluate pros/cons of this feature.

1 Like

Raised a ticket here: https://openehr.atlassian.net/browse/SPECPR-359

Moved the example from the spec to the WIKI: https://openehr.atlassian.net/wiki/spaces/spec/pages/532611073/AQL+descendant+paths+feature+discussion

And sent a PR: https://github.com/openEHR/specifications-QUERY/pull/6

1 Like

Great Pablo
I agree on leaving this stuff out since I don’t understand the use case for such a feature.

There are different ways to achieve the logic here depending on what the intention is.

This is just a shortcut, as any shortcut has it’s uses, mainly when you are lazy :slight_smile:

For instance, when working with XML processing, we write a lot of XPaths, and for some we use the // and //* in the paths to select certain fields, I guess here is the same but for selecting data from the RM.

I guess this is just a feature we can offer or not, depends on what we agree. Either way, it is worth the analysis since someone at some time thought of that when editing the AQL spec, or maybe he/she was thinking about XPath and got confused, I don’t really know.

Would an AQL like this cover some of the needs?

SELECT
    el
FROM
    COMPOSITION e
        CONTAINS ELEMENT el[name/value = 'Systolic']
WHERE
    el/value/magnitude > 150
LIMIT 10

Yes, in fact check section ax. from the review doc: https://docs.google.com/document/d/1g8zOh06LhSNi1yFZWKuBzUX0bJN88r7mKpAFqDNi2JI/edit?usp=sharing

ax. Section 5.4. Exists operator

I raised a comment/question on Slack about that for some specific cases the CONTAINS clause could be simulated with an EXISTS clause, that is when the path is known. For instance:

SELECT … FROM EHR e CONTAINS COMPOSITION c CONTAINS OBSERVATION o [arch1]

Could be checked also with this query if the path to the OBSERVATION arch1 is known in the COMPOSITION:

SELECT … FROM EHR e CONTAINS COMPOSTION c
WHERE EXISTS c/content[arch1]

The big difference is that for EXISTS the paths are fixed and should be known beforehand, and CONTAINS checks for descendents in the hierarchy without knowing the exact paths, is like using an a/*/c or a//c in XPath.

Then I found this example in section 5.4 that is actually using a wildcard path that does exactly that:

… WHERE exists {“o//*/state[at0007]/items[at0008]”}

So if wildcard paths are supported, any CONTAINS could be written with EXISTS.

There are two ideas here:

  1. we need to specify if the wildcard paths are supported or not by the spec
  2. it would help understanding how these operators/clauses work if we mention some cases that the same query could be written in different ways, that will help implementers and people that need to write queries

Practically speaking CONTAINS and SELECT // will (should!) give us the same answers.

Your example opens up a whole other issue (party covered by the other discussion on which classes CONTAINS should apply).

For now I don’t think we should allow this kind of query but … I can see value in the future of doing something like

     SELECT
       el
   FROM
       COMPOSITION e
           CONTAINS OBSERVATION o // to ensure sensible context i.e not an Evaluation goal or somesuch
           CONTAINS ELEMENT el[name/value/mappings/target = 'SNOMED-CT::123456|Systolic blood pressure'|
   WHERE
       el/value/magnitude > 150
   LIMIT 10