@ian.mcnicoll@pablo can you please clarify the use case you have in mind for descendant paths and the suggested changes to current syntax? (i.e. under which clauses this should be available)
Then I discussed with Thomas about the use of “//*” or “/*” to shorten paths that mean “any descendant node”, similar to XPath. From that discussion, I found a similarity of that kind of path matching with the CONTAINS operator. In fact if we allow that on paths, and also the archetype matching predicates like “/o[openEHR-EHR-OBSERVATION.blood_pressure.v1]//*/value”, this could be used as CONTAINS but also has the ability of pointing to data, which would be pretty powerful.
Thanks Pablo. I’d suggest simply removing the use of // from the spec.
CONTAINS is a very well established clause in AQL and it does a specific thing (as I discussed in the topic related to FROM). Descendant paths are powerful indeed but I’d say we should ideally see the requirement to give that power to users. In almost all languages where some form of descendant paths is supported, its liberal use is also discouraged, because it’ll almost always have performance downsides. This is also why most vendors don’t support all types in the CONTAINS clause.
@ian.mcnicoll you said you had some examples when we met last week. Still thinking that’s the case? Can you share them?
I guess I had two main uses, one of which may be invalid!!
When working with complex queries I have sometimes found it easier to get consistent results, or at least the optimum results by using full paths in SELECT, rather than the alias that CONTAINS allows. The downside of that is that one cannot ignore parts of the upper path e.g SECTIONS that one wants to ignore for cross-template queries at ENTRY level, for example. Now, I guess that might be down to improper/irregular implementation of CONTAINS , or indeed a feature of descendant paths and CONTAINS, and if so, probably not a good argument, since if done properly // or CONTAINS should return the same resultset.
The other, possibly better argument is ease of understanding, conversion to/from other formalisms. The // idea is pretty well-established in xquery and json equivalents, so easier to explain to newbies. Also possibly easier to map between AQL and other higher-level constructs such as GraphQL. I think you might have told me that thinking about AQL resultsets in terms of what Xquery might do is helpful I guess it is part of the challenge of keeping our stuff familiar where-ever possible. I’m not suggesting dropping CONTAINS as it is very elegant,once you get familar with the rules.
I would expect // to be supported only in so far as CONTAINS is supported, in terms of types. Though I guess it might be possible to have any processing that is not supported by CONTAINS done on the resultset (with an expected performance hit).
I think we can drop the example from the spec, and discuss the implications of adding support to “//” “/*” “//*” on the WIKI, of course that will have some implementation complexities and maybe cause some issues, so I guess we need to start analyzing this from scratch and evaluate pros/cons of this feature.
This is just a shortcut, as any shortcut has it’s uses, mainly when you are lazy
For instance, when working with XML processing, we write a lot of XPaths, and for some we use the // and //* in the paths to select certain fields, I guess here is the same but for selecting data from the RM.
I guess this is just a feature we can offer or not, depends on what we agree. Either way, it is worth the analysis since someone at some time thought of that when editing the AQL spec, or maybe he/she was thinking about XPath and got confused, I don’t really know.
I raised a comment/question on Slack about that for some specific cases the CONTAINS clause could be simulated with an EXISTS clause, that is when the path is known. For instance:
SELECT … FROM EHR e CONTAINS COMPOSITION c CONTAINS OBSERVATION o [arch1]
Could be checked also with this query if the path to the OBSERVATION arch1 is known in the COMPOSITION:
SELECT … FROM EHR e CONTAINS COMPOSTION c
WHERE EXISTS c/content[arch1]
The big difference is that for EXISTS the paths are fixed and should be known beforehand, and CONTAINS checks for descendents in the hierarchy without knowing the exact paths, is like using an a/*/c or a//c in XPath.
Then I found this example in section 5.4 that is actually using a wildcard path that does exactly that:
… WHERE exists {“o//*/state[at0007]/items[at0008]”}
So if wildcard paths are supported, any CONTAINS could be written with EXISTS.
There are two ideas here:
we need to specify if the wildcard paths are supported or not by the spec
it would help understanding how these operators/clauses work if we mention some cases that the same query could be written in different ways, that will help implementers and people that need to write queries
Practically speaking CONTAINS and SELECT // will (should!) give us the same answers.
Your example opens up a whole other issue (party covered by the other discussion on which classes CONTAINS should apply).
For now I don’t think we should allow this kind of query but … I can see value in the future of doing something like
SELECT
el
FROM
COMPOSITION e
CONTAINS OBSERVATION o // to ensure sensible context i.e not an Evaluation goal or somesuch
CONTAINS ELEMENT el[name/value/mappings/target = 'SNOMED-CT::123456|Systolic blood pressure'|
WHERE
el/value/magnitude > 150
LIMIT 10