We already have the main RM, Demographics, and TP meta-models published and in use, in BMM, JSON-schema, and XSD formats. The class relations can just be looked up, as described in the above-referenced thread.
Not doing so makes an AQL engine fragile, whereas if the relations are looked up, the engine will work on any data. Not only that, but every addition we do to the RM will require tinkering with the AQL semantics spec, and with everyoneâs AQL engines, to write in special rules for the relationships of each new change (example: the addition of EHR.folders). If anyone can explain why we wouldnât want to do this generically, I would be very appreciative.
NB: I donât mean to say that todayâs implementations, which probably are hard-wired should immediately (or maybe ever) change - Iâm just talking about how we specify the semantics of AQL. I really donât see any reason why the very nice work @Seref has done would not be generic to any model - openEHR RM is not special in any way.
Generic rules will just be of the form:
if is_composition (C1, C2) then action_aaa
if is_reachable (C1, C2) then action_aaa
where action_aaa is some action to do in the query processor. Since that is_composition() and other lookups can be very easily done against the meta-models we already have, Iâm not sure why we would document anything else. It doesnât matter whether C1 happens to be EHR or any other class.
I think the key point is deciding how/when 2 instances of the same class are different and cannot be removed from the permutation without losing results.
You have to make each class provide some kind of âidentityâ function, making each row a set of âidentityâ results. If two rows share all the identifiers then you can ignore it as a duplicate. This can be easy to calculate/decide with higher level RM classes (composition, observationâŚ), but it will need to be decided for lower level RM classes (do two ELEMENTs with the same meaning and value represent the same measurement?). Rules became obvious when using this kinds of âidentityâ functions.
This in not an openEHR specific problem. As @Seref said, itâs present in all hierarchical data. Good thing is that we can decide which is the set of data that would allow us to tell 2 classes apart.
I donât really thing we can do this really generic, as you would need additional rules if you introduce new classes (e.g. if is_folder then action_aaa)
I.e. two distinct instances with identical content? It could certainly happen with non-identified value instances e.g. DV_TEXT or similar. However, itâs not just a question of the instances themselves, but what relation they are at the end of e.g. LOCATABLE.name or COMPOSITION.category. Equivalent to saying that their paths have to be distinct, even if their values are the same.
Iâm not sure if this is the main problem in generation of permutations though - I would have said that if the same container is matched once based on what it contains (e.g. Comp containing Obs[BP] and Action[medication admin]) then you donât try to match it again. For the OR case, I think itâs the same - you match every container (EHR, or COMPOSITION) if a hit on any of the sub-parts is encountered, but only once. Those containers are your âFROMâ data set. Then you do the SELECT extract on that.
Every time we use LOCATABLE.name for identification purposes, god kills a kitten. Thatâs exactly the kind of paths we should get out from the identification function (if we want to do queries that support more than one language, that is)