# AQL: Formal definition of FROM clause **Category:** [AQL](https://discourse.openehr.org/c/aql/43) **Created:** 2020-02-07 12:32 UTC **Views:** 1911 **Replies:** 66 **URL:** https://discourse.openehr.org/t/aql-formal-definition-of-from-clause/322 --- ## Post #1 by @Seref Here is the content I'd like to suggest for addition to AQL spec. The spec becomes a a bit repetitive if this is replaces 3.10.2 but I need to explain various things to expand the argument so I can live with it. Comments are welcome. **3.10.2 FROM** The `FROM` clause defines the scope of the query in terms of reference model (RM) types of data to be retrieved along with additional constraints that further narrow down the matching instances of data. These constraints can be constraints on attributes of the RM type in addition to structural constraints for data elements. Structural constraints can be in the form of logical relationships or instances of data directly containing other instances as nested attributes of an object. Both types of structural constraints are expressed with the same `CONTAINS` keyword. All other clauses in the AQL query reference data instances defined in the FROM clause using their aliases and either express further constrains on these, or define new data instances using relative paths based on these aliases. The simplest example of a `FROM` query would be one in which there is only a single RM type is declared. `SELECT .... FROM EHR e ....` The `e` alias, used to refer to all instances of data that has the reference model type EHR, is required to refer to this set of data from other clauses, namely `SELECT` and `WHERE`. This alias can be used directly such as: `SELECT e FROM EHR e... (select all EHR instances)` or as the root of a relative path, which allows the query to express constraints or select data items accessible from the root of the relative path, as in: `SELECT e FROM EHR e WHERE e/ehr_id/value='some_ehr_id' (select all EHR instances that has an ehr_id with value 'some_ehr_id' )` In accordance with the XPath like constraint syntax of AQL, FROM clause can introduce attribute constraints to data instances it defines as in: `SELECT ... FROM EHR e[ehr_id/value='some_ehr_id']` Note that the example above is semantically equivalent to the using the WHERE clause. The attribute constraint in the `FROM` clause is usually preferred. As stated above, `FROM` clause also allows AQL to define structural constraints. This feature is supposed via the `CONTAINS` keyword, which expresses a containment relationship which can be logical or direct data instance containment. Therefore, the semantics of `CONTAINS` keyword is overloaded for multiple types of relationships. An example `FROM` clause, which depicts both types of relationships would be: SELECT ... FROM EHR e[ehr_id/value='some_ehr_id'] CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1] CONTAINS CLUSTER cls[at0018] From a reference model point of view, and EHR instance does not directly contain COMPOSITION instances. Instead, it has references to them expressed via its compositions attribute. The `CONTAINS` keyword that establishes a structural constraint between EHR instances and COMPOSITION instances therefore implies COMPOSITION instances accessible from and EHR instance through resolving the values of its attribute, which in turn implies this is a logical structural constraint. The second use of `CONTAINS` keyword in the same query above establishes a structural constraint based on an instance of a COMPOSITION reference model type actually containing an instance of CLUSTER reference model type, where both instances individually must also satisfy the attribute constraints for their archetype node id, as defined on their aliases, `c` and `cls`. This structural constraint demonstrates direct data containment semantics of `CONTAINS` keyword, similar to the concept of composition in object oriented languages, which is different than aggregation, as employed by the EHR type to refer to COMPOSITIONs. AQL implementations deal with this different semantics internally so that `CONTAINS` keyword can be used seamlessly to express constraints on data to be retrieved. The `CONTAINS` keyword is complemented with logical operators such as AND and OR to express structural constraints that go beyond a single 'path' in the EHR. The details of these are provided below. An important point regarding the use of complex structural constraints in the `FROM` clause is that `FROM` clause always has a single root declaration and all `CONTAINS` keywords and logical operators describe constraints relative to this single root. The data items defined in the `FROM` clause can be of any RM type, however implementations of AQL usually support a subset of reference model types. This is usually done in order to ensure query performance and real life access patterns to data. To clarify, a query such as `SELECT d FROM EHR CONTAINS DV_QUANTITY` is perfectly valid from a syntactic point of view, but its results, all instances of data with quantity type, across all data contained in all EHRs, is completely useless in real life. Based on the syntax and intended functionality defined above, more formal semantics of FROM clause can potentially be represented with various existing formalisms, probably via extensions. One such formalism, Tree Pattern Queries is discussed in detail in regards to its use as a formalism for AQL in (Arikan 2016) This particular formalism, presumably one of the many that could be used, defines semantics of FROM clause based on a single rooted tree pattern as introduced by the Tree Pattern Query (TPQ). This representation, with potential extensions borrowed from labelled property graphs (Green et al. 2018), can encode constraints defined in FROM node. Other relevant formalisms are discussed in the original work that led to the creation of AQL (Ma et al. 2007) REFERENCES: Arikan, S. S. 2016. ‘An Experimental Study and Evaluation of a New Architecture for Clinical Decision Support - Integrating the OpenEHR Specifications for the Electronic Health Record with Bayesian Networks’. Doctoral, UCL (University College London). https://discovery.ucl.ac.uk/id/eprint/1500996/. Green, Alastair, Martin Junghanns, Max Kießling, Tobias Lindaaker, Stefan Plantikow, and Petra Selmer. 2018. ‘OpenCypher: New Directions in Property Graph Querying.’ In EDBT, 520–23. Ma, Chunlan, Heath Frankel, Thomas Beale, and Sam Heard. 2007. ‘EHR Query Language (EQL)-a Query Language for Archetype-Based Health Records’. Medinfo 129: 397–401. --- ## Post #2 by @sebastian.iancu On the first read text looks good to me. Is it true that you now "made peace" on the use of CONTAINS by reference vs real contained structures? I see also that is now open for non-EHR RM types, but is still not clear how would it look like for a demographic RM types - perhaps an example will come later. I'm a bit concern about the last part, where you refer to TPQ, I think in openEHR specification should be very clearly stated what should be supported, so that we can do conformance-test. So what ever is not specified does not exists from these docs perspective - therefore I think we should carefully think about everything we really want all implementation to support (so ...what is the common-set that all of us should support). --- ## Post #3 by @Seref Thanks. Regarding CONTAINS, I'm merely making the point that is it used to imply different types of associations. As you can see, I have an explanation for the semantics of EHR CONTAINS COMPOSITION and COMPOSITION CONTAINS CLUSTER even though the underlying relationships are different. I'd be glad to discuss if how another overload of it an be used to access demographics information but I'd like to have an explanation for that too. Tom made some points about links to demographics information in the form of references from EHR subject etc, which would allow us keep the conceptual integrity of the query semantics. Shall we say I'm eager to have "peace negotiations"? ;) Re the TPQ: it is just a formalism that can be used to describe queries. As I said in the text, there can be others, but this is what I have to offer. It sits between the specs and the technology so even in the thesis I concluded by saying it may become a recommended approach and not necessarily the specification. So in the text above I'm not implying it is part of AQL spec, I thought I wrote that part carefully, as in "look at these things if you want to more formally interpret this thing..." but I may have missed the mark. I insist that we must have an explanation for retrieving a result given a query and a CDR which shows exactly why we're getting the results we're getting. As I said in the SEC skype meeting, SQL has relational algebra with its operations (left/right joins etc) and extensions (order by, window functions etc). Every xquery engine returns the same results given the same text because W3 specified the matching semantics, Gremlin has graph walk... So it goes. What do we have for AQL? That's my overarching concern. I'd be very, very happy to hear suggestions regarding this. or maybe I'm concerned about something that does not matter that much to everybody else and if someone can explain that to me, I'll leave everybody in peace regarding this matter :) --- ## Post #4 by @sebastian.iancu [quote="Seref, post:3, topic:322"] What do we have for AQL? That’s my overarching concern. [/quote] I think I get your concern, it would be better to have this explicitly described and stated, just to avoid unexpected behavior from AQL implementation. Although I'm not the right person to right such text, I will take it for a review. My concern on the other hand is that such text might make assumptions or require something particular related to underlying AQL implementation and related storage engine (e.g. relational db, noSql, xquery, etc) - so we should watch-for and avoid for such 'traps'. But this should not block us having a spec like you asked. --- ## Post #5 by @Seref Regarding your concern: [quote="Seref, post:3, topic:322"] It sits between the specs and the technology [/quote] So sure, not suggesting any particular technology is a core principle. The related chapter in my thesis is titled "persistence abstraction" for that very reason. Given the same behaviour, each implementer would resort to their own intellectual comfort zone based on their expertise and know-how, which is already the case for all vendors anyway. What I'm trying to say is, health data in general and openEHR in particular pushes certain design patterns at the implementation level anyway, we would not worsen what is already dictated by the nature of this domain but would have a better spec. I've been looking at CQL and they have a lot to answer in this department with what they're trying to do :) --- ## Post #6 by @pablo For the record, I have sent a PR with some improvements for the FROM definition: https://github.com/openEHR/specifications-QUERY/pull/5/files Old text: The `FROM` clause utilises class expressions and a set of containment criteria to specify the data source from which the query required data is to be retrieved. Its function is similar as the `FROM` clause of an SQL expression. New text: The `FROM` clause is used to specify the domain of the query, that is a subset of the universe of data that could be retrieve from a CDR. That universe is anything defined in the openEHR RM (that is a set of classes that models clinical records), and archetypes (that define specific data sets based on the openEHR RM). Because of that, the `FROM` clause uses `class expressions` to specify the subsets of data, giving context to the query. And added a summary section with this: `FROM`: Defines the subset of data in which the query will be executed. That definition might be a little broad (on purpose) since allows stuff like FOLDER, LINK or ACTOR, to be part of the AQL expression in the FROM clause. I think we propose a spec to query any RM, but even using openEHR we are just focusing on querying inside EHR and not in the demographic model, and we don't mention querying LINKed structures, which might be really powerful. Checking your definition @Seref, this seems confusing for me: [quote="Seref, post:1, topic:322"] The `FROM` clause defines the scope of the query in terms of reference types of data to be retrieved along with constraints that narrow down the instances of data that have the reference types used in the `FROM` clause. [/quote] Maybe "reference types of data" should be defined. Should that be "RM type" or "RM class"? ("class" would be correct in an OO environment). [quote="Seref, post:1, topic:322"] These constraints can be constraints on attributes of the RM... [/quote] Rephrased to avoid using "constraints" twice: "these constraints can be applied to ..." [quote="Seref, post:1, topic:322"] ...data directly containing other instances.... [/quote] I understand "directly containing" as something that is parent -> child, but CONTAINS allows to constraint containment at any level, I would remove "directly" to avoid a wrong interpretation. [quote="Seref, post:1, topic:322"] ...or instances of data directly containing other instances... [/quote] This is difficult to follow, lots of "instances" :) [quote="Seref, post:1, topic:322"] All other clauses in the AQL query reference data instances defined in the FROM... [/quote] Maybe add a comma after "query". [quote="Seref, post:1, topic:322"] ..instances based on relative paths based on the data... [/quote] Can that be rewritten to avoid "based" twice? [quote="Seref, post:1, topic:322"] This is why `FROM` clause is described as the section of the AQL query that defines the query scope. [/quote] IMO that should be the first sentence of the definition for FROM. Another thing is to mention the scope "over what", like mentioning the "universe" of all queryable data is the RM, then what is defined in the FROM is a subset of that (please check my definition above). [quote="Seref, post:1, topic:322"] `SELECT .... FROM EHR e ....` The `e` alias, used to refer to all instances of data that has the reference model type EHR.. [/quote] The example and description could be before mentioning types/classes and instances, since the example is more descriptive and gives context to the formal definition. Something like: 1. FROM defines the scope of the query bla bla bla 2. basic examples, mention classes and aliases that refer to instances 3. heavy definition 4. more examples An idea, can we ask each implementer to come up with a definition for the main clauses in AQL? Then compare and improve based on good definitions. Would love to hear what others think since this is the core of all queries. --- ## Post #7 by @thomas.beale [quote="Seref, post:3, topic:322"] Regarding CONTAINS, I’m merely making the point that is it used to imply different types of associations. [/quote] On the question of CONTAINS... Ideally, in AQL we would use CONTAINS wherever *logical* containment is understood in the relevant RM. Logical containment means *deletion semantics*, i.e. cascaded-delete in RBMS thinking. Now in the openEHR RM, if we do a DELETE on an EHR, we would logically cascade that DELETE through to all referenced FOLDERs, COMPOSITIONs, EHR_STATUS and so on. To specify that formally would require a kind of reference type that can be marked as 'composition' or 'association', in the same way UML does for direct association refs between objects (i.e. black diamond versus no diamond). But for concrete types that are intended to be physical references to sub-parts, for reasons of computational convenience or whatever, there is no way in UML or BMM to directly mark them as being composition or association. I have implemented 'smart references' in the past (probably we all did at some point) that have this knowledge in them, but of course its just a specific class, it's not built in to the language. We could take that path in openEHR RM: add a data element to the XXX_REF types that mark them as composition or association, or subtype them. Doing it properly means putting it in the BMM, where you could look up EHR.compositions and discover that the logical relationship between the EHR and *the target objects of the references* (`COMPOSITIONs` etc) was indeed `composition` (or not). With that info, `CONTAINS` in an AQL query could be correctly interpreted for both `EHR[x] CONTAINS COMPOSITION[arch-id]` and `COMPOSITION CONTAINS CLUSTER`. This is something I have been thinking about, and indeed, it would be easy to specify and add to the current BMM schemas. If AQL processors were to use the Archie BMM lib to read the BMMs, then the info would be right there, and everything falls into place. --- ## Post #8 by @sebastian.iancu [quote="thomas.beale, post:7, topic:322"] We could take that path in openEHR RM: add a data element to the XXX_REF types that mark them as composition or association, or subtype them. [/quote] is that implicitly not the 'type' attribute of the OBJECT_REF? About CONTAINS: [quote="thomas.beale, post:7, topic:322"] in AQL we would use CONTAINS wherever *logical* containment is understood in the relevant RM. Logical containment means *deletion semantics* , i.e. cascaded-delete in RBMS thinking. Now in the openEHR RM, if we do a DELETE on an EHR, we would logically cascade that DELETE through to all referenced FOLDERs, COMPOSITIONs, EHR_STATUS and so on. [/quote] This is a key aspect that has to be clear from AQL spec about CONTAINS. Perhaps it deserves a small chapter. If we all accept this as "design pattern", then perhaps we don't need (now) to further engineer this (with the BMM thing above)?! But, on the other hand it might be useful to have it, if AQL processor would use BMMs. --- ## Post #9 by @sebastian.iancu [quote="pablo, post:6, topic:322"] The `FROM` clause is used to specify the domain of the query, that is a subset of the universe of data that could be retrieve from a CDR. [/quote] I don't feel this explanations makes it better. The "universe" term is (in my opinion) not very appropriate here, and also putting there the CDR would restrain in the scope to only EHR, whereas as I mentioned earlier, I really would like to consider also DEMOGRAPHIC domain for AQL. As inspiration, I kind of like the simplicity of how wikipedia defines FROM (see https://en.wikipedia.org/wiki/From_(SQL) ), "... will provide the rowset to be exposed through a [Select] statement". Perhaps @Seref will find a way of simplifying his original text there, using less (openEHR) words, and keeping CONTAINS explanation for a separate chapter? --- ## Post #10 by @thomas.beale [quote="sebastian.iancu, post:9, topic:322"] I don’t feel this explanations makes it better. The “universe” term is (in my opinion) not very appropriate here, and also putting there the CDR would restrain in the scope to only EHR, whereas as I mentioned earlier, I really would like to consider also DEMOGRAPHIC domain for AQL. [/quote] Agree on that - AQL itself doesn't know anything about 'CDRs', 'EHRs' or any other specific kind of data. [quote="sebastian.iancu, post:9, topic:322"] As inspiration, I kind of like the simplicity of how wikipedia defines FROM (see https://en.wikipedia.org/wiki/From_(SQL) ), “… will provide the rowset to be exposed through a [Select] statement”. Perhaps @Seref will find a way of simplifying his original text there, using less (openEHR) words, and keeping CONTAINS explanation for a separate chapter? [/quote] I suppose concretely FROM specifies a 'row-set', but really it specifies the 'database', in the abstract sense, which is essentially the 'universe' of data to which the query applies. This is also sometimes called a 'schema', which is a DB word meaning 'model'. --- ## Post #11 by @Seref [quote="pablo, post:6, topic:322"] Should that be “RM type” or “RM class”? [/quote] Good catch, fixed that bit. [quote="pablo, post:6, topic:322"] avoid using “constraints” twice [/quote] I'd rather keep it as it is, happy to hear suggestions that give the same meaning. [quote="pablo, post:6, topic:322"] I understand “directly containing” as [/quote] Thx, reworded that part [quote="pablo, post:6, topic:322"] Maybe add a comma after “query” [/quote] Nope :) [quote="pablo, post:6, topic:322"] Can that be rewritten to avoid “based” twice? [/quote] Yep, done. [quote="pablo, post:6, topic:322"] IMO that should be the first sentence of the definition for FROM [/quote] removed it, because it actually does not define the scope. It defines the source, and scope is defined by SELECT and WHERE potentially extending and narrowing based on the source/root. [quote="pablo, post:6, topic:322"] The example and description could be before mentioning types/classes and instances [/quote] I don't think so. A formal definition is what was requested and IMHO a formal definition should not start with an example. The bits you expand on are the rest of the section for FROM. What I've written above is just the introduction based on the definition. --- ## Post #12 by @Seref [quote="thomas.beale, post:10, topic:322"] Agree on that - AQL itself doesn’t know anything about ‘CDRs’, ‘EHRs’ or any other specific kind of data. [/quote] I respectfully, but strongly disagree :) I've seen this point made before and I meant to respond then. AQL has the potential to be generic query language to query any underlying model but I'm in favour of defining it strictly based on openEHR terms, based on openEHR EHRs, data types, structures etc. With my implementer hat on, I would like a query language spec to focus on the language and terms of data that I'm processing. This is why my suggestions for a formal definition of FROM above refer to RM types, their attributes, containment in EHR etc. I am the one who raised the overloaded semantics of CONTAINS and I'm happy to further specify it but I would rather do that based on more openEHR words, not less. I made a second pass to simplify my definition but I'm keen to address potential adapters using an openEHR specific terminology and language. That's my 2 pennies of course, happy to hear what @bna and @matijap would think. --- ## Post #13 by @bna I agree on this. We need a query language which fits the RM as good as possible. My experience so far is that the match could have been better. Data defined by our RM is hierarchical like trees, and with the possibility of making references it goes into a multi hierarchical graph. This is when you get challenges with todays AQL. It, kind of, assumes a flat database scheme and a flat tabular row based resultset. I think current AQL is really good for lots of use cases. And I will not be surprised if we some day made a new specification which covered the hierarchical data better. This could happen through revolution or evolution. Anyway it has to be a domain specific language for the EHR. --- ## Post #14 by @thomas.beale [quote="Seref, post:12, topic:322"] With my implementer hat on, I would like a query language spec to focus on the language and terms of data that I’m processing. This is why my suggestions for a formal definition of FROM above refer to RM types, their attributes, containment in EHR etc. [/quote] Well, query language semantics should not differ across data models the language processes. The optimisations that might be possible are another thing. If I were implementing an AQL engine, I would expect to have some bags of heuristic rules for processing queries against particular RMs in particular usages, e.g. openEHR RM in EHRs; openEHR RM in HighMed research; openEHR demographics in an MPI; openEHR Task Planning data. But I can't see how the formal language specification can have anything in it that is specific to any particular model. Indeed, I am not aware of anything in the current grammar that is specific to the openEHR RM. There is also the question of 'clinical safety' as Ian as raised in the past. Whether some other layer(s) of semantics are needed over the top of AQL in particular contexts is something to explore as well. But again, if such layers can't rely on general query language semantics, you'd never be able to write those other layers. The CONTAINS semantics can be quite easily specified in the BMM (or other representation) of any model; right now they are not, and so, AQL engines/services don't know that logically, an openEHR EHR object 'contains' (= has sub-part) COMPOSITION, FOLDER, EHR_STATUS etc. We need to fix that. But building it in to the language itself is not the correct approach - it has to be stated in the model definition semantics, and we can do that, indeed, it would not be hard to add it to today's BMMs with a small amount of work. Tools like the Better ADL-designer already read BMM files; in future AQL processors can as well, and all will be well with CONTAINS. --- ## Post #15 by @Seref [quote="thomas.beale, post:14, topic:322"] Well, query language semantics should not differ across data models the language processes. [/quote] I agree, but I also cannot see how I may have suggested that. We have one data model as far I'm concerned: openEHR RM and data based on instances of RM. [quote="thomas.beale, post:14, topic:322"] If I were implementing an AQL engine, I would expect to have some bags of heuristic rules for processing queries against particular RMs in particular usages, e.g. openEHR RM in EHRs; openEHR RM in HighMed research; openEHR demographics in an MPI; openEHR Task Planning data. [/quote] Maybe I'm missing something here but these are all RM implementations, based on the single RM specification. I cannot see why they would be called 'particular RMs'. [quote="thomas.beale, post:14, topic:322"] But I can’t see how the formal language specification can have anything in it that is specific to any particular model. Indeed, I am not aware of anything in the current grammar that is specific to the openEHR RM. [/quote] There is more than one way to skin a cat when it comes to formalising something. I'm in favour of formalising AQL on top of RM. It could also be formalised based on BMM, Tree Pattern Queries as I mentioned above, or with some other, well... formalism :) My understanding of formalising is 'specifying its behaviour' and I suggest we do that based on references to data defined by the RM, which consequently implies using the concepts and terms of the RM, as in, "FROM clause defines data elements based on RM types and constraints on RM type attributes ..." etc. I am concerned about having to resort to other and especially more generic formalisms to define/formalise AQL unless there is no way to do this without using the RM subset of openEHR specifications. The execution semantics is one example of RM not being sufficient, where I suggested the use of TPQ or alternatives, but as I said to @sebastian.iancu above, I'd still try to see that more in the ITS space and not in the AQL specification. [quote="thomas.beale, post:14, topic:322"] I am not aware of anything in the current grammar that is specific to the openEHR RM. [/quote] Well, grammar is at syntax/lexical level and even there you'd have things specific to openEHR if we wanted to help implementers, for example, you cannot have an archetype id token in an AQL query that would not be valid archetype id identifier according to RM, as in `... COMPOSITION c[myLovelyComp]...` should not even be syntactically valid because [we define valid archetype ID syntax in the RM](https://specifications.openehr.org/releases/AM/latest/Identification.html#_need_for_rm_class_name_in_identifier) My points above re the semantics of CONTAINS are explained in terms of RM as you can see, I don't need to break the self-containment of RM spec to explain CONTAINS can mean both resolving an aggregation relationship and a composition one. I'm merely suggesting we follow that approach. [quote="thomas.beale, post:14, topic:322"] There is also the question of ‘clinical safety’ as Ian as raised in the past. Whether some other layer(s) of semantics are needed over the top of AQL in particular contexts is something to explore as well. But again, if such layers can’t rely on general query language semantics, you’d never be able to write those other layers. [/quote] completely agree, but your comment seems to imply you don't think we can specify query language semantics without using another formalism. I'd say query semantics can be specified within RM, but execution is different and even than that's ITS. [quote="thomas.beale, post:14, topic:322"] right now they are not, and so, AQL engines/services don’t know that logically, an openEHR EHR object ‘contains’ (= has sub-part) COMPOSITION, FOLDER, EHR_STATUS etc. [/quote] They do. The fact that we have > 1 working implementations of AQL proves that they do :) BMM is another way for them to know it, but then again, we're in the ITS space. [quote="thomas.beale, post:14, topic:322"] But building it in to the language itself is not the correct approach - it has to be stated in the model definition semantics, and we can do that, indeed, it would not be hard to add it to today’s BMMs with a small amount of work [/quote] Are you suggesting we state query related aspects in the RM? Isn't that what you and I consistently argued against so far, especially in case of GUI aspects, and most recently in Birger's EHR subject concern? I'm advocating we define **what AQL does** based on the RM specification, and **how it may do it** in the ITS, whether BMM or some other mechanism. My attempt to follow the approach I'm suggesting is above. Maybe I'm failing to understand your suggestion and I'd be delighted to be corrected or shown the error of my ways because this stuff is bloody complicated! --- ## Post #16 by @thomas.beale Maybe I was not clear by what I meant when I said 'BMM' - I don't mean the BMM formalism, I mean actual BMM instances, i.e. model definitions. We already have [BMMs for the whole of openEHR, right here](https://github.com/openEHR/specifications-ITS-BMM). These are the files that are consumed by tools that require a model definition. I also have BMMs for FHIR and can make one for any model in the world. We can do the same thing with some other meta-formalism, like XMI or (maybe) JSON-schema, or whatever; we just use BMM because it works and fixes a whole lot of problems of XMI. So what I am advocating re: specifying logical whole/part relationships, is that this semantic be defined in the BMM. (It is [already in the latest BMM spec](https://specifications.openehr.org/releases/LANG/latest/bmm.html#_properties), just not in the implementations.) If we specify this kind of thing properly in the BMM for any concrete model, then an AQL processor always processes the CONTAINS statement correctly. Currently, AQL implementations (quite reasonably) are hard-wired to the openEHR RM, in the same way CKM is, and ADL workbench once was. We need to move AQL (and CKM) to being model-driven, and define the model-specific semantics in the model definition (BMM files, or XMI, or whatever else takes your fancy), and define the query specific semantics in the query language. Hopefully this is clearer! --- ## Post #17 by @Seref Thank you, this is indeed helpful. Allow me to allow you to help me further :) [quote="thomas.beale, post:16, topic:322"] Currently, AQL implementations (quite reasonably) are hard-wired to the openEHR RM, in the same way CKM is, and ADL workbench once was [/quote] a) I just cannot see what is wrong here. b) How can anything be hard wired to RM when RM itself is technology agnostic? [quote="thomas.beale, post:16, topic:322"] We need to move AQL (and CKM) to being model-driven, and define the model-specific semantics in the model definition (BMM files, or XMI, or whatever else takes your fancy), and define the query specific semantics in the query language [/quote] I have no objections to validity of this approach, but I'm concerned about its consequences, because unless I'm missing something, this makes BMM implementation a precondition for AQL implementation. The downsides of which to me would be: - The learning curve for potential openEHR implementers, who now also have to understand BMM to understand AQL - The increase in implementation costs for potential and existing implementers. I guess you can help me a lot more if you could tell me why defining AQL based on RM is bad (in the way you describe as hard wired) --- ## Post #18 by @thomas.beale [quote="Seref, post:17, topic:322"] a) I just cannot see what is wrong here. b) How can anything be hard wired to RM when RM itself is technology agnostic? [/quote] Well, the openEHR RM, at the end of the day, is just a model of data. Naturally some of us think its quite good, but that's subjective ;) The openEHR Demographics part of the RM is separate in the sense of not being part of the EHR, but really querying should work with it as well. The point for a query language isn't to be technology agnostic, the point is to be **model-agnostic**. If we make it specific to some model, we have to specify something different / new just to say how AQL works for openEHR Demographics, Task Planning, or indeed, any archetyped data - including in other domains. The one thing AQL does need to know about that is 'openEHR-ish' is of course Archetypes, archetype ids etc. But that's part of the formalism layer of openEHR, not any of the models. Hence the most recent arrangement of the components into groups that follow this idea: ![image|630x335](upload://fxfG7t7uGOIaGNg2pKxUqrYgeks.png) Re learning curve: * well we are talking about a small number of people who are all engineers and/or scientists, so I don't think BMM will be much of a challenge. Mainly they will experience it just by using Archie, which will make it easy to use. * model-driven is the future. If it's not BMM, it will be Ecore, son-of-UML (SysML2 maybe) or something else. We just don't use those things today because they are out of date (no functional stuff), broken (generics, property/association semantics) and impossible to read, in the case of XMI. Better's ADL-designer already uses BMM to know about models; LinkEHR also reads them. Nedap's nascent ADL tool is BMM-driven. HL7 CIMI is (or at least was until recently) using BMM. CKM will go there at some point... --- ## Post #19 by @pablo [quote="Seref, post:11, topic:322"] removed it, because it actually does not define the scope. It defines the source, and scope is defined by SELECT and WHERE potentially extending and narrowing based on the source/root. [/quote] I guess that depends on the definition of "scope" and "source". As I understand it, "source" would be "all your data" (the think I called universe because of the mathematical set theory term, which is the "given situation" or "given state", that could also be "domain"). Then "scope" would be the subset of the universe that you want to focus on (still thinking as set theory here). The SELECT is to map a projection, I like these definitions: " In [relational algebra](https://en.wikipedia.org/wiki/Relational_algebra), a **projection** is a [unary operation](https://en.wikipedia.org/wiki/Unary_operation) written as Π a 1 , . . . , a n ( R ) {\displaystyle \Pi _{a_{1},...,a_{n}}(R)} ![\Pi _{{a_{1},...,a_{n}}}(R)](upload://k8IleS3yBXCvbbtqsjKMOd9xYrW.svg) where a 1 , . . . , a n {\displaystyle a_{1},...,a_{n}} ![a_{1},...,a_{n}](upload://1KoXZZxyEcSH7dMUIM5TTADukQJ.svg) is a set of attribute names. The result of such projection is defined as the [set](https://en.wikipedia.org/wiki/Set_(mathematics)) obtained when the components of the [tuple](https://en.wikipedia.org/wiki/Tuple) R {\displaystyle R} ![R](upload://4l85AUSysjbU8IYcUahJESW7Y4g.svg) are restricted to the set { a 1 , . . . , a n } {\displaystyle \{a_{1},...,a_{n}\}} ![\{a_{1},...,a_{n}\}](upload://ogjJFekam44AdtioN2fGvZ8jczE.svg) – it *discards* (or *excludes* ) the other attributes.[[1]](https://en.wikipedia.org/wiki/Projection_(relational_algebra)#cite_note-rochester-1)" https://en.wikipedia.org/wiki/Projection_(relational_algebra) " Projection is one of the basic operations of Relational Algebra. It takes a relation and a (possibly empty) list of attributes of that relation as input. It outputs a relation containing only the specified list of attributes *with duplicate tuples removed* . In other words the output must also be a relation." https://stackoverflow.com/a/3462222/1644320 And the WHERE is for filtering data from the scope, only the data that passes the filters will appear in the projection. I know everyone here might have their own definition or idea of things. Maybe we need to go down to the basic definitions that we will agree on, because we might be talking about different things. Of course, it depends on how strict or "mathematically correct" do we want to be on the spec. It's also valid to define our own terms in the context of openEHR, but we need to have good definitions to avoid interpretation issues. --- ## Post #20 by @pablo [quote="thomas.beale, post:10, topic:322"] Agree on that - AQL itself doesn’t know anything about ‘CDRs’, ‘EHRs’ or any other specific kind of data. [/quote] I agree, I shouldn't mention CDR, I was thinking of data storage. [quote="sebastian.iancu, post:9, topic:322"] hereas as I mentioned earlier, I really would like to consider also DEMOGRAPHIC domain for AQL. [/quote] And I agree we should explicitly say AQL expressions could be used to query any archetype RM, including openEHR EHR and DEMOGRAPHIC, but could be used for other RMs. Also that should be extended to the examples, which are all focused on EHR. --- ## Post #21 by @thomas.beale I'm not sure about using the word 'scope' w.r.t. SQL or AQL. In simple terms, the various bits are as follows: SELECT projection (= subset of columns of a Table or View, or properties of a class/type) FROM domain / universe (= tables or classes/types from which columns/properties projection is defined) WHERE criteria (= row selection, by filtering on values) --- ## Post #22 by @pablo That seams reasonable @thomas.beale, but interms of: [quote="thomas.beale, post:21, topic:322"] FROM domain / universe [/quote] If we think of functions, FROM could be a function, the source data set for that function, could be EHR/DEMOGRAPHIC/xxx, is the domain (of that function), then the result or co-domain of the FROM applied to the domain is the domain for the query as a whole, since the query could also be considered a function. But you can consider the query as a whole is applied to EHR/DEMOGRAPHIC/xxx, so that would be the domain for the query, not the result of the FROM, since the query would be a combination of functions applied one to the result of the other: QUERY_RESULT = SELECT(WHERE(FROM(domain))). The difference is subtle, but really depends on what you are focusing on, the FROM clause or the complete AQL. Even more, SELECT and WHERE are also functions, WHERE is a boolean function and SELECT is a mapping function. I would say FROM is a sub-set definition function (could be a "selection" function but gets weird having the SELECT clause). This is what I understand it, I'm not saying this is the most correct way of understanding or defining things. --- ## Post #23 by @bna Regarding the _FROM_ as a filter into the domain DIPS found a need to expand the query model to be able to run the same functional AQL with different constraints. This was suggested into the openEHR REST API v1.0. Since the SEC group wanted to keep the first version minimal this feature was postponed to later versions. We use the following request model. The tagScope and partitionBy is used a lot in production. The use-case is ward lists to query i.e. the latest (partitionBy = EpisodeOfCareId) body temperature for each episode of care (tag = EpisodeOfCareId). > { "aql": "string", "compositionUids": [ "string" ], "ehrIds": [ "string" ], "tagScope": { "tags": [ { "values": [ "string" ], "tag": "string" } ] }, "partitionBy": { "tag": "string", "limit": 0 }, "correlationId": "string" } --- ## Post #24 by @sebastian.iancu Quite a lot of things were said here that, at least in my opinion, I think are important: [quote="Seref, post:15, topic:322"] I’m advocating we define **what AQL does** based on the RM specification, and **how it may do it** in the ITS, whether BMM or some other mechanism. [/quote] [quote="thomas.beale, post:16, topic:322"] If we specify this kind of thing properly in the BMM for any concrete model, then an AQL processor always processes the CONTAINS statement correctly. [/quote] [quote="Seref, post:17, topic:322"] this makes BMM implementation a precondition for AQL implementation. The downsides of which to me would be: * The learning curve for potential openEHR implementers, who now also have to understand BMM to understand AQL * The increase in implementation costs for potential and existing implementers. [/quote] [quote="thomas.beale, post:18, topic:322"] The point for a query language isn’t to be technology agnostic, the point is to be **model-agnostic** . If we make it specific to some model, we have to specify something different / new just to say how AQL works for openEHR Demographics, Task Planning, or indeed, any archetyped data - including in other domains. [/quote] [quote="thomas.beale, post:14, topic:322"] The CONTAINS semantics can be quite easily specified in the BMM (or other representation) of any model; right now they are not, and so, AQL engines/services don’t know that logically, an openEHR EHR object ‘contains’ (= has sub-part) COMPOSITION, FOLDER, EHR_STATUS etc. We need to fix that. But building it in to the language itself is not the correct approach - it has to be stated in the model definition semantics, and we can do that, indeed, it would not be hard to add it to today’s BMMs with a small amount of work. [/quote] Well, I don't know how others are fully understanding and deeply seeing and feeling all the aspect of above quote, but for me: - I get @thomas.beale advocating for formal description in a BMM, it is perhaps the right place - but I also agree @Seref about extra burden on depending on BMM - the whole discussion is around AQL processor and AQL formalism specification, to make it model-agnostic, but the data-storage is not formally specified (neither db-type, neither data-definition or structure), which (I guess) means that the AQL-execution itself is implementation-specific - I wonder how much (if any) the BMM can be used at that level, I have impression that is hard-wired (as opposed to ADL parsing which takes directly benefit of BMM). I suggest adding an extra chapter or few paragraphs in the beginning of AQL specs, that will capture these conceptual design aspects in a dialog above between @thomas.beale and @Seref . It might be useful for implementors to better understand the necessity of BMM in relation with AQL. --- ## Post #25 by @sebastian.iancu This is a nice simple one: [quote="thomas.beale, post:21, topic:322"] SELECT projection (= subset of columns of a Table or View, or properties of a class/type) FROM domain / universe (= tables or classes/types from which columns/properties projection is defined) WHERE criteria (= row selection, by filtering on values) [/quote] but if we would like to use it in specs, then I would change it a bit: > SELECT > projection (= subset of columns or properties of the selected rowset) > FROM > domain (= rowset source, usually tables or classes/types from which columns/properties projection is defined) > WHERE > criteria (= rowset retrieval criteria, by filtering on their values) --- ## Post #26 by @thomas.beale Yep, this is also good, probably better. I wasn't trying to provide a proper text BTW, just to state a sort of common sense understanding of these things, in the interests of not getting too complicated or academic. I leave it to the rest here to get the text right for the users of the AQL specification and tools. --- ## Post #27 by @thomas.beale [quote="sebastian.iancu, post:24, topic:322"] Well, I don’t know how others are fully understanding and deeply seeing and feeling all the aspect of above quote, but for me: * I get @thomas.beale advocating for formal description in a BMM, it is perhaps the right place * but I also agree @Seref about extra burden on depending on BMM * the whole discussion is around AQL processor and AQL formalism specification, to make it model-agnostic, but the data-storage is not formally specified (neither db-type, neither data-definition or structure), which (I guess) means that the AQL-execution itself is implementation-specific - I wonder how much (if any) the BMM can be used at that level, I have impression that is hard-wired (as opposed to ADL parsing which takes directly benefit of BMM). I suggest adding an extra chapter or few paragraphs in the beginning of AQL specs, that will capture these conceptual design aspects in a dialog above between @thomas.beale and @Seref . It might be useful for implementors to better understand the necessity of BMM in relation with AQL. [/quote] We don't have to create the full BMM approach for AQL right now, it will take some time. But we do need to simulate it in the sense that the knowledge of logical whole/part containment not be directly part of the AQL spec or implementation, but instead be in e.g. some other file that is read to discover the semantics of such relationships. In the long run, the semantics of a model should be fully stated in the model definition. The short term question is just where this model definition comes from. --- ## Post #28 by @Seref Thanks Sebastian, I think some clarification is needed, at least re what I suggested. [quote="sebastian.iancu, post:24, topic:322"] the whole discussion is around AQL processor and AQL formalism specification, to make it model-agnostic [/quote] I for one am not discussing AQL processor(s), mainly because that's an implementation topic. I do have consideration for implementations when I make my suggestions, but I won't mention impl. unless I think that something I'm writing may be problematic from that perspective. Regarding implementation, ITS is where we may have recommendations, but I'll repeat my point that the base spec (on the left in Tom's diagram) should not have references to ITS, because it then becomes something that cannot be implemented without using and something that used to be in the ITS box now exists in the BASE. This would be crossing the Rubicon for openEHR, on the way to defining a virtual machine, which will make it a very unpopular option compared to FHIR. I appreciate no one else may see it that way, nor worry about the implications as much as I do, but this is my opinion. I attempted to define AQL behaviour in the context of openEHR data and not go beyond that in terms of formalising it. In my humble opinion, you cannot half formalise it, it just confuses readers/implementers and a full formalisation is a mighty challenge, which I've done once as part of research. Talking about projections and universes requires the reader to apply those concepts to the task at hand to implement the spec and you have to make sure that there is not a too large gap between the formalism you use and the actual behaviour. In other words, if you're going to use another formalism, it has to be at the right level of abstraction in terms of its proximity to semantics of AQL, but again, this is subjective and my view of using formalisms. Re being model agnostic: no one clearly wrote this but I my understanding is demographics is now implicitly considered as a goal for AQL, which I assume is what the points about being model agnostic are referring to. Even if we're now talking about RM + demographics, my view of AQL would be a query language that works on RM + demographics data which is a limited scope which can be defined in terms of an object model which can be represented with UML. I think we had a conversation in the last online SEC call to introduce a SYSTEM concept that sits at the root of EHR and Demographics together to address this concern. At this point, I am not sure I have more to offer in terms of the way forward and I made all the points I'd like to make. So I'll let the rest of the SEC to solidify how we'll define AQL. I appreciate you taking the time to follow discussion and points made. --- ## Post #29 by @thomas.beale [quote="Seref, post:28, topic:322"] Re being model agnostic: no one clearly wrote this but I my understanding is demographics is now implicitly considered as a goal for AQL, which I assume is what the points about being model agnostic are referring to. Even if we’re now talking about RM + demographics, my view of AQL would be a query language that works on RM + demographics data which is a limited scope which can be defined in terms of an object model which can be represented with UML. I think we had a conversation in the last online SEC call to introduce a SYSTEM concept that sits at the root of EHR and Demographics together to address this concern. [/quote] Actually, in my view, AQL should work for any model. I am not clear on what semantics would make it specific to openEHR RM, or even just openEHR RM + Demographics. It should work the same way for archetyped data based on any model. So the question just becomes: how does AQL know about the model underlying the queries written? Either the model definition is hard-wired in to an implementation, which is often a quick and reasonable way to get going - but it means that for every model, you have to do more hard-wiring - or it is defined in a more generic way. As everyone who has ever tried this, you always get to a point where you say ok, enough, let's do this generically, and you go to a generic model-representation approach. So when you do that, you move all the hard-wired definition semantics out to the generic representation - which for us is BMM, or it could be straight UML/XMI, or even JSON-schema, if you can get those things to behave properly. But the representation is in a way just a detail; the point is to have the model semantics in a place that can be interrogated by an AQL processor, then your AQL semantics are clearly separated from your model semantics. And with a smart model representation (like BMM is aiming to be), you can include all the meta-data you need, even for tricky things like logical containment represented by reference relationships (to be fair, we can even do that in UML, with stereotypes, I've just never put it in the model). Now, since most AQL implementations to date didn't try to deal with anything more than openEHR RM data, the above issue was not so apparent as if we were being more agnostic. But as soon as we try to solve issues like the CONTAINS semantics going over reference relationships, it becomes clear that the internal hard-wired representation of the model is deficient. Now, implementers could just go an hardwire that further info (which I am not against, BTW), but regardless of where it is concretely expressed, it is not logically anything to do with AQL, it is to do with the model of the data that some particular AQL queries are targetted to. So logically, it is not part of the AQL spec. I am not convinced I am saying anything different to anyone else here ;) --- ## Post #30 by @sebastian.iancu [quote="thomas.beale, post:29, topic:322"] So the question just becomes: how does AQL know about the model underlying the queries written? Either the model definition is hard-wired in to an implementation, which is often a quick and reasonable way to get going - but it means that for every model, you have to do more hard-wiring - or it is defined in a more generic way. [/quote] I'm not sure if, at this time, we need a generic formal definition, other than for only the sake of making this in a nice proper way. I think we just need to describe AQL so that we know how to apply it for EHR + Demographics, plus perhaps TP ?! Should this be made in a generic, model-agnostic way? Sure, why not (do it right)...? But I think this (BMM) will only describe the AQL semantic, leaving out the architectural aspects of how EHR+Demographic should (or should not) work together, or how TP might be involved, etc.; see also discussion we had about System-concept. Therefore I wonder if your effort @thomas.beale, to have such BMM definition, would be valued accordingly (I have no idea how much time would that take for you). But, as an AQL implementor, I couldn't just use BMM and have all things automatically done (like an AQL processor, or a AQL runner, etc...), I will still have to rely on hard-wires in my implementation; and this different than an ADL-app which can use BMM, isn't it? But perhaps I'm not the right person to comment on this, as I'm not implementing AQL neither BMM at this point, so I could by biased, or just plain wrong... --- ## Post #31 by @bna [quote="sebastian.iancu, post:30, topic:322"] I will still have to rely on hard-wires in my implementation; and this different than an ADL-app which can use BMM, isn’t it? [/quote] Yes. This is IMHO true. --- ## Post #32 by @thomas.beale [quote="sebastian.iancu, post:30, topic:322"] Therefore I wonder if your effort @thomas.beale, to have such BMM definition, would be valued accordingly (I have no idea how much time would that take for you). But, as an AQL implementor, I couldn’t just use BMM and have all things automatically done (like an AQL processor, or a AQL runner, etc…), I will still have to rely on hard-wires in my implementation; and this different than an ADL-app which can use BMM, isn’t it? [/quote] Well it will go into BMM soon anyway, because BMM is undergoing a major revamp, which is close to finished, to make it do expressions, and full model representation (you can look in the [working version](https://specifications.openehr.org/releases/LANG/latest/bmm.html) if you like). But the logical containment semantics will also go into the UML models of openEHR - it's something I should have done long ago, but it's not hard to do. Those changes will take time to filter through to libraries and tools, i.e. to be directly usable. My point is, regardless of when the implementation of the change to BMM gets done, or even if you don't use BMM, the semantics of model relationships of *particular models* should never be part of the AQL *specification*, they should always be stated in the model specification. So even if right now, implementers do just hard-wire the semantics in, this should always be understood as an pragmatic implementation choice, not a specification-level thing. Hopefully this is a bit clearer. I'll show how this can be specified in the UML shortly. --- ## Post #33 by @sebastian.iancu yes, it gets clear :slight_smile: --- ## Post #34 by @Seref [quote="thomas.beale, post:32, topic:322"] the semantics of model relationships of *particular models* should never be part of the AQL *specification* , they should always be stated in the model specification [/quote] this is a statement I'd like to understand better. My question to you is: **what is the model specification?** Let me give you an example that seems to run counter to what you said above: From a query perspective/in the context of query semantics, the composition references in the EHR are interpreted as "contained" and this interpretation is what an implementation of CONTAINS keyword is supposed to implement. See, I just defined some AQL behaviour and the model relationship, if I get you correctly, is specified in the AQL **because it is a relationship that is meaningful in the query use case**. You cannot express this relationship in the model specification, if the model specification means RM, because that'd only be possible if you said something in the lines of > ...The references to compositions in the EHR are interpreted as containment in the context of querying... at which point: you'd end up pushing a higher level aspect, querying, into a lower level, which is something we have opposed to together for years. You can extend BMM to express this containment relationship and that being in the ITS, it is OK if this is what you mean by **model** in that statement above. however, you're still left with the task of defining the relationship between EHR and compositions in the AQL specification to explain and therefore formalise AQL. How can you do this if you don't refer to RM? if you say something in the lines of > CONTAINS interprets the composition references in EHR as per BMM's... the BASE part of openEHR in the left hand side of the diagram you pasted is no longer self contained and just like the assumed types of platforms, BMM is now a prerequisite to implement openEHR. Am I correct to assume you're considering moving on from UML to BMM completely at some point because BMM being a metamodel is already more computable than UML and that fits into your current train of thought as far as I can see. I'd love to understand where I'm wrong in the above picture I drew. I cannot seem to get my head around your statement re the model relationships to begin with. --- ## Post #35 by @thomas.beale [quote="Seref, post:34, topic:322"] My question to you is: **what is the model specification?** Let me give you an example that seems to run counter to what you said above: From a query perspective/in the context of query semantics, the composition references in the EHR are interpreted as “contained” and this interpretation is what an implementation of CONTAINS keyword is supposed to implement. [/quote] Right - but the EHR -> COMPOSITION relationships could be in a model containing the definitions HOUSE and ROOM, with the same kind of reference relationship between the two, and also the same logical compositional relationship. There is nothing special about the EHR / COMPOSITION relationship - indeed, there is nothing in the entire openEHR RM that isn't in other models. So the specification of AQL should only know about *kinds of relationships* in general, not something about a model called 'openEHR RM'. An AQL query processor has to lookup some model description to find out these kind of things. For example, PARTY.relationships is *not* a containment relationship; the target PARTYs aren't deleted if you delete a PARTY whose relationships point to them. How does an AQL query processor know if the keyword 'CONTAINS' is even allowed between two classes? It has to discover that the two classes are in a transitive containment relationship. Determining that properly means being able to query a Model object (e.g. a UML model, a BMM model etc) and make a call like: ``` has_containment_relation (classA, classB: String): Boolean ``` e.g. you might want to ask if `COMPOSITION` can 'contain' `CLUSTER`, which it can, but not directly. The fact of containinment is not itself anything to do with querying, it's just a definitional fact of he model semantics. There are no doubt other tools, unrelated to querying, that could use this information. E.g. imagine a tool that can process a JSON representation of an EHR as a (probably giant) in-line hierarchy. A consumer of that data could determine a) that it was logically valid, and b) how to instantiate it properly into EHR and COMPOSITION objects. Also, to be clear, 'CONTAINS' is nothing to do with EHR, COMPOSITION or any other specific types. CONTAINS is just an operator that asserts a (possibly transitive) compositional containment relationship between instances of two types. In a model-driven implementation, the AQL processor sees the class name 'EHR', the keyword 'CONTAINS' and the class (say) 'CLUSTER', and (somehow) it knows this is openEHR, so it looks up the openEHR BMM and uses a function like the one above to ask if CLUSTERs can indeed really be logically contained inside EHRs. The BMM will have some marker on the various relationships (like the UML black diamond) that says 'is_composition', and that function will assess those relationships, and discover that yes, there is a transitive compositional containment path from EHR to CLUSTER. AQL thus has to know nothing at all about the openEHR RM to process this instance of 'CONTAINS'. Everything I said would be exactly the same if the RM and archetypes and queries in use were FHIR or NIEM or who knows what. Am I getting anywhere? ;) --- ## Post #36 by @bna What about the assumptions made in the following topic, how can you interpret such logic without knowing the RM? https://discourse.openehr.org/t/aql-same-logical-aql-with-different-syntax/366/2 --- ## Post #37 by @thomas.beale Same as for any RM: you look up the RM definition and then validate or execute the queries. --- ## Post #38 by @bna The assumptions made in the example are correct? (BTW : This is an honest question since it took us some time to figure out that it had to be interpreted this way) --- ## Post #39 by @thomas.beale If you mean on that other post, yes, that all looks correct. But normally you would not mentioned the VERSION classes in a query. --- ## Post #40 by @bna To be sure : You agree that they will all produce exactly the same result since they are identical operators on the data? --- ## Post #41 by @thomas.beale They have to because there are no Observations in an openEHR EHR not contained by those things, in each case. But, we should not really be creating queries that don't mention the top-level container, i.e. the EHR. We discussed a little while ago that we should have a notional top-top-level openEHR query space, of the following logical shape (with all the VERSIONing object hidden): ``` openEHR { ehrs: EHR[*] demographics { parties: PARTY[*] party_relationships: PARTY_RELATIONSHIP [*] } plans: WORK_PLAN[*] etc } ``` --- ## Post #42 by @sebastian.iancu As said before, I see where you are aiming with BMM Thomas, but the following: [quote="thomas.beale, post:35, topic:322"] In a model-driven implementation, the AQL processor sees the class name ‘EHR’, the keyword ‘CONTAINS’ and the class (say) ‘CLUSTER’, and (somehow) it knows this is openEHR, so it looks up the openEHR BMM and uses a function like the one above to ask if CLUSTERs can indeed really be logically contained inside EHRs. The BMM will have some marker on the various relationships (like the UML black diamond) that says ‘is_composition’, and that function will assess those relationships, and discover that yes, there is a transitive compositional containment path from EHR to CLUSTER. [/quote] is where I see it a bit differently, because beside an AQL processor (or something similar that needs BMM to construct/translate the logical intent of the query to a particular implementation 'taste'), I don't think a query-optimizer will use such BMM based information to build up its internal query-plan. At least from my experience and technological layers, the data-store implementation (the persistence architecture) is usually made with 'hard-wires'; it is a design choice. --- ## Post #43 by @thomas.beale You can do that. But that's implementation, it's not specification. There is nothing you can put in the AQL specification that is specific to the openEHR EHR or any other model. How those models are used in practice will give you your optimisation settings. Which would probably be different for HighMed compared to Code24 and the other patient-facing EHR systems based on the same model. Note also, that correctly processing 'CONTAINS' isn't an optimisation, it's either correct or its not. Whether it is processed fast at runtime is a whole other question :) --- ## Post #44 by @Seref [quote="thomas.beale, post:35, topic:322"] Am I getting anywhere? [/quote] you are, but where you're getting is not what my questions are about :) almost your entire response is the answer to "how should I implement aql", which is not what I'm asking, really. It is entirely possible that I'm missing the answers to my questions and this would not be the first time so personally I'm not sure where to go from here. Nonetheless, I appreciate the patience and light hearted tone, which is hard to keep when having a discussion on the internets. --- ## Post #45 by @thomas.beale Actually I'm not saying anything about how to implement AQL; just about what can go in the spec versus what cannot... * AQL needs to know what 'CONTAINS' means, generally * An AQL processor needs to know where it can be used with respect to some particular model * It also needs to know how to navigate reference-by-id containment relationships (like EHR.compositions) and direct-reference relationships (e.g. COMPOSITION.content) * => therefore there needs to be a way of an AQL processor interrogating a model representation that provides this information. --- ## Post #46 by @ian.mcnicoll @thomas.beale Let me try to re-phrase this ... >* AQL needs to know what ‘CONTAINS’ means, generally > * An openEHR AQL implementer needs to know where it can be used with respect to the openEHR RM(s) > * An implementer needs to know that CONTAINS (for the openEHR RM) will have to navigate reference-by-id containment relationships (like EHR.compositions) and direct-reference relationships (e.g. COMPOSITION.content) and this need to be made explicit in documentation > ~~=> therefore there needs to be a way of an AQL processor interrogating a model representation that provides this information.~~ This is an implementation decision - IMO we should be specifying the correct behaviour in a way that does not require direct interrogation of the model which implies a dependency on BMM (or something else) for which I detect little current appetite amongst CDR implementers. At the top of this thread, Seref made a start on a documentation approach that I think meets the current need to document the expected behaviour of an openEHR RM 'profile' of AQL, without needing to interrogate the underlying model UML or BMM. Both of the latter options feel much harder to grapple with and understand. So my vote goes strongly with Seref's proposed documentation approach - see top of this topic, and leave any discussion about a more abstract 'model-driven approach' for the future - that will require a lot more work from Thomas that we (openEHR) cannot really afford, and which I don't sense is really being asked for from those here with implementation experience but please do tell me if I am wrong! --- ## Post #47 by @thomas.beale [quote="ian.mcnicoll, post:46, topic:322"] This is an implementation decision - IMO we should be specifying the correct behaviour in a way that does not require direct interrogation of the model which implies a dependency on BMM (or something else) for which I detect little current appetite amongst CDR implementers. At the top of this thread, Seref made a start on a documentation approach that I think meets the current need to document the expected behaviour of an openEHR RM ‘profile’ of AQL, without needing to interrogate the underlying model UML or BMM. Both of the latter options feel much harder to grapple with and understand. [/quote] It's not really an option. If there is anything about 'openEHR RM' or any other model, in the AQL specification, we have **done the wrong thing**. How the model is represented in any particular implementation at this point in time - BMM, rules file, hard-wired something-or-other - is an implementation question. But at runtime, an AQL processor has to be able to discover where the CONTAINS relationships are, not to mention the typing etc, of any model underlying types etc. There is nothing we should be putting in the *AQL spec* that mentions anything about particular models or particular relationships, typing or anything else specific. If someone wants to create some extra spec of 'AQL profiles' and how to write them, and also an 'openEHR profile', then fine I guess, but I don't see the point. I also apparently have not been clear enough about where the BMM work is or is going. It's pretty much done, and it's being completed to drive Expressions, Task Planning and probably GDL3, because you can't do any of these things without proper model representation. Being able to process expressions, queries, decision logic etc - all requires model representation. This is just standard mainstream IT, nothing special. --- ## Post #48 by @thomas.beale BTW I don't disagree with anything in @Seref's original post, which is nice and clear, apart from possibly this: [quote="Seref, post:1, topic:322"] The `CONTAINS` keyword that establishes a structural constraint between EHR instances and COMPOSITION instances therefore implies COMPOSITION instances accessible from and EHR instance through resolving the values of its attribute, which in turn implies this is a logical structural constraint. [/quote] If I read this literally, it implies that the use of 'CONTAINS' in an AQL query 'establishes' something about the model. But that isn't right - a query based on a model can't state truths about the model, only the model can do that. If it were the case, you could never validate a query's use of 'CONTAINS', you would just have to trust it. So you'd have no way of knowing if a query was correct w.r.t. its underlying RM. I am not sure if that is what Seref intended here, I may be reading it wrong... --- ## Post #49 by @ian.mcnicoll Wre are not talking about the AQL specification wer are talking about how that AQL specification applies to a specific RM, in our cae the openEHR RM - as you can see from Seref and Bjorn's examples there are legitimate varying interpreatations of how it should be applied - is EHR e optional, what exactly does CONTAINS mean, how deep should CONTAINS go without a parent object e.g FROM EHR e CONTAINS ELEMENT. That level of detail need to be worked out, agreed and then documented. I think that is understood. What we are wrangling here is the best way to dcument the outcome of those discussions. > Being able to process expressions, queries, decision logic etc - all requires model representation. This is just standard mainstream IT, nothing special. My reading of the discussion is that while openEHR has done a remarkably good job in model-driven representation in terms data of data, that this is far from the standard mainstream IT in terms of implementation, especially in terms of profiled AQL. The pushback I am hearing from Seref and Sebastian suggests that as implementers they are not comfortable, at least right now with having this kind of RM-specific behaviour documented in a model-drive formalism. Seref is telling us he has been down this road already and saw the limitations. I am going to push very clearly that we do not adopt BMM for this purpose but go with Seref's suggested approach - if nothing else that will allow us to make positive progress on addressing the kind of reasonable questions that Seref, Bjorn and Sebastian are asking. I would like a much clearer answer from implementers on where/how BMM has value before commiting much more resource in that direction. > Expressions, Task Planning and probably GDL3, because you can’t do any of these things without proper model representation. I'd like to test that statement with experienced implementers - the success of openEHR to data has been because of the great work that you have done on model representation but I know there has already been significant pushback in a similar way around Expressions. I feel we are in real danger of seeing *everything* in terms of abstract models. @thomas - you have a great handle on this but everytime we push further down this road , I feel we are losing understanding and support, certainly from people outside our community but increasingly from those working within. But I am probably the least qualified person to make such a judgement, expect from the position of perhaps representing something approaching the 'great coding unwashed' :slight_smile: Ian --- ## Post #50 by @thomas.beale [quote="ian.mcnicoll, post:49, topic:322"] I’d like to test that statement with experienced implementers - the success of openEHR to data has been because of the great work that you have done on model representation but I know there has already been significant pushback in a similar way around Expressions. I feel we are in real danger of seeing *everything* in terms of abstract models. @thomas - you have a great handle on this but everytime we push further down this road , I feel we are losing understanding and support, certainly from people outside our community but increasingly from those working within. [/quote] If we are losing understanding, I am unaware of it. Formal model representation really is a mainstay of mainstream computing of all kinds. It comes in many concrete forms: * so-called 'reflection' classes in most programming languages these days * UML / XMI * Eclipse [Ecore / Xcore](https://wiki.eclipse.org/Xcore) * [OMG IDL](https://www.omg.org/spec/IDL/About-IDL/) (around for 30y) * etc I would not expect people doing clinical modelling, or EHR users to know or care about these things of course, they are for development. Only a small number of people in the overall openEHR community need to worry about them. Without model representation, you do the same work but in a non-reusable way, in terms of hard-wired models all over the place. We have those as well, i.e. concrete RM implementations in Java, PHP, C# etc, for concrete EHR system implementations - that is as it should be. But for generic tooling and languages, the normal approach is formal model representation. The whole industry operates like this. Once again, I am not saying anyone should start using BMM today in AQL (i.e. beyond where it is already in use, e.g. in ADL-designer, LinkEHR etc), what I am saying is that we should clearly understand the path forward to implementing certain things properly. However I'll just ask one simple question: how do implementers intend to *validate* AQL queries? --- ## Post #51 by @sebastian.iancu [quote="ian.mcnicoll, post:49, topic:322"] I would like a much clearer answer from implementers on where/how BMM has value before committing much more resource in that direction. [/quote] Well, I said it before, I understand both @Seref and @thomas.beale perspective, and I don't think that they are clashing. @Seref text above has a good human-readable info, while @thomas.beale BMM semantic description of AQL 'contains' provides machine processable info. Together they should cover several things that are not covered now. I was just invoking the (lack of) use of such BMM for implementations of the stage of running AQL (translating the AQL to a 'internal' query adapted to implementation, things elsewhere named query-optimizer) ...at least in my opinion - but that's also fine. I found it important to express my perspective, to give feedback. Furthermore, I can imagine that on the stage of processing AQL query (meaning parsing AQL, all these things that should happen before you even run the query), BMM could be useful from our openEHR-SEC perspective (meaning that some implementors might choose this machine-way rather than having hard-wires). This also relates to my response to: [quote="thomas.beale, post:50, topic:322"] how do implementers intend to *validate* AQL queries? [/quote] It could be hard-wired, or BMM, depends on implementation choices... but let it be a choice (don't force BMM). So once again, keep both the text above and the BMM has its potential usage and audience; just make sure they are aligned. PS: :innocent: hopefully, you'll not find my feedback rubbish... --- ## Post #52 by @Seref [quote="thomas.beale, post:48, topic:322"] If I read this literally, it implies that the use of ‘CONTAINS’ in an AQL query ‘establishes’ something about the model [/quote] Leaving a typo aside, (an EHR instance, not and EHR instance...), that sentence is not about the model. Please note the repeated use of **instances** in that sentence. Not types, but instances, since the structural constraint established (use/assume defined instead) by CONTAINS is a constraint on data, which in this context what I refer to with **instances**. You seem to be reading it with BMM in your mind and probably thinking I'm verbally defining metadata which you'd normally put into a meta model. That's not what I intend to do. I'm assuming RM ( and potentially + demographics) and describing what CONTAINS means in the context of a query when that query is run on data (instances) that is based on openEHR types (RM+demographics) That assumption allows me to define query semantics without referring to any metadata. I don't know if writing the AQL spec this way will deliver the balance between pragmatism and clarity but there is a certain motivation and thinking to it. --- ## Post #53 by @bna We've had openEHR in production 6 years now and AQL for 5 years without using or for many years not even knowing about BMM. BMM is certainly not critical for us to understand how to execute AQL on openEHR data. The last years we've implemented a BMM parser and the possibility to validate our models (C# classes) with the BMM definitions. That's useful for verification towards versions of RM. For AQL everything is more or less hand-coded. We've so far not faced serious issues related to the syntactical problems with AQL. What we have been struggling most with is how to interpret the semantic logic defined by the query editor and apply that to the underlying datasources. As you all know we use a combination of an RDMS and lucene based index in Apache SOLR. The real technical innovation is related to how we design the lucene index and how we apply the AQL logic on to the data. This is, of course, not represented in any openEHR specification :-) We are currently rewriting the backend and query pipeline. This is part of an ordinary maintainace of such a critical component. As part of that we are re-visiting some of our previous assumptions on the understanding of how a query in AQL should be interpreted. Most interesting so far: * How to handle order by for a few specific data values * The understanding of implicit and explicit contains * How to handle the permutation problem I have shared our description of the problems and also proposed our understanding. IMHO we need to work together at such a practical level. Most of the problems is raised by real life issues and problems. We need to solve this first. Then secondary we may put out shared consensus in the formal models. I got a task from the management board to look into why developers are not involved in openehr and modelling. I think many or most developers are problem solvers. The need to find simple answers on complex problems. To help and engage them we need to be concrete in our discussions, and without loosing our heritage as the best modeled specification ever (period). --- ## Post #54 by @thomas.beale [quote="Seref, post:52, topic:322"] I’m assuming RM ( and potentially + demographics) and describing what CONTAINS means in the context of a query when that query is run on data (instances) that is based on openEHR types (RM+demographics) That assumption allows me to define query semantics without referring to any metadata. [/quote] That seems to imply that AQL is a language specific to the openEHR RM. But it should be a query language applicable to any archetyped data based on any reference model - the semantics are the same. So I'm still not clear how, in your explanatory text, the AQL processor could *know* that (say) COMPOSITION contains CLUSTER (and not the other way around), or EHR contains (by reference) COMPOSITION. --- ## Post #55 by @Seref [quote="thomas.beale, post:54, topic:322"] That seems to imply that AQL is a language specific to the openEHR RM. But it should be a query language applicable to any archetyped data based on any reference model - the semantics are the same. [/quote] I think it is quite clear that we disagree on this one :) At this point in time, with my Ocean hat on, I see zero business value in using AQL on anything other than RM (and maybe demographics along with that). Any extra work in specs, in implementation and even discussions is spending Ocean money with no commercial returns in the foreseeable future. I'm OK with making it to openEHR history as the man who misunderstood AQL most , but I'd make this point nonetheless. [quote="thomas.beale, post:54, topic:322"] So I’m still not clear how, in your explanatory text, the AQL processor could *know* that (say) COMPOSITION contains CLUSTER (and not the other way around), or EHR contains (by reference) COMPOSITION [/quote] That is absolutely outside the scope and intention of my text because I see how AQL processor works as implementation detail and leave it to reader of AQL spec to decide on how to make that work, which is what DIPS, Better, Ocean, Ethercis and EhrBase have done without anybody putting this into any spec in the last 10(?) years. --- ## Post #56 by @bna [quote="Seref, post:55, topic:322"] which is what DIPS, Better, Ocean, Ethercis and EhrBase have done without anybody putting this into any spec in the last 10(?) years. [/quote] That's a very good point Seref. You took the word out of my mouth. --- ## Post #57 by @ian.mcnicoll That implies that people are implementing some kind of generic rm-neutral aql processors which is certainly not true for ethercis and I assume ehrbase. I think Bjørn and seref are both saying that this is not how their CDRs work. --- ## Post #58 by @thomas.beale [quote="Seref, post:55, topic:322"] I think it is quite clear that we disagree on this one :slight_smile: At this point in time, with my Ocean hat on, I see zero business value in using AQL on anything other than RM (and maybe demographics along with that). Any extra work in specs, in implementation and even discussions is spending Ocean money with no commercial returns in the foreseeable future. [/quote] Well I can't speak for Ocean, but there are two things I would say: * it is very likely that we will want to use AQL to do querying over other archetyped data, for the simple reason that we are already creating other archetyped data, namely Demographics and Task Plans. * there still isn't any AQL semantic that I can identify that should cause AQL to be limited to just one particular model (the openEHR RM). So I am unclear why we would specify it as if it were. --- ## Post #59 by @matijap We at Better do, and will continue to do, parsing and syntactic validation using off-the-shelf lexers and parsers with antlr grammar having hard-coded some class names from openEHR RM. (Edit: well, as far as I'm aware, only the top-level EHR is really hard-coded for now.) If a need arises to add the few classes that occur in demographics, we will do that will fairly little effort. Whether we have machine-readable (i.e. BMM) information on the fact that EHR contains COMPOSITIONs (in a different way than COMPOSITIONs contain SECTIONs and such) or not, does not make any difference to us: we treat EHRs in a totally different way than COMPOSITIONs, etc., for a technical reason that will not be affected by any amount of formal modelling. (Whether OBSERVATION can contain SECTION or whether ACTION can contain COMPOSITION or some class from demographics in the model, we do not really care when querying: if it cannot, there will simply be no data and I see no use in validating such relationships.) Adding any kind of SYSTEM above EHR, or PARTY and whatnot below that, will require manual labor which we are gladly willing to do if we see value in it for our customers. We will not rewrite our stack to be more generic -- not because it is hard or not worth it, but because we believe it can not be as fast and as flexible, from the customer's point of view, as the market needs for its (non-generic) use cases. We need a good (i.e. understandable and strict) human-readable specification so that most questions like the ones that @bna provides a constant stream (and now he revealed why :slight_smile: ) can be answered simply by pointing out a sentence or paragraph in the specification (that can hopefully be interpreted only in one way). Now two questions arise. 1. Is a machine-readable formal model (BMM) a necessary step towards such a human-readable specification? I think not, because while it might define behaviour strictly, what will happen is that we will find a vague sentence in specification, inquire the formal model to resolve the dilemma, and find out that we do not like the answer, and then we'll have two problems instead of one. Maybe I'm wrong. 2. What amount of openEHR RM may appear in AQL specification: (a) not at all, (b) only in code examples, (c ) also in clarifications in text, (d) also in the specification of the language features. I think (c ) is the correct answer, others may prove me wrong. We've had some discussions in other threads lately whether AQL engines may contain some exceptions when querying EHR data (like not showing incomplete compositions, or even not showing data not belonging to PARTY_SELF, unless explicitly instructed to); if we went this way and would explain this in the AQL spec, that would be the (d) way, which is arguably wrong (but putting that information elsewhere is dangerous as well, so that's another dilemma). What I'm trying to say is that no matter how generic and loosely-coupled the specification will be, implementations will not be, so effort should be put into all this machinery only if it will aid understanding (and I'm pretty sure it will not) or if it will enable some kind of automated validation (which I do not see how). --- ## Post #60 by @pablo Hi @bna I guess that request is then mapped into an AQL expression, for instance ehrIds and values, the rest I don't think have support in an AQL expression. Not sure if that is related to the FROM definition or maybe a proposal for the REST API. --- ## Post #61 by @pablo [quote="sebastian.iancu, post:25, topic:322"] SELECT projection (= subset of columns or properties of the selected rowset) FROM domain (= rowset source, usually tables or classes/types from which columns/properties projection is defined) WHERE criteria (= rowset retrieval criteria, by filtering on their values) [/quote] I really like the idea of adding something like that as a summary at the beginning of spec, it's short and straightforward. I would suggest not to use the term "row" or "rowset" because that could indicate or suggest an implementation technology. If we are not referring to the "row" in the sense of Relational Databases, and we want to keep using the terms, we should define our own "row" and "rowset" semantics in the AQL spec, which is also related to giving an idea of how an AQL processor /evaluator/execution should work. BTW I like the idea of giving implementation hints in the spec, but don't know if we need to separate things or create a more complete spec. With this I mean, to have a complete query spec we need to define: 1. syntax 2. query processing/evaluation/execution 3. result set We are very close to have a good "syntax spec", but we lack on the rest. --- ## Post #62 by @pablo @all I have committed and improvement of the FROM definition to my PR: https://github.com/openEHR/specifications-QUERY/pull/5/commits/9517f7b2dedf3dc083b23c656065a97d5114d14b I tried to remove any reference to a CDR or a specific RM, this is still WIP. Rewriting that I realized we need to mention that AQL is for any RM, but the RM should comply with a couple of things: 1. should be an OO RM (since the FROM uses class names) 2. the RM should be used in a dual-model environment (without this we don't have archetype IDs or paths) In the current v1.0 spec we have "Archetype Query Language (AQL) is a declarative query language developed specifically for expressing queries used for searching and retrieving the clinical data found in archetype-based EHRs." We might be covered for point 2. with "...data found in archetype-based EHRs...", but is not so explicit. Also, there are still many references to the openEHR RM and to EHRs in general, constraining the scope of AQL to CDRs only, a constraint we need to remove. Still I think both conditions should be explicitly mentioned in the AQL introduction (OOM + dual-model), and also mention AQL works on any RM that complies with those conditions. What do you think? --- ## Post #63 by @bna This is a great answer @matijap and you truly show why you are Better :-) Response to your two questions: [quote="matijap, post:59, topic:322"] Is a machine-readable formal model (BMM) a necessary step towards such a human-readable specification? [/quote] As I wrote earlier [quote="bna, post:53, topic:322"] What we have been struggling most with is how to interpret the semantic logic defined by the query editor and apply that to the underlying datasources. [/quote] What I meant by this is : 1. To teach the (internal) developers using our backend to use it properly and explain what the expected output will be based on the query they propose. Most developers has very high competence and experiences using SQL. They think AQL is the same - but it's not. That is confusing and for many disappointing. 2. For the core team who has been working with openEHR since 2010 to find out how a given AQL is expected to map into complex hierarchical datastructures and produce the resultset that both a "semi-clinical-tech" and the "high-competent-openEHR-expert" think is correct. Often we find a discrepancy here. I.e. @ian.mcnicoll had some issues accepting the Glasgow Coma Scale example [given here](https://github.com/bjornna/openehr-conformance/blob/master/aql/case1.1-permutation_gcs/index.adoc). And we, as a community and SEC group, has not yet found a shared solution for the permutation problem as [explained here](https://github.com/bjornna/openehr-conformance/blob/master/aql/case1.1-permutation_bp/index.adoc). For the both the latter examples we, DIPS, is working on some assumptions and query logic which seems to solve it. I will share it as soon as I am able to understand what the developers are doing currently (it's AFAIK heavy stuff, but I think/have heard rumours that the Better guys already has some solution to this). 3. And of course the ORDER BY issue like "should there be a default order if no order is given", how to order data types. And similar to this how to handle NULL when ordering? Do we need some operator to explicit give the AQL engine hints about [i.e. NULL FIRST](http://www.dba-oracle.com/t_order_by_nulls_first.htm) All the examples above is more informal and descriptive than formal modelling definition. Other might have a different view on this. But I must say for us, DIPS, what is important in short terms is to define the expected rules for the problems raised above. And I can not see how to work with this kind of problems without discussing them. So far my the questions related to ORDER BY has been replied with "this can be fixed in BMM by some infix operators". That's fine I think, but for AQL we simply don't care because the AQL pipeline is extremely handcrafted and optimized for our specific implementation. All wee need is an informal description of what we agree on as a SEC group. [quote="matijap, post:59, topic:322"] What amount of openEHR RM may appear in AQL specification [/quote] Current use of AQL is limited to query EHR RM based data. I agree with @matijap that we need some clarifications in text which covers the use-cases that customers or clients will face. If we some time in the future will do more work on DEMOGRAPHICS or TASKPLANNING then we may add text to clarify such use-cases. I think @sebastian.iancu will provide some good use-cases for DEMOGRAPHICS and we will eager to learn about their experiences. And as a final note to self: @Seref - I am sorry for not responding to your initial post in this topic. I think you made a really good start for the discussion. And it was so good that I didn't have any specific comments to it. It made sense to me :-) --- ## Post #64 by @bna [quote="pablo, post:60, topic:322, full:true"] Hi @bna I guess that request is then mapped into an AQL expression, for instance ehrIds and values, the rest I don’t think have support in an AQL expression. Not sure if that is related to the FROM definition or maybe a proposal for the REST API. [/quote] @pablo - sorry , I lost the context for this reply. Which of my posts are you referring? --- ## Post #65 by @sebastian.iancu [quote="pablo, post:61, topic:322"] I would suggest not to use the term “row” or “rowset” because that could indicate or suggest an implementation technology. [/quote] I was referring (hinting) to our (openEHR) row/result-set or whatever that type-name we will have in our specification (because we should have those types specified, including json-schema, xsd). --- ## Post #66 by @bna [quote="sebastian.iancu, post:65, topic:322"] I was referring (hinting) to our (openEHR) row/result-set or whatever that type-name we will have in our specification (because we should have those types specified, including json-schema, xsd) [/quote] I agree. Row and columns in this context is a description of the format of the result from the executed AQL. It's not implementation specific. The terms are AQL specific definitions of the models/types/classes used in AQL. --- ## Post #67 by @pablo That's ok, my point is: if we use those terms, we need to define them in the spec or reference an external definition. In fact, that is the point of this discussion and also the other thread about defining the operators for the simple types, because we build complex concepts without defining the basic semantics of their internal components. --- **Canonical:** https://discourse.openehr.org/t/aql-formal-definition-of-from-clause/322 **Original content:** https://discourse.openehr.org/t/aql-formal-definition-of-from-clause/322