AQL (and related) documentation

thomas.beale · 5 March 2020 13:15

well, things like this we have discussed before, but they belong in the realm of semantic interpretations of the data, not the mechanical act of getting it out of the persistent store. So further filtering or other smart behaviours based on the content / values has to be in a layer above, which today no-one has, and for which we have not even a draft spec. From @ian.mcnicoll’s point of view, these kinds of things are burning clinical safety priorities (I agree); it’s just that you can’t achieve all levels of semantics in the one layer of functionality.

Now, you could achieve some (all?) of these kinds of things with rules, e.g. a rule that states something about draft VERSIONs. What AQL would then need is a generalised notion of such rules, so that any such rule could be written. These rules can be simple to start with. But it would be wrong (again) to embed in the spec statements like ‘if the data model of the query is openEHR RM, and the data item is inside an instance of VERSION class that has draft=true, then don’t return the object’ or whatever.

Instead, the AQL spec should have a general capability to post process (or maybe inline process) some rules that might knock out certain query results if a general pattern is matched (e.g. the content was inside some container considered ‘incomplete’ or ‘draft’). The notion of draft or incomplete is a general one; how openEHR RM or any other model marks its content as draft or incomplete is a model-specific thing. So somewhere in an the query processing process, a ‘remove draft data’ filter is activated, and it will have to look up the appropriate rule that that do this for the data at hand.

These sorts of things are somewhat sophisticated, but never have to be know by query authors, only query engine implementers (probably < 10 people in the whole world). For authors, an engine that knows how to remove or include ‘draft’ info on a switch just does so magically.

Right. As we infer the general rules / semantics, we can put them into the AQL spec. While we don’t know what the general case is, we don’t put anything (certainly not specific references to particular models or classes).

In general, I think any shared specification should only be the general case ‘rules’ (structures, whatever) that have been inferred by study of specific systems and contemplation of known theory. In other words, normal scientific method. A specification (i.e. a language, model etc) is just today’s version of a general theory about some topic.

So the bottom up approach here is that the AQL spec is anaemic on a lot of semantics, but that with the experiences to date, we can slowly start determining the general formulation of various features (CONTAINS, projections, draft content inclusion …) and when we agree on them, we add them in. Before we have them, the community is still experimenting, and we don’t know what the general formulation that can be put into the spec is. Again, normal experimental science approach.

matijap · 5 March 2020 14:05

Like @sebastian.iancu said, the two approaches are not exclusive. However, as (a representative of) an implementor, I see much more value in a document that is a slight mix of query language and RM: definitely I’d like to see openEHR RM classes in the examples in the AQL spec, and most probably also in clarification sentences, for example while CONTAINS would be defined in abstract terms, there should also be a clarification that an example of indirect reference is between EHR and COMPOSITION, and an example of direct reference is between COMPOSITION and SECTION. This requires just a note on top stating that most (all?) examples and clarifications refer to the openEHR RM.

As for the special processing cases (like the one with version lifecycle) I’m open for discussion, but for simplicity’s sake I’d just put them into another section in the same document, and also put a note in the lifecycle_state attribute description about the querying exception (yes, AQL would be referenced informally in the model specification).

thomas.beale · 5 March 2020 14:10

Right. generic definition; concrete examples (that are familiar).

Agree here, after all, that is part of the semantics we want for the ‘incomplete’ lifecycle state - that it somehow affect querying. To be completely correct though it should not mention AQL, but just querying methods in general. That way, it correctly defines not just what happens in openEHR systems that use AQL, but also ones that do it a different way (eg. Code24, EHRserver).

matijap · 5 March 2020 14:11

The exact notion about CONTAINS was not; I made it up as a radical example. The fact about specially handling some attributes (like by default hiding incomplete versions, or hiding non-PARTY_SELF content) was discussed and if any of those are agreed upon, there’s a question of where to describe them.

Seref · 5 March 2020 14:57

I apologise, I misunderstood the nature of your disagreement with me.

To further clarify things, I do not object to AQL’s generic nature. As I said under the FROM semantics discussion:

I merely attempted to follow an approach that considers various realities of software development, but it failed for reasons which I personally find extremely disappointing.

I’m OK as long as I can clarify my position with you and other vendors, that’s all.

rong.chen · 5 March 2020 19:36

Sorry to jump in at a late stage, and I won’t comment on specifics since we don’t implement AQL/CDR ourselves.

However some of the discussions here do echo what’s happened/happening in decision/process support space namely the TP and EL specifications, and now the brand new specs for DS/M. I feel we as industry members/implementers of given areas (CDR/CDS etc) are not fully involved/consulted enough before new specifications took place. As an open standards based CDS product supplier, we are very sensitive to any significant changes to GDL or related/emerging CDS design specifications in the openEHR space. These changes could raise lots of questions from our existing/potential customers especially from large national bodies/regional authorities. And it’s hard to exaggerate the burden (and business risks) potentially caused by rapid changing specs.

I think other vendors share the same concern, which was noted on 8th of Jan, 2020 at our first industry members meeting:

we need to improve our efforts of governance of specifications.

Where to go from here, I really think it’s important to consult key implementers/suppliers of a given technology areas before any major changes of specs in that relevant space. We need to be very careful to introduce new specs especially if there is considerable overlapping with existing ones. The SEC’s decisions undertake certain directions (change/introduce new design) must be motivated by clear business cases.

bna · 5 March 2020 20:36

There is IMHO the only way to make the needed progress related to AQL now. We need, as a group, to work out the semantic understanding of what we want to achieve with AQL queries. This is absolutely needed to make AQL portable.

I have been the product owner for our openEHR solutions in DIPS AS since we started our openEHR journey in 2010. Our most important assets when developing our solutions is the narrative descriptions in the specifications. The narratives defines the intention and context for the implementation. This is the most important message to the developers. For every developer I have met I find they have no problem doing the technical implementation. We have not used the BMM formalism when developing the solutions. It has, so far, not been requested by any developer. The hardest part implementing openEHR is to understand e-health and how to apply a specification into the underlying persistence layer and development language used by the vendor. This is, as far as we can see, work that has to be done carefully by the development team and can not be done “automagically” based on abstract models.

I assume the story might be different for modelling tools. For those you might have to adjust to difference reference models and need a abstract model as the basis for your implementation. In such use-cases the BMM might be a perfect match. We’ve also seen it’s usage as a way to test our implementations with the different RM versions.

Handling data has to be dealt with carefully with a deep understanding of your persistence model. As you all know we use Apache SOLR as our core engine for indexing the data. The last weeks we’ve re-implemented the core usage of SOLR. We’ve used lots of hours working with the details in the underlying Lucene index and the core parts of SOLR. We are cooperating with one of the committers in the Apache SOLR group on this work. We have a developer with who is Doctor in computer science leading the development. He is one of the brightest minds I have met. My intention with this writing is to explain how much effort we have to put in the details on the persistence layer go implement a CDR. I don’t know how we could use any more formalism in the specification to guide or help us with this.

Back to AQL:
It’s a great to have such a portable query language. Kudos to the people who invented it!
Right now we have a few vendors with CDR’s supporting AQL. I know the queries are not directly portable. This is related to some of the semantics in the openEHR data (and archetypes). Most important is to work out a shared understanding of how to map hierarchical data into AQL, and then we have the datatype and equal/order . Similar to this we’ve seen the need for functions within AQL. Most needed was the INSTRUCTION/ACTIVITY/ACTION related operators which DIPS and Better both support.

We as a SEQ group have to work on this topics and discuss the semantics. The outcome of these discussions might be written into some formal definition. But the semantic agreement has to come first!

Seref · 6 March 2020 10:41

I thought long about leaving this without a response. I decided that you need hear a particular point of view, mine, in this case, since I think I earned the right to express that.

I find your overconfidence in your engineering skills troubling Tom. You seem to be disregarding concerns from implementers too easily, mine to be specific in this case, from a perfectionist, idealist point of view. I really did make an effort to emphasise that there is a trade off here and we may gradually improve things, but you’re displaying a repeating pattern of simply dismissing implementation and adoption related concerns, in addition to trivialising all implementation work, despite not always necessarily having actual experience or expertise in particulars of it.

Members of this group, including myself, are aware of many of the principles you come across as lecturing, despite not every one of them being engineers. We are not always in a position to adopt those principles fully and openEHR SEC needs to accommodate that fact.

I’m not sure how SEC can function as a committee without you, given your position, developing a habit of considering pragmatic concerns and making an effort to strike some balance in specification work.

I am truly sorry having to express all the above, but regardless of your opinions of me, I deserve the right to hear some convincing responses to my questions and concerns, both as a SEC member and as a member of openEHR community, without condescension and with a level of courtesy that you’ve always been receiving but failing to give at times. I hope we’ll see some indication of you considering what I’m saying here in your future correspondences.

I do not expect or wish to receive a response, this is all for your consideration, not written with a wish to discuss.

thomas.beale · 6 March 2020 12:01

Well I won’t respond to the main criticism, but I want to make it clear that I am not at all dismissing implementation concerns or working from a theoretical or perfectionist point of view. The only thing I have said in this entire thread is that we should observe a basic, orthodox dictum in IT, which is not to create dependency or close coupling where it can be avoided and doesn’t appear to be needed. The cost of doing that in AQL compared to the cost of making it openEHR RM specific and dependent I would guess to be slightly more in the short term, but greatly lower in the long term.

For me the specifications and archetypes aren’t really the important thing, they are interim artefacts that establish explicit agreements on shared semantics of something or other that will then have some useful results in real systems that function in places of care. I’m really only interested in that final results.

If I had had a theoretical point of view on all of this, I would have gone and turned it into a PhD. Instead I have just worked on it in the industry context, to try and make an effect in the real world. As part of that, I have worked as closely as possible with vendor companies (including Ocean, which I started in 2000 with MD colleagues and left only a few years ago).

Anyway, I’m certainly sorry if any of my posts here seemed discourteous, they weren’t meant to be, and I apologise if anyone found them otherwise. I always assume this group is up for technical discussions and debate, and for me, it’s never, ever personal. I’m sorry if anyone else thought it was.

Anyway, if people want to tell me it is better for me to go, no problem, I won’t take offence at that either. Maybe it is.

sebastian.iancu · 6 March 2020 12:15

Pff, … what a thread!!

Come on guys, we are here to resolve problems not to create new ones. How many hours we already burned on this topic?

I’m reiterating one of my initial post, where I tried to say that, from my point of view both Thomas and Seref are right, both solutions are needed and are complementary, it is just a matter of prioritizing and investing in right resource.

Instead of focusing on our differences, on what makes us apart, let’s focus on what we have in common, how can we use all our multi-diverse experience and competency to create better specs.

And certainly we need all of us on board!

ian.mcnicoll · 6 March 2020 16:16

Thomas, I can assure you that this is absolutely not the case. Every single member of this community, me included, holds you in the highest regard, for the seminal contribution you made to openEHR, for the ongoing drive and energy that you continue to show, and as a true personal friend.

@Sebastian I understand your frustration. We can all get passionate and forget ourselves at times but there are some very important points of principle in here (strongly felt) about how we proceed.

I actually think most people agree with you that there is room for two types of documentation but a clear preference for putting almost all of our collective energy into some kind if openEHR RM focussed AQL documentation

an ongoing tidy-up of the ‘core AQL’ specification which should remain model-neutral, other than with some openEHR examples.
a more practical ‘openEHR AQL’ document , perhaps phrased as guidance, and in ITS if that helps, that very practically works through some of the issues that Seref, Matija and Bjorn have highlighted, and will allow us to make rapid progress on AQL openEHR conformance discussion , and can inform (1) over time.

I’d like to make a start on this with Bjorn to chew over his datatype comparison/ordering questions (which from my POV have significant clinical safety/sense aspect) and perhaps then fold in Seref’s original ‘FROM’ issues comments at the top of this thread

I;m open to further suggestions about how and where this is done but my preference would be to carry on as before with Topics in discourse, at least for a bit.

How does that sound for moving forward?

Ian

sebastian.iancu · 6 March 2020 20:31

@ian.mcnicoll, I’m fine with your proposals, so far. I have nothing against it.
One more remark (still have to make sure people are not going to omit it) is that AQL should (“must” IMO) also work with demographics, so please consider this in the specs, and in the examples for 1.1 release.

ian.mcnicoll · 6 March 2020 21:25

Absolutely though I suspect most CDR vendors may want to use that as a facade to support population queries e.g on a few simple aspects like Dob (age) , sex, gender , locality without actually having a real openEHR demographics engine sitting behind that , as you do.

bna · 6 March 2020 21:31

Please don’t. Since 2010 I have worked with openEHR. We have been a team of a few core members reading the specifications over and over again. For each year implementing improved solutions. What we found most appealing was all the details explained so well about how to build a structured clinical data repository. So please forgive me if I told you this was not a huge gift for e-health.

At the same time. In our company there have only been a few who are able to read the spec and apply it to implementations. That’s ok. I think we also need an abstract formalism to make sustainable platforms.

I am so proud to be part of the openEHR community. All the great people who will use their time to read this post and to think about the same problems that I face in my day-to-day life. We, as a community, are unique. We have expertice in different areas and we need to embrace this.

Coming in to openEHR in 2010 I thought the community was bigger. I thought there was lots of vendors delivering real systems based on openEHR. I was wrong. There was only one; Oceans. And there was this “new kid in block”- Marand. We met Thomas, Ian and a few guys form Marand at the MIE conference in Oslo. It was a high fame factor for us to meet you guys. And we where so impressed about the way you talked, the specifications you had and the platform Marand had implemented.

Since then we’ve implemented three generations of openEHR software without the need to convert data. That’s fantastic. And it is a lot of kudos for all the good you all have done with the specifications.

We need your bright mind. That’s for sure.
We also need good guidelines for developers to cover all the dirty details. And this is an area where I am not sure if we can handle by formalism. This is the area where we have to talk about dirty details which might never be formalized.

I intend to work with e-health and openEHR for the rest of my career. I hope you all can bear with me, and I hope to be part of a community where we can argue loudly and still be friends. I want to be part of a vibrant community with a shared vision. I think openEHR is the only option in the world. I hope and think we can continue the work we are doing and in the future get even more people involved in the work.

I wish you all a great weekend.

PS! I am using the weekend to work on the COVID-19 application based on openEHR. It is a great example and opportunity to demonstrate for the whole world what we have made together based on the heritage from some really great minds.

sebastian.iancu · 6 March 2020 21:37

Well, I’m not pushing anybody towards supporting demographics if their system actually does not implement it (or support it) already. It is just that once is going to be used more than us it should be clear how the query should look like.
It is perhaps also a marketing/sales statement, that openEHR has also Demographics and it could be used in AQL without any problem, and that is more powerful (and complex) than PERSON/PRACTITIONER from FHIR… Therefore it would be nice not to have anymore in text any statement like “AQL is to query medical data” or “to query CDR (clinical)”, etc. (shortly saying), because that implies that its purpose is to query EHRs.

sebastian.iancu · 6 March 2020 21:42

owww, such a warm message … let me hug you!

bna · 6 March 2020 21:47

You might when the COVID-19 period is over… haha

But thanks Sebastian. I appreciate your feedback!

ian.mcnicoll · 6 March 2020 21:50

I completely agree - we need to show AQL does demographics in a vendor-neutral manner and it makes sense to use the DEMOGRAPHICS RM as a faced for that, especially as the key queryable content will be archetyped in CLUSTERS and not directly tied to RM attributes/ classes. So for instance, we can use the FHIR-aligned demographic archetypes as the nominal ‘target’ for population style queries, again even if what is actually under the hood is neither FHIR or openEHR DEMOGRAPHICS.

You will want to, and are able to dig deeper into supporting DEMOGRAPHICS but this will probably not be necessary or doable for others. That’s fine /Mixed economy always wins.

thomas.beale · 7 March 2020 16:25

I can’t remember whether I published my FHIR demographics analysis here, but just in case, it’s here.

yampeku · 7 March 2020 19:03

Wow, somehow I missed this thread! I for one, have been using every part of the specifications for other RM since forever… except AQL. AQL has always been a little out of our scope because, well, we used other query languages for querying and validating data. I have to say that archetypes work with whatever model you can think of. I think we have made lots of proposals in the past to ensure that more or less everything works regardless of the model (e.g. CDA having uppercase attributes and ADL not really supporting that), and I’m happy to say that in that regard every part of the model is quite robust. I’m sure more things will be needed to make it completely model agnostic, but we have, as a community, always found the best solution (or a good compromise) to get things working.

For AQL, I thing the 2 most explicit issues we always have had are actually the ones being discussed these days:

What does AQL return?
The things that AQL assumes as known by the engine (standard dependent). This include quite a lot of things that BMM de facto solves (what is my node id attribute for this model? which are the valid parent classes?etc. etc.)

After all these years of being the “weird nitpicky Spanish guys” we have only received love from this community. So this is to say that we love you all as much