AQL through GraphQL

No openEHR application is built in isolation. There are many services running alongside it and the client usually needs to make multiple requests to obtain different data points like demographics, clinical content, terminology etc. Some views may also require executing multiple AQL queries on the client to get the necessary data.

The Problem: The client application needs to make more HTTP requests than necessary. This becomes especially cumbersome when different services have different authentication mechanisms. On slow networks, the difference is actually noticeable (Authentication requests for each service + the request for the actual data. Multiplied by 2 if it’s on a different origin due to CORS).

Proposed Solution: Write GraphQL resolvers to join all these microservices and provide a unified API for the client to consume.

The Problem: How does one represent AQL through a GraphQL API? Has anyone worked on this? Written schemas etc?

Thought this might be a problem worth exploring as a community.

I know @ian.mcnicoll has mentioned thoughts regarding GraphQL for openEHR.

Regarding data retrieval I believe several openEHR CDR implementations have ways to combine (e.g. nest) AQL queries and/or merge responses on the server side thus reducing number of (http-)calls (@bna can probably link to some of DIPS stuff).

In the paper Applying representational state transfer (REST) architecture to archetype-based electronic health record systems | BMC Medical Informatics and Decision Making | Full Text (long time ago now) we showed an example where we embedded AQL inside XQuery inside HTML…

Warning: Below I go off topic from the “AQL” part of the topic…

I think there is definitely room for GraphQL in the openEHR space, especially regarding gradually constructing input data (possibly several related COMPOSITIONs) to later be submitted as a big CONTRIBUTION object into a CDR/backend. I’d suggest making something like the “contribution builder” described in part of the same BMC REST-paper linked above, and make sure that it in addition to the “raw” (verbose) canonical openEHR JSON (and/or XML) format also can use something like the simplified template specific “structured” openEHR-format (then IDs of specific templates of course need to be provided).

If we can make a multi-user contribution builder, then it could likely make it easier for developers to create multi user + multi-device input UI’s. We need this for example in emergency wards where you have a team simultaneously documenting different parts regarding the same patient. Best user experience would of course be something based on “Operational Transformation” (the same algorithm as Google Docs uses) but multi-device sync via GraphQL could come a long way too, if implemented in an efficient way.

2 Likes

Not exactly the same, but as soon as EHRBase supports getting data out in XML I want to embed AQL queries directly on the XQuery just for science :rofl:

Interesting proposal!

GraphQL was mentioned a several times in the last year in SEC, but we did not had any concrete plans/implementations/requirements on this subject (other than perhaps @ian.mcnicoll being very eager to dive in). So if you have something on this field, I’m pretty sure we’ll have interest on the SEC side.

:+1:

I’m 98% sure graphQL is just one possible alternative ITS for AQL. It handles sparse structure querying via paths, i.e. just what AQL and archetypes provide. I think therefore if people want to map AQL and archetypes into graphQL tech, it should be relatively easy.

I’m not sure if adding a layer on top would resolve latency issues in a general case. This should be measured per case, since it depends on the network topology, and based on that, plan a solution.

If there is an aggregator somewhere, the aggregator will also have its latencies for executing different services and waiting for results to join stuff.

Another point is how the whole system architecture is laid down. Where is the client, where are the servers/services, does the network bandwidth support this kind of architecture, if not, what’s the alternative, etc etc.

I’m just curious to know if we understand the problem before jumping into a technical solution, if this is really an issue, and if there are the any other alternatives or better solutions.

A partly related workaround-question regarding user experience during (possibly slow) data retrieval:

@Sidharth_Ramesh since you seem to be experienced in Svelte, perhaps you can figure out if it would be easy to make a svelte store of some kind for multiple (possibly paged) AQL responses (retrieved from the server via the already available openEHR AQL REST API).

A use case would be to simplify coding of dynamically/incrementally loaded visualizations (timelines etc.) based on a set of different AQL queries (and/or paged results from a single AQL query). This could give the clinician something useful to start working with while the rest of the data is being fetched. It could also help developers write reactive, possibly a bit more declarative, visualizations that just listen to events from the store being populated with AQL response data.

Yes I agree - these are all important questions to understand before jumping to a technology solution.

On an orthogonal thought, exploring this kind of technology would be a good thesis project for students. Maybe people interested on this that is working at some university could propose such idea to be explored.

3 Likes

This is exactly why I was looking at GraphQL in the first place. Apollo Client has already implemented this in most frameworks (including Svelte, React, Angular, Vue) for GraphQL interfaces. And it’s not an easy task to redo all that work. And I believe Apollo Client will also start supporting REST in the future.

As for this,

I completely agree. The network topology does matter in most cases. Takes these scenarios:

Scenario 1:
Multiple services are hosted across different data centres and cloud providers. The client has to make a request to all of these providers to get the data they need. Having a GraphQL aggregator server in the cloud is may not guarantee that it can aggregate the results faster than the client. However, consider these factors:

  • The bandwidth and latency of the cloud GraphQL instance is probably always going to better than the client device which will probably be on WiFi, or 4G data connectivity.
  • The client device has to make 2x the times of requests owing to CORS policy of the browser that hits the OPTIONS endpoint and makes the request only AFTER the results arrive.

Is see GraphQL as the winner here. Unless the client is using the application on a very stable broadband connection.

And even then, consider this example:

I was working on this dashboard that has to show the latest covid result for every patient. The covid results are in an openEHR server. The patient information is in a FHIR server.
I want to make the minimum number of requests and render this UI for 50 patients.

FHIR GET Patient List - 1 request.

AQL for the Covid results for all patients. I don’t know any way to get the latest covid result only for each patient. And most patients have multiple test results. Currently, I’m doing this however, I just want to get the latest result for all patients.
So best case - 1 request (get all results and filter on the client), worst case - 50 requests (1 AQL for each patient)
And I can’t help but think, for the server, doing 50 such queries is not a big deal, but sending that data across the wire is where the problem lies. If we have GraphQL, I don’t have to be limited by this limit. I can just as easily generate 50 queries and send it over as a single request and get back the response in an instant.
This is where I think GraphQL’s biggest benefit will be. Getting exactly the data needed, in the shape requested. Take FHIR’s Bundle requests for example, multiple requests can be Bunded into in 1 request, and I don’t know what I’d do without them.

With tools like Apollo Client (as mentioned above) GraphQL just makes more sense when building real-time reactive applications for mobile devices. (Think Firebase, but open-source)

Scenario 2:
The services are on the same data center/cloud provider in the same region.
The GraphQL aggregator will definitely be faster in this scenario.

@Sidharth_Ramesh I actually think it is possible in many openEHR implementations to write a single (nested?) AQL query (e.g. with a list EHR-IDs as an input parameter) to get the latest COVID result for each patient aggregated in a single network call.

  • @bna don’t you have something called VAQM (or some similar norwegian vacuum-word-pun almost as good/bad as your “EHRcraft” product air-word-pun…)
  • @matijap, @borut.fabjan or somebody else from Better could perhaps point to their version of queries to fit that use case.
  • What about EHRBase @jake.smolka, @birger.haarbrandt or somebody else?

You have a good memory Erik!

Yes we have a system for stored structured queries. We named it VAQM which is an acronym for Virtual Archetype Query Model. To be honest we chose the name to make it easier to remember…

VAQM is a two-step modelling framework for reusable queries.

  • AQL definitions
  • Data element definitions

AQL definitions defines the paths, the AND/OR expressions, the ordering and this is combined into predicates. One predicate is analogue to a view in SQL.

Based on the predicates we define display transformations for the data. These transformation may be used in patientlists or graph/charting. Some examples below and sorry for the Norwegian text.

image

image

A few years ago we prototyped a GraphQL layer above VAQM. Here we combined demographics with VAQM from the openEHR CDR. We found it useful and powerful. But we also had similar and existing services. This is why we never made the prototype into production services. Still I think it is an interesting idea which might be implemented in the future.

3 Likes

Maybe I am missing something, but this is just simple consolidation of queries and results into packages; it isn’t related to any specific semantics of graphQL as far as I can see. There are all kinds of ways we could package up requests and replies so that they are asynchronously batch processed.

We could even add such bundling to AQL itself, but normally it is a generic part of a lower level protocol.

It is consolidation of queries! And there are definately other ways to do it. You could even extend the current REST API Spec to include it. However, I feel GraphQL was made for this exact kind of problem. Why reinvent another API on top of tbe existing openEHR service spec, when just exposing the current API via GraphQL solves this problem? And don’t forget about the rich tooling and state management libraries that comes along with using GraphQL.

2 Likes

Indeed - was just trying to get a clearer idea. SHould we worry about creating dependencies like this? E.g. if we specify query consolidation to be via GraphQL, then we are following GraphQL into the future and are dependent on its direction. Might be perfectly ok, but we need to think about dependencies like that, just in case.

1 Like

We don’t have support for nested queries, but do have other interesting mechanisms. :slight_smile:

For instance, we enable limiting the number of results per ehr, so yes, you can get for instance “last such-and-such Observation for each patient”.

For more complex situations we also offer views (pre-stored queries) that are not necessarily only AQL, but can be scripted using javascript that can be used to execute multiple AQLs as required. It reduces time of the round-trips significantly, but if you have a “n+1” query, it’s still a “n+1” query, so you mileage will vary. :slight_smile:

1 Like

Can you give an example Matija - I need that pattern just a few days ago - worth adding to the formal spec?

1 Like
SELECT
    e/ehr_id/value,
    a/data[at0001]/events[at0006]/time,
    a/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value,  -- systolic
    a/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value   -- diastolic
FROM EHR e
CONTAINS TOP 1 OBSERVATION a[openEHR-EHR-OBSERVATION.blood_pressure.v1]
WHERE e/ehr_id/value MATCHES {
    'e241715b-3ca7-435e-a474-718579aadaa2',
    '0e7ac1d6-2dc6-40ec-8259-ac9f9a9727d1',
    'ed74f788-4bd6-4e47-ab98-211643cc4b0c'}
ORDER BY a/data[at0001]/events[at0006]/time DESC

Note the TOP 1 within the containment.

I think we should first standardise the rules for AQL execution and then proceed with such complex additions that influence structure of the results heavily…

4 Likes