OpenEHR queries

Hi Greg,
o/items[at0004]/value is actually an invalid path as the identifier o
represents an OBSERVATION class and an OBSERVATION does not have an
attribute of items, it has a data attribute that is of type HISTORY, which
has an events attribute that is of type LIST<EVENT>, an EVENT has an
attribute of data of type ITEM_STRUCTURE which for a ITEM_LIST has an
attribute of items. Just like in XPath you need to provide the full path
unless you provide a wild card as indicated by Thomas. AQL and the Ocean
Query Process does support this wild card and a path such as
o//*/items[at0004] can be used.

However, we don't really see that these paths will be used directly and the
paths will be generated using tools such as the Ocean Template Designer and
Archetype Query Builder. Therefore the length of the paths are not a
concern and having the complete path actually allows the query processor to
perform its job more efficiently as it does not need to search for the leaf
node by traversing all possible paths within the data graph.

Heath

From: openehr-technical-bounces@openehr.org [mailto:openehr-technical-
bounces@openehr.org] On Behalf Of Greg Caulton
Sent: Monday, 5 November 2007 9:49 PM
To: For openEHR technical discussions
Subject: Re: OpenEHR queries

Question on the query below, if at0004 occurs *only* once in the
archetype model, in the position
data[at0001]/events[at0002]/data[at0003]/items[at0004]/value, could
one theoretically have the shorter query

SELECT o/items[at0004]/value
FROM EHR [uid = $ehrUid] CONTAINS OBSERVATION o
[openEHR-EHR-OBSERVATION.respiration.v1]
WHERE o/items[at0004]/value/magnitude > $n

The reason I ask is it would be tempting to store (in my data model)
the single discrete value at0004 to map to my equivalent item.

Of course that would break if a new data element was added in a
position (fabricated)
data[at0001]/events[at0099]/data[at00100]/items[at0004]/value but the
simplicity is tempting.

thanks

Greg
http;//www.patientos.org

> Hi Greg,
> The Archetype Query Language (AQL, formerly known as EHR Query language

or

> EQL) was developed by Ocean Informatics and a specification is being
> prepared to be offered to the openEHR foundation as a candidate openEHR
> specification. For now the paper referred to by Rong is the main

reference

> but we hope to provide something on the openEHR WIKI soon.
>
> The Ocean Template Designer provides these openEHR (XPath-like) paths as

a

> property of each node but Ocean is also developing an Archetype Query
> Builder tool that will actually generate the complete query for you.

Here

> is the query generated by the tool as per your use case (it is slightly
> simpler than the example provided by my colleague Chunlan).
>
> SELECT o/data[at0001]/events[at0002]/data[at0003]/items[at0004]/value
> FROM EHR [uid = $ehrUid] CONTAINS OBSERVATION o
> [openEHR-EHR-OBSERVATION.respiration.v1]
> WHERE o/data[at0001]/events[at0002 and name/value='Any
> event']/data[at0003]/items[at0004]/value/magnitude > $n
>
> The units can be included as an additional criteria as indicated by

Chunlan

> but it is unnecessary as the archetype only allows one kind of unit for
> rate.
>
> Let me know if you would like further details regarding the Ocean tools.
>
> Regards
>
> Heath
>
> Heath Frankel
> Product Development Manager
> Ocean Informatics
>
> Ground Floor, 64 Hindmarsh Square
> Adelaide, SA, 5000
> Australia
>
> ph:+61 (0)8 8223 3075
> mb: +61 (0)412 030 741
> email:heath.frankel@oceaninformatics.com
>
>
> > From: openehr-technical-bounces@openehr.org [mailto:openehr-technical-
> > bounces@openehr.org] On Behalf Of Greg Caulton
> > Sent: Monday, 5 November 2007 9:30 AM
> > To: For openEHR technical discussions
> > Subject: Re: OpenEHR queries
> >
> > Thanks Rong,
> >
> > Just the thought for someone but it would be handy to have the XPath
> > (such as
> o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value)
> > for a data value somewhere accessible in the editor or in the html
> > generated content such as
> > http://svn.openehr.org/knowledge/archetypes/dev/html/en/openEHR-EHR-
> > OBSERVATION.body_weight.v1.html
> >
> > Just easier for adhoc testing so not a big deal.
> >
> >
> > > Hi Greg,
> > >
> > > There was a paper published at Medinfo2007 on this topic. The paper

is

> > > available at:
> > >
> >
>

http://www.openehr.org/downloads/publications/archetypes/MedInfo_2007_EQL_MA

> .p
> > df
> > >
> > > Cheers,
> > > Rong
> > >
> > >
> > > >
> > > > Hi,
> > > >
> > > > Somewhere I recall reading that there was an OpenEHR query that
> > > > theoretically an OpenEHR compliant system could execute a return
> > > > results for.
> > > >
> > > > Is there a spec somewhere, preferably with a simple example.
> > > >
> > > > So if someone knew my patient and queried for all instances of
> > > > Respiratory Rate greater than n?
> > > >
> > > > openEHR-EHR-OBSERVATION.respiration.v1.adl
> > > >
> > > > Rate at0004 > n
> > > > Units /min (is that a default or are the units passed in the

query)

Randolph,

You could consider AQL to be a logical query language, this allows semantic queries to be shared amongst heterogeneous (and potentially, non-openEHR) systems without the need to know the systems underlying data model. Therefore, AQL logically runs against the Information Model (using Tim’s words) gaining its semantics from Archetype Models. The Ocean Implementation of the query processor physically uses both the data model and information model to execute a query. Another implementation might purely use the data model or information model. Non-openEHR or non-Archetype-based systems may support only a small subset of AQL queries where they have provided an AQL to native query mapping. The point is, AQL is independent of the implementation and are sharable, semantic queries based on shared archetypes.

Regards

Heath

Heath Frankel
Product Development Manager

Ocean Informatics

Ground Floor, 64 Hindmarsh Square

Adelaide, SA, 5000

Australia

ph: +61 (0)8 8223 3075

mb: +61 (0)412 030 741
email: heath.frankel@oceaninformatics.com

Greg,
In this case at0004 does represent a unique concept but if you take another
archetype that uses archetype internal references such as
OBSERVATION.blood_pressure.v1 at0004 is not a unique concept as a systolic
BP within an "Any Event" event is different to a systolic BP within a "5
Minute reading" event. Therefore you should consider the full paths as your
unique concept identifiers, not the node IDs.

BTW, the node ID predicates on singular attributes such as data, protocol,
state etc are superfluous and if you want to shorten paths, this would be a
valid way to do so without changing the semantics of the path (although it
is not a significant shortening). For example, the following path is
equivalent to the path previously provided in the query.

o/data/events[at0002]/data/items[at0004]/value

Regards

Heath

Thanks. Of course, as you say, the Sql parser will vary depending on the structure of the underlying store, and underlying store designs can vary, one of the strengths of openEHR. It would still seem, however, that whole chunks of the AQL would have to end up in the Sql, and that, in turn, would have implications–at least some implications–how the underlying store would have to be designed. Only certain schemas would even work with openEHR. Does the openEHR community offer any suggestions? At Ocean you evidently felt that a certain design was best, not just any design, which you imply when you refer to the need to be “conformant.” What does it take for a DB schema to be conformant?

The persistence model that Ocean uses is a trade off between completely atomising objects and storing them as blobs.

Have you disclosed any of the details regarding this tradeoff?

Randolph

Greg,
The AQL query parser or query processor is not intended to be implemented in
Java by Ocean but it may be implemented by others in the openEHR Java
reference implementation at some point.

BTW, the openEHR Java reference implementation already has an interface to
the Ocean EhrBank EHR Server using Web Services.

Regards

Heath

Heath Frankel
Product Development Manager
Ocean Informatics

Ground Floor, 64 Hindmarsh Square
Adelaide, SA, 5000
Australia

ph: +61 (0)8 8223 3075
mb: +61 (0)412 030 741
email: heath.frankel@oceaninformatics.com

From: openehr-technical-bounces@openehr.org [mailto:openehr-technical-
bounces@openehr.org] On Behalf Of Greg Caulton
Sent: Monday, 5 November 2007 9:48 PM
To: For openEHR technical discussions
Subject: Re: OpenEHR queries

I appreciate the information. Writing new queries wouldn't be too
hard, it is parsing the queries and then executing the corresponding
queries or service calls against the implemented system that is the
tricky part.

Is Ocean Informatics planning to provide a open source java (or
similar language) implementation of the query parsing engine (I am not
implying you should, just a question in case you were)?

If you were it would be useful to look at how I could plug in my
integration, the early I look at these things in the design the easier
it gets.

thanks!

Greg

http://www.patientos.org

> Hi Greg,
> The Archetype Query Language (AQL, formerly known as EHR Query language

or

> EQL) was developed by Ocean Informatics and a specification is being
> prepared to be offered to the openEHR foundation as a candidate openEHR
> specification. For now the paper referred to by Rong is the main

reference

> but we hope to provide something on the openEHR WIKI soon.
>
> The Ocean Template Designer provides these openEHR (XPath-like) paths as

a

> property of each node but Ocean is also developing an Archetype Query
> Builder tool that will actually generate the complete query for you.

Here

> is the query generated by the tool as per your use case (it is slightly
> simpler than the example provided by my colleague Chunlan).
>
> SELECT o/data[at0001]/events[at0002]/data[at0003]/items[at0004]/value
> FROM EHR [uid = $ehrUid] CONTAINS OBSERVATION o
> [openEHR-EHR-OBSERVATION.respiration.v1]
> WHERE o/data[at0001]/events[at0002 and name/value='Any
> event']/data[at0003]/items[at0004]/value/magnitude > $n
>
> The units can be included as an additional criteria as indicated by

Chunlan

> but it is unnecessary as the archetype only allows one kind of unit for
> rate.
>
> Let me know if you would like further details regarding the Ocean tools.
>
> Regards
>
> Heath
>
> Heath Frankel
> Product Development Manager
> Ocean Informatics
>
> Ground Floor, 64 Hindmarsh Square
> Adelaide, SA, 5000
> Australia
>
> ph:+61 (0)8 8223 3075
> mb: +61 (0)412 030 741
> email:heath.frankel@oceaninformatics.com
>
>
> > From: openehr-technical-bounces@openehr.org [mailto:openehr-technical-
> > bounces@openehr.org] On Behalf Of Greg Caulton
> > Sent: Monday, 5 November 2007 9:30 AM
> > To: For openEHR technical discussions
> > Subject: Re: OpenEHR queries
> >
> > Thanks Rong,
> >
> > Just the thought for someone but it would be handy to have the XPath
> > (such as
> o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value)
> > for a data value somewhere accessible in the editor or in the html
> > generated content such as
> > http://svn.openehr.org/knowledge/archetypes/dev/html/en/openEHR-EHR-
> > OBSERVATION.body_weight.v1.html
> >
> > Just easier for adhoc testing so not a big deal.
> >
> >
> > > Hi Greg,
> > >
> > > There was a paper published at Medinfo2007 on this topic. The paper

is

> > > available at:
> > >
> >
>

http://www.openehr.org/downloads/publications/archetypes/MedInfo_2007_EQL_MA

> .p
> > df
> > >
> > > Cheers,
> > > Rong
> > >
> > >
> > > >
> > > > Hi,
> > > >
> > > > Somewhere I recall reading that there was an OpenEHR query that
> > > > theoretically an OpenEHR compliant system could execute a return
> > > > results for.
> > > >
> > > > Is there a spec somewhere, preferably with a simple example.
> > > >
> > > > So if someone knew my patient and queried for all instances of
> > > > Respiratory Rate greater than n?
> > > >
> > > > openEHR-EHR-OBSERVATION.respiration.v1.adl
> > > >
> > > > Rate at0004 > n
> > > > Units /min (is that a default or are the units passed in the

query)

Randolph,

As openEHR has no specification for a persistence model, there is no such thing as a conformant DB schema. At Ocean we have developed a DB schema that is still evolving but this is transparent to any application as the API is based on the openEHR Information Model. We may explore alternate DB schema’s and even alternate data store technology, but again this will be transparent to the application.

The main article available on this topic was located at http://www.openehr.org/FAQs/t_persistence_notes.htm but has not yet been moved to the new web site. This is really just some suggestions about how a persistence layer could be implemented, it is by no means a specification for conformance.

Regards

Heath

Hi Randolph,

What data query language you want to use for the underlying data model is depend on what data store structure you want to use. If you want to use relational db, then you need to derive sql queries from AQL. If you want to use XML, you may need to generate XQuery statements or Xpath based on the AQL to be able to get the appropriate data. Currently, SQL2005 also provides the SQLXML features which is even handy to retrieve data from relational table fields as well as XML blobs.

Path-value persistence data model mentioned by Tom is one of the option for openEHR based data instances. Relational data model with xml blobs is another option as well. Object oriented persistence layer is another option. It is absolutely vendors design decision on what sort of persistence layer they prefer.

You just asked What does it take for a DB schema to be conformant?

I think what Hug means is that the DB schema can be used to search and retrieve the data that is required by the AQL.

Let’s use the AQL statement that I did for Greg:

SELECT o
FROM EHR e[ehr_id/value=’1234’] CONTAINS COMPOSITION CONTAINS OBSERVATION
o[openEHR-EHR-OBSERVATION.respiration.v1.adl]
WHERE o/data/events[at0002]/data[at0003]/items[at0004]/value/magnitude>n

Imagine you are using a SQL server and you need to use sql query statement to query your database. In the database, you have a table called EHR and you have an id field which matches with the ehr_id/value.

The AQL FROM clause requires all data must be from the EHR with id=’1234’, so in your sql statements, there must be a criteria somewhere EHR.id=’1234’

The AQL statement (in the SELECT clause) also requires all respiration observation instances, which have respiration rate is greater than n, been returned. You have to somehow generate one or a set of sql queries to get the qualified observation data.

If your DB schema can support all these, then I see it is conformant.

Regards,

Chunlan

Hi Randolph

By conformant, I just meant conformant to the openEHR (and AQL) specification. The only thing that can really currently be said about the Ocean implementation of this is that it works and is fast and efficient - not that it is the ‘best’ design. The underlying DB schema is not part of the openEHR or AQL specification and will depend on the application. For instance, there is likely to be a version that uses an XML file based database that can easily reside on a memory stick. This won’t change the AQL query, just the implementation of it in the data access layers.

I think that there are many possible schemas with openEHR and many possible ways of persisting the data. Using an OODBMS is another possible solution that we haven’t looked at and there are many ways of writing a RDBMS schema with pros and cons for the different ways including using application layers like Hibernate to provide an object relational mapping. It will really be up to the implementer to decide.

As Heath has said, when you use our software, you access the software layers and run AQL queries. At some stage we may decide that there are better strategies for persisting the data or may do something to the underlying schema to make it more efficient etc. These kind of changes would be completely transparent to any software accessing the AQL software layers.

regards Hugh

Randolph Neall wrote:

I am not sure I sent this email successfully or not. If you have got one, please ignore this.

Hi Randolph,

What data query language you want to use for the underlying data model is depend on what data store structure you want to use. If you want to use relational db, then you need to derive sql queries from AQL. If you want to use XML, you may need to generate XQuery statements or Xpath based on the AQL to be able to get the appropriate data. Currently, SQL2005 also provides the SQLXML features which is even handy to retrieve data from relational table fields as well as XML blobs.

Path-value persistence data model mentioned by Tom is one of the option for openEHR based data instances. Relational data model with xml blobs is another option as well. Object oriented persistence layer is another option. It is absolutely vendors design decision on what sort of persistence layer they prefer.

You just asked What does it take for a DB schema to be conformant?

In my understanding it is that the DB schema can be used to search and retrieve the data that is required by the AQL.

Let’s use the AQL statement that I did for Greg:

SELECT o
FROM EHR e[ehr_id/value=’1234’] CONTAINS COMPOSITION CONTAINS OBSERVATION
o[openEHR-EHR-OBSERVATION.respiration.v1.adl]
WHERE o/data/events[at0002]/data[at0003]/items[at0004]/value/magnitude>n

Imagine you are using a SQL server and you need to use sql query statement to query your database. In the database, you have a table called EHR and you have an id field which matches with the ehr_id/value.

The AQL FROM clause requires all data must be from the EHR with id=’1234’, so in your sql statements, there must be a criteria somewhere EHR.id=’1234’

The AQL statement (in the SELECT clause) also requires all respiration observation instances, which have respiration rate is greater than n, been returned. You have to somehow generate one or a set of sql queries to get the qualified observation data.

If your DB schema can support all these, then I see it is conformant.

Regards,

Chunlan

Grahame Grieve wrote:

oh? what's a real thing? I don't think these things are any less
real than other things, just that the business environment dictated
that IT requirements came first a thousand years ago.

I am of course being somewhat loose here, but I mean anything that
exists in the natural world that isn't a pure construct of the human
mind - biological and physical structures, disease processes for
example. And many humanly constructed informational entities whose form
follows the complexity of the things they report, e.g. clinical
information, hotel booking information (in its modern form - consider a
site like Expedia). Accounting data in its classical form is a pure
mathematical abstraction, and doesn't have much of a connection to the
human world the way clinical information does.

All I am saying here is that attempts to model anything from this real
world where things are complex and change without control directly as an
ER model is likely to doom the system to unmaintainability in a few years.

- thomas

Thank you, Hugh. Are you free to tell me what kind of persistence layer Ocean uses, OODBMS, XML, or RDBMS? You have claified some issues for me. It would just be interesting to know what basic persistance type you have.

Randolph

Thank you, Chunlan. Your examples are clear. There are basically three choices for persistence types, as you and others have pointed out, XML, OODBMS, RDBMS, and XML. What is proving to be the most popular, most preferred, by the openEHR community? I am trying to find out, for instance, which of these three types Ocean adopted.

Also, one more thing: could you give me an example (web URL) of your biggest, baddest, nastiest archetype, or set of nested archetypes, that a system would should be capable of accommodating?

Randolph

Heath, thanks much! I’d very much like to see the t_persistence_notes.htm, but that one cannot be reached at the moment. I hope to find it soon. I asked Chunlan for an example of your baddest archetype, one that involves some hierarchy and that would challenge a persistence layer. If one comes to mind, let me know.

Thanks again,

Randolph

Hi Randolph

Yes - the Ocean implementation is a pure openEHR 1.0.1 repository. Currently it is running on an SQL Server back-end, however because of the level of abstraction in the persistence layers, it would be fairly trivial to move it to another RDBMS.

regards Hugh

Randolph Neall wrote:

Randy,

I have no control over this t_persistence_notes.htm link but I have sent an email to the webmaster about it.

There are quite a number of complex archetypes, even OBSERVATION.blood_pressure.v1 is more complex than you would expect as archetypes are a maximal data set. The complexity comes when you aggregate archetypes together to construct a complete composition. Some of the NHS composition templates have dozens of archetypes aggregated together. The real killer for a persistence layer is the number of archetypes that could be used, their specialisations and the potential revisions over time.

I would suggest that a good “openEHR” persistence model should be able to store data for any archetype and any of its specialisations and revisions without manual modifications to the software. The Ocean EhrBank EHR Server is able to do this and in fact does this without knowing about the archetype, just the openEHR RM. Another openEHR implementation may do this by requiring archetypes to be registered allowing it to support a new data structure.

Hope this helps.

Heath

Hi Randolph

If you want to look at some large templates (which contain many nested archetypes) you can have a look at the work that is being done in the UK. The Connecting for Health program is building clinical models based on openEHR for use across the program.

The URL is http://www.ehr.chime.ucl.ac.uk/display/nhsmodels/Home

These templates are whole documents like “Antenatal Checkup” and are made up of many nested archetypes. openEHR allows for any arbitrary level of complexity in the models and the persistence mechanism needs to be able to handle this. Certainly the Ocean one does.

One of the nice things about using the openEHR reference model for the persistence of data is that you can keep throwing new archetypes at the data store and persist the data without having to change the data schema ever. So you may have a completely new archetyped concept that your system has never seen before - with openEHR, you can persist it and view it, modify it and compute on it without making changes to the software.

regards Hugh

Randolph Neall wrote:

Hugh, you and Thomas Beale are apparently colleagues at Ocean, and this is what Thomas said in the past day or two:

well, yes and no. If you try to make the relational model have anything
to do with the clinical information model, you will usually hit a wall.
Instead, relational databases can be used very effectively as a
low-level store of blobs keyed by path. I wrote a web page on this
aproach which we still have not transferred to the new site, but will in
the next week or so - it may shed some light on the matter.

The argument of Object databases versus Relational databases is an old
one that I expect most people have already chosen their camp based
upon their personal career experiences.

I used to think that (seriously) until I realised how bad relational
databases are at storing real-world models of anything but how great
they are as a method of storing blobs, paths, indexable values and so
on, in a totally generic way (i.e. where the schema will not change
regardless of changes in the content of the data, or even its domain
level information model). I believe that the textbook theory of using
either E-R models or object models to represent any but the most generic
things in the real world is relatively useless, for anything but a
demonstration database or small, unchanging application (does such
exist?). For anything real, it doesn’t work because a) real-world things
are almost always hierarchical compositions (due to our human way of
describing them) and b) real-world things keep changing (modiyfing the
schema of a database is a pain in the neck when you have 200m records
and 50 tables). Using relational in the classical way works for things
like tax and bank databases because the data are not ‘real’ things, but
tabular accounting constructs.


Can I assume that what Thomas here advocates, (“relational databases can be used very effectively as a low-level store of blobs keyed by path”) is what how the ocean persistence layer actually works? Beyond this, Thomas apparently has little use for the capacities of Sql-type RDBMS systems to handle clinical information. Does the Ocean system ultimately amount to blobs keyed by paths (presumably string paths)? If so, what kind of blobs, XML blobs, or some other structured text system?

Thanks for your efforts to enlighten me. In that you are making progress :).

  • Show quoted text -

Yes - we are apparently colleagues :slight_smile:

I saw Tom’s post - he has had many years experience working with this stuff and in particular using object data stores, object to relational mapping systems and relational systems.

Happy to keep discussing these things

Randolph Neall wrote:

A guess from someone outside Ocean:
If Tom has been involved I'd guess it's stored as blobs of DADL or
something similar :wink: not XML...

In a master thesis project here some time ago the students used db4o
(http://www.db4o.com/) to store openEHR RM objects, but in a
rudimetary way mostly as a simple datastore for Java objects. Query
was not the focus of that project so they did not test proper advanced
querying or scalability using db4o.

// Erik

Erik, as I ask my questions about the Ocean system, I’m getting the growing sense, perhaps mistaken, that I’m pushing against trade secrets. Have they told me as much as they’re going to? If they’re storing the real guts of their clinical information as blobs (“low-level” blobs, as Tom implies, which could be mere binary serializations of their objects) then it would seem querying would be an interesting issue. I assume Ocean is the premier implementation of openEHR, what with its ultimate champion in Tom Beale, and if they’ve had to go this route, this would be significant. Tom works there, and I’m assuming that what he says reflects how they do it. If they had all their data in MS Sql tables and columns, not blobs, and it worked well that way, I doubt Tom would be inveighing against relational databases for clinical data. Maybe they’ve got some serious magic in play.

Randolph