OpenEHR queries

Hi,

Somewhere I recall reading that there was an OpenEHR query that
theoretically an OpenEHR compliant system could execute a return
results for.

Is there a spec somewhere, preferably with a simple example.

So if someone knew my patient and queried for all instances of
Respiratory Rate greater than n?

openEHR-EHR-OBSERVATION.respiration.v1.adl

Rate at0004 > n
Units /min (is that a default or are the units passed in the query)

Or is this future functionality?

thanks

Greg

http://www.patientos.org

Hi Greg,

There was a paper published at Medinfo2007 on this topic. The paper is available at:
http://www.openehr.org/downloads/publications/archetypes/MedInfo_2007_EQL_MA.pdf

Cheers,
Rong

Thanks Rong,

Just the thought for someone but it would be handy to have the XPath
(such as o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value)
for a data value somewhere accessible in the editor or in the html
generated content such as
http://svn.openehr.org/knowledge/archetypes/dev/html/en/openEHR-EHR-OBSERVATION.body_weight.v1.html

Just easier for adhoc testing so not a big deal.

Hi Greg,

Rong has indicated there is a paper about archetype query language. Thanks
Rong. That paper introduced basic query syntax. It was written at the
beginning of this year. The query syntax has been enriched recently in order
to support more complicated queries. I've already started to write the
specifications, but need to resolve some known issues before release.

Anyway, I handcrafted the following queries for you (I cannot build my query
builder at the moment because of some integration issues).

The query statement below shows that all observation instances with
respiratory rate greater than n will be returned.

SELECT o
FROM EHR e[ehr_id/value=$ehrId] CONTAINS COMPOSITION CONTAINS OBSERVATION
o[openEHR-EHR-OBSERVATION.respiration.v1.adl]
WHERE o/data/events[at0002]/data[at0003]/items[at0004]/value/magnitude>n AND
o/data/events[at0002]/data[at0003]/items[at0004]/value/units = '/min'

If you want the respiratory quantity object been returned, the query would
look like:

SELECT o/data/events[at0002]/data[at0003]/items[at0004]/value
FROM EHR e[ehr_id/value=$ehrId] CONTAINS COMPOSITION CONTAINS OBSERVATION
o[openEHR-EHR-OBSERVATION.respiration.v1.adl]
WHERE o/data/events[at0002]/data[at0003]/items[at0004]/value/magnitude>n AND
o/data/events[at0002]/data[at0003]/items[at0004]/value/units = '/min'

Just for your information, the single letter 'o' is the observation class
variable name, "/data/events[at0002]/data[at0003]/items[at0004]/value" is
the archetype path to respiratory quantity node. If you have the archetype
workbench running, you can identify this path there. '$ehrId' is the
parameter name which can be substituted with real EHR ehr_id value at run
time. The query language supports parameterization.

Some archetype query statements would be very long if the query criteria are
complicated. In fact, we don't need to write the above queries by hand.
Ocean Informatics has implemented a tool - Archetype Query Builder, which
can be used to create/edit queries easily. Additionally, Ocean has also
implemented a query parser and query engine as well.

The above query statements are consistent to the query syntax introduced by
the MedInfo paper. The current query tools also support this query syntax.
However, as I have said that we have enriched the query syntax and all the
enhancements can be found from the query specifications.

Hope this helps.

Regards,
Chunlan

Hi Greg,
The Archetype Query Language (AQL, formerly known as EHR Query language or
EQL) was developed by Ocean Informatics and a specification is being
prepared to be offered to the openEHR foundation as a candidate openEHR
specification. For now the paper referred to by Rong is the main reference
but we hope to provide something on the openEHR WIKI soon.

The Ocean Template Designer provides these openEHR (XPath-like) paths as a
property of each node but Ocean is also developing an Archetype Query
Builder tool that will actually generate the complete query for you. Here
is the query generated by the tool as per your use case (it is slightly
simpler than the example provided by my colleague Chunlan).

SELECT o/data[at0001]/events[at0002]/data[at0003]/items[at0004]/value
FROM EHR [uid = $ehrUid] CONTAINS OBSERVATION o
[openEHR-EHR-OBSERVATION.respiration.v1]
WHERE o/data[at0001]/events[at0002 and name/value='Any
event']/data[at0003]/items[at0004]/value/magnitude > $n

The units can be included as an additional criteria as indicated by Chunlan
but it is unnecessary as the archetype only allows one kind of unit for
rate.

Let me know if you would like further details regarding the Ocean tools.

Regards

Heath

Heath Frankel
Product Development Manager
Ocean Informatics

Ground Floor, 64 Hindmarsh Square
Adelaide, SA, 5000
Australia

ph: +61 (0)8 8223 3075
mb: +61 (0)412 030 741
email: heath.frankel@oceaninformatics.com

From: openehr-technical-bounces@openehr.org [mailto:openehr-technical-
bounces@openehr.org] On Behalf Of Greg Caulton
Sent: Monday, 5 November 2007 9:30 AM
To: For openEHR technical discussions
Subject: Re: OpenEHR queries

Thanks Rong,

Just the thought for someone but it would be handy to have the XPath
(such as

o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value)

for a data value somewhere accessible in the editor or in the html
generated content such as
http://svn.openehr.org/knowledge/archetypes/dev/html/en/openEHR-EHR-
OBSERVATION.body_weight.v1.html

Just easier for adhoc testing so not a big deal.

> Hi Greg,
>
> There was a paper published at Medinfo2007 on this topic. The paper is
> available at:
>

http://www.openehr.org/downloads/publications/archetypes/MedInfo_2007_EQL_MA
.p

I appreciate the information. Writing new queries wouldn't be too
hard, it is parsing the queries and then executing the corresponding
queries or service calls against the implemented system that is the
tricky part.

Is Ocean Informatics planning to provide a open source java (or
similar language) implementation of the query parsing engine (I am not
implying you should, just a question in case you were)?

If you were it would be useful to look at how I could plug in my
integration, the early I look at these things in the design the easier
it gets.

thanks!

Greg

http://www.patientos.org

Question on the query below, if at0004 occurs *only* once in the
archetype model, in the position
data[at0001]/events[at0002]/data[at0003]/items[at0004]/value, could
one theoretically have the shorter query

SELECT o/items[at0004]/value
FROM EHR [uid = $ehrUid] CONTAINS OBSERVATION o
[openEHR-EHR-OBSERVATION.respiration.v1]
WHERE o/items[at0004]/value/magnitude > $n

The reason I ask is it would be tempting to store (in my data model)
the single discrete value at0004 to map to my equivalent item.

Of course that would break if a new data element was added in a
position (fabricated)
data[at0001]/events[at0099]/data[at00100]/items[at0004]/value but the
simplicity is tempting.

thanks

Greg
http;//www.patientos.org

Greg Caulton wrote:

Thanks Rong,

Just the thought for someone but it would be handy to have the XPath
(such as o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value)
for a data value somewhere accessible in the editor or in the html
generated content such as
http://svn.openehr.org/knowledge/archetypes/dev/html/en/openEHR-EHR-OBSERVATION.body_weight.v1.html

Just easier for adhoc testing so not a big deal.

Actually, a complete drag and drop query editor is in the pipeline.

- thomas beale

Greg Caulton wrote:

Question on the query below, if at0004 occurs *only* once in the
archetype model, in the position
data[at0001]/events[at0002]/data[at0003]/items[at0004]/value, could
one theoretically have the shorter query
  

Greg,
while it is sometimes the case that an at-code is unique within the
archetype, there are many archetypes where this is not so. The general
case is that a leaf-level at-code will re-appear on numerous leaf nodes,
due to multiple Events, or multiple panel items in a pathology test
battery or somesuch.

However, using the Xpath // construct, one could achieve what you are
after, which would make the query slightly shorter.

- thomas beale

This is of course why you should (IMHO) change your focus (it takes an
"Ah Ha moment") from "data model" to "information models".

Using an object database (ZODB, POET, Gemstone, Versant, Objectivity/DB,
etc.) in your chosen implementation language is usually transparent at
that point.

If your heart can't handle that (OODB) approach for some reason and you
insist on PostgreSQL or Oracle (please do NOT use MySQL for healthcare
information) you should still look at using the custom data type
capabilities of them and follow the information model as defined in the
specifications. Again, you end up with an information model approach
and you do not adhere to (necessarily) to a relational model but you
still maintain data integrity and the relationships defined in the
information model UML.

DISCLAIMER: I understand that Ocean Informatics uses MS-SQL but I do not
know what their "data model"/"information model" looks like at the
persistence level. The really cool thing about truly supporting the
openEHR Information Models is that it doesn't matter as long as you can
support and EHR Extract in context of the information requested.

My 1 cent (the USD is in trouble).

Cheers,
Tim

One reason for the question was that it wasn't clear whether the
atxxxx uniquely identifies a concept within the ADL. I think it still
does, but it can have different context depending on where it occurs.

Implementing a hierarchy of information (information model) using
entity relationships (data model) is common place.

The argument of Object databases versus Relational databases is an old
one that I expect most people have already chosen their camp based
upon their personal career experiences.

I will agree with you that MySQL is not well suited to terabyte
databases with 1000's of concurrent users, with many people attempting
to update the same patient record :slight_smile: My own hospitals largest table
is growing at a rate of 500,000 rows per day, MySQL would choke with
the number of queries and updates hitting it regardless of hardware
IMHO.

Greg
http://www.patientos.org

As a developer from the US who sometimes tries to follow discussions here, I have a question probably well answered if I took more time myself to find the answer. Against what do your archetype queries run? Against the DB itself or some representation of the data in memory? I ask because a few months ago, someone from openEHR said in one of the discussions that a DB schema is not part of openEHR, that some private participant in openEHR had one for sale, and Ocean, maybe, but that was it. So, again against what do these queries (see example in Chunlan Ma’s message below) run?

Thanks,

Randy Neall
Veriqant, L.L.C.

Why?

(Not that I intend to do that)

Gerard

– –
Gerard Freriks, MD
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands

T: +31 252544896
M: +31 620347088
E: gfrer@luna.nl

Those who would give up essential Liberty, to purchase a little temporary
Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755

I will "assume" that your WHY? question was in response to this
statement I made?

In general (and for the speed results MySQL claims) it trades off ACID
qualities to attain them. There are **MANY** documents available on the
Internet to adjudge the differences between the needs of a high speed
blog service etc, and the needs of a real DBMS with ACID integrity and
MVCC.

There are many benchmarks ( I have no interest in either system ) that
show that large scale SMP systems perform better with PostgreSQL vs.
MySQl. IMHO, you should use an OODBMS for most installations anyway. I

Cheers,
Tim

Greg Caulton wrote:

One reason for the question was that it wasn't clear whether the
atxxxx uniquely identifies a concept within the ADL. I think it still
does, but it can have different context depending on where it occurs.
  

it doesn't in general. In certain archetypes it happens to, but you
should not use that in any way to write software.

Implementing a hierarchy of information (information model) using
entity relationships (data model) is common place.
  

well, yes and no. If you try to make the relational model have anything
to do with the clinical information model, you will usually hit a wall.
Instead, relational databases can be used very effectively as a
low-level store of blobs keyed by path. I wrote a web page on this
aproach which we still have not transferred to the new site, but will in
the next week or so - it may shed some light on the matter.

The argument of Object databases versus Relational databases is an old
one that I expect most people have already chosen their camp based
upon their personal career experiences.
  

I used to think that (seriously) until I realised how bad relational
databases are at storing real-world models of anything but how great
they are as a method of storing blobs, paths, indexable values and so
on, in a totally generic way (i.e. where the schema will not change
regardless of changes in the content of the data, or even its domain
level information model). I believe that the textbook theory of using
either E-R models or object models to represent any but the most generic
things in the real world is relatively useless, for anything but a
demonstration database or small, unchanging application (does such
exist?). For anything real, it doesn't work because a) real-world things
are almost always hierarchical compositions (due to our human way of
describing them) and b) real-world things keep changing (modiyfing the
schema of a database is a pain in the neck when you have 200m records
and 50 tables). Using relational in the classical way works for things
like tax and bank databases because the data are not 'real' things, but
tabular accounting constructs.

- thomas beale

a) real-world things
are almost always hierarchical compositions (due to our human way of
describing them)

you can store those in a relational schema. It seems to me that the
problem is that sql doesn't really make it efficient to do heirarchical
type things with them. I know some old data analysts who still bemoan
the fact that sql won the day, given how poor it was at so many things,
this being one of them.

> b) real-world things keep changing (modiyfing the

schema of a database is a pain in the neck when you have 200m records
and 50 tables).

so, you can move the deckchairs on the titanic; changing the schema of
any system is painful. The actual conversion time is usually the least
painful part of it.

> Using relational in the classical way works for things

like tax and bank databases because the data are not 'real' things, but
tabular accounting constructs.

oh? what's a real thing? I don't think these things are any less
real than other things, just that the business environment dictated
that IT requirements came first a thousand years ago.

Grahame

Tim Cook schreef:

  

Why?

(Not that I intend to do that)

(please do NOT use MySQL for healthcare
information)
      
I will "assume" that your WHY? question was in response to this
statement I made?

In general (and for the speed results MySQL claims) it trades off ACID
qualities to attain them. There are **MANY** documents available on the
Internet to adjudge the differences between the needs of a high speed
blog service etc, and the needs of a real DBMS with ACID integrity and
MVCC.
  

This old news, it depends on many parameters how MySQL behaves. You can
use the InnoDB table format which acts 100% according to ACID-specs.
Not that I would advise MySQL, but I would not advise against it for the
reason you gave, because that is not valid anymore, since version 5.x.
InnoDB is supported by MySQL since 2001.

It is a good DBMS, in fact, it uses SAP-technology (read this, from
2003: http://www.infoworld.com/article/03/05/27/HNmysql_1.html)
Oracle has taken a license on the InnoDB format because it is very good,
yes, it is the same format which you can use in MySQL (just one line in
the configuration-file)
http://sql-info.de/mysql/oracle-acquires-innodb.html

There are many benchmarks ( I have no interest in either system ) that
show that large scale SMP systems perform better with PostgreSQL vs.
MySQl. IMHO, you should use an OODBMS for most installations anyway. I
  

It is a good idea, but there are also downsides on this.
- it is not really necessary, because there are very good ways to map
from objects to relational databases. Read the papers from Scot W.
Ambler on this.
- it limits your choice of vendors
- it limits your choice of programming languages, OS-platforms
- it makes your software less transparant to vendor idnependency
- OORDMS's are much more expensive
- they are less mature

And to map the RM-objects to a relational database it is really
suffcient to use ANSI 1992 SQL, which means that your SQL-code is not
vendor-specific.

regards
Bert Verhees

As a developer from the US who sometimes tries to follow discussions here, I have a question probably well answered if I took more time myself to find the answer. Against what do your archetype queries run? Against the DB itself or some representation of the data in memory? I ask because a few months ago, someone from openEHR said in one of the discussions that a DB schema is not part of openEHR, that some private participant in openEHR had one for sale, and Ocean, maybe, but that was it. So, again against what do these queries (see example in Chunlan Ma’s message below) run?

That’s good question. You’ve noticed that I didn’t mention anything about the data store here. In general, query languages are designed specifically for a type of data store. For instance, SQL is run against relational databases. XQuery is run against XML structured data. Object Oriented Query Language need to be run against Object oriented database management systems etc. These types of query language are data query languages, i.e. they query at the data level. You have to know DB schema when you write a SQL query statement.

Archetype Query Language is different from the general query languages. It’s a semantic query language, i.e. it queries data at semantic level. It’s neutral to persistence layer and system design. We only need to use archetype path and openEHR reference model to construct AQL statement. You can have whatever persistence layer as long as it can get expected results back based on the AQL statement. That’s why AQL queries can be shared across systems and enterprise boundaries. Sharing AQL is one of the key solution to achieve semantic interoperability.

Cheers,

Chunlan

Thanks,

Randy Neall

Veriqant, L.L.C.

I think I understand. Thanks. What actually gets persisted, I suspect, are the paths–and values pointed to by those paths–implicit in your archetype object graph, correct? And to convert AQL query into an SQL query you somehow extract that path from AQL and convert it into some sort of SQL, right? Is there anything on your web site about this, about deriving a DB query from an archtype query?

You can have whatever persistence layer as long as it can get expected results back based on the AQL statement.

–That’s the question. How do you "get expected results back based on the AQL Statement?

Thanks,

Randy Neall

Hi Randolph,

Currently, the only AQL query parser that I know of is one that is part of the Ocean Informatics suite of products and runs against the Ocean EhrBank openEHR repository.

Converting AQL to SQL will depend entirely on what your underlying persistence model is and also to some extent what relational database flavour you are using. openEHR doesn’t mandate any particular persistence model and as has been already stated, the really nice thing about AQL is that queries are independent of any underlying relational (or object) data model. So an AQL query that is run against two separate and completely independently developed openEHR repositories that probably use a completely separate persistence model should return exactly the same data (as long as they are both conformant).

The persistence model that Ocean uses is a trade off between completely atomising objects and storing them as blobs. This has been a process of optimisation and we are really happy with the current performance of the system. This is only one of many possible methods of openEHR persistence.

regards Hugh

Randolph Neall wrote: