Thomas Beale wrote:
Tim,
Although we will probably have to wait some time (let's say 6-12 months for
early ones) for 3rd party evaluations to start happening, there may well be
mileage in discussing exactly what kinds of evaluations you think people would
like to see.
Agree.
From my engineering point of view, performance, volumetrics and
availability in heavy multi-user situations are basics.
Agree. As a public health epidemiologist, I am personally interested in
how openEHR might perform in the setting of population-based data
collections, where millions (or tens or hundreds of millions) of records
and large aggregate queries, such as cross-tabulation queries - are the
norm. We know that traditional normalised relational databases don't cut
the mustard for such purposes, keen to see whether openEHR does.
We can also try to show
that the maintenance cost and semantic reliability are greatly improved
(although the former might take a 5 year study). But there must be many other
interesting measures to consider.
How to measure "semantic reliability" is the challenge, but if openEHR
claims to offer improvements in semantic interoperability, then these
claims or aims need to be evaluated as rigorously as possible - ideally
quantitatively, but at least qualitatively in a systematic and
principled fashion rather than just anecdotally (which is what we've
seen so far: assertions like "openEHR does work works well for us in our
lab/product").
Here is a first stab - I am sure much better and more elegant evaluation
ideas can be developed:
1) Assemble descriptions of several clinical situations. By
descriptions, I mean natural language descriptions in the form of
full-text, unencoded clinical histories and progress notes etc, abetted
by investigation results, diagnostic images and their reports, and
perhaps even photos of the patient and so on. A dictated, textual report
which runs to 4 or 5 or more pages from a fastidious and diligent
specialist physician to a GP might be good raw material.
2) Train at least two independent groups of clinicians and
informaticians (but not people with personal investments of time or
otehr interests in the development of openEHR i.e. not anyone on this
list) in the use of openEHR, including the creation of archetype
definitions and templates.
3) Give these groups access to the same repository of openEHR archetype
definitions and templates, and access to the same set of openEHR
software tools, and ask each group to independently a) select and/or
construct a set of archetype and template definitions which they feel
are required to capture the information in the clinical material (as
described above) provided to them, and b) to capture the clinical
scenario information in the openEHR structures that they create.
4) Have a third party compare in some pre-defined way the two openEHR
versions of the original clinical scenario, using a predefined scoring
system (which would need to be developed, or do such things exist?).
Alternatively, the two groups might exchange their openEHR
representations and rate each other.
There are lots of variations on this theme, but this sort of evaluation
would seem to test the "impedance" and information loss at: a) the
human->openEHR interface; b) the openEHR<->openEHR data exchange
interface;, and c) the openEHR->human interface. All of these interfaces
matter, and even if openEHR is a great technical solution, it is how it
works in the real world, which is what an evaluation like the one
outlined above aims to test, is what matters.
Of course, all the or forgoing depends just as much on the quality and
completeness and scope of things like the openEHR archetype and
templates repository(s), and of end-user openEHR training materials etc,
and not just on the technical correctness of the openEHR constraints or
query languages and so on. Although a good technical basis is a good
place to start. But it is just the start, if widespread adoption of
openEHR is the goal.
Tim C