Status of Java RI and openEHR in general

Mihai_Tarce · 24 September 2011 19:39

Hello,

I heard about the openEHR initiative some time ago and have been following it with interest, from a developer perspective. The recent transition announcement sounds encouraging, however there are some questions which I feel are left unanswered, after doing my best to read through the openEHR documentation on the website:

It would be great to have a more detailed and up-to-date status of projects. Looking at the Java reference implementation, I see it is incomplete, lagging behind the latest version of the specification (at least on the project page), and the latest commits are from about a year ago (with the exception of 2 commits on the xml-serializer and adl-parser – a bug fix from what I saw). Documentation on the kernel is even more scarce. Please do not interpret this a critique against the developers working on it, I just think it would be very helpful to know what parts need the most work, so that other people can contribute. Or is the information available and I just overlooked it?
The concern above extends to the specification itself, although this is a bit off topic on this mailing list. While news that various governments are adopting the openEHR standard for their health care systems is encouraging, seeing that version 1.0.2 was published almost 3 years ago and that not much information is available about the future of the standard is not. I understand that this is a standard, and therefore should not be subject to a lot of change, but, again, it would be very comforting to have a more detailed roadmap and an up-to-date status available.
I’ve thought about starting a clean room implementation, and used the XML schema (via JAXB) to generate the objects. While functional, the result was sub-optimal (generics are not used, Java naming conventions are ignored, etc.). Since Eiffel is used for the original specification, one reason being that it supports OCL-like constraints, I also thought about generating an Ecore (Eclipse EMF) meta-model, which can be generated from UML diagrams (although that would also result in the problems mentioned above) and supports OCL constraints. Manually creating the model might be a better choice in my opinion. Is there any similar initiative currently in progress? I wouldn’t want to reinvent the wheel.
Especially following the transition announcement, I’m considering working on a tool for easier archetype modeling, which will be essential before any large scale clinical implementation can take place. The recent discussion on the technical mailing list regarding web-based vs. rich client solutions has merit, but for me is still a long way off, at least until the issues above are dealt with. Are there any existing projects that cover this?

I am confident about the benefits an open implementation of openEHR can provide, and would like to contribute to it. Thank you for reading, I’ve tried to summarize my points as best I could. I am looking forward to hearing your opinions.

Regards,

Mihai

system · 25 September 2011 19:36

Hi and welcome!

Nice to see new people in the community!

Regarding #3 below:- You may want to pick up where this experiment left off:
http://www.openehr.org/wiki/display/dev/Experimental+generation+of+code+and+documentation+from+UML

And read the message at:
http://www.openehr.org/mailarchives/openehr-technical/msg05665.html

Best regards,
Erik Sundvall
erik.sundvall@liu.se http://www.imt.liu.se/~erisu/ Tel: +46-13-286733

2011/9/24 Mihai Tarce <mihaitarce@gmail.com>

thomas.beale · 25 September 2011 22:55

Hello,

I heard about the openEHR initiative some time ago and have been following it with interest, from a developer perspective. The recent transition announcement sounds encouraging, however there are some questions which I feel are left unanswered, after doing my best to read through the openEHR documentation on the website:

It would be great to have a more detailed and up-to-date status of projects. Looking at the Java reference implementation, I see it is incomplete, lagging behind the latest version of the specification (at least on the project page), and the latest commits are from about a year ago (with the exception of 2 commits on the xml-serializer and adl-parser – a bug fix from what I saw). Documentation on the kernel is even more scarce. Please do not interpret this a critique against the developers working on it, I just think it would be very helpful to know what parts need the most work, so that other people can contribute. Or is the information available and I just overlooked it?

You can get all the SVN RSS feeds from this page.

The concern above extends to the specification itself, although this is a bit off topic on this mailing list. While news that various governments are adopting the openEHR standard for their health care systems is encouraging, seeing that version 1.0.2 was published almost 3 years ago and that not much information is available about the future of the standard is not. I understand that this is a standard, and therefore should not be subject to a lot of change, but, again, it would be very comforting to have a more detailed roadmap and an up-to-date status available.

actually, the main reason no changes have been made for some time is that over 18 months was spent working on an organisational merger with IHTSDO, which was backed by the IHTSDO board, but fell through in the end.

However, the community is pretty active in recording issues, and you can see around 40 issues logged over the last 3 years on the SPECPR tracker. As you have seen, the foundation is about to move to a new governance model, so we will start processing these issues once this happens. During this time, most of the proposed solutions (in no way guaranteed to be mutually consistent!) have been tried out in various implementations in the community, so that when these PRs get processed officially there will be good evidence for deciding quickly on them.

I’ve thought about starting a clean room implementation, and used the XML schema (via JAXB) to generate the objects. While functional, the result was sub-optimal (generics are not used, Java naming conventions are ignored, etc.). Since Eiffel is used for the original specification, one reason being that it supports OCL-like constraints, I also thought about generating an Ecore (Eclipse EMF) meta-model, which can be generated from UML diagrams (although that would also result in the problems mentioned above) and supports OCL constraints. Manually creating the model might be a better choice in my opinion. Is there any similar initiative currently in progress? I wouldn’t want to reinvent the wheel.

there are a few things going on, and probably there will always be, because the modelling problem has to be attacked from different angles:

‘Faithful’ source models - models that accurately reflect the published semantics of the specifications:
For software development: Erik Sundvall (Linköping, Sweden) and Eric Browne (Australia) have separately started a port of the model to various UML tools. I see Erik has replied to this post, so you have the details there.
this kind of approach is useful for two things: generating online UML documentation & diagrams, and potentially useful for helping to build and/or reason about software (although many people I know involved in building serious systems don’t appear to use UML for much more than team discussions of models).
For Archetype validation and modelling: I have continued to develop the reference ADL workbench, which is model-driven. In this case, the goal was the same: represent the original specifications faithfully. In theory this might have been possible with XMI, but it is hard work, and UML is still lacking in some respects. EMF did not have generic types when I started on this, so I created a ‘Basic Meta-model’ (BMM)specification and schemas, which drive the ADL Workbench. Detailed reasons for this here. The BMM schemas are far easier to understand than XMI, but nevertheless express proper object-model semantics (data view). At some point, I will try to morph these schemas to a form where they can be exported in an EMF format.
Note: currently the BMM schemas express class invariants in a way more familiar to UML users, e.g. optionality is a boolean flag on a class called ‘is_mandatory’. Not all invariants from the specifications are expressible in this way, but most of the ones relating to the information (i.e. data) view of the model are there.
XSD-based models - the openEHR XSDs are - like any XSD - not faithful renditions of the models of the source specifications - due to the different inheritance semantics, lack of generic types and so on. However, they do faithfully express the basic data semantics. Starting from the XSD is clearly useful for generating classes that can deal with openEHR XML data and process it at the XML level. Seref Arikan at CHIME, UCL is working on a bridge between XSD-based Java classes and the BMM models exposed by the ADL Workbench above.

in some imaginary perfect world, all these models would be one and the same thing, but unfortunately, they are not - we are stuck with the differences and limitations. Ecore/EMF may be one place to eventually connect them and integrate them.

Especially following the transition announcement, I’m considering working on a tool for easier archetype modeling, which will be essential before any large scale clinical implementation can take place. The recent discussion on the technical mailing list regarding web-based vs. rich client solutions has merit, but for me is still a long way off, at least until the issues above are dealt with. Are there any existing projects that cover this?

The ADL workbench will become a ‘technical’ editor in the near future, i.e. allowing editing of archetypes with the full technical aspects of both the AOM and the RM visible. These are generally hidden as much as possible in clinical modelling tools like the Archetype Editor. This is a short-term effort (I hope;-)

A medium term effort getting underway is called ‘Bosphorus’, by Seref is to connect a Java front-end to the ADL Workbench back-end (i.e. executed as a server) via some more modern technical tricks (ZeroMQ, Google Protocol Buffers) than JNI or other naive wrapping methods.

The reason for doing this is that we think that the work of implementing the front-end and message-based bridge to be less than re-implementing the entire ADL compiler right now. The reason to do that is that some final aspects of ADL 1.5 are still under development, and creating a complete ground up ADL 1.5 Java toolstack right now seems premature (I suspect it is worth waiting for another year before doing that). Seref aims to connect XSD-based Java classes to this framework.

I would therefore think that if you want to work in Java, Bosphorus may be the most interesting project.

I am confident about the benefits an open implementation of openEHR can provide, and would like to contribute to it. Thank you for reading, I’ve tried to summarize my points as best I could. I am looking forward to hearing your opinions.

Regards,

hope this helps,

thomas beale

Seref · 26 September 2011 09:23

Greetings Mihai,
You’ve mentioned a few things, but they are all actually complicated topics. I can comment on some of them.

I’ve been working on Ecore and EMF for some time now, to see if I can pull the specs into a fully computable space. The definition of computable is a separate discussion of course, but my personal view is that I’d expect to have an object oriented api for modelling related tasks from a formal representation. EMF is a large ecosystem, with lots of good frameworks supporting the core EMF functionality. It is the most promising thing out there for my purposes.

However, it is not free of issues. If anyone is to use EMF for modelling, EMF should not leak into the software development space. Neither the XSD nor the code you generate from EMF should not have dependencies to EMF. When you have these kind of dependencies, you lose the opportunity to support non Java platforms via your modelling technology, which is not acceptable. Even when you’re working in Java completely, a clean separation should exist.

I’m trying to set XSD as the common layer below EMF, but this is not as easy as it looks. In order to support EMF based models across platforms via XSD, we need to start with the existing openEHR XSDs. This means that we need XSD → ECore → XSD roundtrip capability. I need to perform some experiments, but last time I’ve checked this was broken. ECore models generated from XSDs can be exported to XSD again, but EMF puts some EMF specific information into generated schema, which must be manually cleaned up.
Therefore, if you start with XSD to use Ecore, the round trip is broken. I think I may find a way of fixing this, so I have not given up on it, but at the moment, it is still a problem.

When it comes to JAXB and XSD based representation of parts of openEHR specification, I think it is a good target for improving adoption in the industry. That is why I’m doing the work Thomas has mentioned. But XSD is not lossless when it comes to representing semantics of Object Oriented concepts, as you’ve also written. Beyond that, XSD is about data, not behavior, unless you develop another layer such as the SOAP based web services, ports and methods in WSDL etc. So having XSD to generate classes gives you some capability to move data around (with issues) but you still need to implement methods from the openEHR specification on top of the generated classses. My current approach is to wrap XSD generated classes for storing data, and using adapters/facades to manage this.

In my humble opinion, the most important contribution from anyone is feedback for solving these modelling framework problems.

Regards
Seref

2011/9/24 Mihai Tarce <mihaitarce@gmail.com>

Mihai_Tarce · 26 September 2011 20:36

Thank you for the prompt reply. I had a look at the links you posted, and it does indeed seem almost exactly what I was thinking of doing.
The svn.openehr.org site seems to be down, but I managed to get some of the files in the repository using the web-view. I’ll have a closer look and post back here.

Mihai_Tarce · 26 September 2011 21:23

Hello,

I heard about the openEHR initiative some time ago and have been following it with interest, from a developer perspective. The recent transition announcement sounds encouraging, however there are some questions which I feel are left unanswered, after doing my best to read through the openEHR documentation on the website:

It would be great to have a more detailed and up-to-date status of projects. Looking at the Java reference implementation, I see it is incomplete, lagging behind the latest version of the specification (at least on the project page), and the latest commits are from about a year ago (with the exception of 2 commits on the xml-serializer and adl-parser – a bug fix from what I saw). Documentation on the kernel is even more scarce. Please do not interpret this a critique against the developers working on it, I just think it would be very helpful to know what parts need the most work, so that other people can contribute. Or is the information available and I just overlooked it?

You can get all the SVN RSS feeds from this page.

Ah, perfect, the feeds are a great help for keeping up-to-date.

The concern above extends to the specification itself, although this is a bit off topic on this mailing list. While news that various governments are adopting the openEHR standard for their health care systems is encouraging, seeing that version 1.0.2 was published almost 3 years ago and that not much information is available about the future of the standard is not. I understand that this is a standard, and therefore should not be subject to a lot of change, but, again, it would be very comforting to have a more detailed roadmap and an up-to-date status available.

actually, the main reason no changes have been made for some time is that over 18 months was spent working on an organisational merger with IHTSDO, which was backed by the IHTSDO board, but fell through in the end.

Sorry to hear that, I’ve only been following this industry for a short time, but the number of competing standards is staggering. I really hope things will get better in the future.

However, the community is pretty active in recording issues, and you can see around 40 issues logged over the last 3 years on the SPECPR tracker. As you have seen, the foundation is about to move to a new governance model, so we will start processing these issues once this happens. During this time, most of the proposed solutions (in no way guaranteed to be mutually consistent!) have been tried out in various implementations in the community, so that when these PRs get processed officially there will be good evidence for deciding quickly on them.

Sounds encouraging. I had looked at the JIRA tracker before, but without knowing about the background of the last 18 months, the 3 year old unresolved issues looked quite worrying.

I’ve thought about starting a clean room implementation, and used the XML schema (via JAXB) to generate the objects. While functional, the result was sub-optimal (generics are not used, Java naming conventions are ignored, etc.). Since Eiffel is used for the original specification, one reason being that it supports OCL-like constraints, I also thought about generating an Ecore (Eclipse EMF) meta-model, which can be generated from UML diagrams (although that would also result in the problems mentioned above) and supports OCL constraints. Manually creating the model might be a better choice in my opinion. Is there any similar initiative currently in progress? I wouldn’t want to reinvent the wheel.

there are a few things going on, and probably there will always be, because the modelling problem has to be attacked from different angles:

‘Faithful’ source models - models that accurately reflect the published semantics of the specifications:

For software development: Erik Sundvall (Linköping, Sweden) and Eric Browne (Australia) have separately started a port of the model to various UML tools. I see Erik has replied to this post, so you have the details there.

this kind of approach is useful for two things: generating online UML documentation & diagrams, and potentially useful for helping to build and/or reason about software (although many people I know involved in building serious systems don’t appear to use UML for much more than team discussions of models).

For Archetype validation and modelling: I have continued to develop the reference ADL workbench, which is model-driven. In this case, the goal was the same: represent the original specifications faithfully. In theory this might have been possible with XMI, but it is hard work, and UML is still lacking in some respects. EMF did not have generic types when I started on this, so I created a ‘Basic Meta-model’ (BMM)specification and schemas, which drive the ADL Workbench. Detailed reasons for this here. The BMM schemas are far easier to understand than XMI, but nevertheless express proper object-model semantics (data view). At some point, I will try to morph these schemas to a form where they can be exported in an EMF format.

Note: currently the BMM schemas express class invariants in a way more familiar to UML users, e.g. optionality is a boolean flag on a class called ‘is_mandatory’. Not all invariants from the specifications are expressible in this way, but most of the ones relating to the information (i.e. data) view of the model are there.

XSD-based models - the openEHR XSDs are - like any XSD - not faithful renditions of the models of the source specifications - due to the different inheritance semantics, lack of generic types and so on. However, they do faithfully express the basic data semantics. Starting from the XSD is clearly useful for generating classes that can deal with openEHR XML data and process it at the XML level. Seref Arikan at CHIME, UCL is working on a bridge between XSD-based Java classes and the BMM models exposed by the ADL Workbench above.

in some imaginary perfect world, all these models would be one and the same thing, but unfortunately, they are not - we are stuck with the differences and limitations. Ecore/EMF may be one place to eventually connect them and integrate them.

Thank you very much for taking the time to explain all of this. Regarding your BMM specification, I can see why you would want to develop your own (at the time, there was no viable alternative), and I have to say I am impressed by the thorough approach you have chosen. Great work!

Even with my limited knowledge of the subject, transforming these models to different languages seems difficult, if not impossible, not even considering lossless round-trip transformations. That’s why I’m hesitant of starting an EMF ‘port’ – not to end up with just another semi-compatible model at the end. I’ll read up some more on the alternatives before making a decision.

Especially following the transition announcement, I’m considering working on a tool for easier archetype modeling, which will be essential before any large scale clinical implementation can take place. The recent discussion on the technical mailing list regarding web-based vs. rich client solutions has merit, but for me is still a long way off, at least until the issues above are dealt with. Are there any existing projects that cover this?

The ADL workbench will become a ‘technical’ editor in the near future, i.e. allowing editing of archetypes with the full technical aspects of both the AOM and the RM visible. These are generally hidden as much as possible in clinical modelling tools like the Archetype Editor. This is a short-term effort (I hope;-)

A medium term effort getting underway is called ‘Bosphorus’, by Seref is to connect a Java front-end to the ADL Workbench back-end (i.e. executed as a server) via some more modern technical tricks (ZeroMQ, Google Protocol Buffers) than JNI or other naive wrapping methods.

The reason for doing this is that we think that the work of implementing the front-end and message-based bridge to be less than re-implementing the entire ADL compiler right now. The reason to do that is that some final aspects of ADL 1.5 are still under development, and creating a complete ground up ADL 1.5 Java toolstack right now seems premature (I suspect it is worth waiting for another year before doing that). Seref aims to connect XSD-based Java classes to this framework.

I would therefore think that if you want to work in Java, Bosphorus may be the most interesting project.

I have to admit I only looked at some screenshots of both archetype editors, the Archetype Editor because it’s for Windows (well, .NET), and I’m using Linux (I guess I could try Mono…), and yours because I couldn’t get it to run (there was a page-long dump when starting it from the terminal – I can try again if you’d like more information, or maybe give a hand in trying to work out the cause). Also, to answer your question in the release notes – Yes, at least one person is trying to use the Linux build

I would be more than happy to avoid reimplementing anything I can, especially a compiler, so I’ll continue doing some research – especially on the Bosphorus project you mentioned.

The openEHR website also mentions that ADL and AOM are set to become ISO standards… I assume you’re refering to a specific version – if so, which one? Wouldn’t that version be likely to become the most ‘popular’? Also, I’ve noticed version 2.0 of the ADL is on the official specifications page – is there a roadmap for the ADL 1.x and 2.x branches?

I am confident about the benefits an open implementation of openEHR can provide, and would like to contribute to it. Thank you for reading, I’ve tried to summarize my points as best I could. I am looking forward to hearing your opinions.

Regards,

hope this helps,

More than you know. Thanks!

Mihai_Tarce · 26 September 2011 21:44

Hello,

Sorry for the shotgun approach, but I've been reading up for some time,
and I tried to pack most of my unanswered questions in that post.

However, it is not free of issues. If anyone is to use EMF for
modelling, EMF should not leak into the software development space.
Neither the XSD nor the code you generate from EMF should not have
dependencies to EMF. When you have these kind of dependencies, you
lose the opportunity to support non Java platforms via your modelling
technology, which is not acceptable. Even when you're working in Java
completely, a clean separation should exist.

From what I've seen EMF generates Java code which does implement some
extra functionality (reflection, listener support, etc.), but I don't
imagine generating some POJOs from Ecore models would be very demanding.

I'm trying to set XSD as the common layer below EMF, but this is not
as easy as it looks. In order to support EMF based models across
platforms via XSD, we need to start with the existing openEHR XSDs.
This means that we need XSD -> ECore -> XSD roundtrip capability. I
need to perform some experiments, but last time I've checked this was
broken. ECore models generated from XSDs can be exported to XSD again,
but EMF puts some EMF specific information into generated schema,
which must be manually cleaned up.

The thought of transforming XSD -> Ecore -> XSD (roundtrip) is
frightening, just after trying some of my own (basic) samples. However,
it's also clear to me that having a clearly defined and functional
transformation is fundamental to increase adoption. Personally, after
reading the specifications, I'm still more tempted to go with a
cleanroom approach in Ecore (+OCL), which could then (hopefully) be
transformed to other models as needed (possibly even the original
openEHR XSDs). But I still feel I need to read a lot more before even
considering it.

When it comes to JAXB and XSD based representation of parts of openEHR
specification, I think it is a good target for improving adoption in
the industry. That is why I'm doing the work Thomas has mentioned. But
XSD is not lossless when it comes to representing semantics of Object
Oriented concepts, as you've also written. Beyond that, XSD is about
data, not behavior, unless you develop another layer such as the SOAP
based web services, ports and methods in WSDL etc. So having XSD to
generate classes gives you some capability to move data around (with
issues) but you still need to implement methods from the openEHR
specification on top of the generated classses. My current approach
is to wrap XSD generated classes for storing data, and using
adapters/facades to manage this.

Like you said, behavior is an integral part of openEHR models.
Personally (and, again, based on very little experience), I only see the
XSDs serving for transferring (or maybe temporarily storing) the data,
and delaying the processing until the model is transformed (either to
Ecore, or something else).

Thanks!

thomas.beale · 26 September 2011 22:18

I have to admit I only looked at some screenshots of both archetype editors, the Archetype Editor because it’s for Windows (well, .NET), and I’m using Linux (I guess I could try Mono…), and yours because I couldn’t get it to run (there was a page-long dump when starting it from the terminal – I can try again if you’d like more information, or maybe give a hand in trying to work out the cause). Also, to answer your question in the release notes – Yes, at least one person is trying to use the Linux build

We did not publish the latest release in Linux form - if you have a windows machine available, you should look at the windows build to get an idea. IN the meantime, Peter Gummer, who has taken care of the Linux builds in the past can probably create another one (actually, you have all the source, so of course you are welcome to try). The Linux build is pretty faithful - it’s based on Gtk, so there can be minor GUI quirks (Peter usually finds a way to neutralise them). This is one of the atractions of Eiffel - single code-base => multiple platforms, including GUI.

I would be more than happy to avoid reimplementing anything I can, especially a compiler, so I’ll continue doing some research – especially on the Bosphorus project you mentioned.

The openEHR website also mentions that ADL and AOM are set to become ISO standards… I assume you’re refering to a specific version – if so, which one? Wouldn’t that version be likely to become the most ‘popular’? Also, I’ve noticed version 2.0 of the ADL is on the official specifications page – is there a roadmap for the ADL 1.x and 2.x branches?

wow - that’s a page in need of an update;-) Snapshots of ADL/AOM 1.4 became ISO 13606-2. In this standard (I think I have this right - I dont use the standard directly myself), the AOM is what is normative, with ADL being one possible serialisation. Dipak Kalra (main editor of 13606) recently said that ADL 1.5 could be automatically considered valid even by the existing standard.

ADL 1.5 fixes the two serious problems in ADL 1.4, and a lot of omissions - it is a major advance, and all the openEHR tools will move to ADL/AOM 1.5 over time. See here for the details.

thomas

system · 27 September 2011 08:38

Hi Mihai,

I will try to response to your questions regarding the java reference
implementation project.

I agree that status of the Java implementation page should be more
up-to-date. Probably the scope of the project should be adjusted as
well. In fact, the current code base provides java implementation of
the latest _stable_ openEHR specifications, in particular the
reference models and archetype models. The java library from the
project is being used by a number of java-based archetype editors,
application frameworks and also the openEHR Clinical Knowledge
Manager.

The project is lagging behind the draft specificaiton implementation,
in particular ADL 1.5, which is a major advance in openEHR's design.
Seref's project is probably the best option if you need it now. My
guess is that eventually we would like to have a java based ADL parser
and template compiler in the library.

Regarding the documentation of the java code, the generated javadoc is
a good place to start. The original openEHR design spcifications are
of course the best documentation since java implementation tries to
stay as close to the openEHR design intentions as possible. Last but
not least, there are large number of unit testcases exercising the
code in various ways and they can be of some form of documentation.

As always, you are welcome to contribute to the java implementation.
If you have any further questions, please feel free to post them on
the mailing list dedicated to the java implementation.
http://www.chime.ucl.ac.uk/mailman/listinfo/ref_impl_java

Cheers,
Rong

Topic		Replies	Views
ADL reading Technical (archive)	15	0	30 January 2012
openEHR future directions Clinical (archive)	7	0	18 March 2011
Problem with ADL_parser Technical (archive)	14	2	26 September 2005
Implementation Questions Technical (archive)	5	0	11 September 2006
Open Sorce Database Technical (archive)	7	0	7 August 2007
Microsoft/NHS common health interface and openEHR datatypes Clinical (archive)	24	0	14 February 2008
where to start? Technical (archive)	6	1	20 November 2007
openEHR and HL7 – some thoughts on the current discontents from Chair of openEHR Announcements (archive)	5	1	6 October 2006
Parsing archetype xml with JAXB Technical (archive)	11	0	9 June 2008
How to start Technical (archive)	26	0	15 August 2013

Status of Java RI and openEHR in general

Related topics