Wish list updated and current status of the Java Archetype Editor

Hi all,

Status:
Currently our editor supports Composition and Section archetypes to the extent that the kernel allows.
It also supports event series in Observation archetypes and some bugs have been fixed.

The most severe issues are:

  1. Kernel and parser don’t support archetype slots fully which means the included or excluded archetypes aren’t parsed, stored nor outputted.
  2. The new archetype format with isn’t supported by the kernel and parser.
  3. Instruction and Action archetypes aren’t supported by the kernel and parser.
  4. Embedded dADL inside cADL e.g. "C_QUANTIY = < … > " isn’t supported by the kernel and parser.
  5. Assumed value isn’t supported by the kernel and parser.

Solutions to the issues listed above will accordingly result in the following for our archetype editor:

  1. The editor will fully support Section and Composition archetypes.
  2. & 4. & 5. The editor will support many of the archetypes at openEHR knowledge repository and at Ocean Informatics archetype repository.
  3. Will result in the possibility to support archetypes for the whole entry package in our editor.

We really would appreciate if we could get some approximate estimates on when the issues listed above are solved. It would be nice if the easiest were dealt with first in order not to stop the progress with our archetype editor.

Regards,

Johan & Mattias

Johan Hjalmarsson wrote:

Hi all,

Status:
Currently our editor supports Composition and Section archetypes to
the extent that the kernel allows.
It also supports event series in Observation archetypes and some bugs
have been fixed.

The most severe issues are:
1. Kernel and parser don't support archetype slots fully which means
the included or excluded archetypes aren't parsed, stored nor outputted.
2. The new archetype format with [] isn't supported by the kernel and
parser.
3. Instruction and Action archetypes aren't supported by the kernel
and parser.
4. Embedded dADL inside cADL e.g. "C_QUANTIY = < ... > " isn't
supported by the kernel and parser.
5. Assumed value isn't supported by the kernel and parser.

Solutions to the issues listed above will accordingly result in the
following for our archetype editor:
1. The editor will fully support Section and Composition archetypes.
2. & 4. & 5. The editor will support many of the archetypes at openEHR
knowledge repository and at Ocean Informatics archetype repository.
3. Will result in the possibility to support archetypes for the whole
entry package in our editor.

We really would appreciate if we could get some approximate estimates
on when the issues listed above are solved. It would be nice if the
easiest were dealt with first in order not to stop the progress with
our archetype editor.

Hi Johan, Mattias,

Thanks for the update on the editor.

I have been working on the parser recently. Here is my update of the
parser development in relation to your wish list.

No.2 issue is caused by the object key syntax included in dADL, which is
now used to specify most parts of archetype, e.g. description, ontology.
The current parser doesn't really recognize dADL properly, instead it
uses more specific syntax to parse different parts of the archetype. It
means the support of [] can be added into different parts of the parser
by updating the grammar for differently parts respectively. This is of
course not the ideal way to do, but it works as quick fixes to the
existing parser without major changes.

The support for [] in archetype description is probably OK now and I
hope the update on the ontology part could be done within a couple of days.

No.5 (assumed values) will be dealt with after the No.2, so perhaps it
will be ready end of this week.

No.4 is related to [] support as well.

Complete fix for No.1 (archetype slot) requires implementing assertion
package of AOM and its link to the parser. This would happen with 2-3
weeks hopefully.

No. 3 issues require some update of the RM, particularly the entry
package from ehr_im which is quite different from the current kernel.
It's hard to estimate since it depends how many developer could
contribute. But let say if the UCL group or your group can start to work
on the RM upgrade while I am fix the parser/AOM, then it should be
possible for these to be done within 3-4 weeks.

Regards,
Rong

2006/3/27, Rong Chen <rong@acode.se>:

Johan Hjalmarsson wrote:

Hi all,

Status:
Currently our editor supports Composition and Section archetypes to the extent that the kernel allows.
It also supports event series in Observation archetypes and some bugs have been fixed.

The most severe issues are:

  1. Kernel and parser don’t support archetype slots fully which means the included or excluded archetypes aren’t parsed, stored nor outputted.
  2. The new archetype format with isn’t supported by the kernel and parser.
  3. Instruction and Action archetypes aren’t supported by the kernel and parser.
  4. Embedded dADL inside cADL e.g. "C_QUANTIY = < … > " isn’t supported by the kernel and parser.
  5. Assumed value isn’t supported by the kernel and parser.

Solutions to the issues listed above will accordingly result in the following for our archetype editor:

  1. The editor will fully support Section and Composition archetypes.
  2. & 4. & 5. The editor will support many of the archetypes at openEHR knowledge repository and at Ocean Informatics archetype repository.
  3. Will result in the possibility to support archetypes for the whole entry package in our editor.

We really would appreciate if we could get some approximate estimates on when the issues listed above are solved. It would be nice if the easiest were dealt with first in order not to stop the progress with our archetype editor.

Hi Johan, Mattias,

Thanks for the update on the editor.

I have been working on the parser recently. Here is my update of the parser development in relation to your wish list.

No.2 issue is caused by the object key syntax included in dADL, which is now used to specify most parts of archetype, e.g. description, ontology. The current parser doesn’t really recognize dADL properly, instead it uses more specific syntax to parse different parts of the archetype. It means the support of can be added into different parts of the parser by updating the grammar for differently parts respectively. This is of course not the ideal way to do, but it works as quick fixes to the existing parser without major changes.

Hi everyone,

Thanks Rong for the time estimates. We have seen that the parser uses specific syntax for each case, but we assume it is hard to implement a general parser with JavaCC. As long as the description and ontology parts are changed, then it wouldn’t be much problems in parsing some Ocean archetypes.

The support for in archetype description is probably OK now and I hope the update on the ontology part could be done within a couple of days.

No.5 (assumed values) will be dealt with after the No.2, so perhaps it will be ready end of this week.

No.4 is related to support as well.

Great to hear that you have made progress, we’re looking forward to seeing some commits soon :slight_smile:

As we might have told you before number 4 often relates to the C_QUANTITY data type and object instantiation in the kernel can be completely ignored for this data type as long as the AOM for it hasn’t been updated.

Complete fix for No.1 (archetype slot) requires implementing assertion package of AOM and its link to the parser. This would happen with 2-3 weeks hopefully.

We looked at some specifications and we’re wondering if it is needed to implement the assertion package now, since it adheres to an optional section in ADL. At this stage we would be happy just to see that the archetype slots are parsed into the AOM with real includes and excludes.

What we would like to happen (if possible) is that the assertion package is implemented after the archetype slot object is supported with includes and excludes. Maybe the UCL group can implement the assertion package?

No. 3 issues require some update of the RM, particularly the entry package from ehr_im which is quite different from the current kernel. It’s hard to estimate since it depends how many developer could contribute. But let say if the UCL group or your group can start to work on the RM upgrade while I am fix the parser/AOM, then it should be possible for these to be done within 3-4 weeks.

We will focus on our archetype editor as long as we don’t get completely stuck and dependent of updates to the kernel/parser.

Cheers,

Mattias & Johan

Johan Hjalmarsson wrote:

Hi all,

Status:
Currently our editor supports Composition and Section archetypes to
the extent that the kernel allows.
It also supports event series in Observation archetypes and some bugs
have been fixed.

The most severe issues are:
1. Kernel and parser don't support archetype slots fully which means
the included or excluded archetypes aren't parsed, stored nor outputted.

you need this for Instruction and Action archetypes as well as for
Section archetypes. In general, any kind of archeytpe can have a slot.

2. The new archetype format with [] isn't supported by the kernel and
parser.

this should be very easy for Rong to change

3. Instruction and Action archetypes aren't supported by the kernel
and parser.

I suggest leaving this till we have it completely sorted in the ADL
workbench and Archetype Editor - which we have achieved last night I
think, but we need a few days on testing etc. We will put all the latest
archetypes onto openEHR.org within a week - these will become the
reference archetypes.

4. Embedded dADL inside cADL e.g. "C_QUANTIY = < ... > " isn't
supported by the kernel and parser.

this is easy to handle - as long as you can detect sections of dADL
embedded in cADL - the easiest way to do this is in the archetype
lexical analyser - just read in the whole section and pass it to a dADL
parser (you need to have separate cADL and dADL parsers).

5. Assumed value isn't supported by the kernel and parser.

this should be easy. There is a test archetype on openEHR.org to test this.

Solutions to the issues listed above will accordingly result in the
following for our archetype editor:
1. The editor will fully support Section and Composition archetypes.
2. & 4. & 5. The editor will support many of the archetypes at openEHR
knowledge repository and at Ocean Informatics archetype repository.
3. Will result in the possibility to support archetypes for the whole
entry package in our editor.

- thomas

Rong Chen wrote:

Complete fix for No.1 (archetype slot) requires implementing assertion
package of AOM and its link to the parser. This would happen with 2-3
weeks hopefully.
  

I suggest you limit this to just supporting slots which are defined by
regular expressions on archetype ids, and nothing more. I have rewritten
the assertion grammar today and am testing it. This should help.

No. 3 issues require some update of the RM, particularly the entry
package from ehr_im which is quite different from the current kernel.
It's hard to estimate since it depends how many developer could
contribute. But let say if the UCL group or your group can start to work
on the RM upgrade while I am fix the parser/AOM, then it should be
possible for these to be done within 3-4 weeks.
  

my group at UCL are working on this - they have done data types and I
think data structures by this week. I guess 2 - 3 weeks.

- thomas

Mattias Forss wrote:

2006/3/27, Rong Chen <rong@acode.se <mailto:rong@acode.se>>:

Johan Hjalmarsson wrote:

Hi everyone,

Thanks Rong for the time estimates. We have seen that the parser uses
specific syntax for each case, but we assume it is hard to implement a
general parser with JavaCC. As long as the description and ontology
parts are changed, then it wouldn't be much problems in parsing some
Ocean archetypes.

I don' t know how JavaCC works, but I have done a lot of parsing. The
general design has to be as 4 separate parsers:
- dADL parser
- cADL parser
- ADL arcetype parser (very simple; just extracts the sections and gives
them to the other parsers)
- assertion language parser

with ADL2, the ADL parser will go away (ADL2 means the archetype is
completely in dADL, with embedded cADL and assertion sections).

Currently, I just use the cADL parser to parse assertions, since (due to
archetype slots) cADL can have embedded assertions. So I pass the
invariant section of an archetype to the cADL parser and it can produce
the result. BUT.... I suggest that you leave this to last, and you could
reasonably assume that there will be no invariant section at all - I
think we only have one archetype so far where it is the case.

We looked at some specifications and we're wondering if it is needed
to implement the assertion package now, since it adheres to an
optional section in ADL. At this stage we would be happy just to see
that the archetype slots are parsed into the AOM with real includes
and excludes.

exactly.....just what I suggest

What we would like to happen (if possible) is that the assertion
package is implemented after the archetype slot object is supported
with includes and excludes. Maybe the UCL group can implement the
assertion package?

at the moment it is not a priority, because assertions (apart from very
simple ones in slot expressions) are not a priority for archetypes users.

- thomas

We have come up some issues that concerns the parser and AOM.

For example the current implementation support standard ADL COUNT data
type as it would be an openEHR profile C_COUNT data type. This means
that whenever the parser finds a COUNT it will check for mandatory
attributes that adhere to the C_COUNT (openEHR profile data type) and
try to instantiate the CCount class in the kernel, but this is not
correct. Instead it should only instantitate a CComplexObject so that it
could be possible to have a data type that don't have any attributes,
e.g. COUNT matches {*} which means that the CComplexObject has the any
allowed flag set to true. So this means, that for all the constraints
that are domain type constraints you need to check for the correct
syntax in the parser which is C_COUNT, C_QUANTITY etc.

Another thing is that RM object creation sometimes rely on that the
syntax for constraints is DV_TEXT etc, but we think that the DV check in
the function createRMObject should be skipped.

Regards,

Johan & Mattias

We have come up some issues that concerns the parser and AOM.

For example the current implementation support standard ADL COUNT data type as it would be an openEHR profile C_COUNT data type. This means that whenever the parser finds a COUNT it will check for mandatory attributes that adhere to the C_COUNT (openEHR profile data type) and try to instantiate the CCount class in the kernel, but this is not correct. Instead it should only instantitate a CComplexObject so that it could be possible to have a data type that don’t have any attributes, e.g. COUNT matches {*} which means that the CComplexObject has the any allowed flag set to true. So this means, that for all the constraints that are domain type constraints you need to check for the correct syntax in the parser which is C_COUNT, C_QUANTITY etc.

Another thing is that RM object creation sometimes rely on that the syntax for constraints is DV_TEXT etc, but we think that the DV check in the function createRMObject should be skipped.

Regards,

Johan & Mattias

Johan Hjalmarsson wrote:

We have come up some issues that concerns the parser and AOM.

For example the current implementation support standard ADL COUNT data
type as it would be an openEHR profile C_COUNT data type. This means
that whenever the parser finds a COUNT it will check for mandatory
attributes that adhere to the C_COUNT (openEHR profile data type) and
try to instantiate the CCount class in the kernel, but this is not
correct. Instead it should only instantitate a CComplexObject so that
it could be possible to have a data type that don't have any
attributes, e.g. COUNT matches {*} which means that the CComplexObject
has the any allowed flag set to true. So this means, that for all the
constraints that are domain type constraints you need to check for the
correct syntax in the parser which is C_COUNT, C_QUANTITY etc.

there is something wrong here; there is no such thing as C_COUNT; only
C_QUANTITY....see
http://svn.openehr.org/specification/TRUNK/publishing/architecture/am/openehr_archetype_profile.pdf

Another thing is that RM object creation sometimes rely on that the
syntax for constraints is DV_TEXT etc, but we think that the DV check
in the function createRMObject should be skipped.

Do you mean to do with the names, i..e whether "DV_" is present? This is
a small problem which is more or less political. If we want archeytpes
to be shared beyond the openEHR community, they see "DV_" as an
openEHR-ism, so we often allow it to be removed in archetypes. The
proper solution is that a fixed set of aliases of class names is
allowed; at the moment, this includes:

- all DV_XXX can be just XXX in archetypes
- all ITEM_XXX can be just XXX in archetypes

The Ocean editor implements this mapping. Any tool should probably allow
both kinds of class names, i.e. the alias should not replace the
original set of valid names, just add to it.

- thomas

2006/4/3, Thomas Beale <Thomas.Beale@oceaninformatics.biz>:

Johan Hjalmarsson wrote:

We have come up some issues that concerns the parser and AOM.

For example the current implementation support standard ADL COUNT data
type as it would be an openEHR profile C_COUNT data type. This means
that whenever the parser finds a COUNT it will check for mandatory
attributes that adhere to the C_COUNT (openEHR profile data type) and
try to instantiate the CCount class in the kernel, but this is not
correct. Instead it should only instantitate a CComplexObject so that
it could be possible to have a data type that don’t have any
attributes, e.g. COUNT matches {*} which means that the CComplexObject
has the any allowed flag set to true. So this means, that for all the
constraints that are domain type constraints you need to check for the
correct syntax in the parser which is C_COUNT, C_QUANTITY etc

there is something wrong here; there is no such thing as C_COUNT; only
C_QUANTITY…see
http://svn.openehr.org/specification/TRUNK/publishing/architecture/am/openehr_archetype_profile.pdf

Yes this is most definitely wrong. We got confused when we saw the class CCount in the Java kernel. It’s inheriting CDomainType but according to the specifications, this class shouldn’t even exist. Maybe it exists as help so that one doesn’t need to use the CComplexObject and do extra verification, for example checking for a possible attribute “magnitude”. The problem is still that the parser won’t allow COUNT with no attributes, e.g. no magnitude. However it is possible to create COUNTs with empty attributes with Oceans editor.

Another thing is that RM object creation sometimes rely on that the
syntax for constraints is DV_TEXT etc, but we think that the DV check
in the function createRMObject should be skipped.

Do you mean to do with the names, i..e whether “DV_” is present? This is
a small problem which is more or less political. If we want archeytpes
to be shared beyond the openEHR community, they see “DV_” as an
openEHR-ism, so we often allow it to be removed in archetypes. The
proper solution is that a fixed set of aliases of class names is
allowed; at the moment, this includes:

  • all DV_XXX can be just XXX in archetypes
  • all ITEM_XXX can be just XXX in archetypes

The Ocean editor implements this mapping. Any tool should probably allow
both kinds of class names, i.e. the alias should not replace the
original set of valid names, just add to it.

This sounds okay, as long as the naming is consistent. We don’t think it would be a good idea to allow things to be all mixed up. A question however, why is the naming C_QUANTITY often used in Oceans archetypes but not C_ORDINAL (only ORDINAL with a “value” attribute and not the specified “value” and “symbol”) nor C_CODED_TEXT? It is very hard to remember the naming conventions and it would be good if the relations of the naming were explained in some FAQ. The question is probably easy to answer for the one who wrote the specifications but we tend to mix these up all the time.

Regards,

Mattias & Johan

Mattias Forss wrote:

2006/4/3, Thomas Beale <Thomas.Beale@oceaninformatics.biz
<mailto:Thomas.Beale@oceaninformatics.biz>>:

    Johan Hjalmarsson wrote:
    > We have come up some issues that concerns the parser and AOM.
    >
    > For example the current implementation support standard ADL
    COUNT data
    > type as it would be an openEHR profile C_COUNT data type. This means
    > that whenever the parser finds a COUNT it will check for mandatory
    > attributes that adhere to the C_COUNT (openEHR profile data
    type) and
    > try to instantiate the CCount class in the kernel, but this is not
    > correct. Instead it should only instantitate a CComplexObject so
    that
    > it could be possible to have a data type that don't have any
    > attributes, e.g. COUNT matches {*} which means that the
    CComplexObject
    > has the any allowed flag set to true. So this means, that for
    all the
    > constraints that are domain type constraints you need to check
    for the
    > correct syntax in the parser which is C_COUNT, C_QUANTITY etc

    there is something wrong here; there is no such thing as C_COUNT; only
    C_QUANTITY....see
    http://svn.openehr.org/specification/TRUNK/publishing/architecture/am/openehr_archetype_profile.pdf

Yes this is most definitely wrong. We got confused when we saw the
class CCount in the Java kernel. It's inheriting CDomainType but
according to the specifications, this class shouldn't even
exist. Maybe it exists as help so that one doesn't need to use the
CComplexObject and do extra verification, for example checking for a
possible attribute "magnitude". The problem is still that the parser
won't allow COUNT with no attributes, e.g. no magnitude. However it is
possible to create COUNTs with empty attributes with Oceans editor.

All CDomainType classes will be updated according to the latest
openehr_archetype_profile in the AOM and parser soon.

Rong

Mattias Forss wrote:

2006/4/3, Thomas Beale <Thomas.Beale@oceaninformatics.biz
<mailto:Thomas.Beale@oceaninformatics.biz>>:

    Johan Hjalmarsson wrote:
    > We have come up some issues that concerns the parser and AOM.
    >
    > For example the current implementation support standard ADL
    COUNT data
    > type as it would be an openEHR profile C_COUNT data type. This means
    > that whenever the parser finds a COUNT it will check for mandatory
    > attributes that adhere to the C_COUNT (openEHR profile data
    type) and
    > try to instantiate the CCount class in the kernel, but this is not
    > correct. Instead it should only instantitate a CComplexObject so
    that
    > it could be possible to have a data type that don't have any
    > attributes, e.g. COUNT matches {*} which means that the
    CComplexObject
    > has the any allowed flag set to true. So this means, that for
    all the
    > constraints that are domain type constraints you need to check
    for the
    > correct syntax in the parser which is C_COUNT, C_QUANTITY etc

    there is something wrong here; there is no such thing as C_COUNT; only
    C_QUANTITY....see
    http://svn.openehr.org/specification/TRUNK/publishing/architecture/am/openehr_archetype_profile.pdf

Yes this is most definitely wrong. We got confused when we saw the
class CCount in the Java kernel. It's inheriting CDomainType but
according to the specifications, this class shouldn't even
exist. Maybe it exists as help so that one doesn't need to use the
CComplexObject and do extra verification, for example checking for a
possible attribute "magnitude". The problem is still that the parser
won't allow COUNT with no attributes, e.g. no magnitude. However it is
possible to create COUNTs with empty attributes with Oceans editor.

some_attr matches {
    COUNT matches {}
} -- this means: some_attr has to be a COUNT, but we don't care what the
values are (any value is valid)

This kind of pattern has to be allowed for any type.

- thomas