I’ve been thinking about whether we might want to update or replace ODIN, the JSON-like format used for representing the meta-data and terminology in archetypes. For those who don’t know, ODIN was invented 20 or so years ago, when there was no JSON to speak of, and we’ve kept using it because it’s regular and includes a) a lot more leaf types than JSON, particularly Intervals and Date/time types, and b) type-markers. However, it’s not visually very close to JSON, or even YAML, which is also in relatively wide use.
We have a number of situations in which we want to be able to use literal data structures including at least the following:
- in the openEHR REST APIs
- in the descriptive parts and terminology section of archetypes, templates, OPTs
- in the terminology and constants (‘reference’) section in Decision Logic Modules (example)
- in the Serial /Flat Template Data Format (early spec here) - exemplified by EhrScape web templates.
Only the first of these is completely regular JSON, everything else has requirements that JSON is too weak on its own to satisfy. However, it is almost a standard thing for technology platforms to define JSON-like and/or YAML variants that suit their purposes and then guarantee (possibly lossy) down-transform to standard JSON.
The nice ‘bar’ trick invented by Better for EhrScape JSON is an example of a non-conforming JSON-like syntax:
{
"|code": "238",
"|value": "other care",
"|terminology": "openehr"
}
We would also like to be able to use space-efficient alternatives e.g. [icd10AM::F60.1]
that are not supported in JSON.
I’ve thought about this question from the direction of defining literal expressions in the Expression Language. This approach provides a more programming like approach, but designed for down-conversion to JSON, and enables smart structures like the following (see here):
|
| qRisk3 Risk table
|
Risk_factor_scales = {
[female]: {
[has_atrial_fibrillation]: 1.59233549692696630,
[atypical_antipsychotic_medication]: 0.252376420701155570,
[on_corticosteroids]: 0.595207253046018510,
[has_impotence]: 0,
[has_migraines]: 0.3012672608703450,
[has_rheumatoid_arthritis]: 0.213648034351819420,
[has_chronic_kidney_disease]: 0.651945694938458330,
[has_severe_mental_illness]: 0.125553080588201780,
[has_systemic_lupus]: 0.758809386542676930,
[on_hypertension_treatment]: 0.509315936834230040,
[has_family_history_CV_disease]: 0.454453190208962130
},
[male]: {
[has_atrial_fibrillation]: 0.882092369280546570,
[atypical_antipsychotic_medication]: 0.130468798551735130,
[on_corticosteroids]: 0.454853997504455430,
[has_impotence]: 0.222518590867053830,
[has_migraines]: 0.255841780741599130,
[has_rheumatoid_arthritis]: 0.209706580139565670,
[has_chronic_kidney_disease]: 0.718532612882743840,
[has_severe_mental_illness]: 0.121330398820471640,
[has_systemic_lupus]: 0.440157217445752200,
[on_hypertension_treatment]: 0.516598710826954740,
[has_family_history_CV_disease]: 0.540554690093901560
}
}
;
|
| a Map<Term, Interval<Quantity>> structure
|
ranges = {
------------------------------------
[mild_low_risk]: |<= 99 /min|,
[mild_at_risk]: |100 .. 120 /min|,
[moderate_risk]: |>= 121 /min|
------------------------------------
}
;
We could potentially use this syntax to replace some uses of ODIN today, e.g. like this:
term_definitions = {
"en": {
"date_of_birth": {
text = "Date of birth",
provenance = {"GDL2": ["gt0009"]}
},
"age_in_years": {
text = "Age (years)",
provenance = {"GDL2": ["gt0010"]}
},
"age_category": {
text = "Age category",
provenance = {"GDL2": ["gt0017"]}
},
"gender": {
text = "Gender",
provenance = {"GDL2": ["gt0009", "gt0016"]}
}
}
}
Note that the syntax uses {}
and []
for containers in the same way as JSON, so it reads in a way close to JSON, and it is easy to down-convert. The =
syntax makes it easier to distinguish objects from Map / Array structures.
I am interested in thoughts from the community.