I’ve been cogitating on how rules could be written properly in archetypes, a topic I never spent sufficient time on in the past. As usual, @pieterbos has pushed the envelope and forced me to think about this more carefully. @borut.fabjan, @yampeku and @sebastian.garde might want to take a look, since they are close to tools. If @ian.mcnicoll could take a look from the authoring point of view it would also be good. (Obviously I would like everyone to have a look, I’m just pinging those group since they probably have concrete opinions).
The challenge in getting rules (i.e. expressions in the form of assertions, assignments etc) in archetypes right is to solve the semantics of how they map to runtime data instances. Rules are instance-level entities whose symbols attach to values at runtime. However, as defined in an archetype, they can only refer indirectly to data, via the archetype paths. So the question is what happens at runtime exactly? Given that data will usually have multiple occurrences of particular structures (e.g. Event structures, multiple Elements in a Cluster, etc), how do those multiple instances get mapped to the symbols in rules? Couldn’t there be an explosion of permutations? Generally speaking, it is intuitively obvious how rules are intended to be applied, but specifying them formally so as to be implemented is not quite so obvious.
Below I propose a couple of variants on how we should understand rules in archetypes that we could use to make sense of these questions.
The first is the default approach. This uses the Apgar sum as a simple example of a rule. In the below, you will see the rule, and near the bottom, the bindings to paths. The intention of the rule is that it is applied to each Apgar Event, i.e. potentially up to 5 times (typically 2 or 3).
archetype (adl_version=2.0.6; rm_release=1.0.3; generated; uid=f27fac48-3acb-4061-9619-c783fd8346ab)
openEHR-EHR-OBSERVATION.apgar.v1.0.1
description
lifecycle_state = <"unmanaged">
original_author = <...>
details = <...>
definition
OBSERVATION[id1] ∈ { -- Apgar score
data ∈ {
HISTORY[id3] ∈ {
events ∈ {
POINT_EVENT[id4] occurrences ∈ {0..1} ∈ { -- 2 minutes
offset ∈ {
DV_DURATION[id42] ∈ {
value ∈ {PT1M}
}
}
data ∈ {
ITEM_LIST[id2] ∈ {
items ∈ {
ELEMENT[id10] occurrences ∈ {0..1} ∈ {
value ∈ {
DV_ORDINAL[id43] ∈ {
[value, symbol] ∈ { -- apgar_respiratory_value
[{0}, {[at11]}],
[{1}, {[at12]}],
[{2}, {[at13]}]
}
}
}
}
ELEMENT[id6] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id14] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id18] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id22] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id26] occurrences ∈ {0..1} ∈ {
value ∈ {
DV_COUNT[id48] ∈ {
magnitude ∈ {|0..10|} -- apgar_total_value
}
}
}
}
}
}
}
POINT_EVENT[id27] occurrences ∈ {0..1} ∈ { -- 2 minutes
offset ∈ {
DV_DURATION[id49] ∈ {
value ∈ {PT2M}
}
}
data ∈ {
use_node ITEM_LIST[id50] /data[id3]/events[id4]/data[id2]
}
}
POINT_EVENT[id28] occurrences matches {0..1} matches {...} -- 3 minutes
POINT_EVENT[id29] occurrences matches {0..1} matches {...} -- 5 minutes
POINT_EVENT[id32] occurrences matches {0..1} matches {...} -- 10 minuten
}
}
}
}
rules
check apgar_total_value = apgar_heartrate_value + apgar_respiratory_value +
apgar_reflex_value + apgar_muscle_tone_value + apgar_skin_colour_value
symbols
symbol_definitions = <
["en"] = <
["apgar_respiratory_value"] = <
text = <"Apgar score respiratory value">
>
["apgar_heartrate_value"] = <
text = <"Apgar score heartrate value">
>
["apgar_muscle_tone_value"] = <
text = <"Apgar score muscle tone value">
>
["apgar_reflex_value"] = <
text = <"Apgar score reflex value">
>
["apgar_skin_colour_value"] = <
text = <"Apgar score skin_colour value">
>
["apgar_total_value"] = <
text = <"Apgar score total value">
>
>
symbol_bindings = <
["apgar_respiratory_value"] = <"/data[id3]/events/data[id2]/items[id10]/value[id43]/value">
["apgar_heartrate_value"] = <"/data[id3]/events/data[id2]/items[id6]/value[id44]/value">
["apgar_muscle_tone_value"] = <"/data[id3]/events/data[id2]/items[id14]/value[id45]/value">
["apgar_reflex_value"] = <"/data[id3]/events/data[id2]/items[id18]/value[id46]/value">
["apgar_skin_colour_value"] = <"/data[id3]/events/data[id2]/items[id22]/value[id47]/value">
["apgar_total_value"] = <"/data[id3]/events/data[id2]/items[id26]/value[id48]/magnitude">
>
>
These bindings are to absolute paths, i.e. from the root point of archetype archetype. But if we think about it more carefully, the rule is actually logically embedded within the enclosing ITEM_LIST
object, and it could even be written like that (an idea I proposed 10 years ago):
ITEM_LIST[id2] ∈ {
items ∈ {
ELEMENT[id10] occurrences ∈ {0..1} ∈ {
value ∈ {
DV_ORDINAL[id43] ∈ {
[value, symbol] ∈ { -- apgar_respiratory_value
[{0}, {[at11]}],
[{1}, {[at12]}],
[{2}, {[at13]}]
}
}
}
}
ELEMENT[id6] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id14] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id18] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id22] occurrences ∈ {0..1} ∈ {...}
ELEMENT[id26] occurrences ∈ {0..1} ∈ {
value ∈ {
DV_COUNT[id48] ∈ {
magnitude ∈ {|0..10|} -- apgar_total_value
}
}
}
}
rules
check apgar_total_value = apgar_heartrate_value +
apgar_respiratory_value +
apgar_reflex_value +
apgar_muscle_tone_value +
apgar_skin_colour_value
}
For this to make sense, the bindings would need to be something like this, i.e. relative to the path of the enclosing object:
symbol_bindings = <
["apgar_respiratory_value"] = <"data[id2]/items[id10]/value[id43]/value">
["apgar_heartrate_value"] = <"data[id2]/items[id6]/value[id44]/value">
["apgar_muscle_tone_value"] = <"data[id2]/items[id14]/value[id45]/value">
["apgar_reflex_value"] = <"data[id2]/items[id18]/value[id46]/value">
["apgar_skin_colour_value"] = <"data[id2]/items[id22]/value[id47]/value">
["apgar_total_value"] = <"data[id2]/items[id26]/value[id48]/magnitude">
>
Now if we want to stick to rules expressed only at the top-level of an archetype, we could define the path bindings like this:
symbol_bindings = <
["apgar_events"] = <
root = <"/data[id3]/events">
children = <
["apgar_respiratory_value"] = <"data[id2]/items[id10]/value[id43]/value">
["apgar_heartrate_value"] = <"data[id2]/items[id6]/value[id44]/value">
["apgar_muscle_tone_value"] = <"data[id2]/items[id14]/value[id45]/value">
["apgar_reflex_value"] = <"data[id2]/items[id18]/value[id46]/value">
["apgar_skin_colour_value"] = <"data[id2]/items[id22]/value[id47]/value">
["apgar_total_value"] = <"data[id2]/items[id26]/value[id48]/magnitude">
>
>
>
In the above, there are now ‘root points’ and ‘child paths’. The rule could then be expressed like this:
rules
check
for_all evt: apgar_events |
evt/apgar_total_value =
evt/apgar_heartrate_value +
evt/apgar_respiratory_value +
evt/apgar_reflex_value +
evt/apgar_muscle_tone_value +
evt/apgar_skin_colour_value
This form of the rule can easily be executed at runtime, since the bindings of the instance data items to the symbols are now unambiguous. However, both the rule and the bindings are more complex. If we just used the original form in the first example, the semantics in this more complex rule and bindings have to be inferred.
We could potentially allow the following in order to simplify the rule a bit:
rules
check
apgar_events.for_all (
apgar_total_value =
apgar_heartrate_value +
apgar_respiratory_value +
apgar_reflex_value +
apgar_muscle_tone_value +
apgar_skin_colour_value
)
This requires the runtime to infer that the expression inside the parentheses is really something like the following lambda:
agent (evt: EVENT) {
evt.child(apgar_total_value) = evt.child(apgar_heartrate_value) + ...
}
There are more challenges when the rule(s) include paths from elsewhere in the archetype (e.g. /protocol
or /state
in an openEHR archetype) because then you have some paths inside a multiply valued node (EVENT[*]
) and some outside. I won’t complicate the story with that for now, since I think the above gives us enough to think about.
In summary: the question is about striking the right balance between explicitly stated and inferred semantics for rules within archetypes.
Reality check: it should be remembered that there are other places rules could be applied to data, for example, a more flexible approach would be to associated a DLM (Decision Logic Module) with a template, or just an application. This allows rules to be written that can be bound to:
- variables anywhere in a template, not just in an archetype
- data returned from AQL queries ranging across completely separate Compositions etc from the EHR
- data not even in an openEHR system, e.g. MPI demographic data.
The general case is the reason for the work on DLMs and the Subject Proxy Service.
I mention this because I think that there are limits on what we should try to do with rules inside archetypes, and the complexity of the binding problem may not justify making them too powerful. Others may disagree
My personally preferred solution would be the more explicit (i.e. more complex) representation, because:
- it makes implementation of rule execution much easier
- we can assume that rule-writing will be tool assisted, e.g. AD, LinkEHR etc would add some smarts to make it easy for the author to select paths etc, and the tool will generate the correct rule expression, bindings etc.
- rule probably won’t be used that much in archetypes anyway, since more realistic rules will usually range over variables from multiple archetypes within a template or an application retrieval data set.
(It is also simpler to specify in the ADL specification.)
All feedback welcome.