Approximate dates

heather.leslie · 3 August 2021 01:11

Reading the specs at 7.1.1.2. Partial Date/Times:
“If not even the year is known, then the date is obviously extremely approximate and it would probably be unsafe to represent it computationally. However, if computatable representation was needed in this case, a date interval can be used. A pedantic example which breaks these rules is someone who claims to be born on “a Monday at the start of May in 1934” (i.e. day but not date unknown). Either the clinician determines what date the first Monday in May 1934 actually was and record that (assuming the patient’s way of accurately remembering just happens to be via day rather than date), or else records a partial date of the form “May 1934” (in ISO 8601 form, "1934-05") if they determine that the patient really is unsure.”

I would like to be able to record an approximate date that is more exact than a partial date (eg month and year) but annotate that it is not precise. And definitely not wanting to record an interval - definitely not wanting to record ‘2020-08-01 - 2020-08-05’.

What I’d like to be able to record is “~2020-08-03” in the same way as DV_QUANTIFIED’s "~" magnitude_status. It is closer to real-life recording in many situations. Any age or duration derived from it would be more accurate, even if qualified by a similar ‘~’, than an age or duration calculated or derived from a partial date.

There conceivably may be similar use cases for use of ‘<’ or ‘>’ as in ‘earlier than/before’ or ‘later than/after’ a specified date as well although I don’t have a specific use case at present. Clearly these can’t be used to drive decision support, but only in the same way as an interval or a partial date have significant limitations.

bna · 3 August 2021 07:57

Yes - this have been a need for many years. The first openEHR template we brought to production was one for personal injury. Secondary usage of the data is for a national registry.

It’s hard to define when an accident happened. Sometimes there are no clinical use of specific timing. Sometimes it is. For a registry they might need to know the context. I.e. around 0200 on Sunday morning is when the doors are closing at bars and pubs. Or 0800 Monday morning is when the morning rush is in the traffic. What I learned is that the clinicians want to record the date-time as specific as possible, still they want to make the uncertainity known. I.e. it happened at 1000 but it could be in the span from after 0900 til 1030.

We have not found a good way to model this yet.

pablo · 3 August 2021 17:13

I understand the use case and makes perfect sense. The current date/datetime/time is not good for representing these fuzzy date/time expressions.

New types, something like APPROXIMATED_DATE or APPROXIMATED_DATETIME might be needed, because not even a partial date helps on some cases, like “last week” and there was the 1st of the current month on that week, it is not accurate to use YYYY-MM because there were 2 different months in the same week. Though ISO 8601 has a week number notation we don’t use much that could help on that case.

With APPROXIMATED_DATETIME we might be able to say “yesterday around 2AM”, which in data would be something like (“YYYY-MM-DD”, “~02:00”), and could include the TZ and the width of the range, like “around 2AM +/- 1hour”: (“YYYY-MM-DD”, “~02:00”, “-03:00”, “1h”).

I don’t think current DVs can represent these use cases correctly, and I think the key is not just the representation, is these kinds of values should be interpreted correctly hen retrieved from the openEHR system. Though the ISO 8601 week notation and time interval notation could help with some of these use cases, but I’m not 100% sure DV_DATE and DV_DATE_TIME support those formats.

Also found this Extended Date/Time Format (check the uncertain and approximate qualifiers)

Extended Date Time Format (EDTF) Specification (Library of Congress)

thomas.beale · 3 August 2021 18:40

You can already do this - it’s built in to the RM. See the inherited fields magnitude_status and accuracy from DV_QUANTIFIED.

To be a bit more graphical… here’s the generated view of the DV_DATE type in ADL Workbench, including inherited attributes:

AWB_DV_DATE

the ‘value’ in yellow is the main value you are used to; the accuracy field allows you to express a numerical accuracy, e.g. +/- P3d, and the normal_status field allows you to have the '~'.

pablo · 4 August 2021 22:33

@thomas.beale does DV_DURATION allow both sings “+/-” or do you mean “accuracy” should be interpreted that way without storing “+/-” explicitly?

heather.leslie · 5 August 2021 00:47

Is this a documentation issue then? I couldn’t see that in my reading of the specs and that is why I asked.

When experienced developers are looking for the same thing, we seem to have a problem. Maybe it is just a matter of increasing clarity in the specs.

And of course, the modelling tools don’t have that capability yet but that’s another issue. If it is valid and confirmed to be documented in the specs, I can at least make a tooling CR.

H

thomas.beale · 5 August 2021 11:50

For DV_DATE, DV_DATE_TIME and DV_TIME, accuracy is absolute (meaning +/- x). For types that inherit from DV_AMOUNT, which includes DV_DURATION, accuracy may be expressed either as an absolute value (+/- x), or as a %,( meaning +/- x%) using the accuracy_is_percent flag. There is no need to store any ‘+/-’., that’s conceptually part of the definition.

The screenshot above shows DV_DATE, so you just see accuracy. For DV_DURATION, it looks like this:
AWB_DV_DURATION

As we see in the spec, DV_DURATION inherits from DV_AMOUNT, and DV_AMOUNT and DV_QUANTIFIED carry the accuracy fields. I would say it is pretty clear from the diagram as well, but maybe we need to add further text.

thomas.beale · 5 August 2021 11:58

Well the Better Archetype Designer uses the BMM files as their input, so the accuracy fields are there (the accuracy: ANY field is extraneous - I have reported this as a bug).

heather.leslie · 5 August 2021 17:25

I can see that what you show in the screenshot is true, but all we can do is relabel them as best I can tell from the tooling, when we actually need to be able to explicitly model to indicate where the magnitude status of ~, >, < are valid or not etc.
The +/- accuracy is not a common requirement - think EDD in pregnancy but not used frequently in other use cases. I’d rather specify that ~August 1 is valid rather than specify July31-August 2 to be recorded. Same for partial dates of ~Jan 2006 rather than October 2005-March 2006.

thomas.beale · 5 August 2021 19:54

I forgot to mention - you can do that in the magnitude_status field, which is also visible in those screenshots. How well it is supported in tools like AD I have not verified yet…

sebastian.garde · 12 August 2021 07:55

@thomas.beale Would you expect the magnitude_status to be commonly constrained in an archetype?

It is legal of course, and CKM accepts it. (At present it does not explicitly display this though, but that is easy to add.)

However, I assume that in most cases and if I understand @heather.leslie 's example above correctly, modellers would not want to mandate that a magnitude is only approximate rather than exact, but just indicate that it is acceptable if nothing better is available. This means that constraining a magnitude_status to “~”/approximate in an archetype is not giving you that in my understanding. Rather you’d need to constrain magnitude_status to “=”/exact for any place where as a modeller you don’t want to allow any approximate value to be documented?

My understanding of the meaning if magnitude_status is unconstrained in an archetype (~100% of current archetypes) is that the magnitude in data could be anything of approximate, smaller, larger, or exact.

As a sidenote: I wonder how well the magnitude_status is actually supported in systems, especially looking towards decision support:
If magnitude_status is not constrained in the archetype to be exact, any quantified value could just be approximate and to know whether that is the case, you need to look at magnitude_status as well as the accuracy. I then wonder about the relationship and consistency between the ‘fluffy’ magnitude_status (“non-quantified indication of accuracy”) and the accuracy attribute which is the quantified equivalent? (Not to speak of “accuracy_unknown”, its representation of -1 or void in data, and accuracy_is_percent.)

heather.leslie · 12 August 2021 08:45

I think the assumption should be “=”/exact unless it is deliberately constrained differently, especially for decision support use case etc. Whereas I guess at the moment unconstrained means any and every flavour of date/time is possible.

“~”/approximate allows for clinicians to indicate uncertainty, which is often relevant. But it needs to be configurable in tools.

thomas.beale · 12 August 2021 09:32

I would never expect it to be constrained in an archetype or a template…!

This field originated from lab, and possibly should only / mainly be used there. Although Heather’s example seems reasonable as well.

To be clear here: it doesn’t indicate ‘uncertainty’ in the sense that a clinician might be uncertain (right now) that a patient has meningitis rather than influenza; it’s an uncertainty in value coming from a machine, or the patient him/herself, which is really inaccuracy. Just to be pedantic…

heather.leslie · 18 August 2021 02:43

I think we’re deviating a little from the initial question about how to record ‘approximation’ for a specific date/time. It reflects a clinical need to faithfully and appropriately record an inexactness or uncertainty from the subject of care or record author.

It is not clear to me that we have an agreed answer.

thomas.beale · 18 August 2021 10:35

If we can agree that what we are talking about is ‘accuracy’ in a datum supplied by any means (the doc, the doc using a machine, someone else using a machine, someone else, the patient, …), then I think we are fine.