Should Compositions be written with local timezone or UTC datetimes?

joshua.grisham · 12 September 2023 10:41

Are there any consensus, guidelines, or previous experience that could help us to decide what timezone should be used when storing date and/or time data which is actually written and saved to our (for example) Compositions within an openEHR context?

We at Karolinska University Hospital have had a few discussions on this and can see pros and cons to each approach, but thought to pose the question to the international community here.

From a developer and system perspective it is often a typical first reaction that we want to store dates and times in UTC, and then the UTC timestamps can be converted to whatever locale you want when you are displaying the content somewhere.

However, within a healthcare context maybe it does make sense for certain “clinical” dates/times to be written into (for example) Compositions using the local timestamp where the actual events occurred?

What we are leaning towards is that we will try to start with writing “clinical” date and times with the local timestamp where the actual event occurred (for example the CONTEXT’s START_TIME), and technical timestamps (for example the DV_DATE_TIME under FEEDER_AUDIT) using UTC.

Unfortunately, we do see some inconsistent behavior when querying with the data by timezone in AQL depending on the specific openEHR platform we test with and variations of timestamps either written in the composition and/or given as query parameters. It seems that this could probably be one of the main “sticking points” with picking one way or the other given the state of currently available implementations?

In my mind, for example, we should be able to give a timestamp with a timezone as a value in the WHERE clause of an AQL query (e.g. WHERE c/context/start_time/value >= '2023-09-11T00:00:00.000Z' ) where the system could do a conversion of the timezone given in the query vs the timezone of all of the values it is doing the comparison against (for example that the previously mentioned WHERE clause would match and return a Composition with a context/start_time written/saved as 2023-09-10T22:00:00.000+05:00).

I have not been able to find anything that explicitly talks about “which” timezone should be saved (UTC vs local timezone, and/or a difference between “technical” vs “clinical” time values) and how mixed timezones are/should be compared against each other, only that there “should” exist a timezone on data which is saved and ideally given as parameters to queries (as was discussed in this thread: Datetimes in AQL and API)

joshua.grisham · 12 September 2023 11:27

Regarding different behaviors with different openEHR implementations…

With EHRBase (0.30.0) for example it feels like there is actually a bug and it does not even work to filter on a DV_DATE_TIME

I created 2 example compositions with different timezones which you can see in this AQL query result:

{
    "q": "SELECT c/context/start_time/value AS start_time FROM EHR e CONTAINS COMPOSITION c",
    "columns": [
        {
            "path": "/context/start_time/value",
            "name": "start_time"
        }
    ],
    "rows": [
        [
            "2023-09-08T06:35:48.291Z"
        ],
        [
            "2023-09-07T22:35:48.291+07:00"
        ]
    ]
}

But it seems like querying with WHERE against c/context/start_time doesn’t really work at all that I can tell? If I try to use c/context/start_time/value (with the query “SELECT c/context/start_time/value AS start_time FROM EHR e CONTAINS COMPOSITION c WHERE c/context/start_time/value >= '2023-09-08T00:00:00.000Z”) then I get a Postgres error:

{
    "error": "Bad Request",
    "message": "Could not perform SQL query:org.postgresql.util.PSQLException: ERROR: syntax error at or near \",\"\n  Position: 532, AQL expression:SELECTc/context/start_time/valueASstart_timeFROMEHReCONTAINSCOMPOSITIONcWHEREc/context/start_time/value>='2023-09-08T00:00:00.000Z'<EOF>, Translated SQL:select jsonb_extract_path_text(cast(\"ehr\".\"js_dv_date_time\"(\n  \"ehr\".\"event_context\".\"start_time\",\n  event_context.START_TIME_TZID\n) as jsonb),'value') as \"start_time\" from \"ehr\".\"entry\" right outer join \"ehr\".\"composition\" as \"composition_join\" on \"composition_join\".\"id\" = \"ehr\".\"entry\".\"composition_id\" join \"ehr\".\"event_context\" on \"ehr\".\"event_context\".\"composition_id\" = \"ehr\".\"entry\".\"composition_id\" where (\"ehr\".\"entry\".\"template_id\" = ? and (jsonb_extract_path_text(cast(\"ehr\".\"js_dv_date_time\"(\n  \"ehr\".\"event_context\".,\n  event_context.START_TIME_TZID\n) as jsonb),'value')  >= '2023-09-08T00:00:00.000Z'))"
}

If I try against c/context/start_time then the results are not as expected at all (note >= '2023-09-15...' resulted in compositions with dates before that date):

{
    "q": "SELECT c/context/start_time/value AS start_time FROM EHR e CONTAINS COMPOSITION c WHERE c/context/start_time >= '2023-09-15T00:00:00.000Z'",
    "columns": [
        {
            "path": "/context/start_time/value",
            "name": "start_time"
        }
    ],
    "rows": [
        [
            "2023-09-08T06:35:48.291Z"
        ],
        [
            "2023-09-07T22:35:48.291+07:00"
        ]
    ]
}

With Cambio’s openEHR CDR it seems more like it is doing a string comparison which ignores any kind of timezone mismatch or comparison?

Here are both timestamps I have written there (no WHERE clause):

{
    "meta": {
        "_executed_aql": "SELECT c/context/start_time/value AS start_time FROM EHR e[ehr_id/value=\"63f67128-28df-43cb-8c84-a776abb373ca\"] CONTAINS COMPOSITION c"
    },
    "name": null,
    "q": "SELECT c/context/start_time/value AS start_time FROM EHR e[ehr_id/value=\"63f67128-28df-43cb-8c84-a776abb373ca\"] CONTAINS COMPOSITION c",
    "columns": [
        {
            "name": "start_time"
        }
    ],
    "rows": [
        [
            "2023-09-07T22:35:48.291+07:00"
        ],
        [
            "2023-09-08T06:35:48.291+02:00"
        ]
    ]
}

And then if I filter against the start_time/value it does return without error, and filters successfully, but from what I can tell it is only a string comparison and not taking the timezone into account? This query should in theory return both of the same as above, but does not:

{
    "meta": {
        "_executed_aql": "SELECT c/context/start_time/value AS start_time FROM EHR e[ehr_id/value=\"63f67128-28df-43cb-8c84-a776abb373ca\"] CONTAINS COMPOSITION c WHERE c/context/start_time/value >= '2023-09-08T00:00:00.000Z'"
    },
    "name": null,
    "q": "SELECT c/context/start_time/value AS start_time FROM EHR e[ehr_id/value=\"63f67128-28df-43cb-8c84-a776abb373ca\"] CONTAINS COMPOSITION c WHERE c/context/start_time/value >= '2023-09-08T00:00:00.000Z'",
    "columns": [
        {
            "name": "start_time"
        }
    ],
    "rows": [
        [
            "2023-09-08T06:35:48.291+02:00"
        ]
    ]
}

joshua.grisham · 12 September 2023 11:47

Quick update if I roll back a bit with EHRBase to now 0.21.1 I am able to run the query as I expect but I think it is doing a similar thing as I posted above with Cambio’s platform (string comparison and not checking the timezone)

With EHRBase 0.21.1, using the same dates above, I should again get both compositions returned, but only getting one:

{
    "q": "SELECT c/context/start_time/value AS start_time FROM EHR e CONTAINS COMPOSITION c WHERE c/context/start_time/value>='2023-09-08T00:00:00.000Z'",
    "columns": [
        {
            "path": "/context/start_time/value",
            "name": "start_time"
        }
    ],
    "rows": [
        [
            "2023-09-08T06:35:48.291Z"
        ]
    ]
}

The one with the timestamp 2023-09-07T22:35:48.291+07:00 is being thrown out of the result.

stefanspiska · 12 September 2023 12:06

According to the spec the order for subtypes of DV_QUANTIFIED Data Types Information Model

is defined by order by magnitude and thus allows for comparisons of Time with different timezone as well as incomplete Date/times.

This will be implemented in further version of ehrbase.

erik.sundvall · 18 September 2023 09:51

Hi @stefanspiska! Do you know:

if the problem reported at Unexpected result in AQL query with a DV_DATE_TIME element without fractions of seconds · Issue #989 · ehrbase/ehrbase · GitHub will get solved by the future improved time-implementation in EHRbase and
if there is any info available regarding expected release time/milestone/roadmap for this fix?

stefanspiska · 18 September 2023 10:11

@erik.sundvall

This should then also be fixed by using magnitude.

We are currently working on some improvements for the AQl engine but don’t expect this to become available before Q1 2024.

pablo · 18 September 2023 21:15

In our implementation (CaboLabs) we choose to allow any timezone in the composition, but the query results are all normalized to UTC, so there are no mixed TZs in query results, though we implement a different query formalism than AQL.

The issue with returning different TZs in a query result is the user needs to normalize all the datetimes to one TZ, if everything is in UTC, there is no need for such normalization, of if a transformation to another TZ is needed, the transformation is easier when everything is in the same TZ.

But there is no “right” answer, at least not in terms of the openEHR specification.

erik.sundvall · 19 September 2023 15:10

Not sure I agree to this, I think it depends who you consider being the “user” (app developer or clinical end user?) and if/when and how often they really need to convert anything.

I’d guess (as @stefanspiska hints to above) that a correctly implemented CDR is fully timezone-aware e.g. when in AQL comparing timestamps with WHERE or sorting returned entries using ORDER BY. I hope everybody agrees that far.

But what might sadly (yet) be undefined in the specs is what the shown timestamp in the returned data should be. Personally I’d prefer if it was exactly what was used in the submitted data (for examle suffixed by +01:00 for CET and +02:00 for CEST (Daylight Saving Time) in parts of Europe).

If so, then we can in normal simple user interfaces (such as clinical notes/documents) see e.g. that the patient got the medication at 18:00 their local time no matter if it happened to be winter or summer (without having to do any fancy conversions in the UI). At Karolinska, and likely many other hospitals, we aim at serving patients from many timezones with our openEHR-based solutions.

Also simple time-sorted UI lists that base the time ordering on the returned order from AQL’s ORDER BY, would work straight out of the box, even during the nights shifting between summer and winter time, but still be able to show the local time used for each entry, making it easy to correlate to other systems’ timestamps and real-world events using local time.

If the server instead hides all TZ info (e.g. by converting to UTC as you suggest) in it’s API responses, such TZ-aware UIs have no easy way of knowing the TZ of the original data.

More clever UIs/widgets would likely be more deeply timezone (TZ) aware and could make timestamps deviating from current local time (of the clinical end user) stand out (e.g. by showing them with suffix or an added symbol) or choose to nomalize to local TZ if that is better for the use case at hand, e.g. in UI visualization code plotting and relating things on a graphical timeline.

pablo · 19 September 2023 16:53

“user” in this context is the client of the API.

ian.mcnicoll · 20 September 2023 07:38

Is that correct? Is it not possible to back-calculate a local datetime for a client in a particular Timezone, even for historical data? They are indeed fancy conversions but I had assumed there were libraries that handled these.

In mixed TZ ‘economy’, I would assume that a local use would want to see a historical time in their local TZ even if t was entered in a neighbouring and different TZ .

erik.sundvall · 20 September 2023 12:10

Hi @ian.mcnicoll! Yes it is possible to convert form UTC (or whatever default local time the server returns as default) to what local time was used at entry, if you have all the other metadata, like where in the world the data was entered and a database with info about at what dates/times different jurisdictions shifted to/from daylight saving time any particular year. Pretty messy to maintain though.

I would think it is a lot easier to go the other way by storing clinically interesting dates in the local time of the clinical setting/source at the time of data capture, including with full timezone info in the timestamp suffix.

Yes, that will for several use cases likely be wanted (see e.g. my example about graphical timeline above) and is a fairly simple thing to do in reusable UI-widget code (and likely suported in many SDKs/libraries) if:

the data source contains full timestamp including the timezone suffix/shift used at entry time and
the widget is aware of what timezone the user wants to view things from.

…but if you think about other common use cases like in the winter reading a normal clinical note from the summer, do you then really want the UI to shift the times entered in the summer an hour so that it in the winter looks like the patient during the summer got the medication one hour shifted from the usual time they get it in the winter?