FHIR ID and openEHR

Hi
We have a discussion about how to handle identifiers in FHIR API over openEHR data. A simple example might be body mass index. When the FHIR API return data it should set a value on Observation.id . This field is only 64 characers so its not possible to put the complete EHR-URI here.

Another approach is to use Observation.identifier which can handle the full URI to the given locatable.

In the CDR we do not store a UID on all LOCATBLES. We have, of course, the COMPOSITION.uid , but no other uid is required. We use the full EHR-URI for references within the system.

So the problem remains: How to manage a GUID for the Observation.Id in FHIR API

How are other implementations handling this ?

It is strongly recommended to store a UID (LOCATABLE.uid) on all ENTRYs as well. The reason for that is that Entries should be referenceable independently from the Compositions they are in - i.e. they are meaningful on their own.

To use such ids from outside an EHR means you have to maintain a global GUID → object map.

I agree with @thomas.beale

Just for the sake of discussion, another option could be to use the compo.uid + instance path to the observation

That is a unique identifier that can be calculated instead of stored that identifies that specific observation inside the compo, traversing from the root compo to the specific obs, like: uid/content(1)

Are there not some restrictions on string length for FHIR identifiers which might cause issues with compId + instance path?

Based on Datatypes - FHIR v6.0.0-ballot4 it should be 64 bytes long max.

So this is the max id you can set using the object_id:

d8ac9049-8479-40a6-9148-baa55407e812/content(999999999999999999)

It gets trickier with the full version_id because the system_id length can be unbounded.

Though, when we map ids for full compos, we use the FHIR meta to set the system_id and version_id parts of an openEHR object_version_id. So in theory, by using a combination of meta and id you can still reference a lot of entries inside a compo, via the instance path, in a FHIR resource.

You can of course hash long ids into short ones (like URL shortening) but you have to maintain the reverse map somewhere.

Hashing is one way, another is compression: you can use “/c” instead of “/content”.

Another option is to store the UUID number instead of the string, which is really a 128 bits number (16 bytes) vs the UUID formatted string that is 36 bytes long.

With hashing you need the mappings to get back the original value, with compression you need to decode the value without having to maintain a lookup table on the side.

Note I’m not saying this is good or bad, just talking about options here.

Indeed. If you develop bullet-proof rules, this is a perfectly good way to do compression as well. However, you might also want to obfuscate paths in the data so that what it is part of cannot be easily determined from what is shared. There are many variant use cases. The right solution just depends on what features we really want.

We can add requirements on top. Though if the requirement is to stop at ‘identification of the entry’, I think what we discussed actually covers most of the options:

  • direct identification of entry
  • identification of parent + instance path to entry
    • direct
    • hashed
    • compressed

I think we should update the specification with this recommendation. Currently we have the following recommendation: Common Information Model

We did UID on all entries 10 years ago, but left it for a path based approach - which have served us well until we meet the 64 character requirement which lead to this topic.

We also looked into hashing and other means to manipulate the ID. None of them seems to be attractive due to to need to have a two way hash/compression which is deterministic and fast.

One solution might be to use the identifier in FHIR which is unlimited in size. And add to the capability statement of an FHIR API that the way to look upp data is to search using the EHR URI.

From the responses so far in this thread and from investigation of different projects I think it is fair to state that there are no established pattern, guideline or recommended way to overcome this problem..

That’s fine. Someone must move first. We will work on this during the summer :sun_with_face: :umbrella_on_ground:

I hope we find a manageable solution.

Do you have some examples at hand to check how your implementation reached the 64 byte limit?

Are you using the full object_version_id or the object_id alone?

In my experience, solutions not based on LOCATABLE.UID are conceptually too open or error prone. But as always, it depends. The problem at hands gets way simpler, if we would only talk about a read-only FHIR facade on-top of the openEHR persistence. I’m talking about a bidirectional FHIR API.

Adding meta data to the FHIR resource is very helpful, especially in mapping scenarios. But hard-linking the FHIR resource’s version (meta.version_id) to the openEHR composition’s one blocks cases with cardinalities other than 1:1. Independent version handling of the composition container and the, in this case multiple, virtual outer representations. For instance, having one lab report containing multiple readings, where one comes in later or has to be corrected.

The pathing is something where I don’t get a fully working idea, yet. Do we really have a specified pathing schema that fits the requirements?

In an example like d8ac9049-8479-40a6-9148-baa55407e812/content(2), where does the positional information come from and how is it guaranteed to be stable, even as data might get serialized and deserialized etc.?

What about arbitrary deep paths? In my experience, FHIR resource concepts often match on cluster archetype level. So how would a path safely identify a somewhat open slot sub-structure in a composition and maybe nesting it even deeper?


For general reference: 11.2.4. Data Paths and Uniqueness