I’ve read up a bit more and using PUT just gets us potentially into a tangle with th e url path or complexities around headers.
From my reading of the RESTful difference between PUT and POST, this is still a POST.
We are adding a new unique resource to the collection of Compositions, not updating an existing composition. The only difference is that we are assigning the uid client-side, not server-side.
The only downside I can see of using the uid within the Composition body, is that some composition generated examples populate the uid, which might cause confusion, so I’d want that behaviour to change.
We are looking at full vendor to vendor transfer of all EHRs. This would obviously be more efficient at lower level but may not be a high priority for CDR vendors to implement, so this was an experiment to see how much could be done with the standard API.
If nothing else it is a good advert for tech/vendor neutrality in a way that is non-opaque, and helps tease out the exact requirements.
Good point re the commit_time - we need to think about that.
That’s an interesting scenario, and probably needed sooner than later.
Since different vendors can implement different persistence technologies, its reasonable to work it out at the API level. But it’s not only about copying parts of the loaded EHRs. As @thomas.beale said, you should replicate all metadata (commit times, etc.), templates loaded, FOLDERs, CONTRIBUTIONs, demographics, and any other configuration at the EHR level.
The solution probably is to design a dump/load API endpoint, having some kind of container for all that information. Maybe the EHR_EXTRACT class, or an evolution of it, could be a good candidate.
This would be computationally heavy, but that’s another topic
We are hoping to cover eerything in there other than Folders and Demographics as neither are included in our current scope. Folders would probably not be too hard to do but there really are very few ? any independent openEHR Demographics services out there, and in any case, at least right now very loose or non-standardised coupling.
But we are doing/ havedone
Agree that ideally this should be nicely wrapped up in to a simpler service but for now it is probably pretty useful to see the moving parts, if only so we can raise some of these issues as we go along.
If we can agree on the post /composition with uid resolution and have a list contributions end point, we are probably very close to having something working. I see we, I mean the Future Perfect guys like @Simon !! I’m just giving a little advice.
This would be my preferred approach to design. Even though we did not explicitly define it, there is an implicit context to the operations supported by the REST API (or so I think). Restoring data from another CDR is a richer operation in terms of its semantics, we’re replaying an actual insert that took place in another CDR, and there is more data that is of interest than a direct call to a REST endpoint, say from a mobile app.
The time of the original operation (as Tom points out), the fact that this operation is a replay of a previous one, and the time of the replay (which is potentially meaningful for audit purposes), and the information related to the original CDR. The identity of the bulk import itself, to be assigned and tracked, so that we know how many succeeded, failed etc. The need to keep the original uids.
I would not like to encode all of this within the semantics of a single REST endpoint, it’s too big of a semantic overload for my taste, due to current REST apis not having any awareness of a bulk operation (this is PUT/POST 16 of 345,000… → where do you keep that info? REST is supposed to encapsulate the whole state in the call.)
Finally, things get interesting when you scale up to tens of millions of compositions. We’re currently planning a data migration between two Ocean systems which will take days using specialised tools. If it was REST endpoints, we’d be pronouncing months…
This is definitely a dump/load thing. Some initial modelling on the abstract version of that here.
The reason why it has to be done as a dump-load service is to allow for differences in concrete data representation inside CDRs of different vendors and even different releases from the same vendor. No vendor would be able to guarantee anything if someone just slides the DB tables or even full JSON dump straight into the target DB, even if in some cases it might work. The creation of Versions and Contributions has to be done by appropriate calls taking the relevant content as an argument.
There are clearly multiple forms of this:
a partial EHR extract → import (e.g. as for transfer of care) - this is a merging scenario and you might want to preserve original commit times if the source and target CDRs are within the same health service, or you might not, if the patient treatment is now being taken over at the destination.
the dump-load situation for a whole CDR’s contents, to upgrade it in situ to a new CDR - preserve original commits
dump-load of a full EHR from one CDR to another one - probably want new commit times, indicating the earliest time the local staff at the destination could have seen the data.
These correspond to real world scenarios like:
merging a piece of an EHR to a CDR in a new location
moving an entire EHR to a CDR in a new location (for patient move, e.g. new country, region etc)
replacing the CDR in situ, data should look the same as before
Of course, and what you say is a totally needed approach. Anything else we could propose now would not be available for several years.
@Seref I was not talking of any specific solution at this point, just talking about how to approach to this problem. Then there will come the technical problems, and the possible solutions:
Probably we are not talking of a REST transfer of data, but an export to some kind of file. Or maybe communication via other formats (protobuf). We could also consider using a compression format to reduce the size of data.
Probably this is not a single method, but has more granular operations.
We can learn a lot from SQL dumps: we could allow to decide to export just the “schema” (templates, etc.) or the schema + data. Or just the data for some templates or patients.
And for sure, there will be many other optimizations.
As mentioned in the past, in the context of [SPECITS-62] - openEHR JIRA issue, I’m still in favor of extending the semantic of our PUT Composition to be used also for creation, not only for update. This would be in line with RFC 9110: HTTP Semantics and is most likely exactly what a programmer using REST Api would expect from openEHR API.
The PUT Contribution should be also changed to support creation of a Contribution with a given uid.
We should also keep this behavior consistent, and specify PUT for all resources working in the same way (i.e. create & update).
These changes will be done in the upcoming REST Api specification release, and should facilitate most of replay-functionality described above, keeping original IDs in the target system, although I can imagine system creation timestamps might be not honored always - which is an aspect we should still can discuss in the SEC meetings.
The composition uid is just the Common Information Model locatable uid. so Its a bit strange that one is special and needs a specific syntax.
Also a use case we have is to create a composition and then add it to the directory in one contribution, which only possible if you can set the uid in the composition in the contribution and then put the uid in the folder update in the same contribution .
(clinical hacker alert!) I looked at RFC 9110: HTTP Semantics ands I could still not see a a good argument for using PUT. Can you point to the exact part of the spec which make you think PUT over POST? I can see that a create is allowed by PUT but the spec does not explain why or when it is preferred. We are definitely creating new resources here, not re-allocating uids to an existing resource (or at least that’s my reading!)
The requirements for POST seem fully met and those specs dod not help me understand where a create might be reasonable