Synthetic openEHR Data Generator - Open Source

Hi folks - the wait is over!

I created a rather sophisticated synthetic openEHR data generator;

Written in Python and runs in Docker containers. README contains clear instructions to setup and run.

For fancy stuff (Webtemplates and flat Compositions it needs an ehrbase (or other compliant CDR/openEHR API) instance. I included improved Dockerfile and docker-compose.yml so you can spin it up easily.

It’s a CLI (terminal based) app. Strongly suggest running on Linux (bare metal or WSL on Windows).
It does two things:

  1. duplicate existing compositions (originally forked from but now completely a different codebase);
  2. generate from opts.

For the latter, it generates Webtemplates, flat json composition skeletons and then you can generate as many compos as you want. These conform to Archetype and Template constraints while jittering values; coded texts picked from valid value sets, timestamps varied within plausible ranges. No lorem ipsum nonsense, no values that break your schema. And it’s surprisingly fast!
Feel free to use, break or improve it.

6 Likes