# New Python openEHR Synthetic Data Generator **Category:** [Tools](https://discourse.openehr.org/c/tool-dev/36) **Created:** 2026-03-02 12:14 UTC **Views:** 53 **Replies:** 0 **URL:** https://discourse.openehr.org/t/new-python-openehr-synthetic-data-generator/11796 --- ## Post #1 by @Koray_Atalag Howdy, my weekend hack turned into something I thought might be helpful to others running after test data - heaps of them! https://github.com/atalagk/openEHR-Data-Generator Project started from a fork of Berlin-Institute-of-Health / Genkidata ([https://github.com/Berlin-Institute-of-Health/Genkidata](https://github.com/Berlin-Institute-of-Health/Genkidata)). It has significant additions: * instead of just duplicating existing Compositions, it uses: * An NLP library to change text to synonyms for DV_TEXT. While not perfect from a clinical semantics point of view, it's much better than lorem ipsum stuff! * For quantities it changes values randomly between -15 <> +15 percent so it's likely to be clinically plausable. * in addition a new feature to create canonical Compositions from Webtemplates (it’s a biggie! and possibly still has errors but it passed all tests from ehrbase SDK test webtemplates using Pablo’s validation tool). When you run the app, it prompts three options: 1. API Upload (into ehrbase or other CDR) 2. Jitter Existing Compositions (\\source_models\\compositions) but rather than just duplicating in the original app it creates new values) 3. Stored (Source Webtemplates) Existing Compositions are taken from test data from [https://github.com/ehrbase/openEHR_SDK](https://github.com/ehrbase/openEHR_SDK) so they pretty much cover all possible variations. The amount of Compositions and EHRs is defined by user input. Resulting canonical Compositions are saved into: > /dist/compositions You can put your own Compositions (to duplicate but with new values) and Webtemplates into: > /source_models/compositions > > /source_models/webtemplates Enjoy! And comments / tickets / pull requests welcome. --- **Canonical:** https://discourse.openehr.org/t/new-python-openehr-synthetic-data-generator/11796 **Original content:** https://discourse.openehr.org/t/new-python-openehr-synthetic-data-generator/11796