I am not knowledgeable enough anymore to weigh the requirement and implementation issues in play in this discussion. It is clearly an important matter, though, as I outlined before. I hope the following further reflections may be useful in helping decide the immediate issues you face with handling z-scores.
Silje’s question is important and the approach she rightly points out for defining a reference range for a relevant, well-defined and well-sampled reference population seems unarguable and useful to have available in context of interpreting the raw measurement. This use relates directly to the datum itself and not to a statistic derived both from the datum and summary characteristics of an assumed population from which it is drawn (mean and standard deviation in the context of Z-score). That is a difference which may carry significant consequences downstream – both for openEHR methodology and how it is used in practical clinical context. It can open a door to noise and bias in judgement (see below), so needs to be understood carefully. One has to think about how such information is going to be used.
I see long lists of clinical measurements and their reference ranges on my phone, in my GP health record. At my age, I get screened from time to time using automated tests of blood biochemistry that report maybe 50 such numbers. The GP is alerted about any that may be drifting outside this range and is invited to comment for the record when they do. Usually ‘acceptable’ or ‘will keep an eye on this next year’ or the like, in my case, fortunately, still! . It’s a rule of thumb judgement based on personal experience – the GP’s of lots of folks in their mid seventies, like me, and of my particular profile. It is a judgement and as Kahneman emphasises in his brilliant recent book (Noise – a fault in human judgement), there is a lot to be said for treating such judgement as akin to a measurement, with associated bias and noise. The examples he uses in developing this argument, notably in judicial proceedings, are mind boggling! It’s a great read.
In the clinical world, trends within a statistically defined reference range may well be individually relevant and may be significant (someone normally at one end of the range and progressing steadily towards the other end, for example) . Normal clinical variability in a population is notoriously wide and can mask ongoing trends that are already well advanced when they drift outside the defined reference range. Its hard not to see this surveillance of data not merging into the world of analysis and judgement. The issues being debated here are about a grey area between characterising and recording measurement and characterising and recording judgement and action.
The kind of measurements Thomas was mentioning – partial pressure of CO2 or oxygen saturation – are not so easy, or perhaps useful, to compartmentalise in reference ranges. They reflect all sorts of non-linearities that can quickly become acute problems. The shape of the oxygen dissociation curve is such that the body copes well until deoxygenation progresses to a different region of the curve where the blood’s ability to transport oxygen to support metabolism becomes rapidly more at risk. I doubt anyone would thank their doctors for awaiting a crisis alert at oxygen saturation as low as Thomas quoted! Likewise, the CO2 range he mentions needs to be understood in context of the ongoing physiological mechanisms of lung function, gas exchange and tissue metabolism. The numbers quoted are so wide-ranging that they mean rather little in this practical context of management. I spent nearly 20 years modelling such physiology and putting it to work in clinical context. It’s hard to get beyond applying fairly simple to follow and experienced rules of thumb in this area and hoping for the best! I’ve seen this reality at first hand in both neonatal and adult ICU context over the years.
It seems to me that it is unwise to let decision support-like issues creep too far into data definitions. That’s why this particular use-case around Z-score seems a fairly pivotal one. I assume no one is arguing for a machine learning algorithm to become associated with data definitions. If so, we should probably abandon openEHR and let the machines get on with it! We’re not quite ready for that (yet or hopefully ever!), but where do we steer our modelling paradigm as boundary cases like this challenge existing openEHR methodology and requirement for it to evolve further. Its an empirical matter and that’s why implementation, implementation, implementation (in all the domains openEHR interacts with) must remain the top three priorities guiding development of both openEHR methodology and its wider community. That was written into our community ethos from day one.
I hope this doesn’t sound too irritatingly detached from the nitty gritty of what you are trying to resolve. One tends to a helicopter view of much of life as one gets older!
The