# Pathology numeric values not supported in DV_Quantity **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2006-03-01 01:41 UTC **Views:** 16 **Replies:** 52 **URL:** https://discourse.openehr.org/t/pathology-numeric-values-not-supported-in-dv-quantity/14516 --- ## Post #1 by @Sam Hi everyone, We want to report an issue that has arisen in data processing in Australia. The issue is the somewhat random ability of systems to report a >xx or , <, >=, <=, ~ (~ => approximately, I => Inaccurate) the value "I" may be used for instance when an analyser returns a potassium value which on subsequent examination of the blood is shown to be erroneously high due to haemolysis. This is usually accompanied by some text which is displayed instead of the numeric value e.g. HAEM but the underlying numeric value needs to be stored anyway as well. This of course makes the logic for deciding whether a result is within a normal range more interesting and graphing routines etc need to take this flag into account. I don't feel strongly whether you deal with this as part of the quantity datatype or have a new datatype inheriting from quantity. Regards Vince Dr Vincent McCauley MB BS, Ph.D McCauley Software Pty Ltd --- ## Post #3 by @Tom_Tuddenham I may be missing the mark, as I come orginally from a process control background, so apologies if this sounds like an engineering solution\. There is a mechanism in OPC \(common protocol for getting information out of machines\) to where each data point can be identified according to the quality of the data \- e\.g\. OPC\_QUALITY\_GOOD, OPC\_QUALITY\_BAD, OPC\_QUALITY\_UNCERTAIN\. There is a further qualification for why the data is bad \(connection problems, config error, etc\) but the record still contains a actual value, so it can still be plotted, but if it can also be filtered out\. I guess this is essentially what you were saying Vince\. Unless I've misunderstood what Sam has proposed, the problem with a substituted value is that it's not going to reflect the recorded value \- ie\. a chart won't show the "true" erroneous data\. \-Tom --- ## Post #4 by @system Hi, yy What does it mean? To my mind it semantically means a state of exception. Meaning not only that the measurement is yy but that it is unmeasurable. If this reasoning is true than each archetype with a measurement needs an exception attribute. In general this will be true in many more circumstances. Each possible statement (data item and/or archetype) can have a few states: requested/expected- unrequested/not expected (eg expected is TSH measurement but unrequested and unexpected the response is TSH>2000 as an indication of exception) As exception there are at least two possibilities: known-unknown. (eg RR 120/unknown mmHg. TSH was measured and presented but it must not be considered a real result it is in doubt) true-untrue (eg I measured RR 60/80 this measurement I consider untrue, but it was that was was measured. TSH >2000 but is untrue because it was unmeasurable) Gerard -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #5 by @Sam Vince I like the flags - I wonder if we should have a ? or ! for the value affected by a quality issue - what do others think? The quality issues are dealt with in the laboratory archetypes. Sam Vincent McCauley wrote: --- ## Post #6 by @Karsten_Hilbert Probably "?" for dubitable results\. "\!" is commonly used here for marking up \(perhaps unexpectedly\) clinically important results \(such as an \*unusually\* high titer of something where I expected either normal or somewhat elevated\)\. Karsten --- ## Post #7 by @thomas.beale Just going through the replies we have had on this one..... - Gerard's point about <5 etc being an exception is not quite right - it's very common; it's usually to do with sensitivity of instruments (i.e. accuracy), but there are also analytes which are reported as just being over a threshold since any number larger than X is fine (e.g. glomerulin, Sam tells me). - this is not an indication that the data type is really a DV_INTERVAL or DV_QUANTITY_RANGE - it is clearly not. When we see "HCO3: <5 mmol/L" we are not reporting an interval of 0 - 5 mmol/L, we are reporting a point value somewhere in 0-5, but we don't quite know where. - Tom Tuddenham's point is also correct. In openEHR, we actually do have a data quality marker (I used to work in SCADA as well, and lived with this kind of stuff for years!). It is called null_flavour and is defined on the ELEMENT class, next to the value attribute, which is the one that holds the Quantity that we are talking about (or some other kind of data value in other circumstances). Here we have a more fine-grained occurrence of the same problem, for slightly different reasons: the instrument or measuring method and data communicatoins are working as they should, it's just that either the value is too low or high for quantification by the instrument, or else the instrument doesn't bother reporting it above or below a certain threshold, since it is known that any value above/below is healthy. Nevertheless, we have to treat it in a similar way - probably with a flag that indicates 'status' of the value. - in practical terms we have to deal with the fact that quantities in the form of single-sided intervals with <, >, <=, >= can be mixed in with normal point value quantities, or replace them, on a per test-result basis. - we also have to have a solution that is easily comprehensible in the model and for software developers. Allowing INTERVALs to magically replace QUANTITY as is done in HL7 is not the way to do it, since there is no clean basis in the modelling to do this (i.e. it's not normally possible in OO languages - you have to do something quirky to make it happen.); in any case, as pointed out above, DV_INTERVAL is not semantically correct in these cases anyway. My analysis is that we need to slightly extend DV_QUANTIFIED (supertype of DV_COUNT and DV_QUANTITY, as well as all the date/time types), in the way that Vince has said (probably Vince worked this solution out years ago;-)...so that the semantics are: - a magnitude - NEW ATTRIBUTE: a status flag - with the following possible values: - > : greater than - < : less than - >= : greater than or equal to (Vince, do we really need this and the next one - do you get real values where it is reported like this?) - <= : less than or equal to - = : exact point value (i.e. the default situation) - ~ : approximately equal to, i.e. like '=' but with some unknown error - ? : innaccurate...what does this mean? If it is due to haemolysed blood then is it "inaccurate" or is it really just plain "wrong" ("incorrect")?- accuracy - ..other attributes, depending on subtype Adding a flag will be easy in modelling and software terms. What we have to do is carefully design the values; Vince has provided what is probably just about right, but I would like to be sure - see notes above on the list. Also, remember openEHR QUANTIFIED class already has accuracy as a Real - it can be a % or absolute value, so that any DV_QUANTIFIED can be created with a +/- 5% or whatever. Given this, do we need the '~' flag (maybe we do: maybe there is no accuracy data available, and all we can get from a legacy feed is '~')? And isn't the "inaccurate" flag (as Vince named it) about something else? As Vince said, doing this means more careful data analysis to determine whether a value is normal or not, and how it should be graphed. Do we need to take this into account in the model in some way - there is already another CR to adjust how normal_range is modelled, and we have an is_normal function defined on DV_ORDERED (the ancestor of all the Quantity types in openEHR). If we can get a bit more discussion on these details, I think we can fairly quickly state what changes are needed and write a CR for them. - thomas Sam Heard wrote: --- ## Post #8 by @system Thomas, I agree it is very common. But when <5 is reported in essence it means that it is an exception. It is not a precise result. It does not mean that it is less than 5 only. It means that something of an exceptional state in in order. It could be zero, it could be 4.999.. And anything in between but not an exact figure like x=5.1 units of a kind. Gerard -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #9 by @Heath_Frankel Gerard, There are cases where we get a result like > 60 which is not an exception because the normal range is >60. Heath --- ## Post #10 by @Vincent_McCauley Hi Thomas, Specific points: 1. My pathology software supports <= and >= in this context but I have not come across an automated blood analyser interface that supports or requires this and in the couple of databases I looked at (approx. 15X10^6 numeric values over 8 years) no user has used these values. So probably good for completeness but no apparent use in the real world! 2. The "Inaccurate flag" generally means that for some reason this value should be treated with caution or is unreliable. It may in fact be perfectly accurate (as in the haemolysed blood K+ example) but not actually a measure of the defined analyte - i.e. rather than a serum K+ value what was measured was serum K+ contaminated with Intracellular K+. Similar issues occur with cold agglutinins (inaccurate values if performed on specimen not kept at body temp) and Serum glucose measured on a specimen which does not contain a metabolic inhibitor ("fluoride tube") which has been kept at room temp for too long. At other times it may actually be inaccurate e.g due to failure to calibrate an analyser correctly. The fact that the value is unreliable often only becomes apparent after the event e.g. the doctor rings up to query a high normal K+ on a patient whose K+ value was expected to be low and a visual/microscopic examination of the specimen reveals haemolysis In the case of a badly calibrated analyser, statistical analysis of values performed routinely may demonstrate that there has been an unacceptable variability in results or the average result was significantly higher than expected or (as happened very memorably in my practice in the Emergency department) the values over a period of time fail to correlate with the clinical condition of the patients. So it is absolutely necessary to be able to record (and keep) the value but be able to flag it either at the time or sometime later as unreliable/inaccurate. It is probably not worth (or even in some cases possible) to decide whether an erroneous result is inaccurate or unreliable. 3. Rules need to be provided as to how such values should be treated when comparing with normal ranges. For example if the normal range is 0-6 and the value given is <5 then this is normal. However if the normal range is 0-3 is this a normal value or not? This can be dealt with by "flavours of null" on a "normality flag". 4. Applications such as graphing, statistics packages etc. need to be aware of such values and treat them appropriately. Some general guidance/rules around this for developers/users may be appropriate. Regards Vince --- ## Post #11 by @grahamegrieve hi I don't think that the concept of <,> etc should be conflated with the concept of approximately and doubtful in the model\. the approximate and doubtful always raise the issue of why and how and so I think that should be a matter for the archetype to resolve\. However < and > etc, should be a data type thing\. Grahame Thomas Beale wrote: --- ## Post #12 by @system Raise the proper flag that indicates that it is TRUE and we know how to interpret. GF -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #13 by @thomas.beale Concerning my last reply on this subject, I feel the appropriate solution is: \* add an attribute value\_qualifier of type STRING with allowable values >, <, >=, <=, = \(since this is a closed list, using coded terms doesn't seem to be useful\) \* allow ELEMENT\.null\_flavour and DV\_QUANTIFIED\.accuracy to be used to cover Vince's Inaccurate \(probably wrong\) and '\~' \(slightly inaccurate, but usable value\) cases respectively\. In the latter case, it seems to me that if accuracy is going to be reported, it should be quantified, the way we do it in openEHR, i\.e\. \+/\- 5%, \+/\-2 and so on\. Vince \- am I being unreasonable? Did you have '\~' because labs devices output this? Unless better ideas surface, I will create a CR on the basis of the above\. \- thomas Grahame Grieve wrote: > hi > > I don't think that the concept of <,> etc should > be conflated with the concept of approximately > and doubtful in the model\. the approximate and doubtful always raise the issue of why and how and so I think that should be a matter for the archetype to resolve\. However < and > etc, should be a data type thing\. > > Grahame > Grahame, you are right \- to express ">5 \(inaccurate\)" we need two flags\.\.\. I can't think of great names of the top of my head, but how about:     \* value\_qualifier \- the attribute that carries the <, >, = etc     \* value\_status \- an attribute that carries some other possible       flags, e\.g\. ?, \~, others? I am suggesting that Vince's '\~' is more like a data quality marker than an indicator of how to read the value\.\.\.'?' means inaccurate\.\.\.\.possibly wildly? Are '\~' and '?' really different? If the second flag was just to say accurate / inaccurate then we could just use a Boolean\. That would probably cover 95% of needs and be simple at the same time\.\.\.\.Vince \- any comments on that? I think we are close to a solution here\. \- thomas --- ## Post #14 by @Tim_Churches Thomas Beale wrote: > > Concerning my last reply on this subject, > > I feel the appropriate solution is: > \* add an attribute value\_qualifier of type STRING with allowable values >> , <, >=, <=, = \(since this is a closed list, using coded terms doesn't > > seem to be useful\) > \* allow ELEMENT\.null\_flavour and DV\_QUANTIFIED\.accuracy to be used to > cover Vince's Inaccurate \(probably wrong\) and '\~' \(slightly inaccurate, > but usable value\) cases respectively\. In the latter case, it seems to me > that if accuracy is going to be reported, it should be quantified, the > way we do it in openEHR, i\.e\. \+/\- 5%, \+/\-2 and so on\. Vince \- am I being > unreasonable? Did you have '\~' because labs devices output this? If you are going to capture error limits around a scalar quantity, then you need to also capture the nature of those limits\. Sometimes they are simply co\-efficients of variation, sometimes one or two \(or 1\.96\) standard deviations \(as frequentist confidence intervals for normally distributed data, or asymptotically normal confidence limits for non\-normally distributed data\), sometimes they are non\-normal confidence limits, and occasionally \(but often with clinical trials etc\) they are Bayesian credible intervals\. Then there is the confidence level \- often 95% but sometimes 99%, sometimes less\. Will the proposed solution cover these and other scenarios? Tim C --- ## Post #15 by @thomas.beale Tim Churches wrote: > Thomas Beale wrote: >   >> Concerning my last reply on this subject, >> >> I feel the appropriate solution is: >> \* add an attribute value\_qualifier of type STRING with allowable values >>     >>> , <, >=, <=, = \(since this is a closed list, using coded terms doesn't >>>       >> >> seem to be useful\) >> \* allow ELEMENT\.null\_flavour and DV\_QUANTIFIED\.accuracy to be used to >> cover Vince's Inaccurate \(probably wrong\) and '\~' \(slightly inaccurate, >> but usable value\) cases respectively\. In the latter case, it seems to me >> that if accuracy is going to be reported, it should be quantified, the >> way we do it in openEHR, i\.e\. \+/\- 5%, \+/\-2 and so on\. Vince \- am I being >> unreasonable? Did you have '\~' because labs devices output this? >>     > If you are going to capture error limits around a scalar quantity, then > you need to also capture the nature of those limits\. Sometimes they are > simply co\-efficients of variation, sometimes one or two \(or 1\.96\) > standard deviations \(as frequentist confidence intervals for normally > distributed data, or asymptotically normal confidence limits for > non\-normally distributed data\), sometimes they are non\-normal confidence > limits, and occasionally \(but often with clinical trials etc\) they are > Bayesian credible intervals\. Then there is the confidence level \- often > 95% but sometimes 99%, sometimes less\. Will the proposed solution cover > these and other scenarios? >   Hm\.\.\.\.that's a good question\. Currently the model \(see http://www.openehr.org/uml/Browsable/_9_0_76d0249_1109599337877_94556_1510Report.html) only captures limits as either a \+/\- percent, or as a \+/\- absolute value \(see the accuracy attributes in the diagram\) \- it does this via the attribute accuracy\_is\_percent which is just a Boolean\. What you are asking for would be accommodated by making it a code which indicated the meaning of the accuracy band\. So far we have not had such requirements expressed for the openEHR models, but as I happen to know you are coming form an epidemiological/public health/statistical point of view, clearly we need to accommodate them\. Tim, if the accuracy\_is\_percent attribute was upgraded to a coded value, could you suggest a set of meanings that would cover all the epi/PH needs? \- thomas --- ## Post #16 by @Tim_Churches > > Tim Churches wrote: > > If you are going to capture error limits around a scalar quantity, > then > > you need to also capture the nature of those limits\. Sometimes they > are > > simply co\-efficients of variation, sometimes one or two \(or 1\.96\) > > standard deviations \(as frequentist confidence intervals for normally > > distributed data, or asymptotically normal confidence limits for > > non\-normally distributed data\), sometimes they are non\-normal > confidence > > limits, and occasionally \(but often with clinical trials etc\) they are > > Bayesian credible intervals\. Then there is the confidence level \- > often > > 95% but sometimes 99%, sometimes less\. Will the proposed solution > cover > > these and other scenarios? > > > Hm\.\.\.\.that's a good question\. Currently the model \(see > http://www.openehr.org/uml/Browsable/_9_0_76d0249_1109599337877_94556_151 > 0Report\.html\) > only captures limits as either a \+/\- percent, or as a \+/\- absolute value I think the term "absolute value" as you are using it here \(and I understand how you are using it\) can lead you into a semantic trap, because \+/\-2 is almost always a confidence band \- the producer of the data is rarely absolutely certain that the true value is no more than 2 higher or 2 lower than the nominal value, just that there is a 95% \(or whatever\) likelihood that the true value is within those limits \(and for the epidemiological and statistical puriosts, yes I know that that is not actually a correct statement for frequentist limits, but it is the most commonly used albeit slightly mistaken interpretation\.\.\.\)\. Note also that confidence limits are no always symmetrical \(because not all values have a normal distribution\), so you need to be able to capture both upper and lower limits as separate values\. > \(see the accuracy attributes in the diagram\) \- it does this via the > attribute accuracy\_is\_percent which is just a Boolean\. What you are > asking for would be accommodated by making it a code which indicated the > meaning of the accuracy band\. > > So far we have not had such requirements expressed for the openEHR > models, but as I happen to know you are coming form an > epidemiological/public health/statistical point of view, clearly we need > to accommodate them\. Almost all lab results are just best guesses from a statiistical distribution, so it is not just an epidemiological/public health concern\. > Tim, if the accuracy\_is\_percent attribute was upgraded to a coded value, > could you suggest a set of meanings that would cover all the epi/PH > needs? You'll have to tell me what that would involve\. A single coded value? Upper and lower limits? Confidence level\. Type of limit? Tim C --- ## Post #17 by @thomas.beale Tim Churches wrote: >   >> Tim, if the accuracy\_is\_percent attribute was upgraded to a coded value, could you suggest a set of meanings that would cover all the epi/PH needs? >>     > You'll have to tell me what that would involve\. A single coded value? Upper and lower limits? Confidence level\. Type of limit? >   well, essentially what you are proposing would require \(let's not get too pure about how I use the word "accuracy" here for the moment\): \- lower accuracy limit: Real \- upper accuracy limit: Real \- accuracy limit type: coded term \- confidence level \(or this could be part of the previous coded attribute, since only a small number of confidence bands are used in practice aren't they?\) Now, what we currently have is a set of general purpose quantity classes designed to enabled recording of any quantitative data we have come across so far\. Between various MDs such as Sam, Vince and others, I think we have pathology covered from a practical point of view \(well, we do once we get this <, >, etc thing sorted\)\. The real question is: what is the type & origin of data that need to represented in the more sophisticated way that we are now suggesting? Is it a different category of data? Should be leave the current DV\_QUANTITY as is and add a new subtype? Or is it that we should consider a quantity with a 95% T\-distribution confidence interval as a pretty normal thing? Should we then start considering the "simple" idea of a symmetric accuracy range \(\+/\- xxx\) as really just one specific type of a confidence interval \(it might translate to something like 98% on a normal curve\)\. In other words, should we generalise he "accuracy" notion into a "confidence interval" notion? \- thomas --- ## Post #18 by @system Hi, A few words from a non-techie. Quantity means that what is the resulting figure expressing a quantity. Hb: 8.5 mmol/L A property of the Hb measurement can be an uncertainty. This is not an uncertainty of the figure "8.5", but of the Hb measurement where 8.5 is the correct resulting number and mmol/L the code for the units. There can be the question that the reported 8.5 really is 8.5 with/without roundoff error. Only roundoff could be added to DV-QUANTITY as an added extra property, I think. Uncertainty is added information that the uncertainty of the measurement is plus or minus something according to a specified (or implied) distribution type. In my view uncertainty is the property of the measurement i.e. the specific archetype/template that will express the number This uncertainty will be expressed in an archetype using attributes using DV-QUANTITY expressing the uncertainty as limits and a distribution type term (with a default gaussian distribution?) Gerard Freriks -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #19 by @Tim_Churches Gerard Freriks wrote: > Hi, > > A few words from a non\-techie\. > > Quantity means that what is the resulting figure expressing a quantity\. > Hb: 8\.5 mmol/L > > A property of the Hb measurement can be an uncertainty\. > This is not an uncertainty of the figure "8\.5", but of the Hb > measurement where 8\.5 is the correct resulting number and mmol/L the > code for the units\. > There can be the question that the reported 8\.5 really is 8\.5 > with/without roundoff error\. > Only roundoff could be added to DV\-QUANTITY as an added extra property, > I think\. > > Uncertainty is added information that the uncertainty of the measurement > is plus or minus something according to a specified \(or implied\) > distribution type\. > In my view uncertainty is the property of the measurement i\.e\. the > specific archetype/template that will express the number > This uncertainty will be expressed in an archetype using attributes > using DV\-QUANTITY expressing the uncertainty as limits and a > distribution type term \(with a default gaussian distribution?\) The foregoing seems sensible to me\. I think that uncertainty or confidence interval \(or credible interval\) information for a scalar quantity \(such as a biochemistry result\) should be treated in a similar manner to normal ranges for lab tests results\. How does openEHR handle normal ranges, which, depending on the type of test, may be specific to each lab/assay method or kit and reference population\. My microbiologist colleagues keep reminding me that most serological results can't really be interpreted in the absence of assay\- and lab\-specific reference titres, and the same is true of many NAT \(nucleic acid test\) assays, especially those involving PCR\. Usually the microbiologist or pathologist will provide their lab\- and assay\-specific interpretation of the numbers for the requesting doctor eg "titre 1 in 256 i\.e\. positive for XYZ", and it could be argued that it is enough to capture just the interpretation of the numbers \- but that doesn't seem tot be the guiding principle elsewhere in openEHR\. For example, I marvel at the completeness of the archetype for capturing blood pressure measurements, right down to the detail of which phase of the Korotkoff sounds is used, as I recall\. Applying that same degree of attention to detail to lab results means having the ability to accommodate quite a lot of metadata about each scalar result\. Mostly that detailed metadata about accuracy or confidence limits or about assay types won't be collected, won't be available or won't matter, but occasionally it will matter, and I suppose that's what openEHR needs to plan for, within reason\. Tim C --- ## Post #20 by @Tim_Churches Thomas Beale wrote: > Tim Churches wrote: >> >>> Tim, if the accuracy\_is\_percent attribute was upgraded to a coded >>> value, could you suggest a set of meanings that would cover all the >>> epi/PH needs? >>>     >> You'll have to tell me what that would involve\. A single coded value? >> Upper and lower limits? Confidence level\. Type of limit? >>   > > well, essentially what you are proposing Not proposing anything, I'm just asking the question "Have you thought about this?" > would require \(let's not get > too pure about how I use the word "accuracy" here for the moment\): > \- lower accuracy limit: Real > \- upper accuracy limit: Real > \- accuracy limit type: coded term > \- confidence level \(or this could be part of the previous coded > attribute, since only a small number of confidence bands are used in > practice aren't they?\) > > Now, what we currently have is a set of general purpose quantity classes > designed to enabled recording of any quantitative data we have come > across so far\. Between various MDs such as Sam, Vince and others, I > think we have pathology covered from a practical point of view \(well, we > do once we get this <, >, etc thing sorted\)\. Just curious: have you had much input from pathologists, microbiologists and lab scientists? The more one talks to such people, the more one discovers about the uncertainties inherent in certain assay techniques, and the differences in the scalar \(and qualitative or Boolean\) results produced by different assay kits and different labs\. Oh, there's another form of uncertainty which typically is of relevance to Boolean/dichotomous results \(positive/negative, detected/not detected etc\) and that is the sensitivity and specificity of the test, or the related quantities PPV \(positive predictive value\) and NPV\. \(Note to computer scientists: "specificity" and "sensitivity" are cognate with "precision" and "recall"\.\) > The real question is: what is the type & origin of data that need to > represented in the more sophisticated way that we are now suggesting? Is > it a different category of data? Should be leave the current DV\_QUANTITY > as is and add a new subtype? Or is it that we should consider a quantity > with a 95% T\-distribution confidence interval as a pretty normal thing? > Should we then start considering the "simple" idea of a symmetric > accuracy range \(\+/\- xxx\) as really just one specific type of a > confidence interval \(it might translate to something like 98% on a > normal curve\)\. In other words, should we generalise he "accuracy" notion > into a "confidence interval" notion? I think that a one or two day workshop with a range of pathologists, microbiologists, lab scientists, epidemiologists and statisticians \(and some clinicians and computer scientists, of course\) would suffice to come up with sensible answer to your question\. I'd be happy to participate and to suggest other participants\. First half day would need to be spent bringing everyone up to speed on openEHR so they understand the nature of the question\(s\) to be addressed \(and a good means of spreading the openEHR gospel while you're at it\.\.\.\)\. Might be possible to hold a cyber\-workshop instead, via email or real\-time conferencing\. The former would be much slower, of course\. Tim C --- ## Post #21 by @thomas.beale Sam would be better able to give an idea of all the health professionals who have been consulted, but certainly in Australia, Vince McCauley \(a pathologist\) has been extremely helpful on pathology result detail\. Also, people like Heath Frankel and Grahame Grieve who have worked with HL7v2 messages for years have provided quite a lot of input on details \(for example in Release 1\.0, there is now a summary attribute for Historical data structures, directly due to Grahame's advice on the shape of lab data his software handles \- see http://www.openehr.org/uml/Browsable/_9_0_76d0249_1109157527311_729550_7234Report.html). Is it enough? At this stage I would be fairly confident that the models are good enough for most pathology data \(certainly everything any of the docs working with openEHR has seen\)\. Are they perfect? Of course not\. We always need more input\. The confidence level stuff implied by your requirements \(let's treat them as epi/public health data requirements\) would make things better; we just have to determine a\) what scope of data they apply to \(e\.g\. how much sophistication do we need in the EHR compared to say a dedicated data warehouse designed for statistical studies?\) and b\) how to add them to the current model in a way compatible with what is there\. I think that he idea of a workshop is a good one; I would prefer to see clinical professionals here take up the suggestion and do something with it; I don't see these kinds of discussions as being IT driven \- they are all about articulating requirements\. \- t Tim Churches wrote: --- ## Post #22 by @grahamegrieve I agree with this \- that's it good enough now\. I think this thread is starting to talk about things which aren't properly part of the data type, they are conceptual things about the result values, and should be modelled explicitly in the archetypes\. Grahame Thomas Beale wrote: --- ## Post #23 by @thomas.beale Grahame Grieve wrote: > I agree with this \- that's it good enough now\. > > I think this thread is starting to talk about things which aren't properly part of the data type, they are conceptual things about > the result values, and should be modelled explicitly in the archetypes\. Grahame, is it your feeling that we need to have a better model of accuracy, i\.e\. more like the confidence interval idea? Or are we ok with what we have? My gut feeling is to leave the current DV\_QUANTITY the way it is and consider either a\) doing nothing \- treat Tim's requirements as requirements not on primary data going into the EHR, but on generated data of the kind found in an epidemiological/statistical style of system or b\) add a variant of DV\_QUANTITY \(probably a subtype\) that does the full deal \(and is convertible into a vanilla DV\_QUANTITY\)\. thoughts anyone? \- thomas --- ## Post #24 by @grahamegrieve hi Thomas > is it your feeling that we need to have a better model of accuracy, i\.e\. more like the confidence interval idea? Or are we ok with what we have? well\. a measured quantity is a group of data, with some or all of the following things known: \- what was measured \- how it was measured \(\+ who & when & where & environment conditions\!\) \- units for the values \- a possible range of values What was measured does usually need to be known\. In terms of modeling this as a data type, I note that what was measured is usually considered to be outside the data type\. And though I wonder whether this is actually right, we're not discussing this right now\. In real life, we generally don't count how it was measured as part of the value\. The idea is that you go read the "methods section" \(whatever that means\) if you care\. Except that in clinical medicine, there's a few things where how something is measured matters\. in clinical medicine, we generally say that something else was measured at this point\. A classic example is Total Calcium and Ionized Calcium\. \(and it's not wrong to call them different things, my point is that it's arbitrary\)\. Anyhow, I've never heard someone argue that they should be part of the datatype\. I don't think there's much point in differentiating between a measured and a non\-measured quantity \- that's for philosophy\. so, back to the possible range of values\. This is a complex concept\. Generally, the possible range of values is a bell shaped probability distribution \(or a log bell curve\), but it's rarely properly known whether it actually is \- it's generally assumed that it is a bell curve\. You could \*approximate\* the concept of a probability distribution by reporting a central value with a \+/\-, or an interval that expresses the 95% percentile\. I know that we got taught in uni to track uncertainties \(and sometimes even to quantitate the distribution curve\), and to bring them through our equations \(and conclusions\!\), but out in the real world, it's rarely done in published papers \(shame, really\) and I've never seen it done in clinical work \(even in clinical research\)\. In clinical medicine, the only behaviour I've seen is to report a single value, what was actually measured, and not say anything at all about the uncertainty\. No, I'm wrong\. Once I used to perform an assay where the methodological uncertainty in the number was clinically significant\. We used to report a range rather than a point value, so's the doctor's couldn't be mistaken about it's meaning\. Reporting <X or >X for a value is something that you have to do if you aren't normally reporting a range of values\. So you said you didn't want to model that as an interval, but I was less than convinced \- if you always reported an interval, it would be consistent\. But even if you were consistent in this way, the methodological basis for the "interval" <5 or >5000 is not the same as the methodological basis for 100\-110\. These concepts overlap\. If you added confidence interval \- as an optional item \- then you get an interesting situation\. If I say that this value is <50 \(ci=100\), what am I saying? \(and don't laugh, this is a common clinical result value to report\)\. In clinical medicine, also, the things that may corrupt the result due to interference from drugs, unusual medical conditions, etc, these don't contribute to the distribution range, so it's not usually significant\. This is starting to ramble\. As I said, in clinical medicine, we only report a single value, let the interpreter figure out the distribution themselves\. If they're not sure, they should contact the number on the report \(in all legal jurisdictions I know, there must be one\)\. I think that for the rare cases where the distribution range needs to be conveyed/stored outside the generating system, then the archetype should store it\. The archetype already includes some of the other stuff in my original data grouping, so I don't see it as inappropriate to solve it this way\. so, leave it as it is\. Grahame --- ## Post #25 by @Tim_Churches Thomas Beale wrote: > > Sam would be better able to give an idea of all the health professionals > who have been consulted, but certainly in Australia, Vince McCauley \(a > pathologist\) has been extremely helpful on pathology result detail\. > Also, people like Heath Frankel and Grahame Grieve who have worked with > HL7v2 messages for years have provided quite a lot of input on details > \(for example in Release 1\.0, there is now a summary attribute for > Historical data structures, directly due to Grahame's advice on the > shape of lab data his software handles \- see > http://www.openehr.org/uml/Browsable/_9_0_76d0249_1109157527311_729550_7234Report.html). > > Is it enough? At this stage I would be fairly confident that the models > are good enough for most pathology data \(certainly everything any of the > docs working with openEHR has seen\)\. Are they perfect? Of course not\. We > always need more input\. The confidence level stuff implied by your > requirements \(let's treat them as epi/public health data requirements\) > would make things better; we just have to determine a\) what scope of > data they apply to \(e\.g\. how much sophistication do we need in the EHR > compared to say a dedicated data warehouse designed for statistical > studies?\) and b\) how to add them to the current model in a way > compatible with what is there\. Sure, that's a very reasonable position\. I was not suggesting that openEHR \*must\* to accommodate such things, but as someone else opened the Pandora's Box of \+/\- accuracy as a data value property, as opposed to part of a higher\-level Archetype construct,, I felt obliged to point out that there was more to it than there might appear at first glance\. But a system for EHRs can't accommodate every subtlety of the Universe, so best to force the lid back down on the Box in this case\. > I think that he idea of a workshop is a good one; I would prefer to see > clinical professionals here take up the suggestion and do something with > it; I don't see these kinds of discussions as being IT driven \- they are > all about articulating requirements\. Happy to participate and to suggest other participants if someone wishes to organise one\. Tim C --- ## Post #26 by @Tim_Churches Thomas Beale wrote: > Grahame Grieve wrote: >> I agree with this \- that's it good enough now\. >> >> I think this thread is starting to talk about things which aren't >> properly part of the data type, they are conceptual things about >> the result values, and should be modelled explicitly in the archetypes\. > > Grahame, > > is it your feeling that we need to have a better model of accuracy, i\.e\. > more like the confidence interval idea? Or are we ok with what we have? > My gut feeling is to leave the current DV\_QUANTITY the way it is and > consider either > a\) doing nothing \- treat Tim's requirements as requirements not on > primary data going into the EHR, but on generated data of the kind found > in an epidemiological/statistical style of system or I think that is the best option, but I must point out that not all of the things I mentioned were purely of interest to epidemiologists and statisticians\. I don't have time to look right now, but I am sure openEHR has the issue of normal/reference ranges for lab results well and truly covered, but the issue of the specificity/sensitivity/PPV/NPV of a particular type/method/brand of test \*is\* of immediate clinical interest in some circumstances, and doesn't belong only in a data warehouse\. But it probably belongs in the Archetype, not the reference model\. Tim C --- ## Post #27 by @Sam Dear Tom and all I must say that quantifying accuracy and uncertainty is very difficult - and I do like the inclusion of ~ in the set of flags to mean approximately - when there is no idea of accuracy from a mathematical point of view. I think we may lose something if we try and get it all in the notion we currently have of a measured accuracy. Sam Thomas Beale wrote: --- ## Post #28 by @thomas.beale Sam Heard wrote: > Dear Tom and all > > I must say that quantifying accuracy and uncertainty is very difficult \- and I do like the inclusion of \~ in the set of flags to mean approximately \- when there is no idea of accuracy from a mathematical point of view\. I think we may lose something if we try and get it all in the notion we currently have of a measured accuracy\. > > Sam > notwithstanding the fact that the '\~' flag probably worked well in Vince's system, I cannot help but wonder what it really adds; if you don't know either the absolute limits \(of the measuring device\) or statistical confidence limits, how do you compute with it? How can I write a decision support program that takes notice of a '\~' flag? \- thomas --- ## Post #29 by @Heath_Frankel Tom, Does leaving "the current DV\_QUANTITY the way it is" include the ability to record "< 5 mmol/L" for example? Heath --- ## Post #30 by @thomas.beale Heath Frankel wrote: > Tom, > Does leaving "the current DV\_QUANTITY the way it is" include the ability to > record "< 5 mmol/L" for example? >   yes \- sorry \- that was ambiguous \- we have to make that addition \(using a coded attribute\)\. \- t --- ## Post #31 by @Colin_Sutton The 'coding' is surely 'Accuracy' \('Measurement' has 'Accuracy'\) where this can be None|\~|Unknown|Percentage\(value\)\!SD\(Distribution type,value\) which would cover any measurement \(e\.g\.height,heart rate\), not just pathology lab values Regards, Colin Sutton --- ## Post #32 by @Sam It is a flag that says the value is very uncertain - accuracy is not known - how do we say this - or a quality factor makes the reading very uncertain. I just want to be able to see how we express when accuracy is poor but not quantifiable. Sam Thomas Beale wrote: --- ## Post #33 by @thomas.beale Sam Heard wrote: > It is a flag that says the value is very uncertain \- accuracy is not known \- how do we say this \- or a quality factor makes the reading very uncertain\. I just want to be able to see how we express when accuracy is poor but not quantifiable\. > Sam > doesn't it mean that the value is completely useless, and that instead, a null\-flavour flag should be set in the Element, and no data value be recorded at all? \- thomas --- ## Post #34 by @system Thomas, In a data type like DV \- in my mind\- only flags can be raised that indicate the technicalities of that number\. And that means "round off error" with which it is reported\. All other flags are at the archetype level\. Null\-Flavors belong there\. It is all at the semantic level, the knowlegde level\. And not the numeric interpretation level\. Gerard \-\- CEN/tc251 Convenor \-\- Gerard Freriks, MD convenor CEN/tc251 WG1 TNO ICT Brassersplein 2 Delft T: \+31 15 2857105 M: \+31 6 54792800 --- ## Post #35 by @Tim_Churches Thomas Beale wrote: > Sam Heard wrote: >> It is a flag that says the value is very uncertain \- accuracy is not >> known \- how do we say this \- or a quality factor makes the reading >> very uncertain\. I just want to be able to see how we express when >> accuracy is poor but not quantifiable\. >> Sam >> > > doesn't it mean that the value is completely useless, and that instead, > a null\-flavour flag should be set in the Element, and no data value be > recorded at all? There is no such thing as a pre\-condition in clinical medicine: the clinician has to compute with whatever parameter values are provided, no matter how shabby, incomplete and inconsistent they are\.\.\. Floyd and Hoare would have hated being doctors of medicine, I suspect\. ;\-\) Tim C --- ## Post #36 by @Sam Tom That is unlikely - it just means it is the best that could be done. Remember this can be used usefully outside of labs too. Sam Thomas Beale wrote: --- ## Post #37 by @thomas.beale Colin Sutton wrote: > The 'coding' is surely 'Accuracy' \('Measurement' has 'Accuracy'\) where this can be None|\~|Unknown|Percentage\(value\)\!SD\(Distribution type,value\) > which would cover any measurement \(e\.g\.height,heart rate\), not just pathology lab values >   this seems pretty close to a correct model\. Slight corrections I would suggest are: \- I am still uncomfortable with '\~', since it seems to mean "approximate", but "we don't know how approximate"\.\.\. \- does "None" mean a\) none recorded \(i\.e\. don't know, i\.e\. same as '\~'\) or b\) no accuracy, i\.e\. an exact value \(reasonable for some things, e\.g\. the answer to the question "number of previous pregnancies"\)? \- in the case of a statistical distribution, one value may not be enough to characterise the limits, since the distribution may be asymmetric \(I don't remember enough beyond normal/T/Chi2 to remember if there are distributions that need even more parameters\)\. The question for us in openEHR is how much to implement of such a model: we have to be driven by real use cases\. \- thomas beale --- ## Post #38 by @Tim_Churches Thomas Beale wrote: > Colin Sutton wrote: >> The 'coding' is surely 'Accuracy' \('Measurement' has 'Accuracy'\) where >> this can be None|\~|Unknown|Percentage\(value\)\!SD\(Distribution type,value\) >> which would cover any measurement \(e\.g\.height,heart rate\), not just >> pathology lab values >>   > > this seems pretty close to a correct model\. Slight corrections I would > suggest are: > \- I am still uncomfortable with '\~', since it seems to mean > "approximate", but "we don't know how approximate"\.\.\. > \- does "None" mean a\) none recorded \(i\.e\. don't know, i\.e\. same as '\~'\) > or b\) no accuracy, i\.e\. an exact value \(reasonable for some things, e\.g\. > the answer to the question "number of previous pregnancies"\)? > \- in the case of a statistical distribution, one value may not be enough > to characterise the limits, since the distribution may be asymmetric \(I > don't remember enough beyond normal/T/Chi2 to remember if there are > distributions that need even more parameters\)\. In terms of statistical confidence limits/intervals, the parameters are: the type of limits/interval \(frequentist "confidence interval" or Bayesian "credible interval", the confidence value \(typically 95%, but often not\), and the underlying assumed \*error\* distribution \(normal, Poisson, Student's T, Weibull etc etc\)\. However, confidence intervals/limits don't indicate where in a population distribution a particular value lies \- quantiles are more often used for this \- the actual quantile of the value \(eg for growth measurements read against a normogram\), or the values of the quartiles or 5th and 95th percentile, or variations on that\. > The question for us in openEHR is how much to implement of such a model: > we have to be driven by real use cases\. If you really want to nail this problem, a workshop involving a range of people \(from lab scientists to pathologists to clinicians to epidemiologists and biostatisticians\) is required, I think\. It could be in the form of a virtual workshop via email, but you really need to gather together a diverse group, state the problem/s to be solved \(eg lab values only, or physical measures only, or to include other things like measures derived from psychological scales or population or study measures like odds ratios and relative risks or age\-standardised rates?\), and get them to generate use cases and explore the issues\. Happy to be involved, but as an epidemiologist, I'd feel more comfortable if some mathematical statisticians and some lab scientists were involved too\. Tim C --- ## Post #39 by @system Folks, I will repeat myself. You are talking about a data type. This DV_Quantity is a number. The question is how do we embellish this data type and the number it presents with extra codes/numbers to indicate: types of certainty/uncertainty, and statistical distributions. The only real meaning of an extra attribute as part of DV-Quantity pertains to the number given and not the meaning (interpretation). The extra attribute in DV-Quantity will provide information about the precision of the number, only. Any extra information is a property of the concept in which DV-Quantity is used. E.g. certainty/uncertainty, distribution, etc. It is related to the specific concept and its context that is being expressed and not the expression of a number/data type. ~, statistical distributions, etc will have to be expressed at the level of Concept definition and therefore the Archetype. Greetings Gerard -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #40 by @thomas.beale Tim, I agree with the workshop idea, and assume that it could at least be done in Australia as a starting point\. Thus, for the short term, I am inclined to add only the very simple "<, >, <=, >=, =" indicator, and possibly consider the "\~" one \(since these at least allow us to properly represent very low and very high path test values that are sent as "<5" and similar\)\. The complex stuff that Tim has described below needs proper modelling and in the end will lead to new data types \(and as Gerard says, it may well lead to something in the archetypes\)\. As with everything, we need to really understand the exact requirements first, and that probably won't happen without a workshop\. \- thomas Tim Churches wrote: --- ## Post #41 by @system I agree. A workshop, moment of reflexion, is needed. We must understand better the real facts, the use cases, the requirements, before we come to wrong constructs in the wrong models or the correct ones. Next we need to use the same definitions for: Data Type, Composit Data Type, Archetype. We need a common understanding of the function and meaning of **Class**, its **Attributes**, and **Data Types**. (Data Types are used to define the interface at the field/number/text level with other system components) Gerard Data Type, (e.g. a Floating Point Number) Composit Data Type, (e.g. Floating Point number, plus truncation) Archetype, (e.g. Measurement and its interpretation: ~, <, <<, >>, >, good, bad, not to be trusted, etc, etc) -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #42 by @Heath_Frankel Tom, Not sure of the need for <= or >=\. It's either beyond the value reading capability of the device or an actual value is record \(within some accuracy tolerance\)\. Heath --- ## Post #43 by @thomas.beale Heath Frankel wrote: > Tom, > Not sure of the need for <= or >=\. It's either beyond the value reading > capability of the device or an actual value is record \(within some accuracy > tolerance\)\. > > Heath > yes, this has already been mentioned \- Vince has never seen it in the millions of results his software has processed\. Vince suggested we put it in for completeness, but now that you hvae made me think of it\.\.\.maybe we should only allow what makes sense, i\.e\. =, >, <\.\.\.\.\.with \~ still to be resolved\.\.\. \- thomas --- ## Post #44 by @William_E_Hammond Althio it is possible to write the logic to avoid this need, I find it useful at times to express the logic with great than or equal to make it clear what the logic says\. The overhead is minbor\. I argue for retaining > = and <=\. Ed Hammond                       Thomas Beale                       <Thomas\.Beale@OceanInfor To: openehr\-technical@openehr\.org                       matics\.biz> cc:                       Sent by: Subject: Re: Pathology numeric values not supported in DV\_Quantity                       owner\-openehr\-technical@                       openehr\.org                                                                                                                                                                      04/25/2006 05:03 AM                       Please respond to                       openehr\-technical                                                                                                                                                Heath Frankel wrote: > Tom, > Not sure of the need for <= or >=\. It's either beyond the value reading > capability of the device or an actual value is record \(within some accuracy > tolerance\)\. > > Heath > yes, this has already been mentioned \- Vince has never seen it in the millions of results his software has processed\. Vince suggested we put it in for completeness, but now that you hvae made me think of it\.\.\.maybe we should only allow what makes sense, i\.e\. =, >, <\.\.\.\.\.with \~ still to be resolved\.\.\. \- thomas --- ## Post #45 by @Karsten_Hilbert <= >= do happen over here in Germany\. Karsten --- ## Post #46 by @thomas.beale Karsten Hilbert wrote: --- ## Post #47 by @Karsten_Hilbert Much to my dismay a quick grep over a couple hundred results idling in files on my machine doesn't yield a "<=" or ">=" offhand but I'm positive I've seen it before\. When I come across one I'll post it to the list\. Checking the lab data transmission specs confirms that the result can be a free\-text field which easily allows for any of the discussed accuracy modifiers\. Karsten --- ## Post #48 by @system Dear all, The CEN EN13606 EHRcom standard is using an attribute to indicate that 'something is going on' and that precautions must be taken. Resulting most often in the need for human intervention. Is such an attribute a possible intermediate solution for the problem here? Gerard -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 654 792800 --- ## Post #49 by @William_E_Hammond Just yesterday I ran into this construct in looking at a dosing algorithm for pediatrics\. Without the detail, the first time\-related logic specified for the period of less than 7 days \(<7 days\. The next logic line specified > = 7 days\. Without the >=, the logic would have been awkward at best\. Ed Hammond                       Karsten Hilbert                       <Karsten\.Hilbert@gmx\.net To: openehr\-technical@openehr\.org                       > cc:                       Sent by: Subject: Re: Pathology numeric values not supported in DV\_Quantity                       owner\-openehr\-technical@                       openehr\.org                                                                                                                                                                      04/26/2006 09:24 AM                       Please respond to                       openehr\-technical                                                                                                                                                >> <= >= do happen over here in Germany\. > > can you provide an example of a path result with either of those in it? Much to my dismay a quick grep over a couple hundred results idling in files on my machine doesn't yield a "<=" or ">=" offhand but I'm positive I've seen it before\. When I come across one I'll post it to the list\. Checking the lab data transmission specs confirms that the result can be a free\-text field which easily allows for any of the discussed accuracy modifiers\. Karsten --- ## Post #50 by @thomas.beale William E Hammond wrote: > Just yesterday I ran into this construct in looking at a dosing algorithm > for pediatrics\. Without the detail, the first time\-related logic specified > for the period of less than 7 days \(<7 days\. The next logic line specified >   >> = 7 days\. Without the >=, the logic would have been awkward at best\. >>     > Ed Hammond >   Hi Ed, these are the kind of cases we need, and you have probably seen more data than anyone\. I am convinced to leave <=, >= in ;\-\) \- thomas --- ## Post #51 by @Heath_Frankel Tom, Well this use case is a DV\_Duration\. Therefore should this new attribute be put on DV\_MEASURABLE? This might have some use cases for other Date/times as well\. This is different to partial dates\. This would allow use to say something like "this occurred before 1 May 2006"\. Or is this what the DV\_Time\_Specifcation types are used for? Heath --- ## Post #52 by @thomas.beale Heath Frankel wrote: > Tom, > Well this use case is a DV\_Duration\. Therefore should this new attribute be > put on DV\_MEASURABLE? This might have some use cases for other Date/times > as well\. This is different to partial dates\. This would allow use to say > something like "this occurred before 1 May 2006"\. Or is this what the > DV\_Time\_Specifcation types are used for? >   In theory it should be on DV\_QUANTIFIED\.\.\.\.\.\.\.\. Time specification types have the purpose of specifying time in the future, not the past \(hence the name\)\. \- thomas --- ## Post #53 by @system What about the next url: [http://www.cenorm.be/cenorm/businessdomains/businessdomains/generalstandards/uncertainty+of+measurement.asp](http://www.cenorm.be/cenorm/businessdomains/businessdomains/generalstandards/uncertainty+of+measurement.asp) gerard -- -- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252 544896 M: +31 653 108732 --- **Canonical:** https://discourse.openehr.org/t/pathology-numeric-values-not-supported-in-dv-quantity/14516 **Original content:** https://discourse.openehr.org/t/pathology-numeric-values-not-supported-in-dv-quantity/14516