# Flavour of null **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2005-04-08 21:27 UTC **Views:** 4 **Replies:** 31 **URL:** https://discourse.openehr.org/t/flavour-of-null/14479 --- ## Post #1 by @Sam Dear All A reminder on why flavour of null is at the ELEMENT level: it allows a composition with mandatory data to be saved even if the data is not available, or allows a reason to be stated for data that is missing\. It also allows us to deal with the HL7 flavour of null on the data types\. I am concerned that the flavour of null is set to DV\_CODED\_TEXT and not DV\_TEXT \(ie\. it has to be coded from a terminology\)\. I agree that some systems will want things coded for safety in some situations, but I believe that this should be handled through archetypes and templates\. Laboratories will want to use this for all sorts of reasons, one clear example is when an electrolyte sample has haemolysed \- and they cannot give a potassium reading \(they do not want to omit it\!\) So I want to propose that the flavour of null is set to DV\_TEXT\. Cheers Sam Heard --- ## Post #2 by @grahamegrieve Hi Sam I've discussed this particular case at HL7 before, but I don't remember whether any answer was agreed\. But to me, this case needs to be coded \- there's a fairly small set of reasons why the laboratory would report that an answer was not available, and the reasons themselves may have meaning I advance this small hierarchical vocab: \+ NS \- not suitable    \+ HM \- haemolysed      \+ HM1      \+ HM2 \{ rating for how haemolysed      \+ HM3 \{ ? maybe a seperate element      \+ HM4    \+ LP \- lipeamic      \+ LP1      \+ LP2 \{ rating for how lipaemic      \+ LP3 \{ ? maybe a seperate element      \+ LP4    \+ WP \- wrong preservative    \+ INS \- insufficient sample \+ ERR \- handling error    \+ AGE \- too long to deliver to lab or other delivery problem    \+ LACC \- laboratory accident    \+ FAIL \- specimen could not be analysed for             technical reasons that were not accidental I may have missed some heam and micro specific reasons \- I worked in the core lab\. Some Australian laboratories are reporting meaningless numbers and then reporting the error as a comment, rather than reporting a null value \- so they can be paid\. In spite of my strong clinical objection to this practice, this suggests that this isn't a null\-flavour issue, and indeed, for lipaemic samples, except for a few analytes, I used to report the numbers and just note that the numbers were lower because of the volume effects\. So I think that this is a "laboratory quality indicator" that is a separate element to the actual value, since there is various cases where you'd want to report both \- and I think this is worth modelling in the base pathology result archetype\. Grahame Sam Heard wrote: --- ## Post #3 by @system I agree\. The function of a data type is enable interpretation of a series of bits\. It expresses one facet of a semantic notion\. The function of things above \(other than\) the data type is to indicate semantic notions\. Including the semantic notion of Null\. I can't understand why HL7 views Null a part of the data type specs\. It is not a good sign that HL7 keeps the Null at this place after all these years of comments\. Gerard Web Definitions of data type on the Web:   • The properties and internal representation that characterize data and functions\. www\.ncsa\.uiuc\.edu/UserInfo/Resources/Hardware/IBMp690/IBM/usr/share/man/info/en\_US/xlf/html/UG137\.HTM   • A set of data with values having predefined characteristics\. Examples of data types are: integer, floating point, unit number, character, string, and pointer\. Usually a limited number of data types are built into a programming language\. The language usually specifies the range of values for a given data type, how the computer processes the values, and how they are stored\. usnet03\.uc\-council\.org/glossary/   • classification of data as either qualitative or quantitative ori\.dhhs\.gov/education/products/n\_illinois\_u/data\_management/datamanagement/dmglossary\.html   • Indicates the nature of the data found in the content portion of a data element\. If applicable, Data Type is expressed as MIME type\. Examples of data types identified to date \(04/97\) include: memory\.loc\.gov/ammem/techdocs/repository/attdefs\.html   • A coding scheme recognized by system software for representing organizational data\. www\.cbu\.edu/\~lschmitt/I351/glossary\.htm   • Go to Top of Document \- CONTENTS Data Type describes the format of the data element\. The current Data Types in DOEInfo are as follows: Character Numeric Date Element Name: Data\_Type Table: MetaData Length: 30 Data Type: Character https://mis.doe.gov/doeinfo/infoTerm.cfm   • The characteristic of a data element that describes whether it is numeric, alphabetic, or alphanumeric\. www\.oregoninnovation\.org/pressroom/glossary\.d\-f\.html   • \(n\.\) a named category of data that is characterized by a set of values, together with a way to denote those values and a collection of operations that interpret and manipulate the values\. For an intrinsic data type, the set of data values depends on the values of the type parameters\. www\.hpc\.nw\.ru/ENG/COURSES/fortgloss\.html   • The characteristic of a field that determines what type of data it can hold\. Data types include Boolean, Integer, Long, Currency, Single, Double, Date, String, and Variant \(default\)\. www\.microsoft\.com/resources/documentation/sts/2001/all/proddocs/en\-us/siteadmin/wsagloss\.mspx   • Structural metadata associated with digital data that indicates the digital format or the application used to process the data\. www\.cs\.cornell\.edu/wya/DigLib/new/glossary\.html   • The attribute of a field that determines the kind of data the field can contain\. www\.nova\.edu/techtrain/apptrain/access/accessglossary\.html   • This indicates the structural format of data contained in the field\. For example: N=Numeric A=Alphabetic AN=Alphanumeric criminaljustice\.state\.ny\.us/dict/intro\_details2\.htm   • In programming, classification of a particular type of information\. It is easy for humans to distinguish between different types of data\. We can usually tell at a glance whether a number is a percentage, a time, or an amount of money\. We do this through special symbols \-\- %, :, and $ \-\- that indicate the data's type\. Similarly, a computer uses special internal codes to keep track of the different types of data it processes\. www\.angelfire\.com/anime3/internet/programming\.htm   • Unit of data storage in a software system\. Some of Oracle's built\-in data types are: NUMBER, CHAR, VARCHAR2, DATE, BLOB, CLOB, etc\. Abstract data types can also be defined\. orafaq\.cs\.rmit\.edu\.au/glossary/faqglosd\.htm   • a set of values and one or more operations on the values\. Examples include integer, real, rational, character, string, and list\. www\.cs\.utexas\.edu/users/novak/cs307vocab\.html   • Classifications of data, including text, number, currency, dates, logical operators, and files\. www\.jqjacobs\.net/edu/cis105/concepts/CIS105\_concepts\_13\.html   • Each attribute has a data type that specifies what kind of data will be stored and the size, if appropriate\. The default type is text with a default size of 50 bytes\. Data types include: text, currency, integer, date/time, auto number and memo\. bus\.tsud\.edu/cis3330/protected/glossary\.htm   • characterizes \(a subset of\) data by the number of object classes and the attribute types of the attributes \(that exist in this subset\)\. Typical data types are rectangular and multi relational\. atrey\.karlin\.mff\.cuni\.cz/\~doug/magdon/terms\.html   • Any one of several different types of data, such as integer, real, double precision, complex, logical, and Hollerith\. Each has a different mathematical or logical significance and may have different internal representation\. www\.control\.co\.kr/dic/dic\-d\.htm   • The kind of data being stored or manipulated\. In programming, the data type specifies the range of values that a variable or constant can hold, and how that information is stored in the computer memory\. There are four basic types: Floating\-point, Integer, Double and the String/character data type\. www\.angelfire\.com/ny3/diGi8tech/DGlossary\.html   • The kind of data stored in a field\. NUMERIC, COMPUTED, and WORD\-PROCESSING are examples of VA FileMan DATA TYPEs\. www\.hardhats\.org/fileman/u1/glossary\.htm   • A set of values\. Optionally may include the set of operators used for manipulating those values\. declaration statement \- A statement that is used to specify the attributes \(ie, properties\) of an identifier\. cisnet\.baruch\.cuny\.edu/friedman/cplusplus/glossary\.htm   • Total value of landings www\.fao\.org/DOCREP/003/X2465E/x2465e0g\.htm   • The type of value represented by a constant, variable, or some other program object\. Java data types include the integer types byte, short, int, and long; the floating\-point types float and double; the character type char; and the Boolean type boolean\. docs\.rinet\.ru/KofeynyyPrimer/ch38\.htm   • A description on a field that determines what kind of information you can enter in the field\. Field data types include Text, Memo, and Number\. www\.cs\.rtu\.lv/PharePub/Microsoft%20Access%2097%20Quick%20Reference/htm/ch09\.htm   • On computer science, a datatype \(often simply type\) is a name or label for a set of values and some operations which can be performed on that set of values\. Programming languages implicitly or explicitly support one or more datatypes; these types may act as a statically or dynamically checked constraint on the programs that can be written in a given language\. en\.wikipedia\.org/wiki/Data\_type \-\- <private> \-\- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands \+31 252 544896 \+31 654 792800 --- ## Post #4 by @thomas.beale Grahame Grieve wrote: > Hi Sam > > I've discussed this particular case at HL7 before, but I don't > remember whether any answer was agreed. But to me, this case > needs to be coded - there's a fairly small set of reasons why > the laboratory would report that an answer was not available, > and the reasons themselves may have meaning > > I advance this small hierarchical vocab: > > + NS - not suitable > + HM - haemolysed > + HM1 > + HM2 { rating for how haemolysed > + HM3 { ? maybe a seperate element > + HM4 > + LP - lipeamic > + LP1 > + LP2 { rating for how lipaemic > + LP3 { ? maybe a seperate element > + LP4 > + WP - wrong preservative > + INS - insufficient sample > + ERR - handling error > + AGE - too long to deliver to lab or other delivery problem > + LACC - laboratory accident > + FAIL - specimen could not be analysed for > technical reasons that were not accidental > > I may have missed some heam and micro specific reasons - I > worked in the core lab. > > Some Australian laboratories are reporting meaningless > numbers and then reporting the error as a comment, rather > than reporting a null value - so they can be paid. In spite > of my strong clinical objection to this practice, this > suggests that this isn't a null-flavour issue, and indeed, > for lipaemic samples, except for a few analytes, I used to > report the numbers and just note that the numbers were > lower because of the volume effects. > > So I think that this is a "laboratory quality indicator" > that is a separate element to the actual value, since > there is various cases where you'd want to report both - and > I think this is worth modelling in the base pathology result > archetype. I agree 100% - I don't see this as a flavour of null problem, because flavour of null is/should be about: - inability to provide a value to the computer system at runtime. A possible value set I have proposed in the past: - **no information**: No information provided; nothing can be inferred as to the reason why, including whether there might be a possible applicable value or not - **not available** (unknown): A possible value exists but is not provided (ask user) - **masked**: The value has not been provided due to privacy settings (settable by extract / message serialiser) - **not applicable**: No valid value exists for this data item in this context (should be knowable by application) This value set works for all contexts, is independent of setting, and (I believe) should be settable by software. I have my doubts as to whether there is any milage in having the first two distinct. In any case, this idea of null value is only partly the same as the use case here. In the lab situation, some information items are not available, so you could set the null flavour to "not available", but the actual reasons for this are specific to the setting and the test. Clearly we cannot have a single vocabulary for flavour-of-null which rolls in the value sets of all the possible flavours-of-null for all settings, tests etc, such as the one Grahame has used above. One solution initially appears to be to allow the flavour of null vocabulary itself to be settable (i.e. that in archetypes you could set a different flavour of null vocabulary depending on the field), but this is flawed, since we still want a generic flavour of null value (e.g. one of the 4 above), so that querying can work properly. So we either need two flavour of null values for each value field - one generic, one specific to setting & context - which seems somewhat excessive, or we need to regard the specific "flavour of null" as something else, probably something like a lab quality indicator as Grahame suggests. I agree that pathology archetypes should probably include such indicators explicitly in their model of data. - thomas beale > Grahame > > Sam Heard wrote: > > > Dear All > > > > A reminder on why flavour of null is at the ELEMENT level: it allows a composition with mandatory data to be saved even if the data is not available, or allows a reason to be stated for data that is missing. It also allows us to deal with the HL7 flavour of null on the data types. > > > > I am concerned that the flavour of null is set to DV_CODED_TEXT and not DV_TEXT (ie. it has to be coded from a terminology). I agree that some systems will want things coded for safety in some situations, but I believe that this should be handled through archetypes and templates. > > > > Laboratories will want to use this for all sorts of reasons, one clear example is when an electrolyte sample has haemolysed - and they cannot give a potassium reading (they do not want to omit it!) > > > > So I want to propose that the flavour of null is set to DV_TEXT. > > > > Cheers > > Sam Heard > > - > > If you have any questions about using this list, > > please send a message to [d.lloyd@openehr.org](mailto:d.lloyd@openehr.org) - If you have any questions about using this list, please send a message to d.lloyd@openehr.org --- ## Post #5 by @thomas.beale Gerard Freriks wrote: > I agree\. > > The function of a data type is enable interpretation of a series of bits\. It expresses one facet of a semantic notion\. > > The function of things above \(other than\) the data type is to indicate semantic notions\. > Including the semantic notion of Null\. > > I can't understand why HL7 views Null a part of the data type specs\. > > It is not a good sign that HL7 keeps the Null at this place after all these years of comments\. yes, well, that's another issue with it\.\.\.\.in openEHR and CEN it is modelled based on the real world situation \(i\.e\. that null flavour is not inherently part of a data value, but only part of its use in some context, e\.g\. in ELEMENT\.value\), although I can still imagine improvements\. But we need to walk \(steadily\) first\. \- thomas --- ## Post #6 by @William_E_Hammond Maybe in the US it is only a US realm solution\. I agree that it seems to be strectching it to be part of a data type\. It is clearly a value response\. Ed Hammond Gerard Freriks <gfrer@luna\.nl>@openehr\.org on 04/09/2005 02:41:56 AM Please respond to openehr\-technical@openehr\.org Sent by: owner\-openehr\-technical@openehr\.org --- ## Post #7 by @Sam Dear All OK, I was just checking to see where the detailed reason for a result being NULL should be and am happy to build this into the laboratory archetype in the way Graham suggests, and to leave the generic Flavour of NULL as Tom thinks\. Cheers, Sam --- ## Post #8 by @Elkin_Peter_L_M.D Sam, By way of a friendly amendment, I would say that the Information Model of Null should include both the type of Null and separately the reason for it being Null as separate attributes \(employing an Ontology of Null\)\. I agree that Null should be part of the Information Model explicitly rather than a datatype\. For Example: Null   Unknown   Not available   Not evaluated   Insufficient Information   Result out of Valid Range   Testing yielded no value Reason\_for\_Null   Lost the Sample   Specimen destroyed   Sample Hemolyzed   Sample Lipemic   Sample too long in transport   Specimen Clotted   Machine Error   Human Error   Etc\. Warm regards, Peter Peter L\. Elkin, MD Professor of Medicine Director, Laboratory of Biomedical Informatics Department of Internal Medicine Mayo Clinic, College of Medicine Mayo Clinic, Rochester \(507\) 284\-1551 Fax: \(507\) 284\-5370 --- ## Post #9 by @Koray_Atalag Dear All, I had been pretty busy with an ambitious EU FP6 Project Proposal which is related with my thesis work as most of you are also aware of: CEREBRUS which we were not able to finish till the deadline at March 22\. But we decided to move on and wait for next calls and look for other opportunities of funding/support\. Even the proposal activity is Open Source and at SF\.NET: http://cerebrus-fp6.sourceforge.net All these messages about "Flavours of Null" are indeed very useful solution proposals I think pointing out to the very fact that the "Null" is not that "easy" entity to handle\.\.\.Apart from examples in clinical LAB and quality measures \(I think in US CLIA 88 regulations\) there are many other contexts that is paradigm is a real problem: such as the Bethesda System 2001 in cervical smear reporting\. I had been working on this since 2001 and in fact developed many working information systems \(Freely available as opensource from SF\.NET: http://sourceforge.net/projects/pathos-web/ \) I think we should not add too many meanings/information/pre\-information/contextual information to a single entity, and we had better separate the levels of: A\) data B\) information C\) knowledge in approaching the problem of "NULL"\.\.\. In order to make "sensible", really "implementable" and "user\-friendly" systems in mind, the approach to the problem has to be restated as below and possibly be formulated in a methodology and solved as with my proposal: Problem statement and intial analysis from a historical perspective: 1\) From a computational point of view: We should evaluate if if makes any "sense" to a computational/information system whether all the additional "contextual/information/knowledge" level attributes are of any use or change the "result/response" expected from it; i\.e\. The current "digital" or I should say "binary" computers do not understand at the lowest level the null\. It is either "present/positive/one" or "absent/negative/zero"\. That is why in all early programming and database systems the data type "boolean/bit" is modelled and it is very useful for many systems in many domains\. 2\) As the informatics science is evolved, the real systems had to incorporate "empty/null" aspects of data\.\.\.Then two bits are appended to incorporate this "more than 1 bit" of information and in many cases it is not needed and 1 bit is lost for nothing; more memory, more storage and etc\. Even the Y2K problem originated from such an approach\. This is strýngly related with "Fuzzy Systems" and there is vast amount of research and solutions as far as I know\. I think at the end we will probably need to change the very architecture of our computer systems and storage techologies so as to handle this: 3 state and continous/near analogue processors and storage schemes\. 3\) In complex domains such as clinical medicine, then we are also adding the "Flavours of Null" and now considering to incorporate into the very heart of our models: the Data Types\.\.\.If this happens then we will be spending a whole lot of our memory/storage and eventually processing performance of our "next\-generation" systems as the Americans did with their cars back at the 60s\.\.\. Methodology: 1\) Analyse the data values expected to be encountered: if it is mostly 0/1, Yes/No, True/False, Positive/Negative then do not bother with flavours of Null and use the "good\-old" technique and spend 1 bit\. 2\) If there is a need to put more information to the value \(information/context\) Then start with the "essential" flavours of null \- the true natural features, not the ones we the human kind created to make this complex world even more complicated\! These are in fact the "context and domain" free aspects of being "Null" or "Empty" as Grahame Grieve pointed out in his nice and useful message:    \* \*no information\*: No information provided; nothing can be inferred >       as to the reason why, including whether there might be a possible >       applicable value or not >     \* \*not available\* \(unknown\): A possible value exists but is not >       provided \(ask user\) >     \* \*masked\*: The value has not been provided due to privacy settings >       \(settable by extract / message serialiser\) >     \* \*not applicable\*: No valid value exists for this data item in this >       context \(should be knowable by application\) This can be further discussed and improved in a more "ICT philosophical" way I believe\.\.\. 3\) Put all the "non\-essential" and "contextual" components via a "contextual archetype\-component" as in the solution for "protocols" and employ the "knowledge" aspects; meaning which information is appropriate in that particular situation/context\. So this is a Knowledge Enabled Contextual Plug\-In or Add\-On approach as I just propose in this beautiful Sunday morning\! But they all should map to the "essential" information entities as given in Methodology 2\. I would recomment that you all examine by heart how openSDE approaches this problem in its newer version developed by Erasmus team; mainly by Astrid van Ginneken and Marcel de Wilde\.\.\.There is an immense and many years of practice and feedback from real clinical uses and users behind this work\. It is also Open Source and freely available from SF\.NET: http://sourceforge.net/projects/opensde/ At the end of the day, the information/computational system makes use of true/unknown/false aspects of information provided by user to make inferences or just produce reports and etc\. I also want to point out to the fact that the real "Null" concept is a problem of infinity in math: you can in fact model and represent any value with a continuous number space between \-1 0 \+1 \.\.\.\. There is not much debate on the "Flavours of True/Positive or False/Negative" but in fact it also exists and in clinical medicine it should better be presented with at least sensitivity/specifity measures\. But the current debate on "Flavours of Null" is more complicated as it is an essential paradigm that is present in the universe in many interesting domains\. So all these concepts I think have to be approached little "philosophically" but solved in a "sensible and practical" way\. So my proposal in short is: 1\) Examine the possible data values to be expected in a particular field: If can be solved with a simple True/False then assign a bit\. 2\) If not then employ the "Essential Flavours of Null" which should appear as a separate Data Type I believe 3\) If extra contextual information is needed at the time of design or will probably be needed in future \(This requires a careful study by taking into consideration all viewpoints: legal, epidemiological and etc\.\) then assign "Knowledge\-Enabled Contextual Archetype Plug\-Ins" and provide mappings to the essential ones\. That is my "simple" solution proposal that I had been deeply thinking on for some years\! Best regards, Dr\. Koray Atalag METU Informatics Institute Ph\.D\. Candidate on Information Systems --- ## Post #10 by @thomas.beale Elkin, Peter L\., M\.D\. wrote: > Sam, > > By way of a friendly amendment, I would say that the Information Model of Null should include both the type of Null and separately the reason for it being Null as separate attributes \(employing an Ontology of Null\)\. I agree that Null should be part of the Information Model explicitly rather than a datatype\. For Example: > > Null > Unknown > Not available > Not evaluated > Insufficient Information > Result out of Valid Range > Testing yielded no value > > Reason\_for\_Null > Lost the Sample > Specimen destroyed > Sample Hemolyzed > Sample Lipemic > Sample too long in transport > Specimen Clotted > Machine Error > Human Error > Etc\. > Peter, I think if we go that way, we need to recognise that the reason\_for\_null won't be a single vocabulary \- it will be many\. The values you give are all pathology related; but I can imagine a vocabulary set that deals with psychiatric interviews \(when patients may be refusing or failing to provide answers for a whole variety of reasons\), with general contacts \(when a patient just can't remember family history events\), and so on\. Then there is the question of whether the reason\_for\_null should be a second field next to the flavour of null \(in openEHR \- in ELEMENT\), or only occur in archetyped information structures where it makes sense \(on the assumption that there is no reason\_for\_null vocabulary a lot of the time\)\. We are currently working on the latter assumption, but someone may be able to prove that it should be otherwise\. I think it all comes down to two things: \- what is flavour of null/reason for null used for \(i\.e\. what queries does it satisfy\) \- what is the generality of any particular solution? I suspect that the best approach in a better world would be to have a more powerful ontology, e\.g\. one in which many small reason\_for\_null vocabularies have IS\-A relationships with more general null terms like those in your first list\. However, I doubt if this is generally available in a practical sense for the time being\. \- thomas \- thomas --- ## Post #11 by @Elkin_Peter_L_M.D Very sensible\. I agree\. Peter Peter L\. Elkin, MD Professor of Medicine Director, Laboratory of Biomedical Informatics Department of Internal Medicine Mayo Clinic, College of Medicine Mayo Clinic, Rochester \(507\) 284\-1551 Fax: \(507\) 284\-5370 --- ## Post #12 by @Philippe_AMELINE1 Hi to all, This is fine, and I immediateley start adding these concepts to Odyssee's ontology, with the proper IS A relationships\. However, I would put the "reasons for null" as childs \(and not brothers\) of the "flavour of null", since these reasons may vary depending from the flavour, for example you cannot have "Result out of Valid Range" because "Lost the Sample"\. Furthermore, the "flavours of null" and the valid "reasons for it" are very dependant from the so\-qualified information\. For example, if you expect the left ventricular ejection ratio measured from echocardiography instead of a lab result, you probably will provide specific flavors of null with accurate reasons for it \(for example patient's obesity leading to bad quality exam\)\. In Odyssee, we can do it with Fils guides\. However, the genuine hard task is in the interface side : where there was just an "edit field" waiting for a numerical value \(and its unit\), we must provide the ability to enter descriptive items\. Cheers, Philippe Elkin, Peter L\., M\.D\. wrote: --- ## Post #13 by @Philippe_AMELINE1 Hi Koray, Don't you think that "Null" is not a singularity \(I mean an isolated point\), but the extreme value of a linear cursor we could name "validity" or "confidence"\. To give a matter of fact example, I could say that : I can provide a value without any comment : I am confident in the quality level of the measurement process I can provide a value saying that an average \(or poor\) level of quality must be noticed when using this information I can decide not to provide a value and explain why This is close from error bars in scientific papers ; I don't mean you must provide a calculated accuracy level \(it is usually not possible\), but that when an information was not obtained with the usual level of precision, it should always be noticed\. For example, some calculated values use measured value power 3 \- you can imagine how errors are raised at a high level\. So, maybe we should always provide room for a validity indicator \(that would become the list of reasons for null when "flavours of null" replace the asked value\)\. Cheers, Philippe --- ## Post #14 by @thomas.beale Philippe AMELINE wrote: > Hi Koray, > > Don't you think that "Null" is not a singularity \(I mean an isolated point\), but the extreme value of a linear cursor we could name "validity" or "confidence"\. > > To give a matter of fact example, I could say that : > > I can provide a value without any comment : I am confident in the quality level of the measurement process > I can provide a value saying that an average \(or poor\) level of quality must be noticed when using this information > I can decide not to provide a value and explain why Hi Philippe, our analysis in GEHR/openEHR has always been that confidence are null\-flavour are two different things: \- null / data quality \- indicates that some datum was not obtainable \- confidence is likelihood of being correct a datum is, in the opinion of the health care professional \(or maybe someone else\); it can only be set when there is a value \- thomas --- ## Post #15 by @Dr_LONJON_Roger hello philippe and thomas, excuse me to intervene, in English of bad quality\. in medicine for me, a result must be validated and must be signed by the producer\. This result is therefore automatically a total confidence level\. It is a very important notion on the legal plan when these results are put to disposition on a shared medical file \(server web\) Inversely if this result is approximate, with a coefficient of mistake importing, it is not about a validated data and therefore publishable, because consequences in réponsabilité for their author are unforeseeable if the patient carries complaint\. I am unaware of this aspect of the problem so enters in your reflection\. Cordially Dr R LONJON france Selon Thomas Beale <thomas@deepthought\.com\.au>: --- ## Post #16 by @Philippe_AMELINE1 Hi Tom, I agree with you\. My idea was that if we express the process quality, and consider data measured during this process, we end up with a range of data qualities \- from "plain" to "not measurable" \- and this last one leads to a Null value\. For example, if I say that my echocardiography \(the process\) was globaly of poor quality due to patient's overweight, I can however correctly measure some information, have some other evaluated as "approximate" but with a value provided, and some other being Null\. So, as a summary, I would say that : 1\) confidence applys only when there is a value, it can be a numerical rating \(0 to 100% in Odyssee\) 2\) Null value can have an explanation for null 3\) The process itself should have a quality descriptor, and it would be nice to have a relationship between this global information and "local" data confidence\. What led me to think that way is that among the reasons for Null that were proposed in former messages, most of them where indead qualifiers for the process at a whole \(and I can even say that some where qualifier for the workflow \(the process above the process\) \- I wonder how we can "chain" the quality indicators in the workflow\-process\-data sequence \- for example, an exam can be technically good, but lead to wrong information because it was performed too early or too late in a worflow\)\. Cheers, Philippe --- ## Post #17 by @system I can have 100% confidence that the patient remained silent when asked: "Do you have HIV?" by refusing to answer the question\. Gerard \-\- <private> \-\- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands \+31 252 544896 \+31 654 792800 --- ## Post #18 by @thomas.beale Dr LONJON Roger wrote: > hello philippe and thomas, > excuse me to intervene, in English of bad quality\. > in medicine for me, a result must be validated and must be signed by the > producer\. This result is therefore automatically a total confidence level\. It > is a very important notion on the legal plan when these results are put to > disposition on a shared medical file \(server web\) > > Inversely if this result is approximate, with a coefficient of mistake > importing, it is not about a validated data and therefore publishable, because > consequences in réponsabilité for their author are unforeseeable if the patient > carries complaint\. > > I am unaware of this aspect of the problem so enters in your reflection\. > It is actually quite common: consider that in a differential diagnosis, confidences are always expressed in each of the possible diagnosesa, e\.g\. 90%, 9%, 1% for possible reasons for a child's fever\. I don't see it as being about mistakes, it's about the estimation by a clinical professional of the probability of correctness of an opinion\. In openEHR, confidences always appear in data of the EVALUATION type\. There is no question of clinician confidence in OBSERVATIONs \- they are for all intents objective\. Of course, machines may have limited accuracy \(inbuilt error\) and numeric results may be reported with limited precision; these situations can be archetyped\. \- thomas --- ## Post #19 by @thomas.beale Thomas Beale wrote: > Philippe AMELINE wrote: > >> Hi Koray, >> >> Don't you think that "Null" is not a singularity \(I mean an isolated point\), but the extreme value of a linear cursor we could name "validity" or "confidence"\. >> >> To give a matter of fact example, I could say that : >> >> I can provide a value without any comment : I am confident in the quality level of the measurement process >> I can provide a value saying that an average \(or poor\) level of quality must be noticed when using this information >> I can decide not to provide a value and explain why > > Hi Philippe, > > our analysis in GEHR/openEHR has always been that confidence are null\-flavour are two different things: > \- null / data quality \- indicates that some datum was not obtainable > \- confidence is likelihood of being correct a datum is, in the opinion of the health care professional \(or maybe someone else\); it can only be set when there is a value I should modify this: it is the likelihood of a datum \(e\.g\. a certain level of blood sugar\) indicating a particular problem or condition\. \- thomas --- ## Post #20 by @lavanian Thats a critical requirement that's normally taken care of by any standard software with the help of digital signatures, data encryption and hardware/biometric authentication of the EHR, whether it be in a HIS, RIS or a Telemedicine app\. With warm regards, Wg Cdr \(Retd\) Dr D Lavanian MBBS,MD, Prim Av Med,MISHWM,MISAM Certified HL7 V2\.3 Specialist Domain Expert & Business Manager \- Telemedicine Apollo Health Street Ltd Apollo Hospitals, Jubilee Hills, Hyderabad, India Tel: \+91\-40\-23554350 Fax: \+91\-40\-23554354 lavanian\_d@apollolife\.com Mobile: \+91\-9885023504 --- ## Post #21 by @Elkin_Peter_L_M.D Dear Roger and Thomas, We have looked extensively at Multivalued logic for quantitating uncertainty\. It turns out that most folks in that world have taking 0 false and one true with a number of discrete, usually equally spaced values in between for uncertainty\. After a longwinded go around with a Prof of Philosophical Logic at Princeton \(Dr\. Graham\) We determined that there at least three reproducible types of uncertainty \(with good inter\-rater reliability\) and \~ seven semantic categories\. The types are Probable \(our guess is around 85% true \+/\- 5%\) and Unlikely \(our guess is around 15% true \+/\- 5%\) or Just as likely as not \(again our guess is around 50% \+/\- 15%\)\. These number come from the average PPV of the evidence when a physician "Makes a diagnosis" and NPV when a physician rules one out\. Other distinctions are less reproducible\. When taken together most clinicians would say that Probable is stronger than Likely, however the assignment to actual cases is not in our experience reproducible between knowledgeable reviewers\. I suggest that you first code \(as we do\) True, False or Uncertain\. Then qualify Uncertain with a semantic type indicating strength\. This allows a model that can grow with our ability to represent more closely evidence based medicine\. Warm regards, Peter Peter L\. Elkin, MD Professor of Medicine Director, Laboratory of Biomedical Informatics Department of Internal Medicine Mayo Clinic, College of Medicine Mayo Clinic, Rochester \(507\) 284\-1551 Fax: \(507\) 284\-5370 --- ## Post #22 by @thomas.beale Elkin, Peter L\., M\.D\. wrote: > Dear Roger and Thomas, > > We have looked extensively at Multivalued logic for quantitating uncertainty\. It turns out that most folks in that world have taking 0 false and one true with a number of discrete, usually equally spaced values in between for uncertainty\. > > After a longwinded go around with a Prof of Philosophical Logic at Princeton \(Dr\. Graham\) We determined that there at least three reproducible types of uncertainty \(with good inter\-rater reliability\) and \~ seven semantic categories\. > > The types are Probable \(our guess is around 85% true \+/\- 5%\) and Unlikely \(our guess is around 15% true \+/\- 5%\) or Just as likely as not \(again our guess is around 50% \+/\- 15%\)\. These number come from the average PPV of the evidence when a physician "Makes a diagnosis" and NPV when a physician rules one out\. > \[with appropriate excuses in advance for my engineer's view of clinical things;\-\] I presume that these values \(which seem entirely reasonable to me\) were obtained by a statistical study of clinicians' notes? Or interviews? But the problem we are always concerned with is: what does one clinician mean when s/he says "probable rheumatoid arthritis"? We can't assume it can be translted into 85% \+/\- 5% can we? The particular physician who said it might habitually and unconsciously put "probable" all over the place, when they should really put "possible"\. Sam's point of view so far has been: make them enter a number \(prompt = "% probability of being true" or similar\)\. I know that doesn't address the perfectly reasonable need to allow clinical people to write "probable", "possible" etc, so maybe it's not a long term answer\. But let's just consider what doing clinical medicine is about: it's just scientific problem\-solving\. The goal is to fix a problem \(with the patient\); the method is to iteratively gather information until a conclusion \(diag = Rh Arthritis\) can be drawn or a decision can be made \(commence ibuprofen\)\. Fixing a problem may involve many repetitions of this until the problem is fixed\. Now, whenever \(lack of \) confidence or uncertainty occurs, it means that we don't have enough information to make a decision or draw a conclusion, at least not the next one in the chain\. But we do have an indication of what to do next \- usually gather more information\. So perhaps the way we view words like "possible", "likely", "probable" should be as motivators to perform more actions to reduce the uncertainty\. If a doctor writes "possible malaria" re: a patient just back from a holiday vietnam, with heavy flu\-like symptoms, the obvious implication is to do the appropriate microscopy & other diagnostic procedures for malaria, to rule it out or otherwise\. For most diseases, a diagnostic algorithm or guideline is available, and the physician having used a word implying uncertainty just means that the diagnostic process is currently at some interior node of such a guideline tree\. The key question is probably \_which\_ of the possible next steps to rule out /rule in one of the differential diagnoses to do in which order \- i\.e\. which is cheapest, fastest, most relevant to patient health etc\. So my question to clinicians is this: doesn't a note containing "possible X", "likely Y" really imply a differential diagnosis, even if only one of the possibilities is actually noted? If so, it may not matter what the level of uncertainty is so much; what matters \(among other things\) is the severity of the consequences of any of the possible branches of the differential diagnosis\. E\.g\. if one of the implied or noted branches of a differential diagnosis for a patient presenting fever is malaria, presumably both patient and doctor want to discount it as fast as possible, and pursue the appropriate steps to do so\. But if none of the branches is life\-threatening, reasonable action may be "wait 12 hours" and re\-assess\. The very common situation of infant presenting with fever must present such a quandary daily\. I'm wondering if there is a meta\-algorithm of some sort lurking behind the scenes, which takes account of uncertainty in a note, and also severity of non\-discounted possibilities, as a way of deciding what to do next\. There is undoubtedly published work on this\.\.\. thoughts? \- thomas beale --- ## Post #23 by @Tim_Churches Thomas Beale wrote: > I'm wondering if there is a meta\-algorithm of some sort lurking behind > the scenes, which takes account of uncertainty in a note, and also > severity of non\-discounted possibilities, as a way of deciding what to > do next\. There is undoubtedly published work on this\.\.\. This is a very brief but reasonable introduction to Bayesian probability \(which includes calculation of utility\), which is what I think you are grasping at: http://en.wikipedia.org/wiki/Bayesian_probability Tim C --- ## Post #24 by @thomas.beale Tim Churches wrote: > Thomas Beale wrote: > >> I'm wondering if there is a meta\-algorithm of some sort lurking behind >> the scenes, which takes account of uncertainty in a note, and also >> severity of non\-discounted possibilities, as a way of deciding what to >> do next\. There is undoubtedly published work on this\.\.\. >>    > This is a very brief but reasonable introduction to Bayesian probability > \(which includes calculation of utility\), which is what I think you are > grasping at: http://en.wikipedia.org/wiki/Bayesian_probability > Hi Tim, and there are quite a few decision support products based on Bayesian logic as well\. But I wonder if they have been applied to the problem of determining next best steps based not just on clinical data so far, but also cost, duration, and perceived severity of consequences of not doing something\. And I think that Bayesian products should take as inputs only weightings proven by population studies, whereas physician belief is often supported by informal but often qutie accurate personal experience \(i\.e\. experience of the patient population of the practice\)\. In any case, can we argue that there is no point caring about any finer gradations of true/false than true/false/maybe, as Peter Elkin has said they are doing at Mayo? \- thomas --- ## Post #25 by @b.cohen I agree that clinical diagnosis is about problem solving, although it's 'scientific' credentials are often rather weak\. > From a scientific point of view, one's 'confidence' in a hypothesis, e\.g\. a diagnosis, does not correspond to a quantifable 'likelihood of truth' \(as Popper has insisted for years\) but to the set of hypotheses all of which account for the facts observed so far\. Scientific problem solving seeks optimal ways of making observations that could refute as many members of this set as possible\. Certainty is reached when the set of valid hypotheses is singular\. Note that this is certainty is not equivalent to truth, since the 'true' hypothesis may be one that one has not, yet, been able to articulate; hence Kuhn's 'paradigm shift'\. In clinical medicine, 'Clinical Practice Guidelines' \(CPGs\) provide what practitioners have discovered, so far, to be optimal strategies for diagostic testing in certain areas\. Quoting Thomas Beale <thomas@deepthought\.com\.au>: --- ## Post #26 by @Elkin_Peter_L_M.D Dear Thomas, I think we need clinicians to be more precise in these declarations\. If we begin to train clinicians that Probable should mean \~85% probability \+/\- 5% then we will move closer to stability\. Although the goal of reducing uncertainty is in general laudible there are some problems that crop up first clinicians are usually only about 90% sure by evidence when they "make a diagnosis" if looked at from an EBM perspective\. Also the path to reduction of uncertainty takes into account what prior data is available, the risk benefit ratio of obtaining each piece of data, and patient preference\. Interesting but not easy\. Peter Peter L\. Elkin, MD Professor of Medicine Mayo Clinic College of Medicine --- ## Post #27 by @system \-1\- Almost never a diagnosis is 100% certain\. \-2\- Almost always a test result has uncertainty attached to it \-3\- Many times a conclusion is reached based on many uncertain and conflicting facts \-4\- Quite often a condition, a diagnosis, is assumed that gives rise to a treatment\. Not indicating that the patient is suffering from this condition but using treatment as a test procedure\. Doing nothing is such a test procedure\. Eric Wulff \(from Danmark\) published philisophical texts about health care and these topics\. gerard \-\- <private> \-\- Gerard Freriks, arts Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands \+31 252 544896 \+31 654 792800 --- ## Post #28 by @Arild_Faxvaag1 Hi all\. This is an important topic\. Here are some references / pointers for those who wish to read more: "Decision making in health and medicine\. Integrating evidence and values" Myriam Hunink and Paul Glasziou Cambridge university press \(ISBN 0 521 77029 7\) Society for Medical Decision Making: http://www.smdm.org/ I also recommend journal articles written by Wimla L Patel \(Colombia university, New York\), for instance: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=11418539 \(A primer on aspects of cognition for medical informatics\) regards arild Faxvaag På 22\. apr\. 2005 kl\. 07\.42 skrev Gerard Freriks: --- ## Post #29 by @lakewood Hi Arild, Another site is the MIT Group on Clinical Decision Making: \[ http://medg.lcs.mit.edu/ \]\. "\.\.\. a research group dedicated to exploring and furthering the application of technology and artificial intelligence to clinical situations\. Because of the vital and crucial nature of medical practice, and the need for accurate and timely information to support clinical decisions, the group is also focused on the gathering, availability, security and use of medical information throughout the human "life cycle" and beyond \.\.\." Unfortunately Patient decision\-making receives less emphasis and studies seem to miss some fundamental factors \(e\.g\., it is private\) \[ http://www.ahrq.gov/research/rtisumm.htm \] Regards\! \-Thomas Clark Arild Faxvaag wrote: --- ## Post #30 by @Patrick_Lefebvre > \-1\- Almost never a diagnosis is 100% certain\. Most of healthcare professionnal I met consider that the term "Diagnosis" means "working hypothesis\." Regards, \-\- Patrick Lefebvre \-\-\-\-\-\-\-\-\-\-\-\-\- \( plefebv@free\.fr \) \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-     "Ce que j'écris n'engage que moi, et ce jusqu'à ma prochaine idée\." --- ## Post #31 by @system > We could say that physicians _infer_ diagnostic hypotheses based on > > - knowledge of the tentative underlying disease, > > - the patients subjective experiences > > - phenomena registered in the patients body In any case it is a subjective statement and is a professional opinion based on more (lab results, x-rays) or less (patient history) objective data. > Phrases such as "cannot be exluded" might be due to", "probably", "definitely", "beyond doubt" are statements of probability of the inferrence being correct (and what to do next). Inferrences expressed in the subjective statements documenting the treatment of the patient. > Can one say that diagnoses belong to the class of statements whereas the disease itself belong to the class of natural phenomena? Disease is an abstraction of reality that **for the moment, for the next decision** is considered to represent the reality about the health of the patient. Diagnosis is the professional but subjective opinion about a disease of a patient. There is a continuum: Real pathological, fysiologiscal phenomena in a patient. Certain manifestations of these phenomena. That are (or are not) experienecd by the patient of an other person. The arrousal of distress, anxiety, etc, triggering a visit to a physisian. What is said (or not) about the manifestations of the phenomena during the visit. And how it is said. How it is measured and documented. What is understood of what was said or measured about the manifestations of the phenomena. How all this was mached to the state-of-the art knowledge, or interpreted in the context of a limited amount of available knowledge. What was recorded about all these steps above. How the same person (or others) interpret the recorded 'facts' at a later stage???? So what do we record in an EHR? And what do we interpret readingan EHR? Then ... What is certain? And what is uncertain? Certain or uncertain in what domain, in what line above, at what level of the whole described continuum? In ±25% of the extremely wel researched patients in one University Hospital we not diagnosed correctly during their life time. As could be concluded after an autopsy. So what do we really know about disease and complaints? What is certainty? What is it refering to? Do we understand this mine field well enough? > The diagnosis establishes a relation between the subjective experiences / phenomena and the disease that induces those symptoms and findings. > > Example: > > Experiences and phenomena: Pain in the wrist joints, feeling of joint stiffness, joint tenderness, joint swelling, elevated sedimentation rate. > > Diagnostic inferrence: Rheumatoid arthritis. > > Relation: Might be induced by/due to > > Can statements of probability be considered statements regarding the strength of these relations?? This is what they are at best. --- ## Post #32 by @Arild_Faxvaag1 > In ±25% of the extremely wel researched patients in one University Hospital we not diagnosed correctly during their life time\. > As could be concluded after an autopsy\. > So what do we really know about disease and complaints? > What is certainty? > What is it refering to? > Do we understand this mine field well enough? I recommend this book: "Decision making in health and medicine Integrating evidence and values" by M Hunink and P Glasziou\. Speaking with experience from rheumatology and general practice in Norway, I agree that the state of the diagnosis in medical records is lousy\. Just a few points: \- as stated by Gerard, physicians very often come up with the wrong diagnosis, sometimes with fatal consequences\. \- how strong the physician believes in the diagnostic hypothesis is not stated explicitly\. \-\- whether this certainty/probability is above or below the test\-treat threshold \(depends among others on the expected utility of the treatment\) is not stated\. \-\- whether this certainty/probability is above or below the no treat \(wait\) \- test threshold \- too many resources are spent on excuding differential diagnoses whose \(pretest\) probabilities already are below the no treat \- test threshold \-\-this to maintain the trust from the patient / avoid the risk of litigation\. The actions of health care personnel have norms\. What entity a "diagnosis" is can also be evaluated according to what the norms say it \_should\_ be\. The diagnosis \_ought\_ to be an inferrence drawn from medical knowledge and information which stems from the patient\. It is not the underlying disease but the physician's inferred diagnosis that form the basis of all health interventions\. Because of this, it deserves to be represented as more than a subjective statement\. In an EHR system, it should be possible to link the diagnosis statement with its underlying premises\. It should also be possible to link the diagnosis to the set of plans/actions that follows as a consequence\. This would lay the foundation for EHR systems that visualizes the consequences of physician's actions in a much better ways than in systems of today\. regards Arild fax --- **Canonical:** https://discourse.openehr.org/t/flavour-of-null/14479 **Original content:** https://discourse.openehr.org/t/flavour-of-null/14479