# Intro & Questions: null **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2003-03-25 09:17 UTC **Views:** 2 **Replies:** 2 **URL:** https://discourse.openehr.org/t/intro-questions-null/15735 --- ## Post #1 by @grahamegrieve Hi all I have a series of questions about OpenEHR, but I should start with an introduction\. I am a software engineer for a health systems vender\. I am currently implementing something that looks a whole lot like archetypes\. I'm currently evaluating OpenEHR to see how much of the thinking or specifics I can use, with the intention of long\-term harmonisation with OpenEHR\. I'm also an editor of the HL7 V3 data types specification\. And sadly, my first question relates to data types, specifically, nulls\. I promise to keep away from this subject in the future\. The content referred to here is from 7\.7\.3 in Design Principles v2\.4 The OpenEHR approach is to include a second element/attribute that is a comment upon the quality of the first, thereby "avoiding the problem of pseudo\-values\. The result is that attribute values always satisfy the type rules of the system, and the data interpretation marker indicates how attribute values should be read" Further, there is the comment: > Null values should not be mixed in with the value domains of true > data types, a practice which compromises the comprehensibility and > computability of data; they should be represented using a distinct data > interpretation marker associated with each data value\. I believe that this has merely shunted the problem around, it hasn't actually solved anything\. The first issue for me is the idea that "attribute values always satisfy the type rules of the system"\. If I don't know the value, then I might not be able to provide information in the record to meet these requirements\. For instance, say that the element is a simple integer, and I don't know what the value is\. What can I place in the element so that it is a valid integer? I worry that this will lead to fabrication of data\. A possible response to this is that you don't represent the element, and it has an implicit value of\.\.\. null? The second issue I have is that it is claimed that the introduction of a datum interpretation indicator is "safer" for applications\. I don't see why it's safer \- less chance of bugs \- to have the option of overlooking the data quality indicator rather than being forced to deal with it since it's built into the type somehow\. You can clearly fail to deal with it either way, but you don't so easily entirely miss the quality assessment The third issue is the presence of such a datum interpretation indicator \- will it actually be available when I want it? What will I do when I don't have the information but there is no spot for my system to say so? How will my system even know that a given attribute has a datum interpretation indicator associated with it? This last question raises another question \- where are such datum interpretation indicators introduced \- in the reference model, or the archetypes? This is part of a bigger question, which I will return to later \(and is my major concern\) \- can archetypes actually work in practice at all? \(but don't try to answer this till I have added more to this question later\) Grahame --- ## Post #2 by @thomas.beale Grahame Grieve wrote: > Further, there is the comment: > >> Null values should not be mixed in with the value domains of true >> data types, a practice which compromises the comprehensibility and >> computability of data; they should be represented using a distinct data >> interpretation marker associated with each data value\. > > I believe that this has merely shunted the problem around, it hasn't > actually solved anything\. it solves one thing: it means that no value in a 'value' field is anything other than a data value\. No special terms set to "unknown", or numerics set to \-1 or \-999999 or whatever\. This means that such field are more interoperable, since there is no special processing required on them\. > The first issue for me is the idea that "attribute values always > satisfy the type rules of the system"\. If I don't know the value, > then I might not be able to provide information in the record to > meet these requirements\. For instance, say that the element is > a simple integer, and I don't know what the value is\. What can I > place in the element so that it is a valid integer? I worry that > this will lead to fabrication of data\. A possible response to this > is that you don't represent the element, and it has an implicit > value of\.\.\. null? the value field will probably wind up with a value of 0, since this is a typical default for integers, but it is of no consequence\. The second field \- what I would normally call the "data quality" field \(but we call it the "null flavour" field to fit in with HL7\) is the one you read to find out how to interpret the 'value' field\. If this second field says "unknown", you just ignore the 'value' field\. If it says "known" then you use it\. This is one of the shortcomings with HL7's null flavours by the way \- it should really include the idea of "known"\. We get around this by allowing the null flavour field to be Void itself, meaning "known"\. This is a simple and general approach, and means that there is never any special processing\. > The second issue I have is that it is claimed that the introduction > of a datum interpretation indicator is "safer" for applications\. > I don't see why it's safer \- less chance of bugs \- to have the option > of overlooking the data quality indicator rather than being forced > to deal with it since it's built into the type somehow\. You can > clearly fail to deal with it either way, but you don't so easily > entirely miss the quality assessment you can, but with the data quality indicator approach you always know where to look to find out the data quality \- there are no special values for Integer, Real, String, Coded\_text etc to express the various possible data qualities\. Data quality values have nothing to do with data values, and should not be mixed in with them\. By the way, we did not invent this, it is used on control and monitoring systems, where they have to have a data quality marker for data items scanned from field devices\. This is never mixed in with the data value itself\. There are on the contrary many examples of systems which contain software bugs due to misaligned understandings of special values within data fields\. > The third issue is the presence of such a datum interpretation > indicator \- will it actually be available when I want it? What > will I do when I don't have the information but there is no spot > for my system to say so? How will my system even know that a > given attribute has a datum interpretation indicator associated > with it? It is in every ELEMENT in the openEHR models\. > This last question raises another question \- where are such datum > interpretation indicators introduced \- in the reference model, or > the archetypes? Have a look at the ELEMENT class in the Data structures RM\. > This is part of a bigger question, which I will > return to later \(and is my major concern\) \- can archetypes actually > work in practice at all? \(but don't try to answer this till I have > added more to this question later\) quick answer: yes \- we know because we have built the software \(an Eiffel\-based system\) to prove it; so has the DSTC \(an XML\-based system\)\. So has the team at UCL \(Java/ObjectStore\), and so has a company in Sydney called Meridian, which makes a system called Obstet, which I have personally reviewed\. THere are at least 3 other distinct systems that I know of\. \- thomas --- ## Post #3 by @grahamegrieve >>> >>> Null values should not be mixed in with the value domains of true >>> data types, a practice which compromises the comprehensibility and >>> computability of data; they should be represented using a distinct data >>> interpretation marker associated with each data value\. >> >> I believe that this has merely shunted the problem around, it hasn't >> actually solved anything\. > > it solves one thing: it means that no value in a 'value' field is anything other than a data value\. No special terms set to "unknown", or numerics set to \-1 or \-999999 or whatever\. This means that such field are more interoperable, since there is no special processing required on them\. oh yes \- it has solved that\. sorry, I was writing strictly from comparison with HL7v3\. Still, compared to the HL7 approach, I think that you've merely shunted the same problem around\. All HL7 types include the concept of null, so I might encounter it anywhere\. All openEHR elements include a datum for quality, so I might encounter unknown values anywhere\. In the end, I don't see a difference? oh yes \- the difference is that null flavours can be associated with bits of a data type in HL7\. This sectional approach could be quite useful in coded types\. I will see if I can pick up any flaws in the openEHR coded types relating to this >> The first issue for me is the idea that "attribute values always >> satisfy the type rules of the system"\. If I don't know the value, >> then I might not be able to provide information in the record to >> meet these requirements\. For instance, say that the element is >> a simple integer, and I don't know what the value is\. What can I >> place in the element so that it is a valid integer? I worry that >> this will lead to fabrication of data\. A possible response to this >> is that you don't represent the element, and it has an implicit >> value of\.\.\. null? > > the value field will probably wind up with a value of 0, since this is a typical default for integers, but it is of no consequence\. The second field \- what I would normally call the "data quality" field \(but we call it the "null flavour" field to fit in with HL7\) is the one you read to find out how to interpret the 'value' field\. If this second field says "unknown", you just ignore the 'value' field\. If it says "known" then you use it\. This is one of the shortcomings with HL7's null flavours by the way \- it should really include the idea of "known"\. We get around this by allowing the null flavour field to be Void itself, meaning "known"\. oh \- sounds like exactly how HL7 do it \- if the null flavour is null, then the item has a known value > This is a simple and general approach, and means that there is never any special processing\. > >> The second issue I have is that it is claimed that the introduction >> of a datum interpretation indicator is "safer" for applications\. >> I don't see why it's safer \- less chance of bugs \- to have the option >> of overlooking the data quality indicator rather than being forced >> to deal with it since it's built into the type somehow\. You can >> clearly fail to deal with it either way, but you don't so easily >> entirely miss the quality assessment > > you can, but with the data quality indicator approach you always know where to look to find out the data quality \- there are no special values for Integer, Real, String, Coded\_text etc to express the various possible data qualities\. Data quality values have nothing to do with data values, and should not be mixed in with them\. By the way, we did not invent this, it is used on control and monitoring systems, where they have to have a data quality marker for data items scanned from field devices\. This is never mixed in with the data value itself\. > > There are on the contrary many examples of systems which contain software bugs due to misaligned understandings of special values within data fields\. oh yes \- but I don't see that the HL7 approach is any different in outcome to the OpenEHR approach \(see above\) sorry to all normal people to have a long running discussion between Thomas and me about V3 data types continuing to break out on this list \- but Thomas did ask me to put my comments on this list >> The third issue is the presence of such a datum interpretation >> indicator \- will it actually be available when I want it? What >> will I do when I don't have the information but there is no spot >> for my system to say so? How will my system even know that a >> given attribute has a datum interpretation indicator associated >> with it? > > It is in every ELEMENT in the openEHR models\. > >> This last question raises another question \- where are such datum >> interpretation indicators introduced \- in the reference model, or >> the archetypes? > > Have a look at the ELEMENT class in the Data structures RM\. I can't believe I missed this\. thanks for correcting me\. >> This is part of a bigger question, which I will >> return to later \(and is my major concern\) \- can archetypes actually >> work in practice at all? \(but don't try to answer this till I have >> added more to this question later\) > > quick answer: yes \- we know because we have built the software \(an Eiffel\-based system\) to prove it; so has the DSTC \(an XML\-based system\)\. So has the team at UCL \(Java/ObjectStore\), and so has a company in Sydney called Meridian, which makes a system called Obstet, which I have personally reviewed\. THere are at least 3 other distinct systems that I know of\. ok, I don't doubt that archetypes will \(and do\) work\. But hang on\! Let's consider the claims for what archetypes acheive \(from "design principles"\): \* domain experts, rather than IT people, need to be able to    directly define and manage the knowledge definitions of    their systems \- we call this 'domain empowerment'\. \* it must be possible to deploy systems prior to having created    formal knowledge models of the domain, in other words, systems    must be 'future\-proof'\. I'm working on an archetype based system now\. I don't have the full infrastructure of OpenEHR, but I have issues with whether either of these is achievable as stated, and I don't think my issues arise out of not having the full OpenEHR infrastructure\. regarding the first, if non\-IT people define the knowledge definitions, they will not acquire the features that are needed for real work\. IT people are slowly acquiring a knowledgebase of how to define systems \- and OpenEHR is at the forefront here in my view \- but this has not been easy to acquire\. If archetypes are to be usable, there will need to be substantial IT input into their design\. We've already discussed some of these things off the list \- about reliable identification of concepts and instances of data within the archetypes regarding the second, it will only be possible to deploy systems that are future proof \- i\.e\. archetype definitions can be changed without significant redesign of the system \- if the systems themselves can deal with the change to the archetypes\. Where the system is merely a purveyor of archetypes, and a human actually deals with the data, then this comes easy\. But where the system itself must understand the data, then it is hard work to make it act like this, and then an archetype is so beefed up that changing it constitutes as complete an upgrade to the system as if it had been reprogrammed in parts, and similar change management processes will be required So for these reasons I'm doubtful that the concept of archetypes will "work" in that I don't believe the goals as stated are achievable\. On the other hand, I think that the 2 level design model that archetypes represent will get us closer to that point than anything else\. It seems to me that both subjects in this email \- nulls and archetypes \- deal with wicked complexity\. It doesn't go away, all we can do is move it around\. And what we need to do is move it so that it arises where we are most tooled up to deal with it\. back to nulls \- I think that the OpenEHR suits what OpenEHR is trying to do best \- places the problem in the right place, but that the HL7 approach suits messaging best\. Whether the HL7 approach will remain useful when HL7 moves into EHR \- that depends on how far they move and when\. Grahame --- **Canonical:** https://discourse.openehr.org/t/intro-questions-null/15735 **Original content:** https://discourse.openehr.org/t/intro-questions-null/15735