Data Types

Hi Tom,
I do not know which HL7 TCs and SIGs should host this debate. It is not
humanly possible to follow everything, but I would have thought that EHR and
MnM should also have a point of view.

My thrust all along has been to make sure that we really understand what are
the specific requirements of each use case, which is less controversial than
asking what is the set of all possible requirements.

Once the requirements are clear, then it is another debate about how best to
realise them in a design specification. I still feel that these two stages
are being melded together, when they should be kept as separate as possible.

Tim

Tim Benson wrote:

Hi Tom,
I do not know which HL7 TCs and SIGs should host this debate. It is not
humanly possible to follow everything, but I would have thought that EHR and
MnM should also have a point of view.

My thrust all along has been to make sure that we really understand what are
the specific requirements of each use case, which is less controversial than
asking what is the set of all possible requirements.

it is, and it is the right thing to do when building a system. But when building a standard, or a product, or something which is clearly going to have application outside the situations which can possibly be thought of when it is being written, things are somewhat different - we cannot just stick to software-engineering as usual.

This is why the GEHR/openEHR work uses a 2-level framework rather than the typical single-level one, which is very limited and does not behave well in time.

That said, I am all for finding use cases to justify the inclusion of things, and avoiding unncessary optionality - within an overall framework which provides for future-proofing.

- thomas

Hi,

In my mind I always placed the Null-stuff as an attribute within an
element/Class. Not in a datatype. It is meta-information.

Gerard

Tim Benson wrote:

Tom,
I do not think that structure can be justified if that structure is

unlikely

to add either value or safety down the line. So in the situation where we
are not able to rely on a time as being either a strict point in time

or an

interval is likely to create semantic problems. Unless you can rely on
strict chronological listing it is unhelpful to try to give spurious
precision. So my suggestion is that such fuzzy dates should be put into
free text only and all dates associated with any entry should only be the
ones we can rely on, such as date and time of entry.

We are having an interesting debate on just this topic on the HL7 CQ
list (don't know if you get that). The HL7 data type modelling approach
seems to be to include Null markers all over the place inside the data
types, so that no matter how little you know, you can still create an
instance of a structured data item. My reponse to this has been:

- it makes the data type specification quite a lot more complex, since
now the semantics have to always include the possibility of an attribute
or function result of a data item being Null (just start thinking about
this and it will become more obvious)
- it will make the implementation of data types and also software that
uses them more complicated
- it will create some data instances where parts of the item are
missing, which will IMO be quite unexpected by most software. E.g.
IVL<T>s with missing upper and lower limits (but the principle is
general and applies to all data types). I think there is the potential
for unsafe data via this approach.

In the long term, I think this may cause pollution of EHRs and other
systems with unreliable data items, and cause erroneous results in some
decision support and query-based applications. It will also prevent
applications based on a more typical concept of data types from working
properly.

I am not saying the HL7 approach is invalid - it is valid - but it is
also quite complex, and overkill in most cases (in some parts of the RIM
it is in fact in error, but that's another argument).

The openEHR approach is much simpler:
- data types are "clean" - Null markers are specified at the next level
up in the model
- some special partial data types such as PARTIAL_DATE are specified,
because they occur commonly. The model of PARTIAL_DATE explicitly says
what can be missing and what cannot be, and defines all its semantics
accordingly
- if not enough information is known to create a data item, it should be
recorded as narrative. This way, decision support and querying will not
be operating on unreliable data.

This approach can be summarised as an "all-or-nothing" approach - either
you have the required values to create the data item, or you don't. The
HL7 approach can be described as an "anything-goes" approach - you can
create a structured data item no matter how little you know; it will
just have fewer or more Null markers.

I am partway though writing up the different design approaches, which I
will post if anyone wants to see it.

I wonder what others think.

- thomas beale

-
If you have any questions about using this list,
please send a message to d.lloyd@openehr.org

-- <private> --
Gerard Freriks, arts
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands

+31 252 544896
+31 654 792800

Tom Beale wrote:

But when
building a standard, or a product, or something which is clearly going
to have application outside the situations which can possibly be thought
of when it is being written, things are somewhat different - we cannot
just stick to software-engineering as usual.

Well here I disagree. It is always the case that something designed to do a
set of well specified jobs well, will always find other valuable uses.

This is why the GEHR/openEHR work uses a 2-level framework rather than
the typical single-level one, which is very limited and does not behave
well in time.

I am all in favour of multilevel frameworks. You might like to check out
the e-Service Development Framework at:
http://www.govtalk.gov.uk/documents/eSDFprimerV1b.pdf

This has a three level framework with a high level architecture, reusable
elements and standards (which can be implemented). In another
representation this is presented as the central three layer sandwich of a
5-layer framework with generic standards (such as UML, and XML) above the
architecture and actual instantiations below the standards giving:

- Generic standards (e.g XML)
_ High level information architecture
- Reusable elements
- Standards
- Instantiation

Each of these 5 layers can be divided vertically into Requirements, Design
and Technology Implementation, giving a matrix.

Tim

[many very good points deleted for brevity]

> Envision being able to scan a medical record for all partial

dates.

> Retrieve those dates along with some context of the

CONTRIBUTION. A

> computer could do very little with that information in most

cases.

> But a human mind (physician) could probably see
> relationships/patterns very quickly.

Perhaps. Or it could be a mess of obscurantist screen
junk. But if the mess was organised as the text of a story told by

one human

being to another at a particular time it might be OK.

Exactly.

The implementation of that vision is indeed very tricky. My point
is that the "model" must accommodate it before any attempt at
implementation.

the other point of course that is very important for usability is

the need for the

record to present less and less precision as the years recede:

Really? I had not considered this. Is it really distracting to
'see' a full date?
Does the brain not transform that during the process? Is this
'really important' or is it a 'really cool' technology problem to
conquer?

The _apparent_ need (or acceptability) for less precision in dates as they age is not necessarily true. For example, in retrospective review of 'causes and effects' (eg a drug trial) time _intervals_ will be required to be calculatable at optimum precision. A general principle is, surely, that we can't and should not anticipate the future use/value of any of the data!

Tony Grivell

The _apparent_ need (or acceptability) for less precision
in dates as they age is not necessarily true. For example, in

retrospective

review of 'causes and effects' (eg a drug trial) time _intervals_
will be required to be calculatable at optimum precision.
A general principle is, surely, that we can't and should not

anticipate the

future use/value of any of the data!

It was my impression that he was speaking of an implementation
issue. Not that the actual data should degrade in precision over
time. So the presentation to the physician is what is in question
here. If that is what users would want/need to see then I have no
objection to it. I was merely asking if that was truely a 'need'.

Tim

Tim

This is definately a mistake - amny disorders have a date of onset that is
fuzzy from a month point of view but is worthwhile - last Pap smear, last
attendance at Ophthalmologist etc. The point about a fuzzy date is that it
is helpful for human interpretation - a month that a spouse died will be
very worthwhile even if a day is not known - when chasing records at another
centre - knowing that a date is accurate or not will overcome a lot of
frustration.

SAm

Time to weigh in on fuzzy dates. We have been using fuzzy dates at Duke
and in TMR since the early 70s for just the reason Sam states. Often
patients will know on;y the year, more frequently the month and year only
but no date. We discover that partial data is much more useful than no
data.

So we used fuzzy dates. The fuzzy dates are displayed with ?? for the
unknown parts. Whenever we sort, a fuzzy day sorts to the 15th of the
month, and a fuzzy year sorts to July. Statisticians are generally unhappy
with fuzzy dates and want to throw them out. But every one seems happy
when someone records the date of onset for hypertension as July 4, 1976.
Where is the hour, minutes and seconds. I argue that fuzzy dates are
acceptable and valid data points and should be used in statistical
analysis.

In a datetime stamp, unknowns are stored as 00. Thank goodness, we use
another saymbol for a totally unknown date.

Ed Hammond

Sam, I think you have misunderstood me. Human beings love complex patterns,
but computers hate them. Of course you must keep the richness of "the day
before the big storm", but you should not try to put that sort of thing into
a Julian date field. Let people do what they are good for and let us use
computing for what it is good for. The fact is computers do not like
ambiguity. The question is always what do we want to use this info for? Is
it to structure a record in chronological order or what?

Tim

Tim Benson wrote:

Sam, I think you have misunderstood me. Human beings love complex patterns,
but computers hate them. Of course you must keep the richness of "the day
before the big storm", but you should not try to put that sort of thing into
a Julian date field. Let people do what they are good for and let us use
computing for what it is good for. The fact is computers do not like
ambiguity. The question is always what do we want to use this info for? Is
it to structure a record in chronological order or what?

Tim

I think Tim's general considerations are correct (or at least I agree with them :wink: - the reasons to use structured v non-structured data items (or any items for that matter) are:
- if you have enough raw data to build the structured item
- if the information is to be used in computation

I think these principles are correct.. but we do need to understand them.

The general design of the openEHR data types follows these principles in that you cannot create any item unless you can provide the required data to the creation routine; i.e. you can only create valid data items, be they quantities, terms or whatever.

However, there are times when you don't quite have all the raw data, but a) you have enough to build a reasonable version of a data instance, and b) you want to be able to compute on the instance. Partial dates and times fall into the category, and this is why we have created separate classes of them. If you have year and month only, you cannot create a valid DATE instance, but you can create a valid PARTIAL_DATE instance, which will still satisfy the computational requirements of DATEs (by synthesising reasonable mid-month dates, etc)

For data which is really quite unreliable, we suggest that it be recorded as narrative text, as Tim mentioned earlier.

Contrast this with the HL7 data type approach where every type and every attribute and function result can be Null indicating it is unknown. The idea of this (according to Gunther) is so that no matter how little you know, you can record it in structured form. We can think of this design approach as a completely fuzzy approach. As an example, you can have a IVL<TS> (interval of time) with unknown low and high values. I have noted that this makes it nearly useless for computation, since you can't even call the contains(a_time:TS) routine - well you can, but you will get back "UNK" (unknown) as a value.

I see dangers in this approach:
- the specification is more complex, since the semantics have to include the case where each and every attribute might be Null. Complex specifications are more likely to lead to implementation bugs
- software will be more complex because it has to be able to handle UNKs
- unreliable raw data is being used to create structured data instances whch might be treated by software as being more reliable than they really are
- if there is software operating on the data that does not understand the possibility that UNK can be returned from function calls, it is not clear that the data is safely processable

I can see the theoretical interest of recording unreliable data in a structured way, even if half of it is missing, but practically I don't think that it is a very useful thing to do, except in exceptional (and common) cases like date & time. Gunther says that people may come back and fill in the missing bits, but in general I think this is quite unlikely - no-one has time. (Exceptions might be partial data gathered in A&E or similar situations).

Hence we have opted for a simpler approach:
- in general data types are designed in a pure fashion - no general facility for unknown elements
- special data types for partial data are specified; the advantage of this is that the semantics of these types are clear
- Null markers are recorded, not inside data instances, but where they are used, e.g. in the ELEMENT class in the EHR reference model

thoughts?

- thomas beale

William E Hammond wrote:

Time to weigh in on fuzzy dates. We have been using fuzzy dates at Duke
and in TMR since the early 70s for just the reason Sam states. Often
patients will know on;y the year, more frequently the month and year only
but no date. We discover that partial data is much more useful than no
data.

So we used fuzzy dates. The fuzzy dates are displayed with ?? for the
unknown parts. Whenever we sort, a fuzzy day sorts to the 15th of the
month, and a fuzzy year sorts to July.

Ed, presumably you meant "a fuzzy month". This is the design we have used, so that's encouraging (when can we install it at Duke?-).

Statisticians are generally unhappy
with fuzzy dates and want to throw them out.

I am not convinced that the statistical arguments are so great - I can see that there would be a skew towards things that happen more often on the 15th of the month, due to the day-less dates in the system, but I can't think of any clinical research that would be looking at that. Are there any studies on the dangers of fuzzy dates in statistical analysis?

But every one seems happy
when someone records the date of onset for hypertension as July 4, 1976.
Where is the hour, minutes and seconds. I argue that fuzzy dates are
acceptable and valid data points and should be used in statistical
analysis.

In a datetime stamp, unknowns are stored as 00. Thank goodness, we use
another saymbol for a totally unknown date.

Ed Hammond

- thomas beale

Thomas,
You are correct - I meant fuzzy month, not year. I wish Duke were in a
position to let you install.

Ed

I can see the theoretical interest of recording unreliable data in a
structured way, even if half of it is missing, but practically I don't
think that it is a very useful thing to do, except in exceptional (and
common) cases like date & time. Gunther says that people may come back
and fill in the missing bits, but in general I think this is quite
unlikely - no-one has time. (Exceptions might be partial data gathered
in A&E or similar situations).

All clinical data is unreliable data. Or at least, all clinical data has
varying degrees of reliability. On paper records, the structure tends to
allow the person reading the record to make a shrewd assessment of its
reliability. We pay more attention to the consultant's clinic letter than
the house officer's handwritten note at 4am. We note the gaps as much as the
actual data. We cast an eye down the problem list summary and instantly have
a feel for how well it has been maintained, without necessarily
obsessionally correlating that list with the narrative in the notes.

If I see a patient in A&E who has been punched on the nose, look up his
nose, draw a picture of his nose, and write "no septal haematoma" in the
notes it is pretty clear that when the patient comes back with a septal
haematoma that it developed subsequent to my examination. (Yes, it turns out
the patient was punched a second time...). If I make a statement only about
the external bruising, there will be doubt.

If I see a patient who subsequently turns out to have thyrotoxicosis, but do
not record the presence or absence of certain key clinical findings (e.g.
pulse, weight, tremor), and do not order thyroid function blood tests, then
there must be doubt if I even considered the diagnosis.

Abstracting clinical data out of context is problematic.

One of the skills of an expert is to read a record, and quickly form an
accurate impression of the patient's problems. Part of this is knowing what
to ignore, as well as recognising positive cues. Now paper records have many
problems, but unfortunately most computerised records seem to elevate the
desire of computer scientists, ontologists, expert system designers,
statisticians, politicians, and other busybodies to atomise data elements
from the records of clinical consultations for their own purposes. Fine.
There are lots of good reasons for doing this. But it must not be the
detriment of the primary utility of the record for the clinician.

I know it's a windup to make this statement to this list, but we now have
enough cheap gadgets and computing power at the desktop to model a paper
record graphically. Maybe this would be a good starting point for a clinical
record that truly gave first priority to the clinicians using it.

Would the open-ehr archetypes provide the building blocks for a designer who
wanted to take this approach?

D.

What Ed has described is the proper way to bet on the number that is not there
when you have to for statistical purposes

You ahve to assume some complete date for the purpose of the statistics.
Taking the Mean date for hte month or the mean month for the year is the best
you can do. (We have used hte first of hte month and the first of the year and
that DOES produce a bias

As it turns out the Social security adminestration and the Tumor registry
systems do the same thing (that ed reccomends) when the specfics are not
known.

Thomas Beale wrote:

Clem McDonald wrote:

What Ed has described is the proper way to bet on the number that is not there
when you have to for statistical purposes

You ahve to assume some complete date for the purpose of the statistics.
Taking the Mean date for hte month or the mean month for the year is the best
you can do. (We have used hte first of hte month and the first of the year and
that DOES produce a bias

Right, so we have followed the first approach in the model:

day missing from date => synthesize 15th / MM / yyyy
day and month missing from date => synthesize 30 / JUN / yyyy

We also added the function

possible_dates:INTERVAL<DATE>

For a missing day => synthesize {1 / MM / yyyy - days_in_month(MM) / MM / yyyy}
For a missing month => synthesize {1 / 1 / yyyy - 31 / DEC / yyyy}

These functions can be used to catch the fuzzy dates when querying. I guess there will be a skew when multiple successive queries are run on the same data because fuzzy dates will match more queries, so there must be a better way to do this. Methods I can think of include:

- assigning a random date to each fuzzy date. This is hard, because as more dates are added to the system, you have to keep monitoring them to be sure that you are still setting the fuzzy ones to a truly random value

- using the interval method above, but when doing a series of queries, remembering when you already have matched an item, to prevent double inclusion.

Do Regenstrief or Duke have any method for making fuzzy dates match queries?

- thomas beale

Douglas Carnall wrote:

...

If I see a patient who subsequently turns out to have thyrotoxicosis, but do
not record the presence or absence of certain key clinical findings (e.g.
pulse, weight, tremor), and do not order thyroid function blood tests, then
there must be doubt if I even considered the diagnosis.

Abstracting clinical data out of context is problematic.

this is certainly our point of view. We would say:
- record what is stated measured, checked etc
- do it in such a way that it works a) for patient care b) for decision support c) for other uses, in that order.
- assume that physicians and other health workers are thiking people, and will in general use data to make inferences leading to care decisions; don't try to record data in a way that makes presumptions about this, or prejudices the thinking process of the clinician

That said, there is still the technical challenge, at the reductionist end of the data-recording spectrum, of when to try an record data items in structured form (so they are computable) and when not to. Structured data is much better for:
- computation, especially decision support
- interoperability, since every communicating party can agree on the one standard for what a "Quantity" etc looks like

However, many people, including myself, have strong reservations about recording very unreliable data (either partially specified or from a known/suspected unreliable source) in structured form, particularly if values are synthesized to make it fit the requirements of creation of the structured data object in question.

I know it's a windup to make this statement to this list, but we now have
enough cheap gadgets and computing power at the desktop to model a paper
record graphically. Maybe this would be a good starting point for a clinical
record that truly gave first priority to the clinicians using it.

Would the open-ehr archetypes provide the building blocks for a designer who
wanted to take this approach?

first thing to say is that CEN/GEHR/openEHR approaches do not predispose (we hope) the visual appearance of EHRs in applications to any particular model; there is no reason why the clinician's view of the record on the screen should not look like the paper record they are used to. Once you start looking at forms for recording information in the paper record, it is clear that these forms often represent a) a long-term refinement of important data items for the purpose, and b) a long-term refinement of the arrangement of the questions and way of recording answers. So in many cases, forms will be a starting point for archetypes.

But I should stress that archetypes (as we have defined them in GEHR/openEHR) are constraint models of data, not models for forms as such. Now consider a form like the diabetic interview form in our current project. The first time interview form has boxes for information that clinicians recognise as being in various well-known categories, such as lifestyle (the smoking, diet and exercise questions), family history (diabetes in the family), current medications, and so on. We envisage archetypes primarily for structuring data in the record, so there will be archetypes for each of these well-known categories of information. This means that if a different clinician uses an unrelated form for the patient, which also asks for (probably different) data to do with lifestyle, family history and so on, what we want are archetypes for lifestyle, fam hist etc, which cover the data being asked for in each place. Over time, the design of such archetypes crystallises, and specialisations may be created for certain kinds of patients.

Where does this leave forms? One of the reasons I / DSTC have proposed a more formal concept of "contributions" is so that data gathered on a form, whcih might well be committed to different parts of the EHR according to various thematic (data-oriented rather than scren-oriented) archetypes, can be re-assembled easily into the original form. Secondly, there are various people thinking about "visual archetypes" and stylesheets for archetypes, and I have seen a system in Europe which I think could be integrated with the GEHR archetypes to build screen forms whose elements and element groups are based on archetypes, but where the overall design of the screen form resembles something the clinicians are used to seeing.

It is early days yet....

- thomas beale

Hmm. If I saw a patient with an absent pulse and failed to note that
finding, either mentally or in a clinical record, I think I might rightly be
accused of being a poor observer of the human condition. Come to think of
it, a weightless patient would be interesting too :slight_smile:

And everyone has a physiological tremor, though it is often exaggerated in
moderate to severe untreated thyrotoxicosis.

So in fact, now I come to think about it again, I didn't mean presence or
absence of pulse, weight or tremor, but presence or absence of--all right,
use the word, fuzzy, values for pulse, weight or tremor.

But you knew what I meant didn't you?

:wink:

D.

One thing to be clear on - we must differentiate between "not recorded" and "not there". Not recording someone's weight does not make them "weightless" (don't worry I understood the joke, but this is a serious point as well). A better example would be - not recording smoking status doesn't make the patient a non-smoker.

There are 5 possible situations I know of that can occur with data:

1. it is not recorded (nothing is recorded)
2. it was asked (e.g. by an application GUI) but remains unknown due to various reasons (patient uncounscious, refused to divulge, etc)
3. it is completely known and recorded
4. it is recorded, but there are bits missing
5. it is recorded, but in the negative (no known allergies, no previous surgery, etc etc)

Cases 2, 4and 5 have not always been properly catered for in systems.

Case 2 is dealt with in by the use of what i would call "data quality markers", i.e. what HL7 calls "flavours of Null". Actually, we call them that in the openEHR model, and use HL7's flavours of null (although we use them in a different way)

Case 4 is dealt in openEHR by partial data types e.g. DV_PARTIAL_DATE, and with Null Flavours in HL7.

Case 5 requires proper structurinng of the health record, so that negatives can be recorded; archetypes/templates help in this.

- thomas beale

Douglas Carnall wrote:

Hi,

When I was thinking about this many years ago,
I needed 4 attributes for a data item/statement/fragment/transaction/archetype, etc
.
One on Asked/Answered:

Answered Yes Answered No

Asked The ‘normal’ answer (3) No response (2)
Yes

Asked Unsollicited answer the real ‘Null’, Not recorded (1)
No

And then there is the Attribute on Certainty.
And then the one on Completeness.
And the one on Negation.

Gerard

One thing to be clear on - we must differentiate between “not recorded”
and “not there”. Not recording someone’s weight does not make them
“weightless” (don’t worry I understood the joke, but this is a serious
point as well). A better example would be - not recording smoking status
doesn’t make the patient a non-smoker.

There are 5 possible situations I know of that can occur with data:

  1. it is not recorded (nothing is recorded)
  2. it was asked (e.g. by an application GUI) but remains unknown due to
    various reasons (patient uncounscious, refused to divulge, etc)
  3. it is completely known and recorded
  4. it is recorded, but there are bits missing
  5. it is recorded, but in the negative (no known allergies, no previous
    surgery, etc etc)

Cases 2, 4and 5 have not always been properly catered for in systems.

Case 2 is dealt with in by the use of what i would call “data quality
markers”, i.e. what HL7 calls “flavours of Null”. Actually, we call them
that in the openEHR model, and use HL7’s flavours of null (although we
use them in a different way)

Case 4 is dealt in openEHR by partial data types e.g. DV_PARTIAL_DATE,
and with Null Flavours in HL7.

Case 5 requires proper structurinng of the health record, so that
negatives can be recorded; archetypes/templates help in this.

  • thomas beale

Douglas Carnall wrote:

If I see a patient who subsequently turns out to have thyrotoxicosis, but
do
not record the presence or absence of certain key clinical findings (e.g.
pulse, weight, tremor)

Hmm. If I saw a patient with an absent pulse and failed to note that
finding, either mentally or in a clinical record, I think I might rightly be
accused of being a poor observer of the human condition. Come to think of
it, a weightless patient would be interesting too :slight_smile:

And everyone has a physiological tremor, though it is often exaggerated in
moderate to severe untreated thyrotoxicosis.

So in fact, now I come to think about it again, I didn’t mean presence or
absence of pulse, weight or tremor, but presence or absence of–all right,
use the word, fuzzy, values for pulse, weight or tremor.

But you knew what I meant didn’t you?

:wink:

D.

– –
Gerard Freriks, arts
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands

+31 252 544896
+31 654 792800

Tom,

This area is interesting within the concept of the EHR as we have developed
it. Firstly, the place holders for key information may be an organiser
rather than an entry. For example with adverse reactions to medication - a
statement that the patient reports no known adverse reactions or even
allergies is worth knowing - but it is not an entry of the type allergy or
adverse reaction.

A negative report such as this requires updating - how long is it reasonable
to assume that no new allergies have developed? - whereas a report of an
allergy of penicillin if well documented will probably have the same status
from that point on. Reporting that someone is a non-smoker is not of the
same order - it is a negative finding but if the person is an adult you
would anticipate that it is a stable finding - whereas ex-smoker has a
different 'half-life' - it is also likely to be included in an entry about
smoking - how many per day, perhaps the number of pack years, and the
person's current smoking status. So negative recordings might be different.

The difference then is that a state of tobacco intake = 0 i.e. a non-smoker
is different than the state of having no known allergies - the count of
allergies = 0 but this is not an attribute of most people's health record.

I would propose that we have an entry that is EMPTY and returns a DV_TEXT
that can be displayed if required - but will be dated and we will know who
added it. This will allow organisers to be useful mandatory placeholders and
know unambiguously that there are no allergies for example (and when it was
asked and by whom)

Cheers, Sam

One thing to be clear on - we must differentiate between "not recorded"
and "not there". Not recording someone's weight does not make them
"weightless" (don't worry I understood the joke, but this is a serious
point as well). A better example would be - not recording smoking status
doesn't make the patient a non-smoker.

There are 5 possible situations I know of that can occur with data:

1. it is not recorded (nothing is recorded)
2. it was asked (e.g. by an application GUI) but remains unknown due to
various reasons (patient uncounscious, refused to divulge, etc)
3. it is completely known and recorded
4. it is recorded, but there are bits missing
5. it is recorded, but in the negative (no known allergies, no previous
surgery, etc etc)

Cases 2, 4and 5 have not always been properly catered for in systems.

Case 2 is dealt with in by the use of what i would call "data quality
markers", i.e. what HL7 calls "flavours of Null". Actually, we call them
that in the openEHR model, and use HL7's flavours of null (although we
use them in a different way)

Case 4 is dealt in openEHR by partial data types e.g. DV_PARTIAL_DATE,
and with Null Flavours in HL7.

Case 5 requires proper structurinng of the health record, so that
negatives can be recorded; archetypes/templates help in this.

- thomas beale

Douglas Carnall wrote:

>
>>>If I see a patient who subsequently turns out to have
thyrotoxicosis, but do
>>>not record the presence or absence of certain key clinical
findings (e.g.
>>>pulse, weight, tremor)
>>>
>
>Hmm. If I saw a patient with an absent pulse and failed to note that
>finding, either mentally or in a clinical record, I think I
might rightly be
>accused of being a poor observer of the human condition. Come to think of
>it, a weightless patient would be interesting too :slight_smile:
>
>And everyone has a physiological tremor, though it is often
exaggerated in
>moderate to severe untreated thyrotoxicosis.
>
>So in fact, now I come to think about it again, I didn't mean presence or
>absence of pulse, weight or tremor, but presence or absence
of--all right,
>use the word, fuzzy, values for pulse, weight or tremor.
>
>But you knew what I meant didn't you?
>
>:wink:
>
>D.
>

--
..............................................................
Deep Thought Informatics Pty Ltd

mailto:thomas@deepthought.com.au
open EHR - http://www.openEHR.org
Archetype Methodology -

http://www.deepthought.com.au/it/archetypes.html
Community Informatics -
http://www.deepthought.com.au/ci/rii/Output/mainTOC.html
..............................................................