Data Types

Tim_Cook5 · 4 June 2002 04:57

DV_PARTIAL_DATE - Purpose: Incorrectly assumes that a 'day' is
unknown with a known or unknown month.
It is very realistic to see a situation in which a person will
recall that something occurred on the 1st of the month 10 years ago
but cannot recall if it was June or July. People relate things in
their life and if an event is recurrent on a specific day of the
month they will recall that though they may have no reference to
which month it was.

DV_PARTIAL_TIME - Purpose: Incorrectly assumes that an hour will be
known. Same reasoning as above that a person may not be certain if
an event occurred at half-past 10 or half-past 11.

R/S,
Tim Cook

thomas.beale · 4 June 2002 09:32

[to readers of this list: please remember to cc:all when replying! The list policy is not to automatically set reply-to: to the sender due to spam and vacation email problems]

Tim Cook wrote:

DV_PARTIAL_DATE - Purpose: Incorrectly assumes that a 'day' is
unknown with a known or unknown month.
It is very realistic to see a situation in which a person will
recall that something occurred on the 1st of the month 10 years ago
but cannot recall if it was June or July. People relate things in
their life and if an event is recurrent on a specific day of the
month they will recall that though they may have no reference to
which month it was.

yes, we had this debate, and I seem to remember that we thought that a date like 2/?/1993 was artificial in the sense that even if the patient did remember for a fact that it was the 1st or whatever, the date was no better than as if the day was also forgotten - from a mathematical/processing point of view. Personally I'm agnostic on this, and I would lean toward the "faithfulness" requirement of GEHR which would say record it anyway.

What do others think?

DV_PARTIAL_TIME - Purpose: Incorrectly assumes that an hour will be
known. Same reasoning as above that a person may not be certain if
an event occurred at half-past 10 or half-past 11.

same argument either way for time I guess.

- thomas beale

Tim_Benson · 4 June 2002 10:46

Surely the criterion for any structured data is whether another application
is expected to use that structured data in a way that (a) adds value and (b)
is safe. If either (a) or (b) are not true then structure simply adds cost
and complexity without benefit.

thomas.beale · 4 June 2002 09:57

Tim Benson wrote:

Surely the criterion for any structured data is whether another application
is expected to use that structured data in a way that (a) adds value and (b)
is safe. If either (a) or (b) are not true then structure simply adds cost
and complexity without benefit.

Tim, I agree with the premise; but what is your solution in this case? The structure would only change in a very trivial way i.e. by adding a flag which means "day_unknown". Are you asking for a use case which proves that this should exist? I agree - that's what we need. Tim has provided the simplest of all - if the patient said it, we should record it. Is it enough - I don't know...

- thomas beale

thomas.beale · 4 June 2002 09:59

Tim Benson wrote:

Surely the criterion for any structured data is whether another application
is expected to use that structured data in a way that (a) adds value and (b)
is safe. If either (a) or (b) are not true then structure simply adds cost
and complexity without benefit.

Tim, I agree with the premise; but what is your solution in this case? The structure would only change in a very trivial way i.e. by adding a flag which means "day_unknown". Are you asking for a use case which proves that this should exist? I agree - that's what we need. Tim has provided the simplest of all - if the patient said it, we should record it. Is it enough - I don't know...

- thomas beale

Sam · 4 June 2002 16:25

Tim

I think this is true but from a date point of view we can only know the year
if the month is unknown - if it is one or two then the person will have to
guess and store it as a fuzzy date. I think this is the only sensible
approach. We can record in text the time issues that have been mentioned.

Sam

Sam · 4 June 2002 16:31

I do not think he is right - Sam

Tony_Grivell · 5 June 2002 01:23

I agree with both aspects of Thomas's argument. In terms of future practical "longitudinal" use of the data, the effective time-precision will be set much more by the month than the day. However, it's also true that there _may_ be some other (presently-unimagined) use for the more-precisely defined day - which argues in favour of it being recorded.

tony grivell

thomas.beale · 5 June 2002 02:17

Tony Grivell wrote:

I agree with both aspects of Thomas's argument. In terms of future practical "longitudinal" use of the data, the effective time-precision will be set much more by the month than the day. However, it's also true that there _may_ be some other (presently-unimagined) use for the more-precisely defined day - which argues in favour of it being recorded.

one use case I was trying to imagine was - what if the remembered date corresponded to some other significant date in the patient's history, and doctors were trying to figure out say medications or other interventions which the patient was unsure about about/couldn't remember. So let's say there is an admission for 15/dec/1990 (start of an episode where a fracture was treated), and at some later time, the patient is telling their GP that on the "15th of I forget what month near the end of 1990, I fractured my leg". The GP might review the record and think that the two dates were probably the same, and that it was therefore the same fracture. I would guess that this is a real contrived long shot, and probably unrealistic, but we need some more evidence from clinicians...

- thomas beale

Tim_Cook5 · 5 June 2002 05:00

record and think that the two dates were probably the
same, and that it
was therefore the same fracture. I would guess that this
is a real
contrived long shot, and probably unrealistic, but we
need some more
evidence from clinicians...

*** CAUTION *** This turned into a rambling about sociological
issues and may have no technical merit.

I am not a clinician, but ..............

I believe this is NOT unrealistic. It is 'how' people think.
Especially so when confronted with a question they did not expect in
a usually stressful environment. Yes, a family physician's office
is a stressful environment for most people. It is unfamiliar, they
are usually ill and the questions seem to come from out of nowhere.

Physician's tend to put together puzzles about their patients and
they seldom have all the pieces at one time. We probably cannot
name a system (at this time) that allows physicians to quickly and
easily put those pieces together. But, if given a place to store
those pieces, "in context", an implementation of this model will be
able to perform those type of retrieval functions.

Envision being able to scan a medical record for all partial dates.
Retrieve those dates along with some context of the CONTRIBUTION. A
computer could do very little with that information in most cases.
But a human mind (physician) could probably see
relationships/patterns very quickly.

The 'idea' of an EHR should be to provide the clinician with
appropriate information quickly, so they can do their job better
(improved patient care).

Dr. Robert Shepherd so aptly describes the real benefits an EHR
(with or without decision support) provides to a family physician.
I hope I can paraphrase it in an understandable way. He says that a
family doc will spend 90% of their clinical time on things they are
familiar with. The other 10% is where they benefit the most from
the added information that an EHR can provide quickly. That
information may come in the form of linked guidelines, accurate and
searchable documentation (possibly from other patients EHR's),
current patient's history and problem list or sophisticated decision
support system. Dr. Shepherd supports this with community /
sociological reasoning. So even though a family doc is a
generalist, they tend to specialize to some degree because of these
social realities.

So, for these reasons I believe that providing ways to capture
imprecise dates/times in other than DV_TEXT is of benefit and
allowing for them to contain only 1/3 of the total data is not
without merit.

pschloeffel · 5 June 2002 10:51

I would support Tim's view on the grounds of faithfulness also.

Peter Schloeffel

pschloeffel · 5 June 2002 21:56

This would obviously be very uncommon but certainly a possible clinical scenario
which supports the case for inclusion.

Peter Schloeffel

Tim_Benson · 7 June 2002 08:49

Tom,
I do not think that structure can be justified if that structure is unlikely
to add either value or safety down the line. So in the situation where we
are not able to rely on a time as being either a strict point in time or an
interval is likely to create semantic problems. Unless you can rely on
strict chronological listing it is unhelpful to try to give spurious
precision. So my suggestion is that such fuzzy dates should be put into
free text only and all dates associated with any entry should only be the
ones we can rely on, such as date and time of entry.

What is more precise: "the first of the month, but do not remember which
month", "the night it rained" or "the morning that the kids were late for
school"? To me there is no point in using anything other than free text for
any of these. Julian dates can be very useful, but not all date information
fits the simple model and errors are made when we try to force it in.

We should always have a time stamp for computer entry, which should be
flagged if this is the only Julian-type date information that is available
(and must be used with great caution along side free text data).

Tim

thomas.beale · 7 June 2002 08:10

Tim Benson wrote:

Tom,
I do not think that structure can be justified if that structure is unlikely
to add either value or safety down the line.

a priori, I agree 100% with this; our unwritten motto is: only make it structured if it is safely computable...

So in the situation where we
are not able to rely on a time as being either a strict point in time or an
interval is likely to create semantic problems. Unless you can rely on
strict chronological listing it is unhelpful to try to give spurious
precision. So my suggestion is that such fuzzy dates should be put into
free text only and all dates associated with any entry should only be the
ones we can rely on, such as date and time of entry.

well, I think it depends on how fuzzy. If someone jsut can't remember which date in the first week of July 2001 she first experienced organ rejection symptoms after a kidney transplant, that's not a completely unusable date; and one can easily imagine researchers wanting this kind of date to be in computable form in a study which is trying to characterise the efficacy of immunosuppresant drugs on transplant patients. It seems to me that this is a good case for creating a PARTIAL_DATE object with the day unknown (or maybe an INTERVAL<DATE> - it doesn't matter).

We also added the follwong routines to PARTIAL_DATE:
probable_date: DATE
possible_dates:INTERVAL<DATE>

These provide statistically reasonable approximations to the true date, for the purposes of querying and research.

So the question in my mind is: when is a date or a time _too_ fuzzy to record as a structured object? (I should point out that we agree completely that very unreliable dates/times should indeed be text; but where is the cut-off point?)

What is more precise: "the first of the month, but do not remember which
month", "the night it rained" or "the morning that the kids were late for
school"? To me there is no point in using anything other than free text for
any of these. Julian dates can be very useful, but not all date information
fits the simple model and errors are made when we try to force it in.

again, on the face of it I agree. But it may be in some circumstances that during a 1 minute discussion with the receptionist at the surgery, the patient agrees that "the night it rained" was indeed either tuesday of wednesday of last week, then we do in fact have a reasonable partial date...

apart from these (probably few) cases, I agree, we do not want to suggest that all statements about time no matter how fuzzy should be encoded as structured instances.

We should always have a time stamp for computer entry, which should be
flagged if this is the only Julian-type date information that is available
(and must be used with great caution along side free text data).

agree. Ultimately it is up to the clinician to read the entry and act properly upon it of course.

I think we need to make some additions to the openEHR RM text to do with "when is a date too fuzzy to record in a structured way".

thanks for your comments

- thomas

Paul_Stephen_Woolman · 7 June 2002 10:18

In matching patient records for statistical purposes or for longitudinal
tracking it is frequently the case that dates are entered erroneously,
sometimes by the data entry person, sometimes by the patient themselves. for
instance the date of an operation can easily be "around march 1993". If there
is a date field in the system the operator will usually add "1st" so the date
becomes 1993-03-01 which is clearly incorrect but is the best assumption
given the information. Often the operator cannot proceed through the system
unless the date field is entered. We are humans and most clinical people will
do this sort of approximation in entering dates and "safety" depends a lot on
clinical judgement. Most computer systems do not allow for free text entry of
dates so i dont think that is a starter. What is useful in longitudinal
stracking of records is having some knowledge of the accuracy of a date so a
field like "date certainty" is used and has values like "certain, sure,
approximate, guess".

Here is an example of matching two records for longitudinal study:

record 1

surname BLOGGS
date of birth 1993-03-01 certainty APPROXIMATE
first name JAMES STUART
Health number 1234567890

record 2

surname BLOGGS
date of birth 1993-03-12 certainty CERTAIN
first name JAMEY S.
Health number 123456789

these two records are on the face of it different and could be a different
person but in fact are of the same person recorded in different places at
different times by different people. Human error explaines the different
recordings. Having the certainty field helps to match the records together by
giving less weight to the approximate date in the first record. A matching
algorithm does this as well as taking the initial in the second record as
matching the second forename in the first record.

Paul Woolman
XML program manager
Information and Statistics Division
NHS Scotland

Quoting Thomas Beale <thomas@deepthought.com.au>:

system · 7 June 2002 19:49

Tim,

Not the receiving system is the criterion.
It is the requirement to be able to store narrative and retreive/search in
narrative text in a safe way that is the criterion.

Gerard

Surely the criterion for any structured data is whether another application
is expected to use that structured data in a way that (a) adds value and (b)
is safe. If either (a) or (b) are not true then structure simply adds cost
and complexity without benefit.

-- <private> --
Gerard Freriks, arts
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands

+31 252 544896
+31 654 792800

thomas.beale · 8 June 2002 01:57

Our current solution to this situation, as I mentioned in another post was to add the following routines to the PARTIAL_DATE type:
probable_date: DATE
possible_dates:INTERVAL<DATE>

If the user GUI was then constructed so that a blank date, and a blank month were allowed, the software behind would create a PARTIAL_DATE object. These functions would provide sensible values for statistical computation (i.e. 15th of the month if date unknown, 1st/June if date and month unknown). The possible_dates function provides the outer limits of the possible values of the date, which can be used for query matching. I am not sure of how much skew this introduces, but it has to be better than having falsely accuracte dates, or else no structured date at all.

Paul, what do you think of this approach?

- thomas beale

system · 8 June 2002 15:54

Language is a funny thing.

Sometimes a concept that is used is very precise. (a specific date, a
specific object, e.g. a particular chair)
Sometimes the concept is vague. (some time, some date, any object with the
name chair)
Sometimes it is possible to define exactly what we mean. But what is a
perfect definition of a chair, so a person not having seen any in his life
understands it?

In Informatics many tend to think that everything can be defined precisely.
Reality is, more often than not, it can't .
The problem is how to handle these imprecise concepts faithfully.

By having an attribute (like HL7) indicating this?

Gerard

Ps:
Hi Paul.
Is NHS Scotland interested to take part in EN 13606 work with CEN?

Our current solution to this situation, as I mentioned in another post
was to add the following routines to the PARTIAL_DATE type:
probable_date: DATE
possible_dates:INTERVAL<DATE>

If the user GUI was then constructed so that a blank date, and a blank
month were allowed, the software behind would create a PARTIAL_DATE
object. These functions would provide sensible values for statistical
computation (i.e. 15th of the month if date unknown, 1st/June if date
and month unknown). The possible_dates function provides the outer
limits of the possible values of the date, which can be used for query
matching. I am not sure of how much skew this introduces, but it has to
be better than having falsely accuracte dates, or else no structured
date at all.

Paul, what do you think of this approach?

- thomas beale

-
If you have any questions about using this list,
please send a message to d.lloyd@openehr.org

-- <private> --
Gerard Freriks, arts
Huigsloterdijk 378
2158 LR Buitenkaag
The Netherlands

+31 252 544896
+31 654 792800

Tim_Cook5 · 10 June 2002 00:28

[many very good points deleted for brevity]

> Envision being able to scan a medical record for all partial

dates.

> Retrieve those dates along with some context of the

CONTRIBUTION. A

> computer could do very little with that information in most

cases.

> But a human mind (physician) could probably see
> relationships/patterns very quickly.

Perhaps. Or it could be a mess of obscurantist screen
junk. But if the mess was organised as the text of a story told by

one human

being to another at a particular time it might be OK.

Exactly.

The implementation of that vision is indeed very tricky. My point
is that the "model" must accommodate it before any attempt at
implementation.

the other point of course that is very important for usability is

the need for the

record to present less and less precision as the years recede:

Really? I had not considered this. Is it really distracting to
'see' a full date?
Does the brain not transform that during the process? Is this
'really important' or is it a 'really cool' technology problem to
conquer?

thomas.beale · 10 June 2002 08:03

Tim Benson wrote:

>Tom,
>I do not think that structure can be justified if that structure is unlikely
>to add either value or safety down the line. So in the situation where we
>are not able to rely on a time as being either a strict poFrom - Mon Jun 10 18:14:09 2002
X-UIDL: 1023696510.17046.bne009m.server-mail.com
X-Mozilla-Status: 0011
X-Mozilla-Status2: 00000000
Return-Path: <owner-openehr-technical@chime.ucl.ac.uk>
Delivered-To: mb34367a@bne009m.server-mail.com
Received: (qmail 17043 invoked by alias); 10 Jun 2002 08:08:30 -0000
Delivered-To: alias-deepthoughtcomaumb34367-thomas@deepthought.com.au
Received: (qmail 17019 invoked from network); 10 Jun 2002 08:08:29 -0000
Received: from unknown (HELO chime.ucl.ac.uk) (128.40.182.1)
  by bne009m.server-mail.com with SMTP; 10 Jun 2002 08:08:29 -0000
Received: from localhost (daemon@localhost)
  by chime.ucl.ac.uk (8.11.6/8.11.6) with SMTP id g5A86x815248;
  Mon, 10 Jun 2002 09:07:00 +0100 (BST)
Received: by ATuin.chime.ucl.ac.uk (bulk_mailer v1.13); Mon, 10 Jun 2002 09:05:31 +0100
Received: (from majordom@localhost)
  by chime.ucl.ac.uk (8.11.6/8.11.6) id g5A85NH14923
  for openehr-technical-rimward; Mon, 10 Jun 2002 09:05:23 +0100 (BST)
Received: from mta04.mail.mel.aone.net.au (mta04.mail.au.uu.net [203.2.192.84])
  by chime.ucl.ac.uk (8.11.6/8.11.6) with ESMTP id g5A84cj14795
  for <openehr-technical@openehr.org>; Mon, 10 Jun 2002 09:04:39 +0100 (BST)
Received: from deepthought.com.au ([210.84.94.234])
          by mta04.mail.mel.aone.net.au with ESMTP
          id <20020610080501.JEQP27618.mta04.mail.mel.aone.net.au@deepthought.com.au>;
          Mon, 10 Jun 2002 18:05:01 +1000
Message-ID: <3D045E2F.1090908@deepthought.com.au>
Organization: Deep Thought Informatics Pty Ltd
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.4.1) Gecko/20020508 Netscape6/6.2.3
X-Accept-Language: en,pdf
MIME-Version: 1.0

Topic		Replies	Views
More on ISO 21090 complexity Technical (archive)	35	3	24 November 2010
Pathology numeric values not supported in DV_Quantity Technical (archive)	52	0	4 May 2006
Flavour of null Technical (archive)	31	2	1 June 2005
DV_WORLD_TIME and timezone Technical (archive)	40	0	18 September 2002
openEHR / FHIR data types cross analysis Technical (archive)	20	0	27 March 2012
Intro & Questions: null Technical (archive)	2	1	26 March 2003
The concept of contribution Technical (archive)	45	2	13 June 2002
Null Flavours, boolean values in openEHR Technical (archive)	4	0	11 December 2007
Use of Identifiers in archetypes Technical (archive)	19	0	19 January 2011
Representing binary values with DV_BOOLEAN Technical (archive)	25	1	10 February 2011

Data Types

Related topics