# Suggestion wrt XML Archetypes & Templates **Category:** [Technical (archive)](https://discourse.openehr.org/c/technical-archive/156) **Created:** 2007-11-29 14:45 UTC **Views:** 2 **Replies:** 23 **URL:** https://discourse.openehr.org/t/suggestion-wrt-xml-archetypes-templates/14714 --- ## Post #1 by @Adam_Flinton Dear All, 1\) Problem statement 2\) Solution 3\) Points to Note 4\) XSLT Sheet 5\) Summary 1\) Problem statement I have been writing an OpenEHR publishing & QA routine which is basically Ant, which includes running XSLT tasks for the NHS\. There is a problem with the current structure of the XML archetypes & templates which is that the values are contained as a text\(\) child of an element & sometimes as the text\(\) child of a value child of the element\. This is dangerous & \(IMHO\) wrong\. The reasons being that : A\) a single value of that sort should be contained in an attribute\. B\) It leads to a world of pain wrt "pretty\-print"/indentation\. As an example, XMLSpy will automatically pretty print XML because that makes it readable to the \(human\) reader\. Equally XSLT sheets often use the indent="yes"in the output declaration\.     <xsl:output method="xml" version="1\.0" encoding="utf\-8"         indent="yes" /> Firstly it means that what looks like <rm\_type\_name>                             ELEMENT </rm\_type\_name> is actually: &\#xA;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;ELEMENT&\#xA;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9;&\#x9; As a really quick example of this, get an XML Archetype, open it in XMLSpy, press save, now open it in the Ocean Archetypes editor\. Admire the way the text now is all over the place and has empty square boxes for the line endings \(i\.e\. &\#xA;\) Now try and save as an ADL\. If you save as an XML, the formatting etc is retained\. Basically open up an xml archetype in XMLSpy, click save and you have a corrupt archetype\. Before people decry pretty printing per se bear in mind that : i\) single long string in not readable ii\) the adl is pretty printed\.\.\.\.i\.e\. your adl files do not come as one long string but are formatted in much the same way as XML is pretty printed\. The adl takes care of this in basically the same way I am going to suggest that the XML does ie\. description = <"Clinical description of the meconium"> vs description = Clinical description of the meconium or description =                                 Clinical description of the meconium etc\. Any XLST which tries to extract values from the present structure must engage in code such as: <xsl:variable name="tab">'&\#x9;'</xsl:variable>        <xsl:variable name="nl">'&\#xA;'</xsl:variable>        <xsl:variable name="v\_rm\_type\_name\_no\_pp"            select="translate\(translate\($v\_rm\_type\_name/text\(\),$tab,''\),$nl,''\)" /> & that in itself is dangerous as some editor might put in some formatting chars which are not being filtered out\. 2\) Solution: Instead of using a text child, any value should go in a value attribute e\.g\. <items id="description"> Clinical description of the meconium </items> becomes:   <items value="Clinical description of the meconium" id="description"/> 3\) Points to Note: A\) The result is actually closer to the adl e\.g\.          <items code="at0061">             <items value="Clinical description of the meconium" id="description"/>             <items value="Description" id="text"/>          </items>          <items code="at0062">             <items value="Colour of meconium" id="description"/>             <items value="Colour" id="text"/>          </items> vs                 \["at0061"\] = <                     description = <"Clinical description of the meconium">                     text = <"Description">                 >                 \["at0062"\] = <                     description = <"Colour of meconium">                     text = <"Colour">                 > B\) The files are approximately 2/3'rds the size of the originals\. This could be reduced further by using a smaller attribute name \(e\.g\. val or even v\)\. C\) The Archetypes are much more readable to the average human e\.g\. <details>          <language>             <terminology\_id value="ISO\_639\-1"/>             <code\_string value="en"/>          </language>          <purpose value="To describe body fluids and secretions"/>          <use/>          <misuse/> </details> vs: <details>         <language>                 <terminology\_id>                     <value>ISO\_639\-1</value>                 </terminology\_id>                 <code\_string>en</code\_string>             </language>             <purpose>To describe body fluids and secretions</purpose>             <use/>             <misuse/> </details> or      <occurrences>          <lower\_included value="true"/>          <upper\_included value="true"/>          <lower\_unbounded value="false"/>          <upper\_unbounded value="false"/>          <lower value="1"/>          <upper value="1"/>       </occurrences> vs:         <occurrences>             <lower\_included>true</lower\_included>             <upper\_included>true</upper\_included>             <lower\_unbounded>false</lower\_unbounded>             <upper\_unbounded>false</upper\_unbounded>             <lower>1</lower>             <upper>1</upper>         </occurrences> 4\) XSLT Sheet I have attached a mini\-xslt sheet which takes a template or XML Archetype & renders it into this fomat\. Run the XSLT with saxon as Xalan\.\.\.\.shows how fragile the current situation is as it picks up the "pretty\-print" chars as text children & puts them in where there is no text child except the formatting chars\. 5\) Summary i\) The present situation/structure is dangerous\. ii\) Pretty\-print is the norm & even the ADL is pretty printed and has adopted a similar method to cope\. iii\) The solution simplifies the XML in terms of both processing and human readability\. iv\) The solution shrinks the file sizes\. Yours Adam Flinton [details="(attachments)"] [setTextAsVal.xslt|attachment](upload://yQjHaRN9f71pJUjGOiK9929Gfbh.xslt) (1.55 KB) [/details] --- ## Post #2 by @Heath_Frankel3 Adam, > i\) The present situation/structure is dangerous\. You need to get a better tool, Oxygen never splits an element value over multiple lines or adds whitespace\. A tool that automatically does this is dangerous\. I used to use XMLSpy and never experienced this, but after hearing this I am glad I was convinced to move to Oxygen\. > ii\) Pretty\-print is the norm & even the ADL is pretty printed and has adopted > a similar method to cope\. Sure, but the tool should never add whitespace to a value, that is not the norm, it is simply wrong\. > iii\) The solution simplifies the XML in terms of both processing and human > readability\. I do not see this at all, in fact your solution breaks much processing which is derived directly from the Archetype Model\. Microsoft uses a lot of XML documents in its products and many of them use elements to contain values\. In fact if you go to W3Cschools you will see the majority of examples using element values, and this is a resource teaching the basics of XML\. > iv\) The solution shrinks the file sizes\. Turning an element value into an attribute with name value saves a very minimal set of characters, I find it hard to see how you save a third\. In some cases you might save a third \(such as lower\_included\) but in others you solution actually increase the size\. Take you example of lower and upper, a start tag of 5 characters, add the angle brackets and you have 7 characters\. Using your solution, you have the attribute name of value, which is 5 plus 2 quotes, an equals sign and a space between the tag and the attribute, totalling 9 characters\. In the case of occurrences \(or DV\_INTERVALs in general\), I think we should treat the unbounded and included properties as attributes because they provide meta data about how to interpret the real data, lower and upper\. You will never utilise the unbounded and included values in isolation, they are always used in conjunction with the lower and upper\. So I would suggest a change as follows:         <occurrences>             <lower included="true" >1</lower>             <upper unbounded="true"/>         </occurrences> The included and unbounded attributes exist for both lower and upper with default values of false\. Due to the openEHR assertions, you will never need more than 1 attribute on each element as included and unbounded cannot be both true\. The thing is, if we start entertaining these kinds of changes we will end up in endless debates based on the religious beliefs of XML style\. Xml is just another computer language, all computer professionals have different styles when using those languages\. There is no right and wrong style, just guidelines, but these are usually employed for consistency purposes assisting the readability, not that one style is more ready than another\. Currently, the schema is as consistent as you will ever get\. If anything is going to be changed, then the representation of INTERVAL is probably the only candidate \(there may be another one or two in similar vein, meta data assisting in the interpretation of the value\)\. Regards Heath --- ## Post #3 by @system Dear Adam, I totally understand the XML issues that you described in your previous email\. However, this problem doesn't exist if you use oXygen xml editor\. I just downloaded Altova XMLSpy 2008\. I opened an archetype XML file using XMLSpy 2008 and did pretty\-print and then saved it\. I don't have any issues to open the saved xml using Ocean Archetype Editor \(Release 1 candidate \(1241\)\)\. Additionally, putting element text value as an attribute value would make the xml file looks very ugly when the value is a long string, e\.g\. people can put very long string \(100 words or 200 words or even more\) for the purpose, description, and use fields\. Regards, Chunlan --- ## Post #4 by @Adam_Flinton Heath Frankel wrote: > Adam, > >> i\) The present situation/structure is dangerous\. >>     > You need to get a better tool, Oxygen never splits an element value over > multiple lines or adds whitespace\. A tool that automatically does this is > dangerous\. I used to use XMLSpy and never experienced this, but after > hearing this I am glad I was convinced to move to Oxygen\. > I like oxygen but A\) XMLSpy is our std tool B\) http://www.oxygenxml.com/xml_pretty_print.html C\) Anything doing pretty print \(inc Oxygen\) does the same things\. To quote from the oxygen xml page above: "Although writing documents with no indentation is a perfectly acceptable practice, it makes editing difficult and is error prone\. It also makes the identification of exact error positions difficult\. Formatting and Indenting, also called "Pretty Print", enables the XML documents to be neatly arranged in a manner that is consistent and promotes easier reading\." >> ii\) Pretty\-print is the norm & even the ADL is pretty printed and has >>     > > adopted >   >> a similar method to cope\. >>     > > Sure, but the tool should never add whitespace to a value, that is not the > norm, it is simply wrong\. > Not true\. See above wrt Oxygen XML's view\. I can quote you the relevant sections from the XML docs e\.g\. http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace >> iii\) The solution simplifies the XML in terms of both processing and human >> readability\. >>     > > I do not see this at all, in fact your solution breaks much processing which > is derived directly from the Archetype Model\. > Why is that? > Microsoft uses a lot of XML documents in its products and many of them use > elements to contain values\. In fact if you go to W3Cschools you will see > the majority of examples using element values, and this is a resource > teaching the basics of XML\. > For instructional documents aimed at those learning XML it is nice and simple\. If however you are looking to create a bullet proof serialization in XML where the values matter then it is a poor design\. >> iv\) The solution shrinks the file sizes\. >>     > > Turning an element value into an attribute with name value saves a very > minimal set of characters, I find it hard to see how you save a third\. In > some cases you might save a third \(such as lower\_included\) but in others you > solution actually increase the size\. Take you example of lower and upper, a > start tag of 5 characters, add the angle brackets and you have 7 characters\. > Using your solution, you have the attribute name of value, which is 5 plus 2 > quotes, an equals sign and a space between the tag and the attribute, > totalling 9 characters\. > Run the XSLT on set of files so as to get a reasonable average\. I have done so on the NHS ones\. it is about 2/3'rds\. > In the case of occurrences \(or DV\_INTERVALs in general\), I think we should > treat the unbounded and included properties as attributes because they > provide meta data about how to interpret the real data, lower and upper\. > You will never utilise the unbounded and included values in isolation, they > are always used in conjunction with the lower and upper\. So I would suggest > a change as follows: > >         <occurrences> >             <lower included="true" >1</lower> >             <upper unbounded="true"/> >         </occurrences> >   How about in a template e\.g\. <Items archetype\_id="openEHR\-EHR\-CLUSTER\.symptom\.v2" path="/data\[at0001\]/events\[at0002\]/data\[at0003\]/items\[at0005\]/items" xsi:type="CLUSTER"> vs say in a archetype where the same thing would be shown as: <archetype\_id><value>openEHR\-EHR\-ACTION\.procedure\.v1draft</value></archetype\_id> So are templates wrong & archetypes right or vice versa? > The included and unbounded attributes exist for both lower and upper with > default values of false\. Due to the openEHR assertions, you will never need > more than 1 attribute on each element as included and unbounded cannot be > both true\. > > The thing is, if we start entertaining these kinds of changes we will end up > in endless debates based on the religious beliefs of XML style\. This is not about style it's about safety\. I have been involved in many large scale XML projects\. I have seen this before & it ends up with ugly situations\. You can not assume whitespace will not be added as it is legitimate to pretty print a document\. If you are serious about a singular value it goes in an attribute\. > Xml is just > another computer language, all computer professionals have different styles > when using those languages\. There is no right and wrong style, just > guidelines, but these are usually employed for consistency purposes > assisting the readability, not that one style is more ready than another\. > Currently, the schema is as consistent as you will ever get\. > > If anything is going to be changed, then the representation of INTERVAL is > probably the only candidate \(there may be another one or two in similar > vein, meta data assisting in the interpretation of the value\)\. > > Regards > > Heath > Then at each stage involving the use of Archetypes and templates you are going to have to build in text normalization routines as per: http://www.w3.org/TR/xpath#function-normalize-space & http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ "\[Definition:\] The \*normalized value\* of an element or attribute information item is an ·initial value· <http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/#key-iv> whose white space, if any, has been normalized according to the value of the whiteSpace facet <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/datatypes.html#rf-whiteSpace> of the simple type definition used in its ·validation· <http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/#key-vn>: \*preserve\*    No normalization is done, the value is the ·normalized value·    <http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/#key-nv> \*replace\*    All occurrences of |\#x9| \(tab\), |\#xA| \(line feed\) and |\#xD|    \(carriage return\) are replaced with |\#x20| \(space\)\. \*collapse\*    Subsequent to the replacements specified above under \*replace\*,    contiguous sequences of |\#x20|s are collapsed to a single |\#x20|,    and initial and/or final |\#x20|s are deleted\." Also: http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace & http://www.w3.org/TR/REC-xml/#NT-S "       2\.3 Common Syntactic Constructs This section defines some symbols used widely in the grammar\. S <http://www.w3.org/TR/REC-xml/#NT-S> \(white space\) consists of one or more space \(\#x20\) characters, carriage returns, line feeds, or tabs\.           White Space \[3\] |S| ::= |\(\#x20 | \#x9 | \#xD | \#xA\)\+| \*Note:\* The presence of \#xD in the above production is maintained purely for backward compatibility with the First Edition <http://www.w3.org/TR/1998/REC-xml-19980210>\. As explained in \*2\.11 End\-of\-Line Handling\* <http://www.w3.org/TR/REC-xml/#sec-line-ends>, all \#xD characters literally present in an XML document are either removed or replaced by \#xA characters before any other processing is done\. The only way to get a \#xD character to match this production is to use a character reference in an entity value literal\." Adam Note that that means that you would almost certainly have to specify collapse & that no values could ever start or end with a space or contain more than one contiguous space\. --- ## Post #5 by @system My question: - What is your justification for your statement? " > *| Microsoft uses a lot of XML documents in its products and many of them use* > > > *elements to contain values. In fact if you go to W3Cschools you will see* > > > *the majority of examples using element values, and this is a resource* > > > *teaching the basics of XML.* > > > > > > > > *For instructional documents aimed at those learning XML it is nice and* > *simple.* *If however you are looking to create a bullet proof serialization in XML* *where the values matter then it is a poor design."* In my opinion things can equally expressed in attributes and the 'elements', as this is subject to (local) agreements. Although CEN/tc251 has published a report (CEN/tc251 TS 15211) some years ago where they proposed to express data values as an attribute, I have my doubts. I think it is more correct to reserve attributes to express meta-data about the date value in the 'XML-element'. Attributes to express: language, coding system, precision, etc. Gerard Freriks -- -- Gerard Freriks, MD Huigsloterdijk 378 2158 LR Buitenkaag The Netherlands T: +31 252544896 M: +31 620347088 E: [gfrer@luna.nl](mailto:gfrer@luna.nl) Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755 --- ## Post #6 by @Andrew_Patterson > Note that that means that you would almost certainly have to specify > collapse & that no values could ever start or end with a space or > contain more than one contiguous space\. This is what is specified in the openehr schema for most of the elements you are talking about\. For instance, all types derived from OBJECT\_ID have a value element with type xs:token \- this has the whitespace facet set to collapse and is why I presume XMLSpy is feeling free to pretty print using extra spaces and tabs\. In the case of where the type is xs:string, I'm guessing XMLSpy wouldn't add extra leading or trailing spaces \(as that would be clearly changing the meaning of any element with whitespace set to preserve\)\. I don't use oxygen or xmlspy so I can't really test this out\. Rather than the question of whether to use attributes or elements, my question is \- do we have xs:token or xs:string set correctly as the type in the schema for all elements \- I'm not sure anyone has really gone through and systematically determined what the whitespace ramifications are for each element \(should LOCATABLE\_REF/path be a string or a token for instance? \- without having the formal spec here with me I would think that a path would normally ignore leading and trailing spaces so perhaps it should be a token\) Andrew --- ## Post #7 by @thomas.beale Adam Flinton wrote: > > To quote from the oxygen xml page above: > > "Although writing documents with no indentation is a perfectly > acceptable practice, it makes editing difficult and is error prone\. It > also makes the identification of exact error positions difficult\. > Formatting and Indenting, also called "Pretty Print", enables the XML > documents to be neatly arranged in a manner that is consistent and > promotes easier reading\." >   but no\-one is advocating creating documents with no whitespace, particularly, although many tools do, since the XML is intended for consumption by computers, not people\. But whitespace between Elements is not the same as white space in an Element value\. > >> Sure, but the tool should never add whitespace to a value, that is not the >> norm, it is simply wrong\. >> > Not true\. > > See above wrt Oxygen XML's view\. I can quote you the relevant sections > from the XML docs e\.g\. > > http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ > > http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace >   well what this tells me is that if the whitespace facet of the type in a schema is set to 'preserve' then the whitespace is not changed\. What happens to whitespace \_between\_ Elements doesn't matter too much \(i\.e\. between tag end and new tag start\), since this is just a question of indented formatting\. What the debate here is about, as far as I understand, is about whitespace within textual Element values \- which should of course be preserved, else XML can't be used to send normal documentary text around\. > > If however you are looking to create a bullet proof serialization in XML > where the values matter then it is a poor design\. >   well \- let's have some evidence of that\. If it is true, then change needs to be considered\. But let's have the hard evidence first\. \- thomas beale --- ## Post #8 by @ian.mcnicoll Just for info, I have the latest version of XMLSpy 2008 and cannot reproduce the problem with Pretty-printing adding whitespace to element values. Although XMLspy rather nicely word breaks long text lines and indents appropriately, none of this whitesapce appears to be saved. Incidentally, I personally find element-preponderant XML easier to read than the attribute laden equivalent. Chaque a son gout! Ian --- ## Post #9 by @Adam_Flinton Chunlan Ma wrote: > Dear Adam, > > I totally understand the XML issues that you described in your previous > email\. However, this problem doesn't exist if you use oXygen xml > editor\. I > just downloaded Altova XMLSpy 2008\. I opened an archetype XML file using > XMLSpy 2008 and did pretty\-print and then saved it\. I don't have any > issues > to open the saved xml using Ocean Archetype Editor \(Release 1 candidate > \(1241\)\)\. Additionally, putting element text value as an attribute value > would make the xml file looks very ugly when the value is a long string, > e\.g\. people can put very long string \(100 words or 200 words or even > more\) > for the purpose, description, and use fields\. > > Regards, > > Chunlan > A\) If you don't care about the layout of the text e\.g\. the introduction of line endings & tabs etc then wrt text that is true\. If you do care about the layout then it would be best to have a <text> </text> child which contains a markup/layout dialect such as XHTML\. B\) The ADL is pretty printed & deals with this by in effect using the same markup as an XML attribute: e\.g\.:                \["at0002"\] = <                    description = <"\*">                    text = <"Procedure started date time"> C\) Yes Oxygen can/does do pretty print\. It is a std part of XML & has been since before 1\.0\. e\.g\. http://www.oxygenxml.com/xml_pretty_print.html Adam --- ## Post #10 by @Adam_Flinton Gerard Freriks wrote: > My question: > > \- What is your justification for your statement? a\) Safety b\) Efficiency c\) Best practice > > " >> /| Microsoft uses a lot of XML documents in its products and many of >> them use/ >>> /elements to contain values\. In fact if you go to W3Cschools you >>> will see >>> / >>> /the majority of examples using element values, and this is a resource >>> / >>> /teaching the basics of XML\. >>> / >>> / >>> / >>> / >>> / >> >> /For instructional documents aimed at those learning XML it is nice >> and simple\. //If however you are looking to create a bullet proof >> serialization in XML / > > /where the values matter then it is a poor design\."/ > > In my opinion things can equally expressed in attributes and the > 'elements', as this is subject to \(local\) agreements\. > Although CEN/tc251 has published a report \(CEN/tc251 TS 15211\) some > years ago where they proposed to express data values as an attribute, > I have my doubts\. > I think it is more correct to reserve attributes to express meta\-data > about the date value in the 'XML\-element'\. > Attributes to express: language, coding system, precision, etc\.    <definition archetype\_id="openEHR\-EHR\-EVALUATION\.check\_list\-condition\-third\_party\.v1" xsi:type="EVALUATION">        <Rule name="Has anyone in your family had:" path="/data\[at0001\]/items\[at0004\]"/>        <Rule name="Diabetes" path="/data\[at0001\]/items\[at0004 and name/value='Question group'\]/items\[at0002\]"/>    </definition> ???? From an OpenEHR Template\. Is this wrong? I would argue this is much better XML than that found in the XML serialization of an Archetype\. Adam --- ## Post #11 by @Adam_Flinton Gerard Freriks wrote: > Thanks\. > > But I'm curious in: > Why? > > Why is you solution more safe? A\) You are definitively bookending the string\. This is exactly the same as you do within the ADL e\.g\.                 \["at0002"\] = <                     description = <"\*">                     text = <"Procedure started date time">                 > The adl above does not say: description = \* text = Procedure started due time\. etc\. why is that? & Would that be the same as: description = \* text = Procedure started due time\. ? B\) Even worse is the fact that an XML element can contain many text children even where it may look like there is just one\. This can cause all sorts of fun\. e\.g\. http://www.informit.com/articles/article.aspx?p=31273&seqNum=12&rl=1 "The text of an element is considered \*normalized\* when it contains no two adjacent Text nodes, as was shown above\. In general, deserializing an XML document into a DOM will yield normalized elements\. However, when new Text nodes are inserted into the hierarchy, one can wind up with a denormalized element\. While completely legal, various XML technologies have a difficult time handling denormalized elements\. XPath, for example, depends on a normalized document tree structure to behave properly\. Performing an XPath traversal against a document with denormalized elements would yield unexpected results\. This can be prevented using the Node\.normalize method, which recursively normalizes all ancestor Text nodes\. Consider the following Java code: import org\.w3c\.dom\.\*; void appendText\(Document doc, Node elem\) \{   int nChildren = elem\.getChildNodes\(\)\.getLength\(\);   Node text1 = doc\.createTextNode\("hello "\);   Node text2 = doc\.createTextNode\("world"\);   elem\.appendChild\(text1\);   elem\.appendChild\(text2\);   text2\.splitText\(2\);   assert\(elem\.getChildNodes\(\)\.getLength\(\) == nChildren \+ 3\);   elem\.normalize\(\);   assert\(elem\.getChildNodes\(\)\.getLength\(\) == nChildren \+ 1\); \} As shown in Figure 2\.12 <javascript:popUp\('/content/images/chap2\_0201709147/elementLinks/02fig12\.gif'\)>, after the call to Text\.splitText, there are three new Text node children\. However, after the call to Node\.normalize, the three adjacent Text nodes are folded into a single node containing the string "hello, world"\." > Why is your solution more efficient? A\) File sizes are smaller/the XML is less verbose\. B\) The fact that you know exactly where the string starts and finishes means that using Sax etc can be much faster as there is no need to normalize\. i\.e\. at present you would already have more verbose xml & then the only safe option is to always normalize the whole document before processing it\. C\) XML attribute values are structural vs a function in most of the XML processing languages e\.g\. XSLT or XPath\. e\.g\. compare /a/b/@c vs /a/b/text\(\) or /a/b\[@c="bob"\] vs /a/b\[text\(\) = "bob"\] > Why is your solution a better Best Practice? > In part for the reasons above\. In part because experiences of failures because of the ambiguities wrt the text child in XML have driven people to be pretty careful about using text unless you really need to\. If you want a single string containing a value which will not contain child elements e\.g\. Good use for text child: some <strong>bold text</strong> in some documentation Bad use for text child: at003 Again I would refer you yo your very own ADL which in essence has adopted the exact same solution to avoiding an textual ambiguities via markup such as:                 \["at0030"\] = <                     description = <"\*">                     text = <"Material used">                 >                 \["at0031"\] = <                     description = <"\*">                     text = <"Procedure comments">                 >                 \["at0032"\] = <                     description = <"\*">                     text = <"Procedure comments">                 >                 \["at0033"\] = <                     description = <"\*">                     text = <"Procedure end date time">                 > were say the first element above to be rewritten it could be seen as <at0030 description="\*" text="Material used"/> Adam --- ## Post #12 by @Adam_Flinton Ian McNicoll wrote: > Just for info, I have the latest version of XMLSpy 2008 and cannot > reproduce the problem with Pretty\-printing adding whitespace to > element values\. Although XMLspy rather nicely word breaks long text > lines and indents appropriately, none of this whitesapce appears to be > saved\. > It can pretty print as that's a std part of XML e\.g\. http://www.altova.com/manual2008/XMLSpy/spyenterprise/pretty_printxmltext.htm > Incidentally, I personally find element\-preponderant XML easier to > read than the attribute laden equivalent\. Chaque a son gout\! > It's not about human readability\. It is about having to normalize every Archetype & template prior to loading it\. e\.g\. right now the Ocean Archetype editor doesn't normalize & thus it breaks\. Adam --- ## Post #13 by @Adam_Flinton Thomas Beale wrote: > Adam Flinton wrote: >   >> To quote from the oxygen xml page above: >> >> "Although writing documents with no indentation is a perfectly >> acceptable practice, it makes editing difficult and is error prone\. It >> also makes the identification of exact error positions difficult\. >> Formatting and Indenting, also called "Pretty Print", enables the XML >> documents to be neatly arranged in a manner that is consistent and >> promotes easier reading\." >>   > > but no\-one is advocating creating documents with no whitespace, > particularly, although many tools do, since the XML is intended for > consumption by computers, not people\. But whitespace between Elements is > not the same as white space in an Element value\. >   http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-whiteSpace \*"whiteSpace\* is applicable to all ·atomic· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-atomic> and ·list· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-list> datatypes\. For all ·atomic· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-atomic> datatypes other than string <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#string> \(and types ·derived· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-derived> by ·restriction· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-restriction> from it\) the value of \*whiteSpace\* is |collapse| and cannot be changed by a schema author; for string <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#string> the value of \*whiteSpace\* is |preserve|; for any type ·derived· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-derived> by ·restriction· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-restriction> from string <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#string> the value of \*whiteSpace\* can be any of the three legal values\. For all datatypes ·derived· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-derived> by ·list· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-list> the value of \*whiteSpace\* is |collapse| and cannot be changed by a schema author\. For all datatypes ·derived· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-derived> by ·union· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-union> \*whiteSpace\* does not apply directly; however, the normalization behavior of ·union· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-union> types is controlled by the value of \*whiteSpace\* on that one of the ·memberTypes· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-memberTypes> against which the ·union· <http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-union> is successfully validated\." So for string it's fine to format the text for readability by introducing tabs, linefeeds etc\. >>> Sure, but the tool should never add whitespace to a value, that is not the >>> norm, it is simply wrong\. >>> >> >> Not true\. >> >> See above wrt Oxygen XML's view\. I can quote you the relevant sections >> from the XML docs e\.g\. >> >> http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ >> >> http://www.w3.org/TR/xmlschema-2/#rf-whiteSpace >>   > > well what this tells me is that if the whitespace facet of the type in a > schema is set to 'preserve' then the whitespace is not changed\. What > happens to whitespace \_between\_ Elements doesn't matter too much \(i\.e\. > between tag end and new tag start\), since this is just a question of > indented formatting\. What the debate here is about, as far as I > understand, is about whitespace within textual Element values \- which > should of course be preserved, else XML can't be used to send normal > documentary text around\. >   Or it has to be normalized at every point it is read in\. That means \(for example\) that you can never have text starting with or ending with a space & possibly you can never include tabs, linefeeds etc in your text\. >> If however you are looking to create a bullet proof serialization in XML >> where the values matter then it is a poor design\. >>   > > well \- let's have some evidence of that\. If it is true, then change > needs to be considered\. But let's have the hard evidence first\. > Look at every other major standard for XML\. e\.g\. XMI:   <eAnnotations xmi:id="\_2IHQUQ3aEdy0fvloa5NWrg" source="uml2\.diagrams"/>   <ownedComment xmi:id="\_ABfwQA3bEdy0fvloa5NWrg" body="Advanced Trace allows an external system to identify a Patient based on an NHS Number or a variety of search criteria including name, address and a date range for either birth or death\. Historic data may optionally also be searched, and/or returned\. Current and future dated data is always returned\. Multiple matching records may be returned, along with a MatchingLevel \(this will only be populated for an algorithmic search\) for each match, indicating the confidence of the match\.&\#xD;&\#xA;See the PDS \[SSRS\] for details of the responses returned by Advanced Trace\.&\#xD;&\#xA;" annotatedElement="\_2IHQUA3aEdy0fvloa5NWrg">     <eAnnotations xmi:id="\_ABfwQQ3bEdy0fvloa5NWrg" source="appliedStereotypes">       <contents xmi:type="Default\_0:Default\_\_Documentation" xmi:id="\_ABfwQg3bEdy0fvloa5NWrg"/>     </eAnnotations>   </ownedComment>   <packageImport xmi:type="uml:ProfileApplication" xmi:id="\_2IHQVA3aEdy0fvloa5NWrg">     <eAnnotations xmi:id="\_2IHQVQ3aEdy0fvloa5NWrg" source="attributes">       <details xmi:id="\_2IHQVg3aEdy0fvloa5NWrg" key="version" value="0"/>     </eAnnotations>     <importedPackage xmi:type="uml:Profile" href="pathmap://UML2\_PROFILES/Basic\.profile\.uml2\#\_6mFRgK86Edih9\-GG5afQ0g"/>     <importedProfile href="pathmap://UML2\_PROFILES/Basic\.profile\.uml2\#\_6mFRgK86Edih9\-GG5afQ0g"/>   </packageImport> Note how the documentation includes the formatting entities but that they are intended to be there\. SVG: <g fill="rgb\(232,232,255\)" stroke\-miterlimit="0" font\-family="'Arial'" stroke\-linejoin="round" stroke="rgb\(232,232,255\)"> <rect x="10" y="10" clip\-path="url\(\#clipPath1\)" width="1500" height="202" stroke="none"/> <rect x="10" y="10" clip\-path="url\(\#clipPath1\)" fill="none" width="1500" height="202" stroke="black"/> <image stroke="black" transform="matrix\(1,0,0,1,21,97\)" width="15" xlink:show="embed" xlink:type="simple" fill="black" clip\-path="url\(\#clipPath2\)" preserveAspectRatio="none" height="23" x="0" y="0" xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA8AAAAXCAYAAADUUxW8AAAAgUlEQVR42s1UwQnA&\#13;&\#10;MAiMwSGy/3TZIq0PQcRqVQoV8hBz5jw1sPc5o2hzNKwFRumsBSHgLtMGU4ASyAsp&\#13;&\#10;2pygXHP55ZbaRJmPFXPVlpTZZ5AuByNqnvoz09efjqelqPbd8fyEttV7jAAeK4wA&\#13;&\#10;Xp/x7UCEYL2OUSLwPsB0zU\+ts5bjArh6RxD5kW1tAAAAAElFTkSuQmCC" xlink:actuate="onLoad"/> <rect x="130" y="89" clip\-path="url\(\#clipPath3\)" fill="white" width="111" rx="4\.5" ry="4\.5" height="61" stroke="none"/> <rect x="130" y="89" clip\-path="url\(\#clipPath3\)" fill="none" width="110" rx="4" ry="4" height="60" stroke="black"/> <text x="153" y="124" clip\-path="url\(\#clipPath4\)" fill="black" stroke="none" xml:space="preserve">AR1\_Task1</text> <circle clip\-path="url\(\#clipPath5\)" fill="white" r="14\.5" cx="70\.5" cy="119\.5" stroke="none"/> <circle clip\-path="url\(\#clipPath5\)" fill="none" r="14\.5" cx="70\.5" cy="119\.5" stroke="black"/> <text x="58" y="147" clip\-path="url\(\#clipPath6\)" fill="black" stroke="none" xml:space="preserve">Start</text> <line clip\-path="url\(\#clipPath7\)" fill="none" x1="36" x2="36" y1="11" y2="210" stroke="rgb\(169,169,169\)"/> <rect x="10" y="10" clip\-path="url\(\#clipPath1\)" fill="none" width="1500" height="202" stroke="rgb\(169,169,169\)"/> <rect x="10" y="265" clip\-path="url\(\#clipPath8\)" width="1500" height="200" stroke="none"/> <rect x="10" y="265" clip\-path="url\(\#clipPath8\)" fill="none" width="1500" height="200" stroke="black"/> <image stroke="black" transform="matrix\(1,0,0,1,21,351\)" width="15" xlink:show="embed" xlink:type="simple" fill="black" clip\-path="url\(\#clipPath2\)" preserveAspectRatio="none" height="23" x="0" y="0" xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAA8AAAAXCAYAAADUUxW8AAAAhElEQVR42s1TQQ7A&\#13;&\#10;IAgbpo/w/6/zF25eFsKQyuZhJBwEiy2ItNb78dKKDdQqt3s5bbDJi8njso6FL2sb&\#13;&\#10;oOEeCwrWRbwCZeXSjLps7fZ/RqX7YHWDUbMFU5pnwP0NyxhYR\+1Zy8Cqvk\+0vdmD&\#13;&\#10;ASJWYIBozlj9EBRsV5IVClcyrXk2Om85TlGWUZ\+e/kVBAAAAAElFTkSuQmCC" xlink:actuate="onLoad"/> <rect x="293" y="294" clip\-path="url\(\#clipPath9\)" fill="white" width="153" rx="4\.5" ry="4\.5" height="94" stroke="none"/> <rect x="293" y="294" clip\-path="url\(\#clipPath9\)" fill="none" width="152" rx="4" ry="4" height="93" stroke="black"/> XSLT:     <xsl:strip\-space elements="\*" />     <xsl:param name="p\_FileListDoc" />     <xsl:param name="p\_ProcFileListDoc" />     <xsl:param name="p\_ReportDoc" />     <xsl:template match="/">         <\!\-\- <xsl:message>             p\_configDoc = <xsl:value\-of select="$p\_configDoc" />             </xsl:message> \-\->         <xsl:element name="root" namespace="">             <xsl:element name="errors" namespace="">                 <xsl:call\-template name="ValidateADL">                     <xsl:with\-param name="p\_FileListDoc"                         select="$p\_FileListDoc" />                 </xsl:call\-template>                 <xsl:call\-template                     name="Process\_Contains\-draft\-archetype">                     <xsl:with\-param name="p\_rootNode" select="\." />                 </xsl:call\-template>             </xsl:element>             <xsl:element name="files" namespace="">                 <xsl:call\-template name="ProcessFileNames">                     <xsl:with\-param name="p\_rootNode" select="\." />                 </xsl:call\-template>             </xsl:element>         </xsl:element>     </xsl:template> XSD: <xs:complexType name="ApplicationRolesType">         <xs:annotation>             <xs:documentation>Collects the application roles defined in this file</xs:documentation>         </xs:annotation>         <xs:sequence>             <xs:element name="ApplicationRole" type="RimArtefactType" minOccurs="2" maxOccurs="unbounded">                 <xs:annotation>                     <xs:documentation>A definition of an application role\.</xs:documentation>                 </xs:annotation>             </xs:element>         </xs:sequence>     </xs:complexType> ebXML:   <BinaryCollaboration name="Request Catalog">     <AuthorizedRole name="requestor"/>     <AuthorizedRole name="provider"/>     <BusinessTransactionActivity name="Catalog Request"                                  businessTransaction="Catalog Request"                                  fromAuthorizedRole="requestor"                                  toAuthorizedRole="provider"/>   </BinaryCollaboration> Take an easy example which is \(X\)HTML <a class="docSubTitle" href="\#Table\_of\_Contents:"     name="Getting\_ANT">\(1\) Getting ANT </a> looks exactly the same to the viewer as: <a class="docSubTitle" href="\#Table\_of\_Contents:"     name="Getting\_ANT"> \(1\)             Getting                                                         ANT </a> Because the value contained as the text child has no used or interest in any textual formatting at that level i\.e\. a new line is a <br/> etc\. Adam --- ## Post #14 by @Lisa_Thurston1 Hi Adam and all I think there might have been some misunderstanding regarding the problem you raised. Yes Oxygen and XmlSpy will add unpredictable whitespace characters into an element when pretty printed and resaved. The problem is that the XML tools don't know they are dealing with elements whose space must be preserved. That can be fixed using xml:space="preserve" attributes on all leaf elements which are values ([http://www.w3.org/TR/2000/REC-xml-20001006#sec-white-space](http://www.w3.org/TR/2000/REC-xml-20001006#sec-white-space)). Then the archetype XML can safely be pretty printed or collapsed back into ugly print without affecting the content or meaning. There doesn't appear to be any 'best practice' on this particular question. I don't think this naturally means we should be using XML attributes to store the values. There is only a tiny 'efficiency' gain by using attributes (approx. 6 characters less per value) which I don't think offsets the value of having elements consistently used all the way throughout the serialisation. What I think it does mean is that we should add xml:space="preserve" attributes to our free-text leaf elements at the time of serialisation so that the current problem is resolved. Lisa Adam Flinton wrote: --- ## Post #15 by @Adam_Flinton Lisa Thurston wrote: > Hi Adam and all > > I think there might have been some misunderstanding regarding the > problem you raised\. Yes Oxygen and XmlSpy will add unpredictable > whitespace characters into an element when pretty printed and resaved\. > The problem is that the XML tools don't know they are dealing with > elements whose space must be preserved\. That can be fixed using > xml:space="preserve" attributes on all leaf elements which are values > \(http://www.w3.org/TR/2000/REC-xml-20001006#sec-white-space). Then the > archetype XML can safely be pretty printed or collapsed back into ugly > print without affecting the content or meaning\. > > There doesn't appear to be any 'best practice' on this particular > question\. I don't think this naturally means we should be using XML > attributes to store the values\. There is only a tiny 'efficiency' gain > by using attributes \(approx\. 6 characters less per value\) which I > don't think offsets the value of having elements consistently used all > the way throughout the serialisation\. > > What I think it does mean is that we should add xml:space="preserve" > attributes to our free\-text leaf elements at the time of serialisation > so that the current problem is resolved\. > I reserve my views wrt attributes vs text\(\) however that would do on the proviso of a bit of testing with many tools as it used to be patchily supported by different tools\. I accept that was a few years back & things may well have improved\. So then next question then is when will the tools support this? Adam --- ## Post #16 by @thomas.beale Adam Flinton wrote: > > I reserve my views wrt attributes vs text\(\) however that would do on the > proviso of a bit of testing with many tools as it used to be patchily > supported by different tools\. > > I accept that was a few years back & things may well have improved\. > > So then next question then is when will the tools support this? >   looks like we have arrived at a useful point \- first thing we need is an analysis of changes to the XML\-schemas\. If Lisa's change is all that is needed and someone wants to update the current schemas to make thi work, we can put it on the main TRUNK so that everyone can have access to it\. Further analysis will be needed for the tools, but I would not expect big problems\. Generally they are using orthodox XML parsers whcih I assume respect the whitespace settings in an XML schema\.\.\. \- thomas --- ## Post #17 by @Heath_Frankel3 No XML Schema changes required, in fact the schema already indicates that the string data should have space preserved as per the W3C references provided by Adam\. The problem is that because the schema specifies something is a string type it is not required to be specified in the XML document and when a tool such as XMLSpy reads the document it doesn't know what type the element is without referencing the schema, so it doesn't apply the default space='preserve' attribute when it does a pretty\-print\. So technically there is nothing wrong with the current XML\. However, to support these tools that apply pretty print before checking the schema to determine if they are allowed too, we could explicitly add this space attribute in the data \(alternately, we might be able to provide the type attribute instead, but we haven't tested this yet\)\. The problem is forcing the XML serialiser to put these explicit attributes in the data\. We will explore this\. Stepping back a bit, would it be sufficient \(in the short term at least\) to just have the XML pretty printed out of the tools rather than a single line so that you are not inclined to use the problematic XMLSpy pretty print? Heath --- ## Post #18 by @Adam_Flinton Thomas Beale wrote: > Adam Flinton wrote: >> >> I reserve my views wrt attributes vs text\(\) however that would do on >> the proviso of a bit of testing with many tools as it used to be >> patchily supported by different tools\. >> >> I accept that was a few years back & things may well have improved\. >> >> So then next question then is when will the tools support this? >>   > > looks like we have arrived at a useful point \- first thing we need is > an analysis of changes to the XML\-schemas\. If Lisa's change is all > that is needed and someone wants to update the current schemas to make > thi work, we can put it on the main TRUNK so that everyone can have > access to it\. > > Further analysis will be needed for the tools, but I would not expect > big problems\. Generally they are using orthodox XML parsers whcih I > assume respect the whitespace settings in an XML schema\.\.\. > > \- thomas > I would like though to enquire wrt the rationale of containing \_id info in a separate <value/> element\. If you are being consistent instead of :        <terminology\_id>            <value>ISO\_639\-1</value>        </terminology\_id> it should be simply:        <terminology\_id>ISO\_639\-1</terminology\_id> or <terminology\_id value="ISO\_639\-1"/> Adam --- ## Post #19 by @Adam_Flinton Heath Frankel wrote: > No XML Schema changes required, in fact the schema already indicates that > the string data should have space preserved as per the W3C references > provided by Adam\. The problem is that because the schema specifies > something is a string type it is not required to be specified in the XML > document and when a tool such as XMLSpy reads the document it doesn't > know > what type the element is without referencing the schema, so it doesn't > apply > the default space='preserve' attribute when it does a pretty\-print\. > > So technically there is nothing wrong with the current XML\. However, to > support these tools that apply pretty print before checking the schema to > determine if they are allowed too, we could explicitly add this space > attribute in the data \(alternately, we might be able to provide the type > attribute instead, but we haven't tested this yet\)\. The problem is > forcing > the XML serialiser to put these explicit attributes in the data\. We will > explore this\. > > Stepping back a bit, would it be sufficient \(in the short term at > least\) to > just have the XML pretty printed out of the tools rather than a single > line > so that you are not inclined to use the problematic XMLSpy pretty print? > That might work\. I say might as A\) XMLSpy pretty prints by default & it might still think that the pretty printed doc isn't pretty enough\. B\) Ditto wrt XSLT with indent set to true\. Adam --- ## Post #20 by @Laura_Sato_NHS_CFH All \- this is not my area of expertise, but please find attached a response from a colleague in the UK \- for consideration, and some information about W3C direction\. Best regards, Laura --- ## Post #21 by @Lisa_Thurston1 Adam Flinton wrote: > I would like though to enquire wrt the rationale of containing \_id info > in a separate <value/> element\. > > If you are being consistent > instead of : > >        <terminology\_id> >            <value>ISO\_639\-1</value> >        </terminology\_id> > > it should be simply: > >        <terminology\_id>ISO\_639\-1</terminology\_id> > > or <terminology\_id value="ISO\_639\-1"/> > > Adam >   There is no special rationale\. It is simply the default serialisation of the type TERMINOLOGY\_ID\. Lisa --- ## Post #22 by @Heath_Frankel3 Adam & Lisa, There is a very specific rationale and it is consistent\. The XML Schema is a direct serialisation of the UML openEHR reference models\. Every class is an XML schema type and every attribute is an element except for archetype\_node\_id as it is a metadata attribute\. So in the case of CODE\_PHRASE, it has an attribute of terminology\_id \(hence the element template\_id\) of type TERMINOLOGY\_ID\. TERMINOLOGY\_ID has an attribute of value of type string \(hence the element value\)\. Paths are absolutely critical in openEHR and they are based on XPath\. If we start arbitrarily changing the schema based on what someone thinks is good XML we will break the openEHR path to XPath correspondence making any mapping rules more complex and error prone\. This is why the terminology ID value is not represented as a value attribute or element text node\. Yes it might seem inefficient but it was deemed to be important to make sure the logical model to implementation mapping was consistent and ensure paths worked\. In comparison to HL7 v3, the data:noise ratio is still considerably less\. Regards Heath --- ## Post #23 by @thomas.beale Adam Flinton wrote: >>     > I would like though to enquire wrt the rationale of containing \_id info > in a separate <value/> element\. > > If you are being consistent > instead of : > >        <terminology\_id> >            <value>ISO\_639\-1</value> >        </terminology\_id> > > it should be simply: > >        <terminology\_id>ISO\_639\-1</terminology\_id> > > or <terminology\_id value="ISO\_639\-1"/> > Adam, when you say it 'should' be \- either pulled up a level, with an object attribute removed OR represented as an XML attribute \- what is the driver? Is it semantic \(you think there is something wrong with the reprsentation of the object structure defined by the specification\) or is it to do with space/signal\-to\-noise \(using one of the last two methods uses less characters\)? The way it currently is is due to a direct machine\-performed object serialisation process \- in other words, it simply follows the same rules for transforming any object data into XML\. Your suggestion \(I presume\) is a special case of the general idea of representing all so\-called basic types \(Strings, Integers, dates etc\) as XML attributes rather than as XML elements\. But we have already just discussed and agreed that long text strings \(especially containing unicode, backslash quoting and whitespace\) should be XML elements\. As I have said before, what I think is most important is regular encoding from data to and from XML, so that a\) software is as simple and clean as possible and b\) changes are not needed due to particular content \(i\.e\. data\)\. Now, ideally we would minimise use of bandwidth / space with the representation as well\. The problem is that XML is pretty poorly designed for efficiently representing data, and has a poor signal to noise ratio\.\.\.making data serialise in a way that is either 'more aesthetic' or smaller always implies more complex software containing exceptional rules\. Further, although XML isn't well designed for data representation, in its original design, 'attributes' were intended for meta\-data items, rather than 'data'\. Whether this semantic needs to be retained in the XML we are talking about here is a question\. So the question is: at what level do we include exceptional processing to reduce space wastage, since this complicates the software? How much do we compromise the intended semantics of XML, where attributes are designed for holding meta\-data \(including real meta\-data, e\.g\. things like xsi:TYPE etc\)? Any idea of saving space has to be done on the basis of a study of high volumes of representatively diverse data\. Saving 10 bytes is not interesting, but saving 10Gb/minute in a large data processing system is\. I will go out on a limb and say that 'style' has no place in good engineering, only good engineering does \- correctness, performance, maintainability etc\. With all that in mind \- if the community wants to make the appropriate analysis of data and propose a more space\-efficient schema, I am not against it\. But the needs of correctness \(= patient safety\) must be satisfied\. \- thomas beale --- ## Post #24 by @Adam_Flinton > Any idea of saving space has to be done on the basis of a study of high > volumes of representatively diverse data\. Saving 10 bytes is not > interesting, but saving 10Gb/minute in a large data processing system > is\. I will go out on a limb and say that 'style' has no place in good > engineering, only good engineering does \- correctness, performance, > maintainability etc\. > > With all that in mind \- if the community wants to make the appropriate > analysis of data and propose a more space\-efficient schema, I am not > against it\. But the needs of correctness \(= patient safety\) must be > satisfied\. > > \- thomas beale > When will the tooling decorate the generated xml archetypes with the required attribute? Pretty printing is the norm\. The text should be normalized & the normalization should be enforceable\. Adam --- **Canonical:** https://discourse.openehr.org/t/suggestion-wrt-xml-archetypes-templates/14714 **Original content:** https://discourse.openehr.org/t/suggestion-wrt-xml-archetypes-templates/14714