Documentation Desparation

Hi All,

Over the past several years I have discussed this issue with Tom Beale;
on mailing lists, off mailing lists and in person.

The issue is that Framemaker is a proprietary and basically non standard
document format. I fully understand that Tom enjoys the desktop
publishing capabilities that it gives him and that he is familiar with
the application.

However, we the "open content" community end up with a proprietary
format (Framemaker) and a dead-end format (PDF) for specifications that
are advertised as being open and available.

It is almost the the ultimate sarcastic humor (on the scale of Monty
Python) that here we are trying to deliver computable healthcare
information and our own specifications are locked up in these two
formats. We cannot manipulate them into any kind of help files in order
to integrate them into an application and god forbid we think about
machine translation into other languages.

So, I have to ask myself, as well as all of the members of the openEHR
community. What is wrong with the international, open standard for
document layout; (La)Tex? It seems to work well for all major
publishers, why can't it work for openEHR?

Why do we not insist on our documentation being in a format that is more
useful to us as a broad and open community?

Thanks for listening.

--Tim

Hi Tim,

I have said often in the past I would be happy to see the documents move to such a format if:
a) anything remotely approaching the quality of FrameMaker can be found (I have yet to see it; OpenOffice is not even close)
b) some is going to fund the changeover

Others in this community may not care about visual quality - I am interested in feedback on this point.

I personally have no problem with PDF - it is an output format, not an editing format, and most tools output PDF (including OpenOffice). Personally, for readability, I think it leaves wikis and web sites for dead, but of course the latter have a different function.

I did have a plan for which I don’t currently have enough time, which is to get FrameMaker 9 (I have done that part of the plan) and convert all the docs to structured DITA XML (which FM9 supports, along with a lot of other tools I have never tried - http://www.ditanews.com/tools/desktop_editors/) from which they could potentially be converted to some other XML for another tool. I don’t have my head around all the publishing XML formats yet, but I suspect it is possible. On the topic of help files, it has already been done with FM9 XML output, and integrated experimentally into a Java environment, so one thing Frame doesn’t do is lock you into anything (unless you think DITA XML is a lock-in).

I don’t know whether Latex is the right solution or not; I don’t know whether the GUI editors (main one seems to be Lyx) are any good or not; editing raw markup is out of the question.

I would be interested to know what the community thinks we should be aiming for in terms of output and quality; the question of how to get there can then be thought about separately.

  • thomas beale

Tim Cook wrote:

(attachments)

OceanCsmall.png

It’s understandable that openEHR needs more resources to maintain the documentation source files in a some other (non proprietary and open) format. But even just a parallel publication of the source files as such (DITA XML seems a good candidate) would be low cost and useful for consumption by other tools/XSLT to produce different views – even if use of Framemaker is continued for actual editing of the documentation source.

Just my 2 cents.
Lisa

Thomas Beale wrote:

T[io]m,

I don't think the documentation issue is as clear cut as Tim suggests.
Here are my observations:-

1. The existing PDF documentation is excellent - far better than many
commercial projects. This is partly due to the use of Framemaker, but
mostly due to Tom's commitment, knowledge and skills.

2. In the ideal world, openEHR would have infrastructure that could
readily allow for distributed documentation authoring in multiple
languages with solid version management, from which PDF, XML and HTML
outputs could be autogenerated, and which clearly differentiated
Release vs. Draft status of both the specifications to which the
documentation referred as well as the documentation versions
themselves. Indexing of the documentation artefacts would be
comprehensive and capable of being integrated with the HTML
documentation bundle(s).

3. There is no affordable suite of products that can easily lead to
such an ideal world. Any real-world solution will involve trade-offs.

4. Framemaker can, and does produce the best quality documentation in
PDF format.

5. Framemaker is expensive and recent versions only run on Windows.

6. The openEHR community is still small such that few people will have
the time, skills and willingness to contribute to authoring
documentation.

7. Both Latex and DITA offer potential for single sourcing of
documentation in PDF, XML and HTML. Both are supported by a range of
cross-platform tools starting at zero cost. From a PDF quality
perspective, neither matches Framemaker, particularly in the areas of
UML and other diagram integration, version management through change
bars or indexing.

8. DITA offers the best potential for distributed authoring, semi-
automatic language translation, automated documentation builds. There
is a rapidly growing array of tools supporting DITA editing and
transformations.

10. DITA transformations to PDF, XML or HTML are normally done using
the freely available, java-based DITA Open Toolkit. This DITA-OT is
bundled with a number of XML editors, including oXygen. I've run DITA
transformations in oXygen on Linux, Mac Os X, and Windows XP - from
the same DITA source files on the same networked folder!! The
behaviour and output is identical!! One rarely comes across such
platform independence. However, tracking down errors within the DITA-
OT can be time-consuming and frustrating.

11. DITA-based authoring using these modified XML editors has a long
way to go to be usable by non-XML speaking authors - a parallel with
the LaTex world, by the way. Framemaker 9 and Framemaker >7.1
supplemented with Leximation's DITA plugin are both potential options
as a high quality authoring environment based on DITA. I've dabbled
with both, and have my doubts - particularly if there's a desire to
broaden the authoring base.

9. The existing Framemaker files could be converted to DITA using the
low cost (Windows) tool MIF2Go. This (plus oXygen) could provide a
relatively quick and painless path for Tim and others to produce XML/
HTML renditions of the current specifications for incorporation into
applications.

12. My recommendation would be to consider migrating to DITA, with the
proviso that:- Producing and maintaining documentation is a time
consuming ( and often thankless) task. Producing and maintaining
documentation of the current quality of the openEHR material will
likely be an ongoing challenge to the openEHR community for years to
come.

eric

Hi Eric,

T[io]m,

I don't think the documentation issue is as clear cut as Tim suggests.
Here are my observations:-

1. The existing PDF documentation is excellent - far better than many
commercial projects. This is partly due to the use of Framemaker, but
mostly due to Tom's commitment, knowledge and skills.

This last phrase is no doubt true.

You made many good observations. However, is having perfect PDFs a good
trade off to NOT having a format that can easily be translated into
other languages, a format that is supported by open source tools on
multiple platforms and a format that can be reused inside applications?

I've looked around at many open source projects and many of them have
pretty good looking documentation coming out LaTex source. Some use LyX
to produce their PDFs some use Scribus. But almost all of them have
multiple language versions.

Another important point is that; if Tom were hit by a bus tomorrow.....?

--Tim

Hi Tim and all,

I agree your proposal, (La)TeX, because I think it has some capability
to share document in community with SCM tools(such as CVS, Subversion,
git). It makes clear in the points bellow:
* Who committed the change.
* When it was committed.
* Where was the changepoint(diff)
For these reason, plain text with some formalism would be prefered than
binary file like PDF.
Otherwise, (La)TeX needs some experience for all people to get the
layout they want. I had once tried XML format to share translation
document in a community. But one of the members corrupt the XML by
MS-Word. I think there is no silver bullet to describe text with figures
in plain text.
How about Wiki or Google docs? They can share document with layout among
community. Recent wiki eingine can out put PDF as necessary. Google apps
is also proprietary software but it is easy to edit for everyone.

Cheers,
Shinji

I’ve been avoiding busses for years now :wink: But I take these points on board. I will have a look at what has to be done to get DITA XML output - I am not sure if it is the native FM9 output or whether I have to convert the docs to structured Frame form, which would take some time. But we can do an experiment at least. I still don’t see a nice editing alternative though (and that includes one that properly computes x-refs, pagination, heading numbering, running headers and footers…)

Tim, can you point us to some decent documentation online done using the Latex approach?

  • thomas

Tim Cook wrote:

(attachments)

OceanCsmall.png

Two quick ones I can think of are LyX itself:
http://wiki.lyx.org/LyX/Documentation

and CLIPS:

http://clipsrules.sourceforge.net/

They aren't as 'pretty' as openEHR specs but I believe that has more to
do with the artist than the tools.

--Tim

Anyone already mentioned DocBook?
(I am sorry if I mention it again if someone already did)
It is an Oasis-standard. Here a list of others who use it:
http://wiki.docbook.org/topic/WhoUsesDocBook

People can format the layout as they like in any file-format they like.

Eventually a standaard PDF can be created for people who do not need or
like to do extra work.

I write my documentation very often in DocBook, and OxygenXMLEditor
exports it to PDF

Some interesting reading: DocBook XSL: The Complete Guide, by Bob
Stayton: http://www.sagehill.net/docbookxsl/index.html

Bert

Hy!
Some of us are very happy with the old and obscure "Word" format (.doc
ending). We know that this is the root of all evil, true. It is also
true that it recently was format-contaminated by the infamous letter "x"
at the end, leaving many of us helpless. Anyway, strangely it seems to
have done the job in standardisation groups.

Tim, if you really want to go for it, there is also the ISO document
standard: ISO/IEC 29500:2008.
http://en.wikipedia.org/wiki/Office_Open_XML
That would be standards taken to the edge, for the really brave hearted.
Or is this exactly the "x" mentioned above?

And Tim, my research - brain draws me to (La)Tex, too, be assured of
that. The flesh however draws me to the counter, where I really would
like to take this further over a few beers.
Just 2 cents from ancient Vienna to the other side of the globe. Will be
quiet from now on about this :wink:

See you around,
Stefan Sauermann

Acting Program Director
Biomedical Engineering Sciences (Master)
University of Applied Sciences Technikum Wien
Hoechstaedtplatz 5, 1200 Vienna, Austria
Tel: +43 (1) 333 40 77 - 988
mobile: +43 (664) 6192555
sauermann@technikum-wien.at

http://www.technikum-wien.at
http://www.healthy-interoperability.at

Tim Cook wrote:

Hi!

... our own specifications are locked up in these two
formats. We cannot manipulate them into any kind of help files in order
to integrate them into an application and god forbid we think about
machine translation into other languages.

This discussion regards the technical hurdles of extracting
specification content in order to include it in help files etc, but
what about the legal copyright related possibilities for that?
According to the copyright notice of the openEHR specifications
(included below) it is not clear to me that you can take specification
content and include it in programs (or documents) that might get used
commercially or for non-educational purposes. Isn't this an even more
urgent issue to solve than the technical extraction?

In a previous license discussion I suggested the much more commonly
understood and more open CC-BY licence
(http://creativecommons.org/licenses/by/3.0/) to be used for the
specification documents, but I believe the discussion then slipped
over to just licensing for archetypes. Can we solve this while we are
at it?

From the documentation...

"© Copyright openEHR Foundation 2001 - 2008
All Rights Reserved
1. This document is protected by copyright and/or database right throughout the
world and is owned by the openEHR Foundation.
2. You may read and print the document for private, non-commercial use.
3. You may use this document (in whole or in part) for the purposes of making
presentations and education, so long as such purposes are non-commercial and
are designed to comment on, further the goals of, or inform third parties
about, openEHR.
4. You must not alter, modify, add to or delete anything from the document you
use (except as is permitted in paragraphs 2 and 3 above).
5. You shall, in any use of this document, include an acknowledgement
in the form:
“© Copyright openEHR Foundation 2001-2008. All rights reserved. www.openEHR.org”
6. This document is being provided as a service to the academic community and on
a non-commercial basis. Accordingly, to the fullest extent permitted under
applicable law, the openEHR Foundation accepts no liability and offers no
warranties in relation to the materials and documentation and their content.
7. If you wish to commercialise, license, sell, distribute, use or
otherwise copy
the materials and documents on this site other than as provided for in
paragraphs 1 to 6 above, you must comply with the terms and conditions of the
openEHR Free Commercial Use Licence, or enter into a separate written agreement
with openEHR Foundation covering such activities. The terms and conditions of
the openEHR Free Commercial Use Licence can be found at
http://www.openehr.org/free_commercial_use.htm"

...by the way there is currently nothing at
http://www.openehr.org/free_commercial_use.htm

Best regards,
Erik Sundvall
http://www.imt.liu.se/~erisu/
erik.sundvall@liu.se (previously erisu@imt.liu.se)
Tel: +46-13-227579 (soon changing to +46-13-286733)

Well, I'm still waiting to hear from the openEHR Foundation Board
(officially) on this issue since they are the only governing body we
have.

I'm not personally concerned with the notice you pointed out because my
re-use strictly adheres to items 2&3. However, commercial
users/developers such as Ocean Informatics may or may not be in breach
of that license. That is for the Foundation Board to decide. There
does seem to be some conflict with some of the content notices and
licenses regarding commercial use though. It basically depends on where
you look on the website.

The openEHR Foundation, as a legal entity in the UK (and the web site
claims globally), supported by CHIME/UCL and Ocean Informatics I assume
have sought proper legal counsel?

--Tim

Dear all,
I’d like to express my concerns about practical outcomes of suggested changes, changes based on potential benefits. I’d appreciate your input about the use cases we are discussing just to make sure that I get this right.
First of all, translation of openEHR documentation to other languages is a very critical task, which would be quite a challenge, because we are talking about very high quality documentation, to which I keep going back quite often, mostly to find out that a point that I was missing has already been there, expressed carefully. At one point I’ve thought about translating the docs to Turkish, my mother tongue, and realized that not having a Framemaker licence was the least of my problems. Reflecting the same quality, and more important than that, the same semantics consistenty in other languages is a huge challange. It requires understanding of the domain, the standard, and possesion of more than ordinary control over two languages, one being English. Also, as a member of openEHR community I would not like to see translations of the specs in the wild, with no official approval or inclusion from openEHR foundation, since this can easily lead to confusing documentation on an already confusing topic, which is challanging enough to master with really good docs.
I would like to know if there are efforts, or even intentions of translating this documentation to other languages, and the owners of these intentions. How many translations of the documentation will be for Spanish for example? If a person would give this task a try, due to reasons expressed above, he/she would have to possess quite a lot of time, skills and he/she would have to communicate with openEHR to make sure that the outcomes do not do harm instead of doing good. My opinion is, this would be an effort linked to an institutuion like a university, or a government agency, working with openEHR. I can’t see people working in their homes/offices on their own, doing this whole work, and if there are people like this, I really want to know them. The point? Well, the translation would mostly likely be performed by people with resources. A framemaker 9 licence would be the least of their problems. Again, please let us know if there is a person out there, comminting to translation, committing to ensure its quality, and committing to its maintanance, and is not able to move forward, just because he/she can’t afford a licence for Framemaker.
I appreciate the effort for preserving the idea of openness in all aspects of openEHR, but I want to see Tom producing documentation efficiently. This is his time spend in front of a computer, and I do not want him working slower, or producing inferior quality output, which is what will obviously happen if he does not use Framemaker. I have to confess that I am failing to see the fairness of asking Tom to commit more of his time today, for potential future benefits, which have significant prerequisites that must be covered, before they can be realized.
Having used Framemaker html, xml outputs to produce documentation for Eclipse plugins, I’m fine with the idea of documentation being exported to these formats from framemaker. PDF outputs are simply read only docs, doing exactly what they are created for, providing cross platform access to documentation. So I don’t see the point of critisizing them for not being appropriate for translation either, since they are not produced to be edited at all.
Conclusion: please let us see concrete use cases,that justifies making the suggested changes, build on not only on idealism but also actual cost benefit analysis, and we can build a solution, or a roadmap from there. I’d rather see this wonderful community move forward, trying to stay close to its principles as much as it can, with its available resources, than see it watch others progress while we fail to do so just because we’re getting ready for a better future all the time.

Best Regards
Seref

Hi Seref,

Thanks for your concerns and well thought out points.

If you read my original posting, I didn't ask Tom to stop using
Framemaker. I ask for some output in place of (or in addition to) the
PDF and Framemaker formats. I'll happily accept .doc files at this
point.

It seems that we have a different perspective on what the sense of trust
in the community is also. But that is an entirely other subject. :slight_smile:

--Tim

For everyone’s information - the specification use license was created by UCL lawyers some years ago, based (I think) on a fairly academic model of fair use. I can almost guarantee that they were not thinkng about creating extracted help files, and at the time neither was I. The board is working on the CC-BY / BY-SA question for archetypes, and given the thinking about specification use for help files, maybe we need to revisit the documentation license as well. I wonder what precedents there are for help documentation generated from an existing corpus of documentations ./ specifications?

  • thomas beale

Tim Cook wrote:

(attachments)

OceanCsmall.png

Unfortunately I can’t make any conversion mission a top priority, but let’s commit at least to an experiment which I can initiate - I will generate the ‘standard as-is’ XML output from one specification (say the data types) and make that available - Seref or someone else may be able to determine what rules it is following; in the meantime I can do a bit of research on what needs to be done to a FM document to make its XML output DITA based.

  • thomas

Tim Cook wrote:

(attachments)

OceanCsmall.png

Tom,
I’d be happy to help you out, just let me know what you need me to do. I’ll be putting all of the documentation into Eclipse plugins of Opereffa anyway. We can turn that task into an experiment to lay out some sort of method for transformation of documentation to other formats.

Cheers

(attachments)

OceanCsmall.png

Hi All,

I really appreciate the “mental” exercise to achieve a better documentation; however I must say I am really surprised to watch the recent discussions like this one because I wonder if we, as a community yet to solve many fundamental problems and overcome the many challenges, have enough resources to deal with this at the moment. Frankly I disagree with the need to translate all the specs and documentation into other languages at the moment - not to say that this is trivial but I don’t think we are at that stage at the moment. And when we become (if ever!) a multi-million $$$ foundation then I suggest looking at how ISO or national bodies approach the multi-lingual documentation problem.

While I believe in and most importantly own a couple open source projects myself, I see many from FOSS rounds getting into the pitfall of seeing software as either evil or good or having the illusion of open source as a merit by itself. That is not true…I hope we don’t end up trying to FOSS everything “just for the sake of” the “open” in our prefix :wink:

And Şeref I don’t think much people left in Turkey to bother with openEHR anyways!

Cheers,

-koray

Seref Arikan wrote:

+1

Grahame

+2

Stef