Invariants are not part of the BMM files

borut.jures · 1 December 2021 10:09

I just learned about the “Invariants” section of the specifications.

I would like to use them when generating solutions based on the BMM specifications.

I know this is probably a big undertaking but can the invariants be added to the BMMs?

Having invariants in BMMs would avoid the above problem in the openEHR solutions build/generated on computable specifications.

sebastian.iancu · 1 December 2021 14:19

There are more things missing from BMM files (like descriptions for instance) - if we have a working solution to extract the invariants, we should not forget about the others…

borut.jures · 1 December 2021 14:40

Since my approach is to generate “everything” I can only use what is in the BMMs.

Please let me know if there is anything I can do to help getting things from UML(?) into BMM.

If everything else fails I can parse them from the specifications HTML.

p.s.
Are invariants using the expression language or something else?

sebastian.iancu · 1 December 2021 15:48

The specifications are generated from UML files via an extractor - Thomas knows more about this. BMM files are now hand made (as far as I know). The plan is to extract/generate them from UML. Once that is done we could consider stop using UML but BMM as source of all specs (meaning we’ll also need to generate adoc files from BMM).
Do you think you could “extract” all info (necessary) from UML into BMM? that would be helpful…

borut.jures · 1 December 2021 16:19

I can try. Can you (or @thomas.beale) please send me an example UML extractor file?

My approach would be to write everything in adoc file and extract the BMM part from it.

A core BMM part of such a file would be extracted for the development tools.
The cleaned adoc version of such an adoc file would get generated.
Or the specifications site would use a JSON file prepared from the “extended adoc” as a source for what it displays (it can also be a pregenerated static site). An example is my UML site where each page has a .json file from which the page is built when requested.

sebastian.iancu · 1 December 2021 16:37

I guess Thomas will point you to that extractor (I don’t know that in detail).

That would make the adoc the source-of-truth - but in our view that actually should be the BMM (files). Those files are now used to generate JSON schema, code, archie, your UML, etc … and eventually also the classes as adoc files. The conversion from adoc towards static html is done and is solid - no real need to change that now.
So …if you could generate BMMs from UMLs that would be the most helpful contribution at this time.

thomas.beale · 1 December 2021 17:12

This will not happen overnight unfortunately. The current version of BMM formalism you are looking at is BMM2. To represent ‘everything’, including descriptions, invariants etc, needs BMM3, which is described in the working copy of the BMM spec.

However… the current UML extractor (java source here) does already extract out all descriptions, preconditions, postconditions and invariants - that’s how we make those blue tables in the specs. But the extract format is .adoc text files (like you see here).

So the situation is currently:

BMM2 doesn’t know about methods, pre-conditions, or post-conditions or class invariants;
UML’s representation of pre-conditions, post-conditions and class invariants as we have them in our models is not formal - they are just expression string - they are not checked by the UML tool. (Theoretically, they could be re-written in OCL, but no-one uses OCL and most people can’t read it, and I don’t know how good tool support for it is anyway).
BMM3 meta-model isn’t quite finished, and I have not finalised the syntax, but you can see a description in the new draft Expression Language, and you can also see (working) Antrl4 files.

I am working in the background on the Antlr4 files which is a good way to find and fix various deficiencies in the BMM3 meta-model and particularly the new expression language - in the latter, we need to be careful of syntax, even simple things like what ‘’ can represent, and allowing inline JSON and other tricks.

What could we do in the short term?

we can obtain the documentation from those extracted .adoc class files, and there are documentation fields to put them in in BMM2 files (which conform to the P_BMM meta-model);
we could do the same with the invariants; there is currently no field for them, but adding a field invariants: List<String> would be an easy enough thing to do.

Now, the current UML extractor doesn’t write BMM files, and I have not had time to code that. It would require a fair bit of work - probably about a month. So our concrete problem is how to get those documentation and invariant items into the current BMM files, which are manually maintained. We don’t want to do this manually (really - it would take about 3 weeks just to do that, and is instantly out of date whenever we touch the UML, because that is still the primary representation of everything).

Could we do an interim hack? Well in @borut.jures 's environment, you could write some script / code to extract the relevant fields (documentation and invariants) from the .adoc class files which are all visible in the specifications repos. This will be throwaway code for sure, and it would probably have to filter a bit of .adoc crap, but that would not be hard. The output of this script should be something like a JSON file (YAML, whatever you like - I personally prefer JSON5) with a structure like:

classes: [
  <class_name_1> : {
    documentation: "<class doc string here, in .adoc markdown>"
    properties: [
        <property_name_1>: {
          documentation: "<property doc string here, in .adoc markdown>"
        }
        <property_name_2>: {
          documentation: "<property doc string here, in .adoc markdown>"
        }
        etc: {}
    ]
    invariants: [
        "<invariant_1>",
        "<invariant_2>",
        "<invariant_n>"
    ]
  }
  <class_name_2> : {
    documentation: "<class doc string here, in .adoc markdown>"
    properties: []
  }
]

or close to that. A second script could be run over this to inject these fields into the current BMM files. This could all be one script of course, but in my experience, if we do this, we might end up living with this arrangement for some time, and both pieces of logic need to be changeable independently.

If these scripts could be made reliable, we could use them to update the current official BMMs, not just Borut’s private working copies.

I would suggest nobody rush into anything here - even this hack will be quite some work. And there are things to know about how to handle the specifications repos, described here.

If people really want to try this, let’s discuss it a bit more to make sure we are all on the same page, and we got the design right…!

NOTE: This still won’t represent any methods at all (and there are quite a few in the openEHR specs). But it would be a reasonable start.

sebastian.iancu · 1 December 2021 20:10

Hmm, … it looks like I was too enthusiastic, and also not aware of BMM2 vs BMM3 difference - so I’m sorry @borut.jures that I put you on wrong track. @thomas.beale knows better the situation/status.

What I can add is that 3y ago I had a PHP reader for our UML files, and transformed them into JSON schema, and I also could render invariants, methods, inheritance, etc - worked for most of the classes, except a few. I was not very happy with that code and I’m still planning to revive it one day.

borut.jures · 1 December 2021 22:02

I had to try

The first part - JSON.

Discourse doesn’t let me upload a ZIP file so here is a link: neoehr.com/files/adoc_extractor_json.zip

borut.jures · 2 December 2021 11:54

I’ve used the generated JSON files to augment the models after they are loaded from BMM. First implementation is for the UML models. I didn’t “clean” method declarations (they contain links to the specifications).

I’ll try to use invariants in their current form and generate validations for the models.

It is probably best to wait adding methods and invariants until BMM3 is ready.

@thomas.beale I found only one error in the adoc files:

“proportion_kind.adoc” for RM Release 1.1.0 specifies PROPORTION_KIND as class instead of enumeration.

thomas.beale · 2 December 2021 17:18

Borut - I had a look at the JSON - looks pretty close - nice work as usual! Might be good to have a switch in your generator that allows methods to be ignored / not ignored. The output we could use is without methods, which will make the files smaller and a bit easier to deal with.

I won’t have time this week to try injecting that extract into the official BMM files, but that would be the intention. We’d need to make this little tool chain repeatedly runnable at any time. Ideally it will run natively on Debian - then I can run it on the server and me dev environment.

Maybe the fastest way to do the injection is with a Java program using Archie to do BMM → JSON, then inject the documentation + invariants, then write BMM back out from JSON. @pieterbos is there enough in Archie to try this?