openEHR grammar generalities

thomas.beale · 20 September 2021 21:18

Following @pablo 's recent post on Terminology Ids, I was having a look at some really basic things, including definitions for Integer, which appear in versions as well as numbers and parts of Reals/floats etc.

We currently define it as follows:

INTEGER: DIGIT+ ;

But that allows zero-filled integers, which are at least weird, if not usually some sort of stringified 0-filled integer…

I had a look at Kotlin (just picking on a ‘modern’ language), and it has:

IntegerLiteral
    : DecDigitNoZero DecDigitOrSeparator* DecDigit
    | DecDigit // including '0'
    ;

i.e. no zero-filling. Java is similar:

fragment
DecimalNumeral
	:	'0'
	|	NonZeroDigit (Digits? | Underscores Digits)
	;

Now when we create version ids, we don’t want zero-filling, at least I don’t think we do. I wonder if we ought to define INTEGER like this:

INTEGER: '0' | [1-9][0-9]*

Then Version id is something like this:

VERSION_ID     : INTEGER '.' INTEGER '.' INTEGER VERSION_MOD? ;
fragment VERSION_MOD: ( '-rc' | '-alpha' ) ( '.' DIGIT+ )? ;

This allows zero-filling in the numeric part of the version modifier, on the assumption that this part isn’t necessarily understood as a proper number.

Related: should we allow zero-filled Real numbers? Kotlin says yes, so does Java.

Reals probably don’t matter that much, since they serve only one purpose, but Integers are both numbers and also pieces of many kinds of Ids, including version ids, OIDs, and so on.

I am tempted to follow kotlin and Java, and define INTEGER as a non-zero-filled numeric. Would this break anything?

We can also solve the problem another way, as I have done now, and just define a separate matcher fragment for non-zero-filled Integers:

fragment NUMBER : '0' | [1-9][0-9]* ;

Thoughts?