Did anybody implement AQL with a LL parser framework?

Greetings,
The AQL grammar from the wiki has direct and indirect left recursion. Which means without changes in the grammar, LL parser generators (both JavaCC and Anltr) can’t generate parsers for this grammar.

I’m curious if anybody has refactored this grammar for LL parser generators. Shinji? Your latest release includes an AQL parser does not it? Could you please share your method? I can always look at the code, but you’d probably save me time :slight_smile:

I’m interested in experiences of others too.

Kind regards
Seref

So has AQL been selected as the official openEHR query language?

there is no guarantee that the current grammar is the best one, and in
my view, a better grammar could be developed. We just need to make sure
it deals with the typical queries we write, and ones that are already
written (or at least that the latter could safely be translated).

Other thing to keep in mind: many non-LL grammars are probably
computable as LL grammars with sufficient tokens of lookahead.

- thomas

Antlr is LL(*), but LL can’t handle direct/indirect recursion, which is a matter of grammar rules. Direct recursion is easier to handle, but indirect recursion makes my head hurt :slight_smile:

not officially. It has been implemented in at least 2 production systems
(it may be 3), so we know it 'works'. But at least from my point of
view, and I am sure the primary developers (Chunlan Ma, Heath Frankel)
would agree - it can only benefit from wider inspection and testing by
the community. I would suggest that the current AQL spec can be
considered to be in a 'trial' state now.

There is also a very good piece of work called a-path which Zilics did
in Brazil a few years ago, and I think that should be integrated into
ADL 1.5 (rules section) and probably also AQL.

There are others thinking about an expanded AQL, with the equivalent of
schema definition operations of SQL as well.

Once the current specification governance has been clarified, I suggest
that the current AQL spec be declared 'trial' just so we have something
solid to work with, and then a Query project group be set up to
consolidate work on a next generation AQL.

- thomas

Hi Seref,

My ADL parser does not include AQL parsing.
I used Treetop, which is an Ruby implementation of PEG/Packrat parsing
algorithm,
not LL/LR. PEG/Packrat parser algorithm was described in this paper.
http://bford.info/pub/lang/packrat-icfp02/

Antlr is an implementation of PEG parser by LL techniques. I do not
know Antlr so much.

Packrat parser does not need to separate scanner/parser/lexer and is capable to
infinite look ahead recursive.

I do not know why are you parsing AQL, but this proceeding about querying EHR
by archetype might be helpful for your research.
http://web-ext.u-aizu.ac.jp/labs/sw-db/7108/71080109.pdf

Best regards,
Shinji

Thanks Shinji,
In general, Antlr has some convenient features, infinite lookahead being one of them. I’ve quickly checked, and Treetop does not seem to support left recursion either. So you must have modified the grammar to make it work.
I’m referring to grammar rules such as

Tom made the point earlier. At one point it would be good to unify various AQL implementation experiences. I’ll check out the papers.

Best regards
Seref

Hi Seref,

Left recursive rule is theoretically rewrite to equivalent right recursive.
In ADL grammar, cADL has left recursive rule that cannot be parsed by LL,
so I converted them to right recursive rules.
For example of left recursive to explain(shorten for convenience)

arithmetic_expression:
   arithmetic_leaf

arithmetic_node

arithmetic_node:
  arithmetic_expression '+' arithmetic_leaf

...

arithmetic_leaf:
(an arithmetic element)

I converted this grammar to this right recursive rule.

arithmetic_expression:
   arithmetic_leaf

arithmetic_node

arithmetic_node:
  arithmetic_leaf '+' arithmetic_expression

...

These conversions were the most tough points.
I wish I could help you.

Regards,
Shinji

Hi!

We implemented an AQL parser using JavaCC. My colleague Mikael Nyström made some transformations to make the published AQL grammar work in JavaCC. Mikael is on vacation right now, but I’m sure he does not mind sharing his experiences once he gets back.

I do think it would be interesting to switch to ANTLR sooner or later in order to share efforts between projects with different implementation/target-languages and because the ANTLRWorks environment http://www.antlr.org/works/index.html looks promising compared to the pretty bad JavaCC-plugin in e.g. Eclipse.

Our parser (and thus also the modified grammar) will soon be open sourced so you are free to use it. So if you are not in an extreme hurry I’d suggest using or getting inspiration from what we have already done.

Best regards,
Erik Sundvall
erik.sundvall@liu.se http://www.imt.liu.se/~erisu/ Tel: +46-13-286733

P.S.
I leaned much from Andrew W. Appel, the chapter 3 of "Modern Compiler
Implementation in ML"

Thanks Erik,
AntlrWorks is nice, but it has a problem of slowing down for some reason, even if the grammar is not that big. Appears to be a known issue, but the latest version still has this behaviour. Still beats anything else out there though.

For me, ANTLR’s main advantage is its infrastructure to support multiple target languages. I’ve used JavaCC a long time ago, and I don’t think it is inferior to Antlr, though Antlr has a bit of a learning curve.

I’m working on the refactoring of the existing grammar via elimination of left recursions, but my point is pretty much the same with yours; moving grammars to Antlr would help different groups develop parsers/tools easier. I’ve asked the original question to see if people did the actual work of eliminating recursions and basically making grammar LL compatible, and based on the responses I can see that they have.

This is good news, since it means we may be able to build a community around Antlr based implementations. Good to hear that you’ll be putting something out soon, I’ll try to do the same :wink:

Regards
Seref

Thanks Shinji,
The conversions are indeed though :slight_smile: The problem is with the indirect/mutual recursions. Even if you deal with recursions, Anltr manages to find ambiguities, but they can be dealt with backtracking or syntactic/semantic predicates of Antlr. Maybe I should fire up a Ruby environment and see your solutions :slight_smile:

Regards
Seref

Ps: For those who have lost precious brain cells to parsers at one point in their lives, I’m still resisting the idea of going back to “the dragon book”…

Hi,

we have implemented AQL parser with Antlr.
We started off with a grammar file we found on openehr site and then
worked through it to convert it to Antlr specifics.

Best regards,
Bostjan Lah
Marand d.o.o.

Hi Seref and Erik,
The grammar published on openEHR by Ocean Informatics was what we used with a proprietary third party tool. If people have converted the grammar to work with more standard parsing tools and want to post it to the AQL wiki page for others to use then we too can test it with our tool and if successful can deprecate the original.

Heath

Hi,

we’ve added our ANTLR on this page: http://www.openehr.org/wiki/display/spec/AQL-+Archetype+Query+Language

Best regards,
Bostjan

Hi,

I used the standard approach for rewriting left recursion to non left recursion described in computer grammar books (and in Wikipedia http://en.wikipedia.org/wiki/Left_recursion). It was a quite straight forward translation, but it was necessary to be careful with all the details. :slight_smile: We will share the grammar when we have cleaned up the file a little more.

Regards,

Mikael

Thanks Mikael,
I’ve also found left recursion to be less of a problem as I focused on the grammar. The details, and especially the division of tasks between lexer and parser in Antlr is what is tricky, as you’ve also pointed out.

I decided to keep working on my version without peeking at any other work, to see what comes out of that (probably lots of things to laugh at). There is now already an Antlr grammar out there, thanks to Marand, but I find the idea of having multiple works out there very exciting. 2012 has certainly started well for the openEHR community!

Best regards
Seref

2012/1/13 Mikael Nyström <mikael.nystrom@liu.se>

Hi,

I found both left recursion as well as other aspects of ANTLR quite tricky esp as I had very little experience with ANTRL and grammars in general.
There is also no “quick and dirty” way to do it - books on the subject are quite hefty :-(
Anyway, I am really looking forward to another ANTLR grammar - perhaps we can merge both at the end and come out with one that works for really all cases.

Kind regards,
Bostjan

My observations so far:

Don’t assume that you can handle everything using Anltr’s build in mechanisms: semantic & syntactic predicates are powerful, but they are emphasized too much, so it makes you try to solve all problems with them. I guess the reason people can use JavaCC easier is that there is not this much specific features, so less confusion.

Also, if you try to solve problems with specific token declaration in the lexer, parser rules become complicated very quickly, because all types of tokens are sent to parser rules, and parser rules become complicated as a result.

My feeling at the moment is, if you’re using Antlr for Aql, keep the lexer rules small, and handle most of the work in parser rules. Once I make some more progress, I’d really like to discuss in depth with authors of JavaCC based parsers.

Hi Seref,

This is the demo site for AQBE dynamic query generation.
http://wako3.u-aizu.ac.jp:8080/aqbe/
It is wonderful.

Regards,
Shinji