Writing a Parser/Interpreter Using Go and ANTLR4
I recently wrote an interpreter for the Opencypher graph query
language using ANTLR4 parser generator. This took some trial and
error. At first I tried to write this interpreted using the listener
feature provided by ANTLR4. In this approach, you register a listener
that contains an Enter
and Exist
method for each grammar
derivation. This turned out to be difficult
The discovery phase of this project took longer than I hoped because I had to read the generated source and figure out how to use it to first build an abstract syntax tree (AST) as the output of the parser, and then to use that to build an interpreter using that AST. Because this is a “discovered” approach rather than a “learned” one, it is possible that what I did is not exactly what the tool developers intended, but everything works in the end. So this is how it is done:
The first step in the process is writing, or acquiring a grammar file, in my case, Cypher.g4. This file defines the grammar and the lexical elements of the openCypher language.
Next, generate the parser from it. You must have antlr4
installed
for this. In one of the Go source files, add a go:generate
directive:
|
|
This will generate the parser source code under parser/
directory.
There are two data types that are important at this step:
CypherListener
interface defines an interface containing two methods for each grammar derivation:EnterX(*XContext
andExitX(*XContext)
, whereX
is the name of the grammar derivation. For example, the top-level derivation for the openCypher grammaer is calledOC_Cypher
:
oC_Cypher
: SP? oC_Statement ( SP? ';' )? SP? EOF ;
The method EnterOC_Cypher
is called when the parser starts
parsing input that matches oC_Cypher
rules, and the method
ExitOC_Cypher
is called when the parser completes parsing input
that matches oC_Cypher
rules.