Parser/Interpreter with Go and ANTLR4

Writing a Parser/Interpreter Using Go and ANTLR4

I recently wrote an interpreter for the Opencypher graph query language using ANTLR4 parser generator. This took some trial and error. At first I tried to write this interpreted using the listener feature provided by ANTLR4. In this approach, you register a listener that contains an Enter and Exist method for each grammar derivation. This turned out to be difficult

The discovery phase of this project took longer than I hoped because I had to read the generated source and figure out how to use it to first build an abstract syntax tree (AST) as the output of the parser, and then to use that to build an interpreter using that AST. Because this is a “discovered” approach rather than a “learned” one, it is possible that what I did is not exactly what the tool developers intended, but everything works in the end. So this is how it is done:

The first step in the process is writing, or acquiring a grammar file, in my case, Cypher.g4. This file defines the grammar and the lexical elements of the openCypher language.

Next, generate the parser from it. You must have antlr4 installed for this. In one of the Go source files, add a go:generate directive:

1
2
3
package opencypher

//go:generate antlr4 -Dlanguage=Go Cypher.g4 -o parser

This will generate the parser source code under parser/ directory.

There are two data types that are important at this step:

  • CypherListener interface defines an interface containing two methods for each grammar derivation: EnterX(*XContext and ExitX(*XContext), where X is the name of the grammar derivation. For example, the top-level derivation for the openCypher grammaer is called OC_Cypher:
oC_Cypher
      :  SP? oC_Statement ( SP? ';' )? SP? EOF ;

The method EnterOC_Cypher is called when the parser starts parsing input that matches oC_Cypher rules, and the method ExitOC_Cypher is called when the parser completes parsing input that matches oC_Cypher rules.