h2. Architecture description Every language whether it is spoken language or a computer language, it will have a grammar. This grammar defines the rules or syntaxes for that language. Though many languages including ABAP don’t have an official version of this grammar, it is still available in other forms or can construct the grammar from the language syntax help. Well, what is this grammar? Is there any representation for it? YES. EBNF means Extended Backus-Naur Form, the grammar for describing parser grammars. Like lex and yacc, ANTLR is one such parser generator that understands EBNF grammar files. The idea for this ABAP parser is to have a grammar defined for each ABAP keyword or statement. All the grammar files are compiled to generate Java files and then to Jar using an ANT task. This generated Java files will call the semantic code for either interpreting or translating. This ABAP parser is completely coded in ANTLR v3.0.1. Please note that the EBNF grammar for ABAP provided here must not be used to validate ABAP syntax because, the motive is just to parse. The reason being ABAP supports multiple additions for many statements and the yet other complexity is that it can occur in any order. Just to parse and make sure the parser traverses through all the tokens, zero or multi occurrences for each additions would easily provide a solution for a parser but fails if the same addition for that ABAP statement occurs twice. There are options like memoize =true in ANTLR, but some commands in ABAP do allow multiple additions. h2. A sample rule To explain a bit further let’s take only the SELECT statement in ABAP,
In the above diagram the whole flow of the parsing logic for SELECT statement is provided. Below is the EBNF grammar for SELECT written in ANTLR. select :'SELECT' ('SINGLE'|'FOR' 'UPDATE'|'DISTINCT')* (abapVariable+|'*'|fieldVariable)