Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ftomassetti/javacc2antlr
https://github.com/ftomassetti/javacc2antlr
antlr antlr4 javacc
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/ftomassetti/javacc2antlr
- Owner: ftomassetti
- License: apache-2.0
- Created: 2017-12-17T13:33:09.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2022-10-18T13:09:06.000Z (about 2 years ago)
- Last Synced: 2024-04-16T18:15:21.345Z (9 months ago)
- Topics: antlr, antlr4, javacc
- Language: Java
- Size: 1.48 MB
- Stars: 16
- Watchers: 8
- Forks: 10
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# JavaCC2ANTLR
[![Build Status](https://travis-ci.org/ftomassetti/JavaCC2ANTLR.svg?branch=master)](https://travis-ci.org/ftomassetti/JavaCC2ANTLR)
JavaCC is an old and venerable tool, used in so many projects. In recent years however ANTLR seems to have a growing community and
there are different tools to support ANTLR. Also, ANTLR can be used to generate a parser for so many target languages that
are not supported by JavaCC.So I hacked together this little project, in Kotlin.
For now it basically get a JavaCC grammar and produces a lexer and a parser ANTLR grammar which should hopefully be equivalent.
## Generate ANTLR Lexer & Parser
Simply look at the class `JavaCCToAntlrConverter`. It takes the file name of the JavaCC grammar and outputs
a Lexer and a parser Grammar.## Generate an ANTLR in memory
```kotlin
val file = File("src/test/resources/java.jj")
val grammarName = file.nameWithoutExtension.capitalize()val javaCCGrammar = loadJavaCCGrammar(file)
val antlrGrammar = javaCCGrammar.convertToAntlr(grammarName)
this.genericParser = antlrGrammar.genericParser()
val ast = genericParser.parse("class A { }")
```## Push/Pop Mode Commands
JavaCC by default does not have a way for tokens to change the token manager lexical state with memory, like ANTLR provides
with the `pushMode` and `popMode` commands. For example, to parse as a single token a balanced set of parentheses such as
`((()) ())` you might have the following JavaCC parser:
```
TOKEN_MGR_DECLS : {
static List lexicalStateStack = new ArrayList();static void openParen() {
lexicalStateStack.add(curLexState);
}static void closeParen() {
SwitchTo(lexicalStateStack.remove(lexicalStateStack.size() - 1));
}
}SKIP : {
< " " >
}MORE : {
{ openParen(); }
| { closeParen(); }
}MORE : {
< "(" > { openParen(); } : LEVEL1
}MORE : {
< "(" > { openParen(); } : LEVELN
}TOKEN : {
{ closeParen(); } : DEFAULT
}void Start(): {} { }
```However, the ANTLR lexer would not behave correctly because we cannot infer when, according to the `SwitchTo` statements
executed as part of the actions, the corresponding ANTLR rules should use `mode`, `pushMode`, or `popMode` commands:```
lexer grammar Lexer;SKIP0 : ' ' -> skip ;
MORE0 : '(' -> more, mode(LEVEL1) ;mode LEVEL1;
LEVEL1_SKIP0 : SKIP0 -> skip ;
MORE1 : '(' -> more, mode(LEVELN) ;
BALANCED_PARENS : ')' -> mode(DEFAULT_MODE) ;mode LEVELN;
LEVELN_SKIP0 : SKIP0 -> skip ;
LPAREN : '(' -> more ;
RPAREN : ')' -> more ; // PROBLEM: Cannot escape this mode!parser grammar Parser;
options { tokenVocab=Lexer; }
start : BALANCED_PARENS EOF ;
```In order to handle such actions, you must add the following fields to your `TOKEN_MGR_DECLS` with values set to the name
of your functions that should map to `pushMode` and `popMode` commands respectively:```
TOKEN_MGR_DECLS : {
...
final static String pushStateFunc = "openParen";
final static String popStateFunc = "closeParen";
}
```Now the lexer gets generated correctly:
```
SKIP0 : ' ' -> skip ;
MORE0 : '(' -> more, pushMode(LEVEL1) ;mode LEVEL1;
LEVEL1_SKIP0 : SKIP0 -> skip ;
MORE1 : '(' -> more, pushMode(LEVELN) ;
BALANCED_PARENS : ')' -> popMode ;mode LEVELN;
LEVELN_SKIP0 : SKIP0 -> skip ;
LPAREN : '(' -> more, pushMode(LEVELN) ;
RPAREN : ')' -> more, popMode ;
```## Licensing
The project is made available under the Apache Public License V2.0. Please see the file called [LICENSE](LICENSE).