Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/c2nes/javalang
Pure Python Java parser and tools
https://github.com/c2nes/javalang
java library parser python
Last synced: 9 days ago
JSON representation
Pure Python Java parser and tools
- Host: GitHub
- URL: https://github.com/c2nes/javalang
- Owner: c2nes
- License: mit
- Created: 2012-09-29T19:21:02.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2023-09-17T20:29:07.000Z (over 1 year ago)
- Last Synced: 2024-12-07T11:08:09.216Z (16 days ago)
- Topics: java, library, parser, python
- Language: Python
- Homepage:
- Size: 131 KB
- Stars: 740
- Watchers: 19
- Forks: 162
- Open Issues: 90
-
Metadata Files:
- Readme: README.rst
- License: LICENSE.txt
Awesome Lists containing this project
README
========
javalang
========.. image:: https://travis-ci.org/c2nes/javalang.svg?branch=master
:target: https://travis-ci.org/c2nes/javalang.. image:: https://badge.fury.io/py/javalang.svg
:target: https://badge.fury.io/py/javalangjavalang is a pure Python library for working with Java source
code. javalang provides a lexer and parser targeting Java 8. The
implementation is based on the Java language spec available at
http://docs.oracle.com/javase/specs/jls/se8/html/.The following gives a very brief introduction to using javalang.
---------------
Getting Started
---------------.. code-block:: python
>>> import javalang
>>> tree = javalang.parse.parse("package javalang.brewtab.com; class Test {}")This will return a ``CompilationUnit`` instance. This object is the root of a
tree which may be traversed to extract different information about the
compilation unit,.. code-block:: python
>>> tree.package.name
u'javalang.brewtab.com'
>>> tree.types[0]
ClassDeclaration
>>> tree.types[0].name
u'Test'The string passed to ``javalang.parse.parse()`` must represent a complete unit
which simply means it should represent a complete, valid Java source file. Other
methods in the ``javalang.parse`` module allow for some smaller code snippets to
be parsed without providing an entire compilation unit.Working with the syntax tree
^^^^^^^^^^^^^^^^^^^^^^^^^^^^``CompilationUnit`` is a subclass of ``javalang.ast.Node``, as are its
descendants in the tree. The ``javalang.tree`` module defines the different
types of ``Node`` subclasses, each of which represent the different syntaxual
elements you will find in Java code. For more detail on what node types are
available, see the ``javalang/tree.py`` source file until the documentation is
complete.``Node`` instances support iteration,
.. code-block:: python
>>> for path, node in tree:
... print path, node
...
() CompilationUnit
(CompilationUnit,) PackageDeclaration
(CompilationUnit, [ClassDeclaration]) ClassDeclarationThis iteration can also be filtered by type,
.. code-block:: python
>>> for path, node in tree.filter(javalang.tree.ClassDeclaration):
... print path, node
...
(CompilationUnit, [ClassDeclaration]) ClassDeclaration---------------
Component Usage
---------------Internally, the ``javalang.parse.parse`` method is a simple method which creates
a token stream for the input, initializes a new ``javalang.parser.Parser``
instance with the given token stream, and then invokes the parser's ``parse()``
method, returning the resulting ``CompilationUnit``. These components may be
also be used individually.Tokenizer
^^^^^^^^^The tokenizer/lexer may be invoked directly be calling ``javalang.tokenizer.tokenize``,
.. code-block:: python
>>> javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
This returns a generator which provides a stream of ``JavaToken`` objects. Each
token carries position (line, column) and value information,.. code-block:: python
>>> tokens = list(javalang.tokenizer.tokenize('System.out.println("Hello " + "world");'))
>>> tokens[6].value
u'"Hello "'
>>> tokens[6].position
(1, 19)The tokens are not directly instances of ``JavaToken``, but are instead
instances of subclasses which identify their general type,.. code-block:: python
>>> type(tokens[6])
>>> type(tokens[7])
**NOTE:** The shift operators ``>>`` and ``>>>`` are represented by multiple
``>`` tokens. This is because multiple ``>`` may appear in a row when closing
nested generic parameter/arguments lists. This abiguity is instead resolved by
the parser.Parser
^^^^^^To parse snippets of code, a parser may be used directly,
.. code-block:: python
>>> tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
>>> parser = javalang.parser.Parser(tokens)
>>> parser.parse_expression()
MethodInvocationThe parse methods are designed for incremental parsing so they will not restart
at the beginning of the token stream. Attempting to call a parse method more
than once will result in a ``JavaSyntaxError`` exception.Invoking the incorrect parse method will also result in a ``JavaSyntaxError``
exception,.. code-block:: python
>>> tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
>>> parser = javalang.parser.Parser(tokens)
>>> parser.parse_type_declaration()
Traceback (most recent call last):
File "", line 1, in
File "javalang/parser.py", line 336, in parse_type_declaration
return self.parse_class_or_interface_declaration()
File "javalang/parser.py", line 353, in parse_class_or_interface_declaration
self.illegal("Expected type declaration")
File "javalang/parser.py", line 122, in illegal
raise JavaSyntaxError(description, at)
javalang.parser.JavaSyntaxErrorThe ``javalang.parse`` module also provides convenience methods for parsing more
common types of code snippets.