https://github.com/felipetomazec/lexical-analyzer

A basic Lexical-Analyzer written in Java.
https://github.com/felipetomazec/lexical-analyzer

compilers java javafx javafx-application lexical-analysis

Last synced: 9 months ago
JSON representation

A basic Lexical-Analyzer written in Java.

Host: GitHub
URL: https://github.com/felipetomazec/lexical-analyzer
Owner: FelipeTomazEC
Created: 2019-06-27T23:55:59.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-04-15T19:50:00.000Z (over 2 years ago)
Last Synced: 2025-04-10T03:53:44.366Z (9 months ago)
Topics: compilers, java, javafx, javafx-application, lexical-analysis
Language: Java
Homepage:
Size: 204 KB
Stars: 29
Watchers: 0
Forks: 11
Open Issues: 1
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Lexical Analyzer

The task of translating high level code, i.e., programming languages, into

a format that can be understood by a computer - binary code - is the

main job of a compiler. Speaking in a simple way, the compiler can be

splited in 3 parts:

- Lexical Analyzer (LA)

- Syntax Analyzer (SA)

- Semantic Analyzer (SMA)

The Lexical Analyzer is responsible for separating the source code into

lexemes, which are the words that compose the code. After separating all

lexemes, the LA classifies them using Token classification. **Keywords**,

**Special Symbols**, **Identifiers** and **Operators**, are examples of

tokens. Removing white spaces and comments from the compiled code is also a

role played by the Lexical Analyzer. The output of this process is a

table containing the lexemes and their token classification. Lexical

errors as invalid constructions of lexemes, *e.g. '12variableName'*,

*'na;;me'*, are also captured by the LA.

This project is an implementation of a **simple** Lexical Analyzer made in Java.

It provides a GUI where the user can type the code and get the tokens of it.

It is also possible to load the code from a file and make the analysis.

### Recognized Tokens

The Lexical Analyzer of this project recognizes the following classes

of tokens:

- **IDENTIFIER** - Variable names;

- **STRING** - Words between double quotes "";

- **INTEGER** - Number with no dot ( . );

- **FLOAT** - Float point numbers;

- **PLUS** - ( + );

- **MINUS** - ( - );

- **TIMES** - ( * ),

- **DIVIDE** - ( / );

- **KEYWORD** - for, while, do, if, else, print, switch, case, default and

  null;

- **INVALID**;

- **ASSIGN_OP** - Assignment operator ( = );

- **SEMICOLON** - ( ; )

- **LEFT_PARENTHESIS** - '(';

- **RIGHT_PARENTHESIS** - ')';

- **LEFT_BRACE** - ( { );

- **RIGHT_BRACE** - ( } );

- **COMMA** - ( , );

- **DOT** - ( . );

- **DOTDOT** - ( .. );

- **COLON** - ( : );

- **EQUAL** - ( == );

- **LOWER_OR_EQUALS** - ( <= );

- **GREATER_OR_EQUALS** - ( >= );

- **NOT_EQUALS** - ( <> );

- **GREATER_THAN** - ( > );

- **LOWER_THAN** - ( < );

- **AT_SIGN** - ( @ ).

***P.S. 1**: Sentences initiated by // or chunks of sentences between /* */

are considered comments and are not mentioned in the output.*

***P.S. 2**: The lexemes must be separated by at least one white space(' ')

to be recognized as separated things.*

### Screenshot



### Conclusion

This is a very simple example that demonstrates how a Lexical Analyzer

can be implemented. This project is also a usage example of

Finite-State Automata, a very powerful and useful tool.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/felipetomazec/lexical-analyzer

Awesome Lists containing this project

README