https://github.com/oguzhan-yilmaz/lexical-analyzer-for-psi-lang

A lexical analyzer for custom defined PSI Language, developed for Programming Languages Course.
https://github.com/oguzhan-yilmaz/lexical-analyzer-for-psi-lang

c lexical-analyzer

Last synced: 2 months ago
JSON representation

A lexical analyzer for custom defined PSI Language, developed for Programming Languages Course.

Host: GitHub
URL: https://github.com/oguzhan-yilmaz/lexical-analyzer-for-psi-lang
Owner: oguzhan-yilmaz
License: mit
Created: 2020-08-03T14:14:53.000Z (about 5 years ago)
Default Branch: master
Last Pushed: 2020-08-03T14:22:36.000Z (about 5 years ago)
Last Synced: 2025-03-23T08:44:03.405Z (7 months ago)
Topics: c, lexical-analyzer
Language: C
Homepage:
Size: 4.88 KB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# lexical-analyzer-for-psi-lang
A lexical analyzer for custom defined PSI Language, developed for Programming Languages Course.

## Objective
Creating the Lexical Analyzer phase of PSI++ Compiler in C. Analyzer should get the ‘.psi’ file and output the produced tokens to a ‘.lex’ file.

## Detailed Description of Token Handling

### Keywords
PSI++ Language has 18 different keywords. For this program’s consideration it has to be surrounded by whitespace, and match the predefined set of keywords. They are tokenized as ‘Keyword’
### Identifiers
Identifiers in PSI++ are arbitrarily named so it doesn’t have any predefined set but rules. Any identifier surrounded by whitespace or operators, which has a length less than 30 characters, starting with alphabetic characters and containing only alphanumeric characters and underscore is tokenized as ‘Identifier’ in Analyzer.
### String Constants
Any set of characters which are surrounded by double quote is tokenized as ‘StringConst’ in Analyzer.
### Integer Constants
Any sets of digit surrounded by whitespace which length is less than 10 is considered as ‘IntConst’ token.
### Operators
When the one of the operators in predefined set of operators found in source code of any PSI++ program, it is considered to be an operator and tokenized as ‘Operator’ by the Analyzer.
### End of Line
Semicolons are tokenized as ‘EndOfLine’.
### Brackets
Parenthesis, Curly Brackets and Square Brackets are tokenized by Analyzer with their respective names.
### Comments
This Analyzer ignores the comments which are denoted any text between ‘/*’ and ‘*/’. It doesn’t show up in the ‘.lex’ file.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oguzhan-yilmaz/lexical-analyzer-for-psi-lang

Awesome Lists containing this project

README