https://github.com/finnff/bbycblcompiler

Last synced: about 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/finnff/bbycblcompiler
Owner: finnff
Created: 2025-06-22T21:08:51.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-06-22T21:54:17.000Z (about 1 year ago)
Last Synced: 2025-06-22T22:31:27.231Z (about 1 year ago)
Language: Python
Size: 15.3 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          ## Mainframe Minds bbyCBL Antlr4 Parser 👶

### Prereqisities

1. Extract test files:

```bash

cd tests

tar -xvf recombined_formatted.tar.gz

```

2. Install requirements (`antlr4-tools` + `antlr4-python3-runtime`)

```bash 

pip3 install -r requirements.txt

```

3. Generate parser

```bash

antlr4 -Dlanguage=Python3 bbyCBL.g4 -o parser   

```

### Usage

1. Run Parser on test cases

```bash

python3 parserRunner.py

```

* *There are some params /debugging options (printing AST's, logging, No. iterations) that can be configured in* `parserRunner.py`

There are also some parameters that can be passed to the script:

```bash

python parserRunner.py              # Run all tests

python parserRunner.py -FAILED      # Run only previously failed tests

python parserRunner.py -CLEAR       # Clear the failed tests directory

python parserRunner.py -WORKERS N   # Use N worker processes (default: all CPU cores)

```

#### recombined_formatted Test Case coverage:

[

    

](https://github.com/user-attachments/assets/897e8f60-5af5-4ef5-be55-d0d27c2b18b6)

> [!WARNING]

> This current implementation of the parser is tuned almost exclusively for the recombined_formatted test cases. It's not syntactically pretty, particularly readable or adaptable for anything else, or even something that we want to continue working with while implementing the LLVM compiler backend. We're currently working on a more generalized version in the [dev branch](https://github.com/SilasGitHub/software-evolution/blob/dev/bbyCBL.g4) with takeaways gained from the lecture on 7-5, but this is still lacking some features as compared to this version for claiming XP.

### Lexxer+Parser Progress:

##### Basic Parsing Requirements ✅ 

| Requirement | Implemented | Location |

|-------------|-------------|----------|

| 1XP - Basic statements (ACCEPT, ALTER, GO TO, IF, PERFORM, SIGNAL) | ✅ | bbyCBL.g4: `acceptStmt`, `alterStmt`, `gotoStmt`, `ifStmt`, `performStmt`, `signalStmt` |

| 2XP - COPY, DISPLAY, IDENTIFICATION DIVISION | ✅ | bbyCBL.g4: `identificationDivision`, `identificationClause`, `copyStmt`, `displayStmt` |

| 2XP - Arithmetic operations (ADD, CALL, DIVIDE, MOVE, MULTIPLY, SUBTRACT) | ✅ | bbyCBL.g4: `addStmt`, `callStmt`, `divideStmt`, `moveStmt`, `multiplyStmt`, `subtractStmt` |

| 3XP - DATA DIVISION (nested structures with OF) | ✅ | bbyCBL.g4: `dataDivision`, `dataEntry`, and `qualifiedId` for OF references |

| 3XP - EVALUATE, LOOP (detachable clauses) | ✅ | bbyCBL.g4: `evaluateStmt`, `loopStmt` with interleaved `loopControl` and `statement` |

| 4XP - Sentences, statements (NEXT SENTENCE, STOP) | ✅ | bbyCBL.g4: `sentence`, `nextSentenceStmt`, `stopStmt` |

##### Position-based Parsing & Line Continuations (4XP) ✅ 

| Requirement | Implemented | Location |

|-------------|-------------|----------|

| Ignore columns 1-6 (sequence number) | ✅ | preprocess_cobol(): Skips first 6 columns |

| Process column 7 (line status indicator) | ✅ | preprocess_cobol(): Uses `indicator = line[6]` |

| Handle space for normal line | ✅ | preprocess_cobol(): `if indicator == " "` |

| Handle asterisk for comment line | ✅ | preprocess_cobol(): `if indicator == "*": continue` |

| Handle hyphen for line continuation | ✅ | preprocess_cobol(): `if indicator == "-"` |

| Raise error for other indicators | ✅ | preprocess_cobol(): `raise ValueError(f"Invalid line indicator...")` |

| Parse Area A (columns 8-11) | ✅ | preprocess_cobol(): `area_a = line[7:11]` and `validate_area()` |

| Parse Area B (columns 12-72) | ✅ | preprocess_cobol(): `area_b = line[11:72]` and `validate_area()` |

| Ignore columns 73-80 | ✅ | preprocess_cobol(): Only uses columns up to 72 |

| Process combined lines with continuations | ✅ | preprocess_cobol(): `current_line = current_line.rstrip() + glue + continuation` |

##### Case Insensitivity (5XP) ✅ 

| Requirement | Implemented | Location |

|-------------|-------------|----------|

| Case-insensitive keywords and identifiers | ✅ | normalize_case(): Converts keywords to uppercase |

| Case-sensitive string literals | ✅ | normalize_case(): `if in_string: out.append(ch)` |

| Case-sensitive PICTURE clauses | ✅ | normalize_case(): `if in_picture: out.append(ch)` |

| Case-sensitive comment lines | ✅ | preprocess_cobol(): Comment lines are skipped entirely |

| Case-sensitive ID division values | ✅ | normalize_case(): `if in_id_value: out.append(ch)` |

| Ambiguity resolution - uppercase keywords | ✅ | normalize_case(): `if lower in ALL_KEYWORDS: out.append(word.upper())` |

##### Whitespace Insignificance (5XP) ⚠️

| Requirement | Implemented | Location |

|-------------|-------------|----------|

| Preserve whitespace in strings | ✅ | normalize_case(): Preserves all characters in strings |

| Preserve whitespace in comments | ✅ | preprocess_cobol(): Comment lines are kept intact (though skipped) |

| Preserve whitespace in ID division values | ✅ | normalize_case(): Preserves characters in ID values |

| Ignore other whitespace | ⚠️ | Partial: ANTLR grammar has `WS: [ \t\r\n]+ -> skip;` but doesn't handle all cases |

| Ambiguity resolution with whitespace | ⚠️ | Partial: join_without_space logic helps but doesn't fully implement the test cases |

#### Example Parse Error debug info:

[

    

](https://github.com/user-attachments/assets/19600a5c-9751-4663-b377-640dadfed993)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/finnff/bbycblcompiler

Awesome Lists containing this project

README