https://github.com/absarraashid3/translodep2c
TranslodeP2C is an AI-powered pseudocode-to-C++ transformer, built with a seq2seq model. It preprocesses structured pseudocode, trains on paired datasets, and generates efficient C++ code. With an intuitive Streamlit UI, TranslodeP2C enables seamless and intelligent code synthesis from natural language descriptions. 🚀
https://github.com/absarraashid3/translodep2c
artificial-intelligence genrative-ai machine-learning
Last synced: 3 months ago
JSON representation
TranslodeP2C is an AI-powered pseudocode-to-C++ transformer, built with a seq2seq model. It preprocesses structured pseudocode, trains on paired datasets, and generates efficient C++ code. With an intuitive Streamlit UI, TranslodeP2C enables seamless and intelligent code synthesis from natural language descriptions. 🚀
- Host: GitHub
- URL: https://github.com/absarraashid3/translodep2c
- Owner: AbsarRaashid3
- Created: 2025-02-27T12:45:03.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-03-02T10:45:29.000Z (3 months ago)
- Last Synced: 2025-03-02T11:27:43.114Z (3 months ago)
- Topics: artificial-intelligence, genrative-ai, machine-learning
- Language: Python
- Homepage:
- Size: 9.04 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TranslodeP2C
## Overview
TranslodeP2C is an AI-powered pseudocode-to-C++ conversion system.
Leveraging a Transformer-based seq2seq model,
it translates pseudocode descriptions into structured C++ programs.
The project includes preprocessing, vocabulary building, training,
and inference, with an interactive Streamlit UI.## Features
* Transformer-based sequence-to-sequence model for code generation.
* Converts pseudocode to C++ using deep learning.
* Preprocessing and vocabulary management for structured learning.
* Training pipeline with customizable hyperparameters.
* Inference system with greedy decoding.
* Streamlit-based web UI for user-friendly interactions.## Installation
### Prerequisites
Ensure you have the following installed:
* Python 3.8+
* PyTorch
* Streamlit
* tqdm## Setup
1. Clone the repository:
git clone https://github.com/absarraashid3/translodep2c.git
cd translodep2c
2. Install dependencies:
pip install -r requirements.txt
3. Prepare your dataset and place it in data/train/split/.
## Usage### Preprocessing
Convert TSV trInaining data into paired pseudocode-code format:
** python src/preprocess.py --input_tsv "C:\Projects\GenAi\data\train\split\spoc-train-train.tsv" --output_txt "C:\Projects\GenAi\data\train_pairs.txt" **### Building Vocabulary
Generate vocabulary pickle files from training pairs:
** python src/vocab.py --pairs_file "C:\Projects\GenAi\data\train_pairs.txt" --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" **## Training the Model
Train the Transformer model for pseudocode-to-C++ conversion:
** python src/train.py --pairs_file "C:\Projects\GenAi\data\train_pairs.txt" --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" --epochs 10 --batch_size 8 **## Inference
Generate C++ code from input pseudocode:
** python src/infer.py --model_checkpoint transformer_seq2seq.pt --src_vocab_file "src/src_vocab.pkl" --tgt_vocab_file "src/tgt_vocab.pkl" --pseudocode "read n print factorial of n" **## Web Application
Launch the Streamlit UI:
** streamlit run src/app.py **Enter pseudocode and get auto-generated C++ code!
## Future Enhancements
* Implement beam search decoding for better predictions.
* Fine-tune with more programming languages.
* Optimize the model for faster inference.# 🚀 Transform pseudocode into real C++ with TranslodeP2C!



