https://github.com/lorenzomaiuri-dev/quantum-gpt
A hybrid Quantum-Classical Transformer implementation based on nanoGPT, using PyTorch and PennyLane to replace attention heads with Variational Quantum Circuits (VQC).
https://github.com/lorenzomaiuri-dev/quantum-gpt
gpt language-model nanogpt natural-language-processing pennylane pytorch qnlp quantum quantum-computing quantum-machine-learning research transformers variational-quantum-circuit
Last synced: 5 months ago
JSON representation
A hybrid Quantum-Classical Transformer implementation based on nanoGPT, using PyTorch and PennyLane to replace attention heads with Variational Quantum Circuits (VQC).
- Host: GitHub
- URL: https://github.com/lorenzomaiuri-dev/quantum-gpt
- Owner: lorenzomaiuri-dev
- License: mit
- Created: 2025-12-21T17:11:24.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2026-01-06T23:12:11.000Z (5 months ago)
- Last Synced: 2026-01-09T03:46:19.749Z (5 months ago)
- Topics: gpt, language-model, nanogpt, natural-language-processing, pennylane, pytorch, qnlp, quantum, quantum-computing, quantum-machine-learning, research, transformers, variational-quantum-circuit
- Language: Python
- Homepage:
- Size: 2.98 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Quantum GPT (Hybrid QNN-NanoGPT)




A hybrid Quantum-Classical implementation of a Generative Pre-trained Transformer (GPT).
This project adapts Andrej Karpathy's `nanoGPT` architecture by replacing classical linear layers in the Self-Attention mechanism with **Variational Quantum Circuits (VQC)** using PennyLane.
## 🚀 Scientific Concept
In a standard Transformer, the Attention Head projects input tokens into Query, Key, and Value spaces using linear matrices ($W_Q, W_K, W_V$).
In this **Quantum-Hybrid architecture**, we replace these dense layers with a parameterized quantum evolution:
$$
x \xrightarrow{\text{Adapter}} z \in \mathbb{R}^n \xrightarrow{R(\phi)} |\psi(z)\rangle \xrightarrow{U(\theta)_{\text{entangle}}} \langle Z \rangle \to y
$$
**Where:**
* **Adapter**: A classical bottleneck layer compressing high-dimensional embeddings to $n$ qubits.
* $R(\phi)$: **Angle embedding** encoding classical data into quantum states.
* $U(\theta)$: A sequence of trainable entangling layers (**Strongly Entangling Layers**).
* $\langle Z \rangle$: **Expectation value measurement** returning the projected vector.
### Why?
This architecture allows us to study if the high-dimensional **Hilbert space** and **quantum interference** can capture semantic relationships more efficiently (parameter-wise) than classical linear algebra, despite the constraints of current NISQ simulation.
This allows exploring the expressivity of quantum circuits within a sequence modeling task.
Note: We employ a Quantum Bottleneck architecture. High-dimensional classical embeddings are projected down to a lower-dimensional quantum latent space via a trainable adapter, processed by the VQC, and projected back. This maintains computational feasibility while exploiting quantum interference.
## 📂 Project Structure
```text
quantum-transformer/
├── checkpoints/ # Saved models
├── data/ # Input text data
├── src/ # Source code
│ ├── config.py # Hyperparameters & flags
│ ├── dataset.py # Tokenizer & Dataloader
│ ├── model.py # Transformer Architecture
│ └── quantum_layers.py # PennyLane Circuits & Hybrid Layers
├── main.py # Entry point (Train/Generate)
└── requirements.txt # Dependencies
```
## 🛠️ Installation
Clone the repository:
```bash
git clone https://github.com/lorenzomaiuri-dev/quantum-gpt.git
cd quantum-transformer
```
Install dependencies:
```bash
pip install -r requirements.txt
```
## ⚡ Usage
### Training
To train the model on the Shakespeare dataset (included in data/):
```bash
python main.py --mode train
```
Note: Quantum simulation is CPU-intensive. The default configuration uses a "Quantum Bottleneck" (4-8 qubits) to keep training times feasible on consumer hardware.
### Generation
To generate text using the trained checkpoint:
```bash
python main.py --mode generate
```
## ⚙️ Configuration
You can modify hyperparameters in src/config.py:
```Python
# Quantum Settings
USE_QUANTUM = True # Set False to use standard Linear Layers
N_QUBITS = 4 # Number of qubits per head
N_QLAYERS = 2 # Depth of the quantum circuit
```
## 🧠 Architecture Details
Embedding Dimension: 8 (scaled down for simulation speed)\
Heads: 2\
Qubits per Head: 4
## 📊 Preliminary Results (Coming Soon)
Comparison between Classical (64 params) vs Hybrid Quantum (4 qubits) attention heads:
- [ ] Loss Convergence: Comparing training stability.
- [ ] Parameter Efficiency: Can quantum circuits learn with fewer parameters?
- [ ] Runtime Analysis: Quantifying the overhead of quantum simulation.
## 🙏 Acknowledgements
Andrej Karpathy for the original nanoGPT and Video Lecture.\
Xanadu for the PennyLane library used for quantum machine learning.