https://github.com/iitis/llm-network-traffic-generation
https://github.com/iitis/llm-network-traffic-generation
Last synced: 9 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/iitis/llm-network-traffic-generation
- Owner: iitis
- Created: 2025-04-25T09:22:35.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-09-02T10:26:52.000Z (10 months ago)
- Last Synced: 2025-09-02T10:31:17.176Z (10 months ago)
- Language: Jupyter Notebook
- Size: 28.4 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Network Traffic Generation by Large Language Models (LLMs)
This repository presents a novel approach for generating **realistic network traffic** using **Large Language Models (LLMs)**, specifically **OpenAI’s GPT-4.1** and **GPT-5**.
Our method, called the **Large Language Model Network Traffic Generator (LLM-NTG)**, aims to bridge the gap between **realistic traffic generation** and the **expressive capabilities of LLMs**.
We employ a few-shot learning framework combined with a **human-in-the-loop feedback mechanism**, where generated traffic is continuously evaluated and refined.
---
## 📂 Repository Structure
### 1. `Datasets/`
Contains datasets required for traffic generation:
- **One-way** and **two-way traffic datasets**.
- **Sample inputs** in `.json`, `.pcapng`, and `.csv` formats for traffic generation.
---
### 2. `GPT-4.1/`
Experiments performed with **GPT-4.1**.
This folder has two subfolders: `Experiment_1/` and `Experiment_2/`.
Each experiment includes:
- **`Exp1_Sample_Packets_Extraction.ipynb`**
Extracts traffic data from the dataset and prepares **sample packets** for generation.
- **`Exp1_Traffic_Generation_gpt4.1.ipynb`**
Generates synthetic traffic and saves the output as `.json` files.
- **`Exp1_Statistics_of_Generated_Traffics_gpt4.1.ipynb`**
Computes statistics of the generated traffic.
(`Experiment_2/` follows the same structure.)
---
### 3. `GPT-5/`
Experiments performed with **GPT-5**.
This folder also has `Experiment_1/` and `Experiment_2/`.
Since the **same input samples** are used as in GPT-4.1, there is no extraction notebook here.
Each experiment includes:
- **`Exp1_Traffic_Generation_gpt5.ipynb`**
Generates synthetic traffic with GPT-5.
- **`Exp1_Statistics_of_Generated_Traffics_gpt5.ipynb`**
Computes statistics of the generated traffic.
(`Experiment_2/` follows the same structure.)
---
### 4. `Generated_Traffic/`
Contains the **generated traffic outputs** in `.json` format and .pcap files.
- Each experiment’s results are stored here.
- The `JSON_files/` subfolder contains details such as: .json format of generated traffic
- The `PCAP_files/` subfolder contains details such as: There are .pcap files of the generated traffi,c and Wireshark can be used to view them.
- The `Results/` subfolder contains details such as:
- **Token usage**
- **Computation time**
---
### 5. `pcap_converter.py`
A transformation script that converts generated `.json` traffic files into **`.pcap` format**, enabling further analysis with standard network traffic tools (e.g., Wireshark).
---
### 6. PCAP_GPT4.1_Experiment_1.png and PCAP_GPT5_Experiment_1.png
Wireshark representation of traffic generated by GPT 4.1 and 5 for Experiment~1.
---
## 🚀 Usage
1. Explore the **datasets** under `Datasets/`.
2. Run the notebooks in `GPT-4.1/` or `GPT-5/` for traffic generation.
3. Generated traffic will be saved under `Generated_Traffic/`.
4. Optionally, convert `.json` traffic files into `.pcap` format using:
```bash
python pcap_converter.py input.json output.pcap