https://github.com/zurd46/zurdsynthdatagen
This Electron project uses the OpenAI ChatCompletion API to generate synthetic datasets in either German (DE) or English (EN).
https://github.com/zurd46/zurdsynthdatagen
data data-structures dataset electron json jsonl nodejs openai synthetic
Last synced: 3 months ago
JSON representation
This Electron project uses the OpenAI ChatCompletion API to generate synthetic datasets in either German (DE) or English (EN).
- Host: GitHub
- URL: https://github.com/zurd46/zurdsynthdatagen
- Owner: zurd46
- Created: 2025-01-22T09:38:46.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-22T12:53:51.000Z (about 1 year ago)
- Last Synced: 2025-04-02T10:18:04.786Z (12 months ago)
- Topics: data, data-structures, dataset, electron, json, jsonl, nodejs, openai, synthetic
- Language: JavaScript
- Homepage:
- Size: 110 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Zurd SynthDataGen
This Electron project uses the OpenAI ChatCompletion API to generate synthetic datasets in either German (DE) or English (EN). Each request is automatically split into three JSONL files (`train.jsonl`, `val.jsonl`, `test.jsonl`), and new data is always **appended**. In the user interface, however, only the **train** dataset is displayed in a table.
## Features
- **Electron app** (GUI) with Materialize CSS in dark mode
- **Language selection** (DE/EN)
- **Model selection** (e.g., `gpt-4`, `gpt-3.5-Turbo`, etc.)
- **Continuous appending** to `train.jsonl`, `val.jsonl`, and `test.jsonl` (no overwriting)
- **UI table** only shows entries from `train.jsonl` (train split)

## Requirements
1. **Node.js** (version 14 or higher)
2. **OpenAI API Key** (in a `.env` file or set as an environment variable)
## Installation & Setup
1. **Clone the repository**:
```bash
git clone https://github.com/zurd46/ZurdSynthDataGen.git
cd ZurdSynthDataGen
```
2. **Install dependencies:**:
```bash
npm install
```
3. **Set up your OpenAI API key**:
Create a file named .env in the project root and add:
```bash
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxx
```
4. **Start the app:**
This will open an Electron window with the Zurd SynthDataGen interface.
```bash
npm start
```