https://github.com/starjob42/Starjob

JSSP dataset for LLMs
https://github.com/starjob42/Starjob

Last synced: 3 months ago
JSON representation

JSSP dataset for LLMs

Host: GitHub
URL: https://github.com/starjob42/Starjob
Owner: starjob42
Created: 2024-10-01T19:07:58.000Z (7 months ago)
Default Branch: main
Last Pushed: 2024-10-21T10:26:46.000Z (6 months ago)
Last Synced: 2024-10-21T14:59:32.145Z (6 months ago)
Language: Python
Size: 12.7 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome_ai_agents - Starjob - JSSP dataset for LLMs (Building / Datasets)
awesome_ai_agents - Starjob - JSSP dataset for LLMs (Building / Datasets)

README

# Starjob Dataset designed to train LLMs on JSSP

## Dataset Overview

**Dataset Name:** jssp_llm_format_120k.json
**Number of Entries:** 120,000
**Number of Fields:** 5

## Fields Description

1. **num_jobs**
- **Type:** int64
- **Number of Unique Values:** 12

2. **num_machines**
- **Type:** int64
- **Number of Unique Values:** 12

3. **instruction**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Initial description of the problem detailing the number of jobs and machines involved.**

4. **input**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Description of the problem in LLM format**

5. **output**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Solution in LLM format:** 120,000

6. **matrix**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Input problem OR-Tool makspan and solution in Matrix format**

## Usage

This dataset can be used for training LLMs for job-shop scheduling problems (JSSP). Each entry provides information about the number of jobs, the number of machines, and other relevant details formatted in natural language.

# Setting Up Your Python Environment

Follow these instructions to create a virtual environment and install the necessary libraries.

## Step 1: Create a Virtual Environment

```bash
python3 -m venv llm_env
```

Activate the Virtual Environment
After creating the virtual environment, activate it using the following command:

On Windows
```bash
.\llm_env\Scripts\activate
```

On macOS and Linux
```bash
source llm_env/bin/activate
```

# Install the Required Libraries
```bash
pip install -r requirements.txt
```

# Training
Make sure to put dataset.json under data directory

```bash
python train_llama_3.py
```

## License

This dataset is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). For more details, see the [license description](https://creativecommons.org/licenses/by-sa/4.0/). The dataset will remain accessible for an extended period.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/starjob42/Starjob

Awesome Lists containing this project

README