Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/starjob42/Starjob
JSSP dataset for LLMs
https://github.com/starjob42/Starjob
Last synced: 3 days ago
JSON representation
JSSP dataset for LLMs
- Host: GitHub
- URL: https://github.com/starjob42/Starjob
- Owner: starjob42
- Created: 2024-10-01T19:07:58.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-10-21T10:26:46.000Z (3 months ago)
- Last Synced: 2024-10-21T14:59:32.145Z (3 months ago)
- Language: Python
- Size: 12.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome_ai_agents - Starjob - JSSP dataset for LLMs (Building / Datasets)
- awesome_ai_agents - Starjob - JSSP dataset for LLMs (Building / Datasets)
README
# Starjob Dataset designed to train LLMs on JSSP
## Dataset Overview
**Dataset Name:** jssp_llm_format_120k.json
**Number of Entries:** 120,000
**Number of Fields:** 5## Fields Description
1. **num_jobs**
- **Type:** int64
- **Number of Unique Values:** 12
2. **num_machines**
- **Type:** int64
- **Number of Unique Values:** 12
3. **instruction**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Initial description of the problem detailing the number of jobs and machines involved.**
4. **input**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Description of the problem in LLM format**5. **output**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Solution in LLM format:** 120,0006. **matrix**
- **Type:** object
- **Number of Unique Values:** 120,000
- **Input problem OR-Tool makspan and solution in Matrix format**
## UsageThis dataset can be used for training LLMs for job-shop scheduling problems (JSSP). Each entry provides information about the number of jobs, the number of machines, and other relevant details formatted in natural language.
# Setting Up Your Python Environment
Follow these instructions to create a virtual environment and install the necessary libraries.
## Step 1: Create a Virtual Environment
```bash
python3 -m venv llm_env
```Activate the Virtual Environment
After creating the virtual environment, activate it using the following command:On Windows
```bash
.\llm_env\Scripts\activate
```On macOS and Linux
```bash
source llm_env/bin/activate
```# Install the Required Libraries
```bash
pip install -r requirements.txt
```# Training
Make sure to put dataset.json under data directory```bash
python train_llama_3.py
```## License
This dataset is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). For more details, see the [license description](https://creativecommons.org/licenses/by-sa/4.0/). The dataset will remain accessible for an extended period.