https://github.com/kyflores/scorelm

Analyze musical scores with language models!
https://github.com/kyflores/scorelm

Last synced: 5 months ago
JSON representation

Analyze musical scores with language models!

Host: GitHub
URL: https://github.com/kyflores/scorelm
Owner: kyflores
Created: 2023-07-24T05:53:40.000Z (almost 2 years ago)
Default Branch: develop
Last Pushed: 2023-11-16T23:35:01.000Z (over 1 year ago)
Last Synced: 2025-01-04T20:46:38.760Z (6 months ago)
Language: Python
Size: 70.3 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# ScoreLM
Analyze musical scores with language models!

## Setup
First, create a new virtualenv or conda environment with python & pip.
Then:
```
pip install transformers datasets accelerate music21 deepspeed
```

Finally, note that the scripts generate some very large files, like datasets or model
weights. **Please DO NOT commit these to git!** The default names for these files are in the gitignore,
but please be careful.

## Data generation
Run `generate_data.py` to process scores from composers in music21's database into the model's language.
This script produces `dataset/.jsonl` for each composer. To create a training dataset, use
`cat` to merge the composers you want into one file called `data.jsonl`.
```
cat bach.jsonl mozart.jsonl > data.jsonl
```
Filenames can be repeated to include them multiple times. This can be used to influence the composition
of the dataset, or balance it. For instance `palestrina.jsonl` is the largest by a wide margin, so other
composers might need to be repeated to get more diverse and interesting generations.

## Training
Use `train.py` to train the model. This script finetunes Eleuther's Pythia model for the score text
format. Right now we use Pythia for a few reasons:
* Easy to access from Hugging Face, no downloading delta weights and merging like the official llama.
* Offers many scaling options for different hardware configurations. 70m is very manageable and can
be trained on average hardware without `deepspeed`
* Pythia is trained on [The Pile](https://arxiv.org/pdf/2101.00027.pdf), which contains code text. Pretraining
on code (or other formal languages like mathematics) is useful if the score text format resembles code.

Check out one of the [pythia model cards](https://huggingface.co/EleutherAI/pythia-12b-v0) for more info.

Parameters are selected in `train_cfg.json`.
In particular, check out these parameters:
```
model_name:
Set by default to EleutherAI/pythia-70m-deduped, the smallest Pythia variant.
If you have enough VRAM, you can select 160m, 410m, etc.
max_length:
The sequence length the model can process, in tokens. Reduce if you're running out of VRAM.
batchsize:
How many items to process in a batch. Use the largest batchsize your VRAM allows. Reduce if
out of VRAM. When reducing batchsize, you may also need to reduce lr.
```
TODO:
* Support checkpointing and checkpoint loading.

### Deepspeed
Deepspeed implements optimizations for training that can significantly reduce the VRAM
requirement, at the cost of additional main RAM use. To use deepspeed,
```
accelerate launch --config_file accel_config.yaml train.py
```
If you receive an error about CPU Adam like this: `AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'`
Then try reinstalling deepspeed with these arguments: `DS_BUILD_CPU_ADAM=1 BUILD_UTILS=1 pip install deepspeed -U `
I had this problem on ROCm setups but not CUDA.

## Inference
Use `infer.py` to output text with a trained model. This scripts loads the model
in a directory named `score-lm`, which is created by the training script when it finishes.
```
python infer.py --help
python infer.py -p "|"
```
The `-p` argument is the text to start generating from. Since a `|` is the start of a new measure,
it is the usual way to start generation of a new score. You could also manually enter your own score,
and let it continue from that.

`temperature & topk` are the key parameters to modify during training. Both will result
in more randomness when set higher. At low temperatures (<0.5) you'll likely get the same
chord repeated over and over, and at higher temperatures you'll get scores that Nancarrow would love.
TODO:
* expose these on the CLI

## Reconstructing the score.
`infer.py` will take the generated output from the model and try to reconstruct a musical score
out of it. Naturally, the model output is imperfect so the decoder will ignore and fix certain
errors such as:
* Invalid note names (like H) or octaves (like -6) are ignored.
This is done on a note by note basis within a chord, so a single bad note doesn't drop the entire chord.
* Invalid durations (anything too small for musescore to display) are changed to 0.5.
* Nothing is done to duplicates, but it's probably worth fixing. Musescore throws a warning,
but renders them anyway.
* We simplify ties to be only `start` or `None`, ignoring the `continue` or `stop` codes in music21.
* Currently, we do not follow the measure tokens generated by the model, and instead append
everything into one long stream and let music21 figure out where the measure boundaries should be.
* It's a future goal to benchmark the model's ability to generate notes that properly sum to a measure.
* Non-power-of-two duration (like triplets) handling seems bugged and needs more investigation, but
there aren't many triplets in the training data at the moment. They render but Musecore produces
warnings.

## WSL2
Deepspeed hits an out of memory issue on WSL2, which is discussed in [this thread](https://github.com/microsoft/DeepSpeed/issues/2977)
Apply the workaround suggest, removing the pin memory call.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyflores/scorelm

Awesome Lists containing this project

README