https://github.com/georgepar/slp
Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning
https://github.com/georgepar/slp
multimodal multimodal-deep-learning multimodal-learning natural-language-processing pytorch pytorch-lightning wandb
Last synced: 21 days ago
JSON representation
Utils and modules for Speech Language and Multimodal processing using pytorch and pytorch lightning
- Host: GitHub
- URL: https://github.com/georgepar/slp
- Owner: georgepar
- License: mit
- Created: 2018-11-21T21:46:19.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-02-16T03:00:13.000Z (over 2 years ago)
- Last Synced: 2024-11-16T03:25:45.998Z (11 months ago)
- Topics: multimodal, multimodal-deep-learning, multimodal-learning, natural-language-processing, pytorch, pytorch-lightning, wandb
- Language: Python
- Homepage:
- Size: 2.02 MB
- Stars: 21
- Watchers: 6
- Forks: 7
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# slp
* **Repo:** [https://github.com/georgepar/slp](https://github.com/georgepar/slp)
* **Documentation:** [https://georgepar.github.io/slp/latest/](https://georgepar.github.io/slp/latest/)slp is a framework for fast and reproducible development of multimodal models, with emphasis on
NLP models.It started as a collection of scripts and code I wrote / collected during my PhD and it evolves
accordingly.As such, the framework is opinionated and it follows a convention over configuration approach.
A heavy emphasis is put on:
- Enforcing best practices and reproducibility of experiments
- Making common things fast at the top-level and not having to go through extensive configuration options
- Remaining extendable. Extensions and modules for more use cases should be easy to add
- Out of the box extensive logging and experiment management
- Separating dirty / scratch code (at the script level) for quick changes and clean / polished code at the library levelThis is currently in alpha release under active development, so things may break and new features
will be added.## Dependencies
We use [Pytorch](https://pytorch.org/) (1.7) and the following libraries
- [Pytorch Lightning](https://pytorch-lightning.readthedocs.io/en/stable/)
- [huggingface/transformers](https://huggingface.co/transformers/)
- [Wandb](https://wandb.ai/)
- Python 3.8## Installation
You can use slp as an external library by installing from PyPI with
```
pip install slp
```Or you can clone it from github
```
git clone git@github.com:georgepar/slp
```We use [poetry](https://python-poetry.org/) for dependency management
When you clone the repo run:
```bash
pip install poetry
poetry install
```and a clean environment with all the dependencies will be created.
You can access it with `poetry shell`.**Note**: Wandb logging is enabled by default. You can either
- Create an account and run `wandb login` when you clone the repo in a new machine to store the results in the online managed environment
- Run `wandb offline` when you clone the repo to disable remote sync or use the `--offline` command
line argument in your scripts
- Use one of their self-hosted solutions## Create a new project based on slp
You can use the template at [https://github.com/georgepar/cookiecutter-pytorch-slp](https://github.com/georgepar/cookiecutter-pytorch-slp)
to create a new project based on slp```
pip install cookiecutter poetry
cookiecutter gh:georgepar/cookiecutter-pytorch-slp
# Follow the interactive configuration and a new folder with the project name you provided will appear
cd $PROJECT_NAME
poetry install # Installs slp and all other dependencies
```And you are good to go. Follow the instructions in the README of the new project you created. Happy coding
## Contributing
You are welcome to open issues / PRs with improvements and bug fixes.
Since this is mostly a personal project based around workflows and practices that work for me, I don't guarantee I will accept every change, but I'm always open to discussion.
If you are going to contribute, please use the pre-commit hooks under `hooks`, otherwise the PR will not go through the CI. And never, ever touch `requirements.txt` by hand, it will automatically be exported from `poetry`
```bash
cat <> .git/hooks/pre-commit
#!/usr/bin/env bashbash hooks/export-requirements-txt
bash hooks/checks
EOTchmod +x .git/hooks/pre-commit # Keep an up-to-date requirements.txt and run Linting, typechecking and tests
ln -s $(pwd)/hooks/commit-msg .git/hooks/commit-msg # Sign-off your commit
```## Cite
If you use this code for your research, please include the following citation
```
@ONLINE {,
author = "Georgios Paraskevopoulos",
title = "slp",
year = "2020",
url = "https://github.com/georgepar/slp"
}
```## Roadmap
* Optuna integration for hyperparameter tuning
* Add dataloaders for popular multimodal datasets
* Add multimodal architectures
* Add RIM, DNC and Kanerva machine implementations
* Write unit tests