Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/idorce/torchtrain
A small tool for PyTorch training
https://github.com/idorce/torchtrain
deep-learning machine-learning ml neural-network python pytorch pytorch-tool pytorch-training
Last synced: 3 months ago
JSON representation
A small tool for PyTorch training
- Host: GitHub
- URL: https://github.com/idorce/torchtrain
- Owner: idorce
- License: mit
- Created: 2019-10-25T17:28:31.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-03-14T13:39:43.000Z (almost 4 years ago)
- Last Synced: 2024-09-13T13:34:51.375Z (3 months ago)
- Topics: deep-learning, machine-learning, ml, neural-network, python, pytorch, pytorch-tool, pytorch-training
- Language: Python
- Homepage:
- Size: 122 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🔥 torchtrain 💪
A small tool for PyTorch training.
## Features
- Avoid boilerplate code for training.
- Stepwise training.
- Automatic TensorBoard logging, and tqdm bar.
- Count model parameters and save hyperparameters.
- DataParallel.
- Early stop.
- Save and load checkpoint. Continue training.
- Catch out of memory exceptions to avoid breaking training.
- Gradient accumulation.
- Gradient clipping.
- Only run few epochs, steps and batches for code test.## Install
```
pip install torchtrain
```## Example
Check doc string of [`Trainer` class](https://github.com/idorce/torchtrain/blob/master/torchtrain/trainer.py) for detailed configurations.
An incomplete minimal example:
```python
data_iter = get_data()
model = Bert()
optimizer = Adam(model.parameters(), lr=cfg["lr"])
criteria = {"loss": AverageAggregator(BCELoss())}
trainer = Trainer(model, data_iter, criteria, cfg, optimizer)
trainer.train(stepwise=True)
```Or this version:
```python
from argparse import ArgumentParserfrom sklearn.model_selection import ParameterGrid
from torch.optim import Adam
from torch.optim.lr_scheduler import LambdaLR
from transformers import AutoModel, BertTokenizerfrom data.load import get_batch_size, get_data
from metrics import BCELoss
from models import BertSumExt
from torchtrain import Trainer
from torchtrain.metrics import AverageAggregator
from torchtrain.utils import set_random_seedsdef get_args():
parser = ArgumentParser()
parser.add_argument("--seed", type=int, default=233666)
parser.add_argument("--run_ckp", default="")
parser.add_argument("--run_dataset", default="val")
parser.add_argument("--batch_size", type=int, default=64)
parser.add_argument("--warmup", type=int, default=10000)
parser.add_argument("--stepwise", action="store_false")
# torchtrain cfgs
parser.add_argument("--max_n", type=int, default=50000)
parser.add_argument("--val_step", type=int, default=1000)
parser.add_argument("--save_path", default="/tmp/runs")
parser.add_argument("--model_name", default="BertSumExt")
parser.add_argument("--cuda_list", default="2,3")
parser.add_argument("--grad_accum_batch", type=int, default=1)
parser.add_argument("--train_few", action="store_true")
return vars(parser.parse_args())def get_param_grid():
param_grid = [
{"pretrained_model_name": ["voidful/albert_chinese_tiny"], "lr": [6e-5]},
]
return ParameterGrid(param_grid)def get_cfg(args={}, params={}):
cfg = {**args, **params}
# other cfgs
return cfgdef run(cfg):
set_random_seeds(cfg["seed"])
tokenizer = BertTokenizer.from_pretrained(cfg["pretrained_model_name"])
bert = AutoModel.from_pretrained(cfg["pretrained_model_name"])
data_iter = get_data(
cfg["batch_size"], tokenizer, bert.config.max_position_embeddings
)
model = BertSumExt(bert)
optimizer = Adam(model.parameters(), lr=cfg["lr"])
scheduler = LambdaLR(
optimizer,
lambda step: min(step ** (-0.5), step * (cfg["warmup"] ** (-1.5)))
if step > 0
else 0,
)
criteria = {"loss": AverageAggregator(BCELoss())}
trainer = Trainer(
model,
data_iter,
criteria,
cfg,
optimizer,
scheduler,
get_batch_size=get_batch_size,
)
if cfg["run_ckp"]:
return trainer.test(cfg["run_ckp"], cfg["run_dataset"])
return trainer.train(stepwise=cfg["stepwise"])def main():
param_grid = get_param_grid()
for i, params in enumerate(param_grid):
print("Config", str(i + 1), "/", str(len(param_grid)))
cfg = get_cfg(get_args(), params)
metrics = run(cfg)
print("Best metrics:", metrics)if __name__ == "__main__":
main()
```