https://github.com/NERSC/pytorch-examples
PyTorch examples for NERSC systems
https://github.com/NERSC/pytorch-examples
Last synced: 3 months ago
JSON representation
PyTorch examples for NERSC systems
- Host: GitHub
- URL: https://github.com/NERSC/pytorch-examples
- Owner: NERSC
- Created: 2018-10-08T19:25:18.000Z (almost 7 years ago)
- Default Branch: main
- Last Pushed: 2024-10-28T23:52:26.000Z (8 months ago)
- Last Synced: 2025-03-04T19:02:08.400Z (4 months ago)
- Language: Jupyter Notebook
- Size: 371 KB
- Stars: 31
- Watchers: 18
- Forks: 12
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# NERSC PyTorch examples
This repository contains some PyTorch example models and training code
with support for distributed training on NERSC systems.The layout of this package can also serve as a template for PyTorch
projects and the provided BaseTrainer and train.py script can be used to
reduce boiler plate.## Package layout
The directory layout of this repo is designed to be flexible:
- Configuration files (in YAML format) go in `configs/`
- Dataset specifications using PyTorch's Dataset API go into `datasets/`
- Model implementations go into `models/`
- Trainer implementations go into `trainers/`. Trainers inherit from
`BaseTrainer` and are responsible for constructing models as well as training
and evaluating them.All examples are run with the generic training script, `train.py`.
## Examples
This package currently contains the following examples:
- CIFAR10 classification with ResNet50 or generic CNN model.
- HEP-CNN classification (https://arxiv.org/abs/1711.03573).
- Minimal Hello World example.## How to run
To run the examples on the Perlmutter supercomputer, you may use the
provided example Slurm batch script:`sbatch -N 4 scripts/train_perlmutter.sh configs/cifar10.yaml`