https://github.com/kurama622/picopebble
A lightweight distributed machine learning training framework for beginners
https://github.com/kurama622/picopebble
Last synced: 6 months ago
JSON representation
A lightweight distributed machine learning training framework for beginners
- Host: GitHub
- URL: https://github.com/kurama622/picopebble
- Owner: Kurama622
- License: mit
- Created: 2024-01-28T02:18:57.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-11T14:59:22.000Z (about 1 year ago)
- Last Synced: 2025-04-07T03:24:58.847Z (6 months ago)
- Language: C++
- Homepage:
- Size: 3.16 MB
- Stars: 34
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
**ENGLISH** | **[δΈζη](./README_CN.md)**
# Introduction
PicoPebble is a lightweight distributed machine learning training framework for beginners. It uses MPI to pass parameters and update gradients between multiple machines, and it also allows for training on a single machine. The features currently supported by PicoPebble include:
- Synchronous training
- Asynchronous training
- Data parallelism
- Pipeline model parallelismThere are also several features in the development pipeline:
- Tensor model parallelism
- Passing parameters through Gloo
- Disaster recovery# Dependency
Currently, PicoPebble relies on MPI for parameter synchronization, so you need to install OpenMPI. Please note that you should not install both OpenMPI and MPICH at the same time.
## Centos 8
```bash
sudo yum install openmpi-devel -y
```## Ubuntu
```bash
sudo apt install openmpi-bin libopenmpi-dev
```## Archlinux
```bash
sudo pacman -S openmpi
```## Docker
```bash
docker build -t picopebble -f Dockerfile .# for podman
# podman build -t picopebble -f Dockerfile .`
```# Build && run
## single-node or single-machine
```bash
# ./build_run.sh
./build_run.sh 1
```## multi-node
```bash
./build_run.sh 3
```# Reference
- [https://foundationsofdl.com/2022/02/12/neural-network-from-scratch-part-5-c-deep-learning-framework-implementation/](https://foundationsofdl.com/2022/02/12/neural-network-from-scratch-part-5-c-deep-learning-framework-implementation/)