Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wh1isper/davincirunsdk
为类Jupyter交互式环境提供Notebook友好的Ascend分布式训练SDK
https://github.com/wh1isper/davincirunsdk
Last synced: about 1 month ago
JSON representation
为类Jupyter交互式环境提供Notebook友好的Ascend分布式训练SDK
- Host: GitHub
- URL: https://github.com/wh1isper/davincirunsdk
- Owner: Wh1isper
- License: mit
- Created: 2022-07-25T03:24:19.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2022-10-28T14:56:06.000Z (about 2 years ago)
- Last Synced: 2024-04-24T00:29:44.467Z (7 months ago)
- Language: Python
- Size: 5.87 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![](https://img.shields.io/github/license/wh1isper/davincirunsdk)
![](https://img.shields.io/github/v/release/wh1isper/davincirunsdk)
![](https://img.shields.io/pypi/dm/davincirunsdk)
![](https://img.shields.io/github/last-commit/wh1isper/davincirunsdk)
![](https://img.shields.io/pypi/pyversions/davincirunsdk)
![](https://readthedocs.org/projects/davincirunsdk/badge/?version=latest)# davincirunsdk
此项目荣获[2022启智社区优秀开源项目](https://openi.org.cn/html/2022/notices_1024/653.html)
为类Jupyter交互式环境提供Notebook友好的Ascend分布式训练SDK,推荐在[AI靶场](https://git.openi.org.cn/datai/datai_tutorial)使用本脚本进行python脚本训练
## 特性
- `davincirun`命令,支持Modelarts Ascend训练作业,不再需要打包davinci文件夹
- `init_rank_table` 支持转换v0.1 hccl json -> v1.0 hccl json
- `start_distributed_train`, `wait_distributed_train` 根据v1.0 hccl json启动并等待分布式训练完成
- notebook友好,`output_notebook=True`支持在notebook中输出分布式训练日志更多见[SDK文档](http://davincirunsdk.readthedocs.io/)
![](./image/example.gif)
## 安装及使用
### 安装
`$pip install davincirunsdk`
### 调试环境(开发环境)
以[MindSpore1.5分布式训练教程](https://www.mindspore.cn/tutorials/zh-CN/r1.5/intermediate/distributed_training/distributed_training_ascend.html) 为例,使用本SDK可改造为
```python
import osos.environ['DATA_PATH'] = '/cache/cifar-10-batches-bin'
from davincirunsdk import start_and_wait_distributed_traincmd = ['python', 'resnet50_distributed_training.py']
start_and_wait_distributed_train(cmd, output_notebook=True)
```### 训练作业
以下命令将等价于`python davincirun.py train.py`
```bash
$davincirun train.py
```或在python文件中使用:
```python
from davincirunsdk import init_rank_table, start_and_wait_distributed_traininit_rank_table()
start_and_wait_distributed_train(['python', 'train.py'])
```### AI靶场全量运行
同[调试环境(开发环境)](#调试环境(开发环境)),不需要额外修改
更多细节见[案例](#案例)
## LICENSE
MIT License
## 开发指南
### 克隆并安装
```bash
$git clone https://git.openi.org.cn/Wh1isper/davincirunsdk.git
$cd davincirunsdk
$pip install -e ./
```### 单元测试
```bash
$pip install pytest
$pytest .
```### 文档构建
```bash
$pip install Sphinx sphinx-rtd-theme
$make html
```### 项目构成
`notebook`文件夹下是针对notebook运行环境修改的davincirun文件,以及sdk入口
`davincirunsdk`目录下,除了`notebook`外的文件,是原有davincurun代码,进行了python包改造,并按需启用了moxing对obs文件的支持
各项文档和API功能还在完善中,欢迎各位在issue中进行反馈
## 案例
[AI靶场分布式训练支持](https://git.openi.org.cn/Wh1isper/distrubuted-trainning-on-datai)
[盘古协同计算](https://git.openi.org.cn/PCL-Platform.Intelligence/AISynergy/src/branch/V2.0.0/examples/Pangu-AISync)
## 鸣谢
感谢华为云、鹏城实验室、AI靶场对本项目的大力支持和帮助,该项目已贡献给[AI靶场](https://git.openi.org.cn/datai)
# LINK
不如在这里给我们一个Star🌟鼓励一下!
🌟🌟🌟[Github](https://github.com/Wh1isper/davincirunsdk) 🌟🌟🌟
🌟🌟🌟[OpenI](https://git.openi.org.cn/Wh1isper/davincirunsdk) 🌟🌟🌟