{"id":19054679,"url":"https://github.com/ssbuild/deep_training","last_synced_at":"2025-05-16T16:03:24.019Z","repository":{"id":63268935,"uuid":"561596976","full_name":"ssbuild/deep_training","owner":"ssbuild","description":"deep learning","archived":false,"fork":false,"pushed_at":"2025-03-09T18:55:10.000Z","size":2258,"stargazers_count":150,"open_issues_count":7,"forks_count":17,"subscribers_count":6,"default_branch":"dev","last_synced_at":"2025-03-30T13:07:38.040Z","etag":null,"topics":["adalora","alora","deep","deep-training","lora","models","torch","training","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ssbuild.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-04T03:15:39.000Z","updated_at":"2025-03-09T15:38:37.000Z","dependencies_parsed_at":"2024-02-14T18:46:35.800Z","dependency_job_id":"cb0ec492-bce6-452d-9d49-685bfaf43f94","html_url":"https://github.com/ssbuild/deep_training","commit_stats":{"total_commits":307,"total_committers":3,"mean_commits":"102.33333333333333","dds":0.3908794788273615,"last_synced_commit":"484397014a037bc35194b82593aae8b56148185c"},"previous_names":[],"tags_count":35,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssbuild%2Fdeep_training","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssbuild%2Fdeep_training/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssbuild%2Fdeep_training/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ssbuild%2Fdeep_training/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ssbuild","download_url":"https://codeload.github.com/ssbuild/deep_training/tar.gz/refs/heads/dev","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247492513,"owners_count":20947544,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adalora","alora","deep","deep-training","lora","models","torch","training","transformers"],"created_at":"2024-11-08T23:39:20.830Z","updated_at":"2025-05-16T16:03:24.004Z","avatar_url":"https://github.com/ssbuild.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## transformer is all you need.\r\n- deep training framework based on transformers\r\n\r\n## install and download\r\n\r\n- pip install -U deep_training\r\n- 源码安装\r\n```text\r\npip install -U git+https://github.com/ssbuild/deep_training.git\r\n```\r\n\r\n- 源码重装\r\n```text\r\npip install -U git+https://github.com/ssbuild/deep_training.git --no-deps --force-reinstall\r\n```\r\n\r\n## support python 3.10 3.11 3.12 3.13\r\n  \r\n## update\r\n- \u003cstrong\u003e2024-06-10\u003c/strong\u003e\r\n   - 0.3.1 support glm4 https://github.com/ssbuild/glm4_finetuning \r\n   glm4v https://github.com/ssbuild/glm4v_finetuning \r\n\r\n- \u003cstrong\u003e2024-02-15\u003c/strong\u003e\r\n   - 0.2.11 support internlm2 https://github.com/ssbuild/internlm2_finetuning\r\n\r\n- \u003cstrong\u003e2023-12-02\u003c/strong\u003e\r\n   - 0.2.10 update qwen model for 1.8b 7b 14b 72b\r\n   - 0.2.10.post0 fix qwen attention_mask\r\n\r\n\r\n- \u003cstrong\u003e2023-11-13\u003c/strong\u003e\r\n  - 0.2.9 release\r\n  - 0.2.9.post0 support chatglm3-6b-32k\r\n\r\n- \u003cstrong\u003e2023-10-22\u003c/strong\u003e\r\n  - 0.2.7\r\n    - support clip 完整训练 https://github.com/ssbuild/clip_finetuning \r\n    - support asr seq2seq 完整训练 https://github.com/ssbuild/asr_seq2seq_finetuning\r\n    - support asr ctc 完整训练 https://github.com/ssbuild/asr_ctc_finetuning\r\n    - support object detection 完整训练 https://github.com/ssbuild/detection_finetuning\r\n    - support semantic segmentation 完整训练 https://github.com/ssbuild/semantic_segmentation\r\n    - support chatglm3  完整训练 https://github.com/ssbuild/chatglm3_finetuning\r\n  - 0.2.7.post1\r\n    - support skywork 完整训练 https://github.com/ssbuild/skywork_finetuning\r\n  - 0.2.7.post2\r\n    - support bluelm 完整训练 https://github.com/ssbuild/bluelm_finetuning\r\n  - 0.2.7.post3\r\n    - support yi 完整训练 https://github.com/ssbuild/yi_finetuning\r\n  - 0.2.7.post4\r\n    - fix dataclass serialization in deepspeed\r\n    \r\n- \u003cstrong\u003e2023-10-16\u003c/strong\u003e\r\n  - 0.2.6 support muti-model\r\n    - visualglm 完整训练 https://github.com/ssbuild/visualglm_finetuning  \r\n    - qwen-vl 完整训练  https://github.com/ssbuild/qwen_vl_finetuning \r\n\r\n- \u003cstrong\u003e2023-10-07\u003c/strong\u003e\r\n  - 0.2.5 \r\n    - support colossalai 训练 ，策略 ddp ,gemini,gemini_auto，zero2,zero2_cpu,3d\r\n  - 0.2.5.post2 \r\n    - support accelerator 训练 , fix some bug in accelerator and hf trainer\r\n  - 0.2.5.post4 \r\n    - fix trainer some bug\r\n\r\n- \u003cstrong\u003e2023-09-26\u003c/strong\u003e\r\n  - 0.2.4 \r\n    - support transformers trainer and qwen-7b 新版 和 qwen-14b ， 旧版不再支持，旧版可以安装 deep_training \u003c= 0.2.3 \r\n  - 0.2.4.post3 \r\n    - support ia3 finetuning\r\n\r\n- \u003cstrong\u003e2023-09-21\u003c/strong\u003e\r\n  - 0.2.3 \r\n    - support dpo 完整训练 [dpo_finetuning](https://github.com/ssbuild/dpo_finetuning)\r\n\r\n- \u003cstrong\u003e2023-09-06\u003c/strong\u003e\r\n  - 0.2.2 \r\n    - 调整baichuan模块命名 adjust baichuan v2 完整训练 [baichuan2_finetuning](https://github.com/ssbuild/baichuan2_finetuning)\r\n  - 0.2.2.post0 \r\n    - fix baichuan ptv2\r\n  - 0.2.2.post1 \r\n    - fix rwkv4 a bug\r\n  - 0.2.2.post4 \r\n    - fix llama and baichuan mask bug\r\n\r\n- \u003cstrong\u003e2023-09-02\u003c/strong\u003e\r\n  - 0.2.1 \r\n    - fix llama model\r\n\r\n- \u003cstrong\u003e2023-08-23\u003c/strong\u003e\r\n  - 0.2.0 \r\n    - release lora内部调整\r\n  - 0.2.0.post1 \r\n    - add xverse-13b chat  and fix muti lora\r\n  \r\n- \u003cstrong\u003e2023-08-16\u003c/strong\u003e\r\n  - 0.1.21 \r\n    - release 增加 5种 rope scale 方法 ， fix chatglm2-6b-32k 推理 rope_ratio\r\n  - 0.1.21.post1 \r\n    - fix moss rope\r\n  \r\n- \u003cstrong\u003e2023-08-09\u003c/strong\u003e\r\n  - 0.1.17 \r\n    - update qwen model\r\n  - 0.1.17.post0 \r\n    - update qwen config\r\n\r\n- \u003cstrong\u003e2023-08-08\u003c/strong\u003e\r\n  - 0.1.15.rc2 \r\n    - support XVERSE-13B  完整训练 [xverse_finetuning](https://github.com/ssbuild/xverse_finetuning)\r\n\r\n  \r\n- \u003cstrong\u003e2023-08-05\u003c/strong\u003e\r\n  - 0.1.13\r\n    - support qwen(千问)  完整训练 [qwen_finetuning](https://github.com/ssbuild/qwen_finetuning)\r\n  - 0.1.13.post2 \r\n    - fix quantization bug\r\n  - 0.1.14 \r\n    - release fix qwen stream\r\n\r\n- \u003cstrong\u003e2023-07-18\u003c/strong\u003e\r\n  - 0.1.12\r\n    - support InternLm(书生)  完整训练 [internlm_finetuning](https://github.com/ssbuild/internlm_finetuning)\r\n    - support baichuan v2 完整训练 [baichuan2_finetuning](https://github.com/ssbuild/baichuan2_finetuning)\r\n    - fix adalora some bugs\r\n    - support rwkv world training\r\n\r\n  \r\n- \u003cstrong\u003e2023-07-04\u003c/strong\u003e\r\n  - 0.1.11 rc1 \r\n    - support baichuan model  完整训练 [baichuan_finetuning](https://github.com/ssbuild/baichuan_finetuning)\r\n    - support chatglm2 model  完整训练 [chatglm2_finetuning](https://github.com/ssbuild/chatglm2_finetuning)\r\n  - 0.1.11  \r\n    - fix baichuan and chatglm2 some bugs \r\n    - support conv2d for lora \r\n    - support arrow parquet dataset\r\n    \r\n- \u003cstrong\u003e2023-06-06\u003c/strong\u003e\r\n\r\n  \r\n- \u003cstrong\u003e2023-06-06\u003c/strong\u003e\r\n  - 0.1.10 \r\n    - release add qlora and support more optimizer and scheduler\r\n    - support lora prompt for deepspeed training\r\n    - support rwkv4  完整训练 [rwkv_finetuning](https://github.com/ssbuild/rwkv_finetuning)\r\n  - 0.1.10.post0 \r\n     - fix package setup for cpp and cu code for rwkv4\r\n  - 0.1.10.post1 \r\n     - fix infer for rwkv4\r\n\r\n\r\n- \u003cstrong\u003e2023-05-24\u003c/strong\u003e\r\n  - 0.1.8  \r\n    - fix load weight in prompt_tuning,p_tuning,prefix_tuning,adaption_prompt\r\n\r\n- \u003cstrong\u003e2023-05-19\u003c/strong\u003e\r\n  - 0.1.7 \r\n    - fix 0.1.5 rl bugs\r\n  - 0.1.7.post1 \r\n    - fix chatglm-6b-int4,chatglm-6b-int4 p-tuning-v2 training , fix ilql lightning import\r\n    - fix load weight in prompt_tuning,p_tuning,prefix_tuning,adaption_prompt\r\n  \r\n- \u003cstrong\u003e2023-05-10\u003c/strong\u003e\r\n  - 0.1.5 \r\n    - fix lora v2 modules_to_save 自定义额外训练模块\r\n    - support reward ppo  llm 完整训练 [rlhf_llm](https://github.com/ssbuild/rlhf_llm)\r\n    - support reward ppo  chatglm 完整训练 [rlhf_chatglm](https://github.com/ssbuild/rlhf_chatglm)\r\n    - support reward ppo  chatyuan 完整训练 [rlhf_chatyuan](https://github.com/ssbuild/rlhf_chatyuan)\r\n  - 0.1.5.post2 release\r\n    - fix prompt modules_to_save 自定义额外训练模块\r\n    - support ilql 离线模式训练 ilql  完整训练 [rlhf_llm](https://github.com/ssbuild/rlhf_llm)\r\n  - 0.1.5.post4 release\r\n    - fix opt model hidden_size for ppo ilql \r\n    - fix ppotrainer ilqltrainer deepspeed save weight\r\n    - import AdmaW from transformers or but torch firstly\r\n  \r\n- \u003cstrong\u003e2023-05-02\u003c/strong\u003e\r\n  - 0.1.4 \r\n    - support prompt_tuning,p_tuning,prefix_tuning,adaption_prompt\r\n\r\n- \u003cstrong\u003e2023-04-21\u003c/strong\u003e\r\n  - 0.1.3rc0 \r\n    - support moss chat模型 完整训练参考 [moss_finetuning](https://github.com/ssbuild/moss_finetuning)\r\n    - moss 量化int4 int8推理\r\n  - 0.1.3.post0 \r\n    - 新版本基于lightning, pytorch-lightning 更名 lightning,分离numpy-io模块\r\n\r\n\r\n\r\n\r\n- \u003cstrong\u003e2023-04-11\u003c/strong\u003e\r\n  - 0.1.2 \r\n    - 重构lora v2, 增加adalora\r\n  - 0.1.2.post0 \r\n    - fix lova v1,lova v2 load_in_8bit\r\n- \u003cstrong\u003e2023-04-07\u003c/strong\u003e\r\n  - deep_training 0.1.1 \r\n    - update chatglm config \r\n- \u003cstrong\u003e2023-04-02\u003c/strong\u003e\r\n  - release 0.1.0 and lightning \u003e= 2\r\n- \u003cstrong\u003e2023-03-15\u003c/strong\u003e\r\n  - 0.0.18\r\n    - support ChatGLM模型(稳定版\u003e=0.0.18.post7) 完整训练参考 [chatglm_finetuning](https://github.com/ssbuild/chatglm_finetuning)\r\n  - fix deepspeed进程数据平衡 \r\n  - 0.0.18.post9 \r\n    - 增加流式输出接口stream_chat接口\r\n  - 0.0.20 ChatGLM lora \r\n    - 加载权重继续训练 ， 修改数据数据编码 ，权重自适应\r\n  - 0.0.21.post0 \r\n    - fix ChatGLM deepspeed stage 3 权重加载\r\n- \u003cstrong\u003e2023-03-09\u003c/strong\u003e\r\n  - 增加LLaMA 模型(并行版) 完整训练参考 [llama_finetuning](https://github.com/ssbuild/llama_finetuning)\r\n- \u003cstrong\u003e2023-03-08\u003c/strong\u003e\r\n  - 增加LLaMA 模型(非模型并行版) 完整训练参考 [poetry_training](https://github.com/ssbuild/poetry_training)\r\n- \u003cstrong\u003e2023-03-02\u003c/strong\u003e\r\n  - 增加loRA 训练 , lion,lamb优化器 , 完整训练参考 [chatyuan_finetuning](https://github.com/ssbuild/chatyuan_finetuning)\r\n- \u003cstrong\u003e2023-02-15\u003c/strong\u003e\r\n  - 增加诗歌PaLM预训练模型 \r\n- \u003cstrong\u003e2023-02-13\u003c/strong\u003e\r\n  - 增加中文语法纠错模型gector, seq2seq语法纠错模型 \r\n- \u003cstrong\u003e2023-02-09\u003c/strong\u003e\r\n  - 增加诗歌t5decoder预训练, 诗歌laMDA预训练模型 , t5encoder 预训练模型\r\n- \u003cstrong\u003e2023-02-07\u003c/strong\u003e\r\n  - 增加层次分解位置编码选项，让transformer可以处理超长文本\r\n- \u003cstrong\u003e2023-01-24\u003c/strong\u003e\r\n  - 增加诗歌gpt2预训练,诗歌t5预训练，诗歌unilm预训练\r\n- \u003cstrong\u003e2023-01-20\u003c/strong\u003e\r\n  - 增加对抗训练 FGM, FGSM_Local,FreeAT, PGD, FGSM,FreeAT_Local, 其中FreeAT推荐使用FreeAT_Local,FGSM 推荐使用 FGSM_Local\r\n- \u003cstrong\u003e2023-01-19\u003c/strong\u003e\r\n  - 增加promptbertcse监督和非监督模型\r\n- \u003cstrong\u003e2023-01-16\u003c/strong\u003e\r\n  - 增加diffcse 监督和非监督模型\r\n- \u003cstrong\u003e2023-01-13\u003c/strong\u003e\r\n  - 增加ESimcse 模型\r\n- \u003cstrong\u003e2023-01-11\u003c/strong\u003e\r\n  - 增加TSDAE句向量模型\r\n- \u003cstrong\u003e2023-01-09\u003c/strong\u003e\r\n  - 增加infonce监督和非监督,simcse监督和非监督,SPN4RE关系模型抽取\r\n- \u003cstrong\u003e2023-01-06\u003c/strong\u003e\r\n  - 增加onerel关系模型抽取，prgc关系模型抽取，pure实体模型提取\r\n- \u003cstrong\u003e2022-12-24\u003c/strong\u003e\r\n  - 增加unilm模型蒸馏和事件抽取模型\r\n- \u003cstrong\u003e2022-12-16\u003c/strong\u003e\r\n  - crf_cascad crf级联抽取实体\r\n  - span ner 可重叠多标签，非重叠多标签两种实现方式抽取实体\r\n  - mhs_ner 多头选择实体抽取模型\r\n  - w2ner 实体抽取模型\r\n  - tplinkerplus 实体抽取\r\n  - tpliner 关系抽取模型\r\n  - tplinkerplus 关系抽取模型\r\n  - mhslinker 多头选择关系抽取模型\r\n\r\n- \u003cstrong\u003e2022-11-17\u003c/strong\u003e: \r\n  - simcse-unilm 系列\r\n  - simcse-bert-wwm 系列 \r\n  - tnews circle loss\r\n  - afqmc siamese net similar\r\n- \u003cstrong\u003e2022-11-15\u003c/strong\u003e: \r\n  - unilm autotitle seq2seq autotitle\r\n  - 普通分类,指针提取命名实体,crf提取命名实体\r\n  - prefixtuning 分类 , prefixtuning 分类 , prefixtuning 指针提取命名实体 , prefixtuning crf 提取命名实体\r\n- \u003cstrong\u003e2022-11-12\u003c/strong\u003e: \r\n  - gplinker (全局指针提取)\r\n  - casrel (A Novel Cascade Binary Tagging Framework for Relational Triple Extraction 参考 https://github.com/weizhepei/CasRel)\r\n  - spliner (指针提取关系 sigmoid pointer or simple pointer)\r\n- \u003cstrong\u003e2022-11-11\u003c/strong\u003e: \r\n  - cluener_pointer 中文命名实体提取 和 cluener crf 中文命名实体提取\r\n  - tnews 中文分类\r\n- \u003cstrong\u003e2022-11-06\u003c/strong\u003e: \r\n  - mlm,gpt2,t5等模型预训练任务\r\n\r\n\r\n\r\n## tasks\r\n- \u003cstrong\u003e预训练\u003c/strong\u003e:\r\n  - \u003cstrong\u003e 数据参考 \u003c/strong\u003e [THUCNews新闻文本分类数据集的子集](https://pan.baidu.com/s/1eS-QZpWbWfKtdQE4uvzBrA?pwd=1234)\r\n  - \u003cstrong\u003emlm预训练\u003c/strong\u003e例子 bert roberta等一些列中文预训练 \r\n  - \u003cstrong\u003elm预训练\u003c/strong\u003e例子 gpt2等一些列中文预训练 \r\n  - \u003cstrong\u003eseq2seq 预训练\u003c/strong\u003e例子 t5 small等一些列中文预训练 \u0026nbsp;\u0026nbsp;\r\n  - \u003cstrong\u003eunilm 预训练\u003c/strong\u003e例子 unilm bert roberta 等一些列中文预训练 \u0026nbsp;\u0026nbsp\r\n- \u003cstrong\u003e中文分类\u003c/strong\u003e:\r\n  - 例子 \u003cstrong\u003etnews 中文分类\u003c/strong\u003e\r\n- \u003cstrong\u003e命名实体提取\u003c/strong\u003e: \r\n  - \u003cstrong\u003e参考数据\u003c/strong\u003e  cluner\r\n  - \u003cstrong\u003ecluener 全局指针提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener crf提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener crf prompt提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener mhs ner多头选择提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener span指针提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener crf 级联提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener tplinkerplus 提取\u003c/strong\u003e\r\n  - \u003cstrong\u003epure 提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecluener w2ner 提取\u003c/strong\u003e\r\n- \u003cstrong\u003e关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003e参考数据\u003c/strong\u003e  [duie和法研杯第一阶段数据](https://github.com/ssbuild/cail2022-info-extract)\r\n  - \u003cstrong\u003egplinker 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003ecasrel 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003espliner 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003emhslinker 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003etplinker 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003etplinkerplus 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003eonerel 关系抽取\u003c/strong\u003e\r\n  - \u003cstrong\u003eprgc 关系提取\u003c/strong\u003e\r\n  - \u003cstrong\u003espn4re 关系提取\u003c/strong\u003e\r\n- \u003cstrong\u003e事件提取\u003c/strong\u003e\r\n  - \u003cstrong\u003e参考数据\u003c/strong\u003e duee事件抽取 [DuEE v1.0数据集](https://aistudio.baidu.com/aistudio/competition/detail/46/0/datasets)\r\n  - \u003cstrong\u003egplinker 事件提取\u003c/strong\u003e\r\n- \u003cstrong\u003e prompt 系列\u003c/strong\u003e: \r\n  - 例子 \u003cstrong\u003eprefixprompt tnews中文分类\u003c/strong\u003e\r\n  - 例子 \u003cstrong\u003eprefixtuning tnews 中文分类\u003c/strong\u003e\r\n  - 例子 \u003cstrong\u003eprefixtuning cluener 命名实体全局指针提取\u003c/strong\u003e\r\n  - 例子 \u003cstrong\u003eprefixtuning cluener 命名实体crf提取\u003c/strong\u003e\r\n  - 例子 \u003cstrong\u003eprompt mlm 自行构建数据模板集，训练参考 pretrain/mlm_pretrain\u003c/strong\u003e\r\n  - 例子 \u003cstrong\u003eprompt lm  自行构建数据模板集，训练参考 pretrain/seq2seq_pretrain ,  pretrain/lm_pretrain\u003c/strong\u003e\r\n- \u003cstrong\u003e simcse 系列\u003c/strong\u003e: \r\n  - \u003cstrong\u003esimcse-unilm 系列\u003c/strong\u003e  例子 unilm+simce  \u0026nbsp;\u0026nbsp; \r\n  参考数据\u0026nbsp;\u0026nbsp; [THUCNews新闻文本分类数据集的子集](https://pan.baidu.com/s/1eS-QZpWbWfKtdQE4uvzBrA?pwd=1234)\r\n  - \u003cstrong\u003esimcse-bert-wwm 系列\u003c/strong\u003e 例子 mlm+simcse \u0026nbsp;\u0026nbsp;\r\n  参考数据\u0026nbsp;\u0026nbsp; [THUCNews新闻文本分类数据集的子集](https://pan.baidu.com/s/1eS-QZpWbWfKtdQE4uvzBrA?pwd=1234)\r\n- \u003cstrong\u003e sentense embeding\u003c/strong\u003e: \r\n  - \u003cstrong\u003ecircle loss \u003c/strong\u003e 例子 tnews circle loss\r\n  - \u003cstrong\u003esiamese net \u003c/strong\u003e 例子 afqmc siamese net similar\r\n\r\n\r\n\r\n## optimizer\r\n```text\r\n   lamb,adma,adamw_hf,adam,adamw,adamw_torch,adamw_torch_fused,adamw_torch_xla,adamw_apex_fused,\r\n   adafactor,adamw_anyprecision,sgd,adagrad,adamw_bnb_8bit,adamw_8bit,lion,lion_8bit,lion_32bit,\r\n   paged_adamw_32bit,paged_adamw_8bit,paged_lion_32bit,paged_lion_8bit,\r\n   lamb_fused_dp adagrad_cpu_dp adam_cpu_dp adam_fused_dp\r\n```\r\n\r\n## scheduler\r\n```text\r\n  linear,WarmupCosine,CAWR,CAL,Step,ReduceLROnPlateau, cosine,cosine_with_restarts,polynomial,\r\n  constant,constant_with_warmup,inverse_sqrt,reduce_lr_on_plateau\r\n```\r\n\r\n  \r\n## works\r\nCreate a model factory, lightweight and efficient training program and make it easier, training model easier to get started.\r\n\r\n\r\n\r\n## 友情链接\r\n\r\n- [pytorch-task-example](https://github.com/ssbuild/pytorch-task-example)\r\n- [chatmoss_finetuning](https://github.com/ssbuild/chatmoss_finetuning)\r\n- [chatglm_finetuning](https://github.com/ssbuild/chatglm_finetuning)\r\n- [chatglm2_finetuning](https://github.com/ssbuild/chatglm2_finetuning)\r\n- [t5_finetuning](https://github.com/ssbuild/t5_finetuning)\r\n- [llm_finetuning](https://github.com/ssbuild/llm_finetuning)\r\n- [llm_rlhf](https://github.com/ssbuild/llm_rlhf)\r\n- [chatglm_rlhf](https://github.com/ssbuild/chatglm_rlhf)\r\n- [t5_rlhf](https://github.com/ssbuild/t5_rlhf)\r\n- [rwkv_finetuning](https://github.com/ssbuild/rwkv_finetuning)\r\n- [baichuan_finetuning](https://github.com/ssbuild/baichuan_finetuning)\r\n- [internlm_finetuning](https://github.com/ssbuild/internlm_finetuning)\r\n- [qwen_finetuning](https://github.com/ssbuild/qwen_finetuning)\r\n- [xverse_finetuning](https://github.com/ssbuild/xverse_finetuning)\r\n- [auto_finetuning](https://github.com/ssbuild/auto_finetuning)\r\n- [aigc_serving](https://github.com/ssbuild/aigc_serving)\r\n## \r\n    纯粹而干净的代码\r\n\r\n## 协议\r\n本仓库的代码依照 Apache-2.0 协议开源\r\n\r\n\r\n## discuss\r\nQQ group：821096761\r\n\r\n\r\n## Star History\r\n\r\n[![Star History Chart](https://api.star-history.com/svg?repos=ssbuild/deep_training\u0026type=Date)](https://star-history.com/#ssbuild/deep_training\u0026Date)\r\n\r\n\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fssbuild%2Fdeep_training","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fssbuild%2Fdeep_training","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fssbuild%2Fdeep_training/lists"}