{"id":21220728,"url":"https://github.com/ydrmaster/operators","last_synced_at":"2025-07-10T12:31:04.840Z","repository":{"id":243977342,"uuid":"806384013","full_name":"YdrMaster/operators","owner":"YdrMaster","description":"算子库","archived":false,"fork":false,"pushed_at":"2024-06-12T08:20:44.000Z","size":52,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-06-13T10:43:17.224Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/YdrMaster.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-27T05:14:38.000Z","updated_at":"2024-06-12T08:20:48.000Z","dependencies_parsed_at":"2024-06-16T17:48:02.668Z","dependency_job_id":null,"html_url":"https://github.com/YdrMaster/operators","commit_stats":null,"previous_names":["ydrmaster/operators"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YdrMaster%2Foperators","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YdrMaster%2Foperators/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YdrMaster%2Foperators/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/YdrMaster%2Foperators/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/YdrMaster","download_url":"https://codeload.github.com/YdrMaster/operators/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225637113,"owners_count":17500365,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-20T22:13:52.812Z","updated_at":"2025-07-10T12:31:04.834Z","avatar_url":"https://github.com/YdrMaster.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"﻿# InfiniOperators 算子库\n\n跨平台高性能统一算子库。形式为 C 接口动态库。\n\n## 简介\n\n### 算子接口设计\n\n采用3+1段式算子设计，每个算子都实现并对外暴露以下的 C 接口:\n\n- 第一阶段：构造硬件控柄（Handle）。用户提供控柄地址、硬件类型以及硬件序号。控柄所在的内存空间由用户管理。\n\n  ```C\n  infiniopStatus_t infiniopCreateHandle(infiniopHandle_t *handle_ptr, int device, int device_id);\n  ```\n\n- 第二阶段：构造算子描述（Descriptor）。用户提供描述符地址、硬件控柄、以及算子涉及的张量描述（含张量数据类型、形状和步长）。这一步会完成算子所需的与张量数据无关的预计算。\n\n  ```C\n  infiniopStatus_t infiniopCreateOpDescriptor(infiniopHandle_t handle, infiniopOpDescriptor_t *desc_ptr, infiniopTensorDescriptor_t t, ...);\n  ```\n\n- 第三阶段（可选）：计算额外工作空间。根据算子描述，计算算子所需的额外工作空间大小，并存储于用户提供的位置。具体空间分配由用户负责。\n\n  ```C\n  infiniopStatus_t infiniopGetOpWorkspaceSize(infiniopOpDescriptor_t desc, uint64_t *size);\n  ```\n\n- 第四阶段：计算。根据算子描述符，在指定的硬件上执行相应计算，用户需要提供输入输出的数据，以及硬件计算流（CPU 为 NULL）。\n\n  ```C\n  infiniopStatus_t infiniopGetOp(infiniopOpDescriptor_t desc, [void *workspace, uint64_t workspace_size,] void *output_data, void *input_data, ..., void *stream);\n  ```\n\n- 销毁描述和硬件控柄。\n\n  ```C\n  infiniopStatus_t infiniopDestroyOpDescriptor(infiniopOpDescriptor_t desc);\n  infiniopStatus_t infiniopDestroyHandle(infiniopHandle_t handle);\n  ```\n\n### 张量（Tensor）描述设计\n\n张量描述由以下几个部分组成：\n\n1.数据类型，由打包大小（即一个元素代表几个数据）、符号位、元素大小、尾数位数、指数位数共4字节表示。定义如下：\n\n```C\ntypedef struct DataLayout {\n    unsigned short\n        packed : 8,\n        sign : 1,\n        size : 7,\n        mantissa : 8,\n        exponent : 8;\n} DataLayout;\n```\n\n2.维度信息。张量有多少个维度。类型为uint64_t。\n\n3.张量形状。张量每个维度的大小。类型为uint64_t*。\n\n4.张量步长。张量每个维度的步长。类型为uint64_t*。\n\n创建和销毁张量描述符的接口：\n\n```C\ninfiniopStatus_t infiniopCreateTensorDescriptor(infiniopTensorDescriptor_t *desc_ptr, DataLayout layout, uint64_t ndim, uint64_t *shape, uint64_t *strides);\ninfiniopStatus_t infiniopDestroyTensorDescriptor(infiniopTensorDescriptor_t desc);\n```\n\n## 一、使用说明\n\n### 1. 配置\n\n#### 查看当前配置\n\n```xmake\nxmake f -v\n```\n\n#### 配置 CPU （默认配置）\n\n```xmake\nxmake f --cpu=true -cv\n```\n\n#### 配置 GPU\n\n需要指定 CUDA 路径， 一般为 `CUDA_HOME` 或者 `CUDA_ROOT`。\n\n```xmake\nxmake f --nv-gpu=true --cuda=$CUDA_HOME -cv\n```\n\n#### 配置 MLU\n\n```xmake\nxmake f --cambricon-mlu=true -cv\n```\n\n#### 配置 NPU\n\n````xmake\nxmake f --ascend-npu=true -cv\n````\n\n### 2. 编译安装\n\n```xmake\nxmake build \u0026\u0026 xmake install\n```\n\n### 3. 设置环境变量\n\n按输出提示设置 `INFINI_ROOT` 和 `LD_LIBRARY_PATH` 环境变量。\n\n### 4. 运行算子测试\n\n```bash\ncd operatorspy/tests\npython operator_name.py [--cpu | --cuda | --cambricon | --ascend]\n```\n\n## 二、开发说明\n\n### 目录结构\n\n```bash\n├── xmake.lua  # xmake 构建脚本\n├── include\n│   ├── ops\n│   │   ├── [operator_name].h  # 对外暴露的算子 C 接口定义，descriptor 定义\n│   ├── tensor\n│   │   ├── tensor_descriptor.h  # 对外暴露的张量 descriptor 定义\n│   ├── handle\n│   │   ├── handle_export.h  # 对外暴露的硬件 handle 定义\n│   ├── *.h  # 对外暴露的核心结构体定义\n├── src\n│   ├── devices\n│   │   ├── [device_name]\n│   │       ├── *.cc/.h # 特定硬件（如 cpu、英伟达）通用代码\n│   ├── ops\n│   │   ├── utils.h  # 全算子通用代码 (如 assert)\n│   │   ├── [operator_name]  # 算子实现目录\n│   │       ├── operator.cc # 算子 C 接口实现 (根据 descriptor 调用不同的算子实现)\n│   │       ├── [device_name]\n│   │       │   ├── *.cc/.h/... # 特定硬件的算子实现代码\n│   ├── *.h  # 核心结构体定义\n│  \n├── operatorspy  # Python 封装以及测试脚本\n    ├── tests\n    │   ├── operator_name.py  # 测试脚本\n    ├── *.py     # Python 封装代码\n```\n\n### 增加新的硬件\n\n- 在 `src/device.h` 和 `operatorspy/devices.py` 中增加新的硬件类型，注意两者需要一一对应；\n- 在 `xmake.lua` 中增加新硬件的编译选项以及编译方式；\n- 在 `src/ops/devices/[device_name]` 下编写特定硬件的handle实现和通用代码；\n- 实现该硬件的算子；\n\n### 增加新的算子\n\n- 在 `src/ops/[operator_name]` 增加创建/销毁算子描述符、算子计算的C接口，注意C接口header使用`__C __export`前缀；\n- 在 `src/ops/[operator_name]/[device_name]` 增加算子在各硬件的实现代码；\n- 在 `operatorspy/tests/[operator_name].py` 增加算子测试；\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fydrmaster%2Foperators","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fydrmaster%2Foperators","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fydrmaster%2Foperators/lists"}