{"id":18895601,"url":"https://github.com/kindlingproject/space-capsule","last_synced_at":"2025-04-15T01:14:10.758Z","repository":{"id":37343079,"uuid":"459872125","full_name":"KindlingProject/space-capsule","owner":"KindlingProject","description":null,"archived":false,"fork":false,"pushed_at":"2022-12-16T10:01:46.000Z","size":54061,"stargazers_count":28,"open_issues_count":11,"forks_count":10,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-15T01:13:58.779Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KindlingProject.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-16T05:58:43.000Z","updated_at":"2025-03-24T03:49:09.000Z","dependencies_parsed_at":"2023-01-29T12:30:58.934Z","dependency_job_id":null,"html_url":"https://github.com/KindlingProject/space-capsule","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KindlingProject%2Fspace-capsule","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KindlingProject%2Fspace-capsule/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KindlingProject%2Fspace-capsule/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KindlingProject%2Fspace-capsule/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KindlingProject","download_url":"https://codeload.github.com/KindlingProject/space-capsule/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248986315,"owners_count":21194025,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T08:28:59.528Z","updated_at":"2025-04-15T01:14:10.742Z","avatar_url":"https://github.com/KindlingProject.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Space-capsule 太空舱计划\n\n\u003cimg src=\"https://user-images.githubusercontent.com/64676199/172799692-068e06cc-5283-4d8c-a835-03ab3d2a621f.png\" width=\"200px\"\u003e\n\n## 项目介绍\n\nKindling 是一款基于标准化错误定界和定位理念设计的开源监控工具，其设计和核心思路是，对于云上环境和容器环境中，应用程序可能出现的各种故障，能够以标准化的步骤，定位故障出现的位置和原因，而不依赖于用户对应用的了解或对k8s,linux知识的积累。\n\n太空舱计划设计的目的是检验监控工具的标准化故障排查能力，推动Kindling项目的不断演进，以实现对各类故障的覆盖。\n\n项目中总结了一些常见的应用异常场景，这些异常场景由项目组的开发和运维经验整理而来，覆盖了云上项目各类异常场景，包括网络设备，运行资源，应用程序缺陷等多种原因。\n\n这些场景通过预设置的Demo应用和故障注入程序进行封装，可以在各种k8s环境和云主机环境中复现.\n\n用户可以快捷的将自己的测试环境改造成太空舱环境，并部署演示应用，来自行注入故障以验证监控工具的错误检测和定界定位能力。\n\n我们期望Kindling项目能够尽可能地覆盖用户在云上的各种异常场景，欢迎任何有兴趣的开发者参与丰富我们未设想到的缺陷场景，又或是对其他监控工具进行错误检测和定位测，来帮助所有的eBPF监控工具共同进步。\n\n## How to start on K8s\n### k8s版本支持\n    v1.18.1 -\u003e 1.23.1 已验证\n    其他版本如不适配，请提issues,提pr更加欢迎\n### k8s开始 \n1. 下载release到master节点(保证能执行kubectl命令即可)之后解压 \n2. 创建演示namespace -\u003e kubectl create namespace practice\n3. cd space-capsule目录,执行./install.sh安装 [chaosblade-operator](https://github.com/chaosblade-io/chaosblade-operator) 和 示例应用\n- 执行 kubectl get po -n chaosblade 可查看chaosblade-operator是否安装成功\n```\n\nNAME                                   READY   STATUS    RESTARTS   AGE\nchaosblade-operator-748dc7588b-z9kts   1/1     Running   1          5d22h\nchaosblade-tool-69whc                  1/1     Running   3          12d\nchaosblade-tool-8jrxx                  1/1     Running   1          12d\nchaosblade-tool-8mcjx                  1/1     Running   1          4d23h\nchaosblade-tool-987fq                  1/1     Running   3          12d\nchaosblade-tool-ksw9w                  1/1     Running   2          12d\nchaosblade-tool-mwt76                  1/1     Running   1          12d\n```\n- 执行 kubectl get po -n practice \n可查看示例应用是否安装成功\n```\n\nNAME                             READY   STATUS             RESTARTS   AGE\nbop-67bddbd49-n5nzf              1/1     Running            0          20h\nconfigservice-7f67b8846d-4h6kv   1/1     Running            0          3h49m\nconfigservice-fbcb85d77-8zqhb    1/1     Running            0          3h56m\ncoreservice-5779d97d6b-4hhs8     1/1     Running            0          3h51m\ncronservice-77dbf765f9-zfsrp     1/1     Running            0          4h7m\ndataservice-96d785bdc-r7lww      1/1     Running            0          3h51m\netcs-7b9d99bbb5-vs2ng            1/1     Running            0          3h50m\ngateway-f5c68b974-z4vcm          1/1     Running            0          3h50m\nng-b76f67475-2qwth               1/1     Running            0          3h50m\nriskservice-78956c8bf8-8ckqx     1/1     Running            0          3h49m\nrocketmq-0                       1/1     Running            0          4h\n```\n3.  ng为入口应用，默认通过ingress-nginx(需自配) 暴露ng-svc的端口，如果没有ingress-nginx可将ng-svc修改为nodePort形式\n\n```\n[root@10 space-capsule-alpha]# kubectl get svc -n practice\nNAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE\nbop-svc        ClusterIP   10.99.95.76     \u003cnone\u003e        8080/TCP   35d\ncore-svc       ClusterIP   10.96.122.223   \u003cnone\u003e        8080/TCP   35d\ncron-svc       ClusterIP   10.108.84.197   \u003cnone\u003e        8080/TCP   34d\ndata-svc       ClusterIP   10.99.95.79     \u003cnone\u003e        8080/TCP   35d\netcs-svc       ClusterIP   10.99.95.80     \u003cnone\u003e        8080/TCP   35d\ngateway-svc    ClusterIP   10.99.95.75     \u003cnone\u003e        8080/TCP   35d\nng-svc         ClusterIP   10.99.95.71     \u003cnone\u003e        8080/TCP   36d\nrisk-svc       ClusterIP   10.99.95.78     \u003cnone\u003e        8080/TCP   35d\nrocketmq-svc   ClusterIP   10.99.95.77     \u003cnone\u003e        8080/TCP   5d20h\n```\n\n3. 使用 ./space-capsule case1 创建第一个故障场景 \n\n```\n[root@10 space-capsule-alpha]# ./space-capsule case12 --namespace practice --pod coreservice-67dd66b57c-hsm2b\nCheck result True\nChaosblade Exist\nCopy file finished\n['bash', '-c', '/opt/chaosblade/blade prepare jvm  --process java    ']\n{\"code\":200,\"success\":true,\"result\":\"01aa8dca66762cf7\"} None\nagent ['01aa8dca66762cf7']\nslow_code injected done！\n```\n4. 使用 ./space-capsule undo case1 还原第一个故障场景\n\n```\n[root@10 space-capsule-alpha]# ./space-capsule undo case12\n{\"code\":200,\"success\":true,\"result\":{\"target\":\"jvm\",\"action\":\"delay\",\"flags\":{\"classname\":\"com.imooc.appoint.service.Impl.PracticeServiceImpl\",\"methodname\":\"httpTxn1\",\"offset\":\"100\",\"process\":\"java\",\"time\":\"3000\"}}}\n```\n\n## How to start on vm\n1. 下载release到主机节点之后解压\n2. cd space-capsule目录\n3. 使用 ./space-capsule case1-vm 创建第一个故障case，虚机加后缀-vm\n\n```\n[root@nginx space-capsule-alpha]# ls\nexample  history  install.sh  space-capsule\n[root@nginx space-capsule-alpha]# ./space-capsule case1-vm\n\n```\n4.  使用 ./space-capsule undo case1 还原第一个故障场景\n\n```\n[root@nginx space-capsule-alpha]# ./space-capsule undo case1-vm\n{\"code\":200,\"success\":true,\"result\":{\"target\":\"network\",\"action\":\"delay\",\"flags\":{\"interface\":\"ens192\",\"local-port\":\"8080,8081\",\"offset\":\"100\",\"time\":\"2000\"}}}\n```\n## 流量发送\n### k8s jemter脚本及使用\n1. 位于/space-capsule/jemter文件夹下面\n2. 修改用户定义的变量中ngIp,ngport为nginx暴露的端口或者nodeport方式暴露的端口\n3. 下发配置\n- 禁用start线程组，启用init线程组，执行。查看结果树，每个请求项data不为空表示配置下发成功。\n```\n{\n    \"data\": {\n        \"mysql_resp_interval\": \"-1\",\n        \"get_http_url1\": \"http:\\/\\/gateway-svc:8080\\/bookDemo\\/gateway\\/practice\\/httpTxn1\",\n        \"get_http_url2\": \"http:\\/\\/gateway-svc:8080\\/bookDemo\\/gateway\\/practice\\/httpTxn1\",\n        \"get_http_url3\": \"http:\\/\\/gateway-svc:8080\\/bookDemo\\/gateway\\/practice\\/httpTxn1\",\n        \"get_http_url4\": \"http:\\/\\/gateway-svc:8080\\/bookDemo\\/gateway\\/practice\\/httpTxn2And4\",\n        \"http_error_resp_code\": \"200\",\n        \"http_resp_interval\": \"10\"\n    },\n    \"success\": true,\n    \"Connection\": \"keep-alive\"\n}\n```\n4. 发送流量\n- 禁用init线程组，启用start线程组，执行，查看执行结果树：\n\n```\n{\n    \"data\": {\n        \"result\": \"Success\",\n        \"data\": \"{\\\"data\\\":{\\\"result\\\":\\\"Success\\\",\\\"d`...`\"}\",\n        \"status\": 200\n    },\n    \"success\": true,\n    \"Connection\": \"keep-alive\"\n}\n```\n\n### vm jemter脚本及使用\n- 待开始\n## 预构建缺陷场景和原因\n\n| No     | 案例原因                      | 案例表现          | 案例类型    | k8s支持 | 云主机支持 |\n|--------|---------------------------|---------------|---------|-------|-------|\n| case1  | 主机网卡之间存在高网络延时             | 请求超时/响应慢      | 网络异常    | ✅     | ✅     |\n| case2  | 容器网卡之间存在高网络延时             | 请求超时/响应慢      | 网络异常    | ✅     | NA    |\n| case3  | 主机网卡之间存在丢包                | 请求超时/异常返回     | 网络异常    | ✅     | ✅     |\n| case4  | 容器网卡之间存在丢包                | 请求超时/异常返回     | 网络异常    | ✅     | NA    |\n| case5  | 集群内网络隔离                   | 建立连接失败        | 网络异常    | ✅     | NA    |\n| case6  | 域名解析失败                    | 建立连接失败/无访问请求  | 网络异常    | ✅     | 待开始   |\n| case7  | 由于主机防火墙导致的长tcp连接中断        | 请求超时/请求返回异常   | 网络异常    | 待开始   | 待开始   |\n| case8  | 主机节点负载过高，应用程序无法分配到充足的资源   | 请求超时/响应慢      | 资源异常    | ✅     | ✅     |\n| case9  | k8s资源配置不合理，limit无法满足程序需要  | 请求超时/响应慢      | 资源异常    | ✅     | NA    |\n| case10 | 容器/主机内部文件打开数达到限制，无法建立更多连接 | 无法建立连接        | 资源异常    | 实施中   | 待开始   |\n| case11 | 命名空间资源配额限制，服务无法创建实例       | 无法建立连接        | 资源异常    | ✅     | NA    |\n| case12 | Java程序低效代码长时间运行           | 请求超时/响应慢      | 程序缺陷    | ✅     | 待开始   |\n| case13 | Java程序死锁                  | 请求超时/响应慢      | 程序缺陷    | ✅     | ✅     |\n| case14 | Java程序未捕获异常导致程序致命终止       | 请求异常返回/建立连接失败 | 程序缺陷    | ✅     | ✅     |\n| case15 | 磁盘io慢                     | 请求处理缓慢        | 资源异常    | 实施中   | 待开始   |\n| case16 | 应用程序使用的线程池资源耗尽            | 业务请求超时        | 资源异常    | 待开始   | 待开始   |\n| case17 | Java程序使用的内存资源超过Xmx限制      | 请求异常          | 资源异常    | 待开始   | 待开始   |\n| case18 | 应用程序中使用的Sql语句执行时间过久       | 请求超时 |资源异常    | 待开始|待开始 |\n\n\n\n### 示例应用说明\n\n示例应用是一个基于SpringBoot完成的演示程序，用于模拟正常状态下的用户服务。\n\n#### 示例应用整体调用拓扑\n\n```mermaid\ngraph LR;\n    Nginx--\u003eGateway;\n    Gateway--\u003eBop;\n    Bop--\u003eCoreservice;\n    Bop--\u003eCronservice;\n    Coreservice--\u003eRiskservice;\n    Riskservice--\u003eDataservice;\n    Cronservice--\u003eDataservice;\n    Dataservice--\u003eRocketmq;\n```\n\n#### 示例应用业务调用链-1\n```mermaid\ngraph LR;\n    Nginx--\u003eGateway;\n    Gateway--\u003eBop;\n    Bop--\u003eCronservice;\n    Cronservice--\u003eDataservice;\n    Dataservice--\u003eRocketmq;\n```\n\n#### 示例应用业务调用链-2\n\n```mermaid\ngraph LR;\n    Nginx--\u003eGateway;\n    Gateway--\u003eBop;\n    Bop--\u003eCronservice;\n    Cronservice--\u003eBop;\n    Bop--\u003eCoreservice;\n    Coreservice--\u003eRiskservice;\n    Riskservice--\u003eDataservice;\n    Dataservice--\u003eRocketmq;\n```\n\n## 能力\n\n- 基于命令行工具，快速构建实验环境。\n- 罗列常见的问题场景和导致问题的原因\n- 通过注入程序快速复现用户异常场景，并支持快速恢复\n- 复现指定原因导致的问题场景\n\n## 工作原理\n\n- 网络类故障： 基于chaosblade封装的网络故障注入工具，底层为tc命令实现\n- 资源类故障： 基于chaosblade和k8s api-server实现应用的资源控制和节点的资源控制 \n- 应用代码类故障： 基于chaosblade的JVM代码注入工具\n- 权限和策略故障： 基于k8s api-server和示例应用内部逻辑\n\n## 缺陷注入\n\n- 支持出网和入网流量延迟和丢包场景，支持以下粒度: node,workload,pod,containers\n- 支持资源抢占场景，包括 cpu，mem和disk资源, 支持以下粒度: node,workload,pod,containers\n- 支持k8s资源限制场景，包括cpu，memory和ephemeral storage的requests，limit限制\n- 支持Java应用程序异常场景，包括死锁，资源异常使用，RuntimeError, 外部资源阻塞等情况\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkindlingproject%2Fspace-capsule","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkindlingproject%2Fspace-capsule","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkindlingproject%2Fspace-capsule/lists"}