{"id":13565210,"url":"https://github.com/kubeop/k8s","last_synced_at":"2025-12-29T16:29:05.574Z","repository":{"id":37631073,"uuid":"159332014","full_name":"kubeop/k8s","owner":"kubeop","description":"Deploy a  Production Ready Kubernetes High Availability Cluster with Binary","archived":false,"fork":false,"pushed_at":"2024-04-12T07:57:21.000Z","size":682,"stargazers_count":195,"open_issues_count":0,"forks_count":115,"subscribers_count":13,"default_branch":"main","last_synced_at":"2024-05-21T19:17:16.915Z","etag":null,"topics":["ansible","container","containerd","docker","etcd","inventory","k8s","kubernetes","pod"],"latest_commit_sha":null,"homepage":"https://www.kubeop.com","language":"Jinja","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kubeop.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-27T12:27:55.000Z","updated_at":"2024-06-05T09:50:05.630Z","dependencies_parsed_at":"2023-02-08T09:17:15.967Z","dependency_job_id":"a64d8362-c0de-4504-93e7-f5df69c05cf2","html_url":"https://github.com/kubeop/k8s","commit_stats":null,"previous_names":["sonicma09/k8s","kubeop/k8s"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kubeop%2Fk8s","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kubeop%2Fk8s/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kubeop%2Fk8s/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kubeop%2Fk8s/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kubeop","download_url":"https://codeload.github.com/kubeop/k8s/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247694877,"owners_count":20980733,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ansible","container","containerd","docker","etcd","inventory","k8s","kubernetes","pod"],"created_at":"2024-08-01T13:01:42.601Z","updated_at":"2025-12-29T16:29:05.568Z","avatar_url":"https://github.com/kubeop.png","language":"Jinja","funding_links":[],"categories":["Jinja"],"sub_categories":[],"readme":"![](https://img.shields.io/github/forks/kubeop/k8s?style=social)\n![](https://img.shields.io/github/watchers/kubeop/k8s?style=social )\n![](https://img.shields.io/github/stars/kubeop/k8s?color=green\u0026style=social)\n\n## 支持发行版\n\n- AlmaLinux 8，9，10\n- RockyLinux 8，9，10\n- Ubuntu Server 20.04，22.04，24.04\n- Debian 12，13\n- TencentOS Server 3\n- Kylin Server v11\n\n\n\n## 支持平台\n\n- amd64\n- arm64\n\n\n\n## 支持kubernetes版本\n\n- 1.26.x ~ 1.35.x\n\n\n\n## 支持组件\n\n- Core\n  - [kubernetes](https://github.com/kubernetes/kubernetes)\n  - [etcd](https://github.com/etcd-io/etcd)\n  - [containerd](https://github.com/containerd/containerd)\n- Network Plugin\n  - [cni-plugins](https://github.com/containernetworking/plugins)\n  - [calico](https://github.com/projectcalico/calico)\n  - [cilium](https://github.com/cilium/cilium)\n  - [flanneld](https://github.com/flannel-io/flannel)\n  - [kube-router](https://github.com/cloudnativelabs/kube-router)\n- Application\n  - [coredns](https://github.com/coredns/coredns)\n  - [node-local-dns](https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/nodelocaldns)\n  - [metrics-server](https://github.com/kubernetes-sigs/metrics-server)\n  - [nvidia_device_plugin](https://github.com/NVIDIA/k8s-device-plugin)\n  - [helm](https://github.com/helm/helm)\n\n\n\n## 开始配置\n\n\u003e 本脚本向前兼容，拉取最新代码，根据需求调整`group_vars/all.yml`和`inventory`中相关参数即可。\n\u003e\n\u003e 如需更新本地代码至最新，建议备份`group_vars/all.yml`和`inventory`并更新代码至最新，然后将原`group_vars/all.yml`和`inventory`中需要的配置添加到最新的`group_vars/all.yml`和`inventory`文件即可。\n\u003e\n\u003e 对于Kubernetes \u003e=1.32.0版本，推荐使用4.19+以上内核，也支持5.x或6.x版本内核。对于cgroups v2支持，最低内核版本为4.15，推荐版本为5.8+。\n\u003e\n\u003e 从Kubernetes 1.35.0版本开始，移除 `cgroup v1` 支持，全面买入 `cgroup v2` 版本。**kube-proxy**的 `ipvs`模式标记废弃，推荐迁移至 nftables 模式。1.35.x是支持 `containerd v1.x` 系列的最后一个版本，在升级到下一个Kubernetes版本之前，必须切换到 `containerd 2.0` 或更高版本。\n\n\n\n### 配置Ansible控制端\n\n建议根据下表安装合适的Python版本和Ansible版本\n\n| 组件                    | 版本     |\n| ----------------------- | -------- |\n| **Control Node Python** | \u003e=3.9    |\n| **Target Python**       | \u003e=3.6    |\n| **Ansible**             | \u003e=2.14.0 |\n\n\n\n安装Ansible\n\n```shell\npip3 install \"ansible\u003e=7.0.0,\u003c10.0.0\" -i https://mirrors.ustc.edu.cn/pypi/web/simple\npip3 install \"netaddr\u003e=0.10.1\" -i https://mirrors.ustc.edu.cn/pypi/web/simple\n```\n\n- 不同Python版本Anisble支持矩阵详情，请参考：https://docs.ansible.com/ansible-core/devel/reference_appendices/release_and_maintenance.html#ansible-core-support-matrix\n\n\n\n### 开启双栈网络\n\n在`group_vars/all.yml`配置中将ipv4_stack和ipv6_stack设置为`true`表明开启IPv4和IPv6双栈网络，将ipv4_stack设置为`true`表明开启IPv4单栈网络，将ipv6_stack设置为`true`表明开启IPv6单栈网络。使用IPv6网络之前请务必确认集群节点所在网络已开启支持IPv6。\n\n⚠️：在云平台使用Calico IPIP时，请勿开启双栈网络，Calico IPv6网络不支持IPIP，[参考](https://github.com/projectcalico/calico/issues/5206)。\n\n\n\n### 修改 inventory\n\n请按照inventory模板格式修改对应资源\n\n- 当haproxy和kube-apiserver部署在同一台服务器时，请确保端口不冲突。\n\n\n\n### 配置 group_vars\n\n编辑 group_vars/all.yml 文件，根据自己的实际环境进行配置。\n\n请注意：\n\n- **Kubernetes** 的最低版本要求为 v1.26\n\n- 请尽量将etcd安装在独立的服务器上，不建议跟master安装在一起。数据盘尽量使用SSD盘。\n- Pod 和Service IP网段建议使用保留私有IP段，建议（Pod IP不与Service IP重复，也不要与主机IP段重复，同时也避免与docker0网卡的网段冲突）从以下网段及子网选择：\n  - Pod 网段\n    - A类地址：10.0.0.0/8\n    - B类地址：172.16.0.0/12\n    - C类地址：192.168.0.0/16\n  - Service网段\n    - A类地址：10.0.0.0/16-24\n    - B类地址：172.16-31.0.0/16-24\n    - C类地址：192.168.0.0/16-24\n\n\n\n\n### 挂载数据盘\n\n如已经自行格式化并挂载目录，可以跳过此步骤。\n\n```shell\nansible-playbook fdisk.yml -i inventory -e \"disk=sdb dir=/data\"\n```\n\n- 可选变量`-e \"disk=sdb dir=/data num=1\"`\n\n如果是NVME的磁盘，请使用以下方式:\n\n```shell\nansible-playbook fdisk.yml -i inventory -e \"disk=nvme0n1 dir=/data num=p1\"\n```\n\n⚠️：\n\n- 此脚本会格式化{{disk}}指定的硬盘，并挂载到{{dir}}目录。\n- 同时会将`/var/lib/etcd`、`/var/lib/containerd`、`/var/lib/kubelet`、`/var/log/pods`数据目录绑定到此数据盘`{{dir}}/containers/etcd`、`{{dir}}/containers/containerd`、`{{dir}}/containers/kubelet`、`{{dir}}/containers/pods`目录，以达到多个数据目录共用一个数据盘，而无需修改kubernetes相关数据目录。\n\n\n\n如需不同目录挂载不同数据盘，可以使用以下命令单独挂载\n\n```shell\nansible-playbook fdisk.yml -i inventory -l etcd -e \"disk=sdb dir=/var/lib/etcd\" --skip-tags=bind_dir\n```\n\n如已经格式化并挂载过数据盘，可以使用以下命令将数据目录绑定到数据盘\n\n```shell\nansible-playbook fdisk.yml -i inventory -l master,worker -e \"disk=sdb dir=/data\" -t bind_dir\n```\n\n\n\n### 安装GPU驱动\n\n当集群节点为GPU节点时，请先参考[nvidia](nvidia.md)完成驱动安装。\n\n\n\n### 下载离线包\n\n```shell\n# 如从自建文件服务器下载，请修改 group_vars/all.yml 文件中的默认下载地址\nansible-playbook download.yml\n```\n\n- 请确保Ansible控制端可以访问**Internet**，否则无法下载离线安装包。\n- 或在其他**Internet**节点下载后，按照对应目录结构拷贝到{{ download.dest }}目录中也可。\n\n\n\n### 同步镜像\n\n```shell\n# 建议将 group_vars/all.yml 中定义的镜像自行同步至私有镜像仓库中，并将地址修改为私有镜像仓库地址\n# 目前会自动将 group_vars/all.yml 中定义的镜像同步到阿里云镜像仓库，可能不稳定或失效。\n```\n\n\n\n## 部署集群\n\n```shell\n# 执行之前，请确认已经进行过磁盘分区\n# 执行之前，请确认已经执行 ansible-playbook download.yml 完成安装包下载\nansible-playbook cluster.yml -i inventory\n```\n\n如是公有云/私有云环境，使用公有云/私有云的负载均衡即可（需提前配置好负载均衡），无需安装haproxy和keepalived。\n\n```shell\nansible-playbook cluster.yml -i inventory --skip-tags=haproxy,keepalived\n```\n\n- 默认会对节点进行初始化操作，集群节点会取主机名最后两段和IP作为集群节点名称。\n\n如果想让master节点也进行调度，可以使用以下参数\n\n```shell\nansible-playbook cluster.yml -i inventory --skip-tags=create_master_taint\n```\n\n\n\n## 扩容节点\n\n### 扩容master节点\n\n扩容时，在inventory文件master组中依次添加新增服务器信息（执行时请务必使用-l参数指定IP）。\n\n格式化挂载数据盘\n\n```shell\nansible-playbook fdisk.yml -i inventory -l ${SCALE_MASTER_IP} -e \"disk=sdb dir=/data\"\n```\n\n执行生成节点证书\n\n```shell\nansible-playbook cluster.yml -i inventory -t cert\n```\n\n执行节点初始化\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${SCALE_MASTER_IP} -t verify,init\n```\n\n执行节点扩容\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${SCALE_MASTER_IP} -t master,containerd,worker --skip-tags=bootstrap,create_worker_label\n```\n\n\n\n### 扩容worker节点\n\n扩容时，在inventory文件worker组中依次添加新增服务器信息（执行时请务必使用-l参数指定IP）。\n\n格式化挂载数据盘\n\n```shell\nansible-playbook fdisk.yml -i inventory -l ${SCALE_WORKER_IP} -e \"disk=sdb dir=/data\"\n```\n\n执行生成节点证书\n\n```shell\nansible-playbook cluster.yml -i inventory -t cert\n```\n\n执行节点初始化\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${SCALE_WORKER_IP} -t verify,init\n```\n\n执行节点扩容\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${SCALE_WORKER_IP} -t containerd,worker --skip-tags=bootstrap,create_master_label\n```\n\n\n\n## 替换集群证书\n\n先备份并删除证书目录{{cert.dir}}，重新创建{{cert.dir}}，并将token、sa.pub、sa.key文件拷贝至新创建的{{cert.dir}}（这三个文件务必保留，不能更改），然后执行以下步骤重新生成证书并分发证书。\n\n```shell\nansible-playbook cluster.yml -i inventory -t cert,dis_certs\n```\n\n然后依次重启每个节点。\n\n重启etcd\n\n```shell\nansible -i inventory etcd -m systemd -a \"name=etcd state=restarted\"\n```\n\n验证etcd\n\n```shell\netcdctl endpoint health \\\n        --cacert=/etc/etcd/pki/etcd-ca.pem \\\n        --cert=/etc/etcd/pki/etcd-healthcheck-client.pem \\\n        --key=/etc/etcd/pki/etcd-healthcheck-client.key \\\n        --endpoints=https://10.43.75.201:2379,https://10.43.75.202:2379,https://10.43.75.203:2379\n```\n\n逐个删除旧的kubelet证书\n\n```shell\nansible -i inventory master,worker -m shell -a \"rm -rf /etc/kubernetes/pki/kubelet*\"\n```\n\n- `-l`参数更换为具体节点IP。\n\n逐个重启节点\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t restart_apiserver,restart_controller,restart_scheduler,restart_kubelet,restart_proxy,healthcheck\n```\n\n- 如calico、metrics-server等服务也使用了集群证书，请记得一起更新相关证书。\n- `-l`参数更换为具体节点IP。\n\n重启网络插件\n\n```shell\nkubectl get pod -n kube-system | grep -v NAME | grep cilium | awk '{print $1}' | xargs kubectl -n kube-system delete pod\n```\n\n-  更新证书可能会导致网络插件异常，建议重启。\n-  示例为重启cilium插件命令，请根据不同网络插件自行替换。\n\n\n\n## 升级集群版本\n\n请先编辑group_vars/all.yml，修改kubernetes.version为新版本。\n\n下载新版本安装包\n\n```shell\nansible-playbook download.yml\n```\n\n升级etcd（升级会自动重启etcd，可根据需求自行选择是否升级）\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t install_etcd,dis_etcd_config\n```\n\n- `-l`参数更换为具体节点IP。\n\n安装kubernetes组件\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t install_kubectl,install_master,install_worker\n```\n\n- `-l`参数更换为具体节点IP。\n\n更新配置文件\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t dis_master_config,dis_worker_config\n```\n\n- `-l`参数更换为具体节点IP。\n- 使用v7.x之前版本部署的集群，因v7.x之后调整节点名称，不建议执行\n\n清空节点\n\n```shell\nkubectl drain --ignore-daemonsets --force \u003c节点名称\u003e\n```\n\n升级containerd组件（升级会自动重启containerd，可根据需求自行选择是否升级）\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t install_runc,install_cni,install_containerd,install_critools,containerd_config\n```\n\n- `-l`参数更换为具体节点IP。\n\n然后依次重启每个kubernetes组件。\n\n```shell\nansible-playbook cluster.yml -i inventory -l ${IP} -t restart_apiserver,restart_controller,restart_scheduler,restart_kubelet,restart_proxy,healthcheck\n```\n\n- `-l`参数更换为具体节点IP。\n\n恢复调度\n\n```shell\nkubectl uncordon \u003c节点名称\u003e\n```\n\n\n\n## 清理集群节点\n\n```shell\nansible-playbook reset.yml -i inventory -l ${IP} -e \"flush_iptables=true\"\n```\n\n\n\n## 项目支持\n- 如果你觉得本项目还不错，可以通过 Star 来表示你的喜欢\n- 在公司或个人项目中使用，并帮忙推广给伙伴使用\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkubeop%2Fk8s","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkubeop%2Fk8s","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkubeop%2Fk8s/lists"}