{"id":38670216,"url":"https://github.com/deepghs/cyberharem","last_synced_at":"2026-01-17T09:54:55.791Z","repository":{"id":184210027,"uuid":"671436607","full_name":"deepghs/cyberharem","owner":"deepghs","description":"100% Automated Anime Character Lora Training Pipeline","archived":false,"fork":false,"pushed_at":"2025-07-22T04:41:47.000Z","size":455,"stargazers_count":67,"open_issues_count":1,"forks_count":7,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-22T06:48:44.630Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deepghs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-07-27T10:06:59.000Z","updated_at":"2025-07-22T04:41:50.000Z","dependencies_parsed_at":"2023-09-29T11:52:38.838Z","dependency_job_id":"9425b86c-30c6-44f3-9a2d-b56d3542a7b8","html_url":"https://github.com/deepghs/cyberharem","commit_stats":null,"previous_names":["deepghs/cyberharem"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/deepghs/cyberharem","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepghs%2Fcyberharem","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepghs%2Fcyberharem/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepghs%2Fcyberharem/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepghs%2Fcyberharem/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deepghs","download_url":"https://codeload.github.com/deepghs/cyberharem/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deepghs%2Fcyberharem/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28505565,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T06:57:29.758Z","status":"ssl_error","status_checked_at":"2026-01-17T06:56:03.931Z","response_time":85,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-17T09:54:55.024Z","updated_at":"2026-01-17T09:54:55.785Z","avatar_url":"https://github.com/deepghs.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CyberHarem\n\n[![Discord](https://img.shields.io/discord/1157587327879745558?style=social\u0026logo=discord\u0026link=https%3A%2F%2Fdiscord.gg%2FTwdHJ42N72)](https://discord.gg/TwdHJ42N72)\n![GitHub Org's stars](https://img.shields.io/github/stars/deepghs)\n[![GitHub stars](https://img.shields.io/github/stars/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/stargazers)\n[![GitHub forks](https://img.shields.io/github/forks/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/network)\n![GitHub commit activity](https://img.shields.io/github/commit-activity/m/deepghs/cyberharem)\n[![GitHub issues](https://img.shields.io/github/issues/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/issues)\n[![GitHub pulls](https://img.shields.io/github/issues-pr/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/pulls)\n[![Contributors](https://img.shields.io/github/contributors/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/graphs/contributors)\n[![GitHub license](https://img.shields.io/github/license/deepghs/cyberharem)](https://github.com/deepghs/cyberharem/blob/master/LICENSE)\n\nCyberHarem Automated Waifu Training Pipeline\n\n(NOTE: This project is still work in progress. It has only been tested on A100 80G, ubuntu environment.)\n\n**(NOTE: HCP-Diffusion has been kicked out from CyberHarem. Now CyberHarem is based on kohya script and a41 webui.)**\n\n## Install\n\nClone and install this project\n\n```shell\ngit clone https://github.com/deepghs/cyberharem.git\ncd cyberharem\npip install -r requirements.txt\n```\n\nThis project works on HuggingFace. You should **set the namespace on HuggingFace before start using it**\n\n```shell\n# set your huggingface username or organization name\nexport CH_NAMESPACE=my_hf_username\n\n# set your huggingface token\nexport HF_TOKEN=your_huggingface_token\n```\n\nAfter set `CH_NAMESPACE`, your datasets or models will be saved to `my_hf_username/xxxxx`.\n\n## Dataset Making\n\n### Create Dataset With Waifuc\n\nHere is the [waifuc project](https://github.com/deepghs/waifuc), an efficient train data collector for anime\nwaifu.\nWe recommend you to learn how to use it before start reading this\npart: https://deepghs.github.io/waifuc/main/index.html\n\nAfter that, run the following code\n\n```python\nfrom waifuc.source import DanbooruSource\n\nfrom cyberharem.dataset import crawl_dataset_to_huggingface\n\ns = DanbooruSource(['surtr_(arknights)'])\n\ncrawl_dataset_to_huggingface(\n    # your cyberharem datasource\n    source=s,\n\n    # name of dataset, trigger word of model\n    name='surtr_arknights',\n\n    # display name (for others to see, e.g. on civitai)\n    display_name='surtr/スルト/史尔特尔 (Arknights)',\n\n    # how many images you need,\n    limit=500,\n)\n\n```\n\nThe dataset with 500 original images will be pushed to dataset repository `my_hf_username/surtr_arknights`.\nThis step may take several hours.\n\nIt is worth noting that you do not have to add attached actions after the source. They will be added inside the\n`crawl_dataset_to_huggingface` function, and the datasource will be auto-cleaned and processed.\n\nIn some cases, if you do not want it to process your dataset (e.g. your datasource is trusted or processed), just run\nas following\n\n```python\nfrom waifuc.source import LocalSource\n\nfrom cyberharem.dataset import crawl_dataset_to_huggingface\n\ns = LocalSource('/my/local/directory')\n\ncrawl_dataset_to_huggingface(\n    source=s,\n    name='surtr_arknights',\n    display_name='surtr/スルト/史尔特尔 (Arknights)',\n\n    # no limit on the quantity\n    limit=None,\n\n    # skip all the pre-processes \n    skip_preprocess=True,\n)\n```\n\n### Batch Process Anime Videos\n\nFirst, you need to download the anime videos to your local environment (e.g. at folder `/my/anime/videos`)\n\n```shell\nset CH_BG_NAMESPACE=bg_namespace\npython -m cyberharem.dataset.video huggingface -i /my/anime/videos -n 'Name of The Anime'\n```\n\nThen the bangumi dataset will be pushed to `bg_namespace/nameoftheanime`.\n\nMore options can be found with `-h` option\n\n```\nUsage: python -m cyberharem.dataset.video huggingface [OPTIONS]\n\n  Publish to huggingface\n\nOptions:\n  -r, --repository TEXT   Repository to publish to.\n  -R, --revision TEXT     Revision for pushing the model.  [default: main]\n  -i, --input TEXT        Input videos.  [required]\n  -n, --name TEXT         Bangumi name  [required]\n  -s, --min_size INTEGER  Min size of image.  [default: 320]\n  -E, --no_extract        No extraction from videos.\n  -h, --help              Show this message and exit.\n```\n\n### Extract Training Dataset from Bangumi Dataset\n\nYou can extract images from the bangumi dataset (e.g. `BangumiBase/fatestaynightufotable`,\nabovementioned `bg_namespace/nameoftheanime`), like this\n\n```python\nfrom cyberharem.dataset import crawl_base_to_huggingface\n\ncrawl_base_to_huggingface(\n    # bangumi repository id\n    source_repository='BangumiBase/fatestaynightufotable',\n\n    ch_id=[18, 19],  # index numbers in bangumi repository\n    name='Illyasviel Von Einzbern',  # official name of this waifu\n    limit=1000,  # max number of images you need\n)\n```\n\nThen the bangumi-based dataset will be uploaded to `my_hf_username/illyasviel_von_einzbern_fatestaynightufotable`.\nLike this: https://huggingface.co/datasets/CyberHarem/illyasviel_von_einzbern_fatestaynightufotable .\n\n## Train LoRA\n\n~~The training method we employ is pivotal tuning, which stores the trigger words of LoRA in an embedding file. The\nactivation of LoRA is achieved by triggering the embedding file during use. We refer to this as P-LoRA.~~\n\nThat is all history, now we use kohya script to train common LoRAs.\n\nYou can train a LoRA with the dataset on huggingface\n\n(PS: if you need to use reg dataset for training, please set the `REG_HOME` directory, this directory is used for reg\ndataset and latent cache management.)\n\n```python\nfrom ditk import logging\n\nfrom cyberharem.train import train_lora, set_kohya_from_conda_dir, set_kohya_from_venv_dir\n\nlogging.try_init_root(logging.INFO)\n\n# if your kohya script is in conda\nset_kohya_from_conda_dir(\n    # name of the conda environment\n    conda_env_name='kohya',\n\n    # directory of kohya sd-script\n    kohya_directory='/my/path/sd-script',\n)\n\n# # if your kohya script is in venv\n# set_kohya_from_venv_dir(\n#     # these should be a venv folder in this directory\n#     kohya_directory='/my/path/sd-script',\n#    \n#     # name of venv, default is `venv`\n#     venv_name='venv',\n# )\n\nif __name__ == '__main__':\n    workdir = train_lora(\n        ds_repo_id='CyberHarem/surtr_arknights',\n\n        # use your own template file\n        # this one is the default config template, you can just use it\n        template_file='ch_lora_sd15.toml',\n\n        # use reg dataset\n        use_reg=True,\n\n        # hyperparameters for training\n        bs=8,  # training batch size\n        unet_lr=0.0006,  # learning date of unet\n        te_lr=0.0006,  # learning rate of text encoder\n        train_te=False,  # do not train text encoder\n        dim=4,  # dim of lora\n        alpha=2,  # alpha of lora\n        resolution=720,  # resolution: 720x720\n        res_ratio=2.2,  # min_res: 720 // 2.2, max_res: 720 * 2.2\n    )\n\n```\n\nPlease note that this script takes about 18G GPU memory in maximum. We can run it on A100 80G, but maybe you cannot\nrun it on 2060. If OOM occurred, just lower the `bs` and `max_reg_bs`.\n\nAlso, you can specify directory or environment information of kohya script, like the followings:\n\n* Set kohya in conda environment\n\n```shell\nexport CH_KOHYA_DIR=/my/path/sd-script\nexport CH_KOHYA_CONDA_ENV=kohya\nunset CH_KOHYA_VENV\n```\n\n* Set kohya in venv\n\n```shell\nexport CH_KOHYA_DIR=/my/path/sd-script\nexport CH_KOHYA_VENV=venv\nunset CH_KOHYA_CONDA_ENV\n```\n\nBy using these variables, you do NOT have to specify them in your python code.\n\n## Evaluate LoRA and Publish It To HuggingFace\n\nWe can automatically use a1111's webui to generate images, assess which LoRA step is the best, and publish them to the\nhuggingface hub.\n\nNOTE:\n\n1. **[Dynamic Prompts Plugin](https://github.com/adieyal/sd-dynamic-prompts) is REQUIRED for image batch\n   generation!!!** Please install it before batch inference.\n2. Please start your webui with API mode, by using `--api` and `--nowebui` arguments.\n\n```python\nfrom cyberharem.infer import set_webui_server, set_webui_local_dir\nfrom cyberharem.publish import deploy_to_huggingface\n\n# your a41 webui server\nset_webui_server('127.0.0.1', 10188)\n\n# your directory of a41 webui\n# these should have `models/Lora` inside\nset_webui_local_dir('/my/a41_webui/stable-diffusion-webui')\n\ndeploy_to_huggingface(\n    workdir='runs/surtr_arknights',  # work directory of training\n    eval_cfgs=dict(\n        # basic infer arguments\n        base_model='meinamix_v11',  # use the base model in a41 webui\n        batch_size=64,  # we can use bs64 on A100 80G, lower this value if you cant\n        sampler_name='DPM++ 2M Karras',\n        cfg_scale=7,\n        steps=30,\n        firstphase_width=512,\n        firstphase_height=768,\n        clip_skip=2,\n\n        # hires fix\n        enable_hr=True,\n        hr_resize_x=832,\n        hr_resize_y=1216,\n        denoising_strength=0.6,\n        hr_second_pass_steps=20,\n        hr_upscaler='R-ESRGAN 4x+ Anime6B',\n\n        # adetailer, useful for fixing the eyes\n        # will be ignored when adetailer not installed\n        enable_adetailer=True,\n\n        # weight of lora\n        lora_alpha=0.8,\n    )\n)\n\n```\n\nImages will be created for steps evaluation. After that, best steps will be recommended, and all the information\n(images, model files, data archives and LoRAs) will be pushed to model repository `my_hf_username/surtr_arknights`.\n\nAlso, if you do not want to set webui settings in python code, just use the following environment variables\n\n```shell\nexport CH_WEBUI_SERVER=http://127.0.0.1:10188\nexport CH_WEBUI_DIR=/my/a41_webui/stable-diffusion-webui\n```\n\n## Upload to CivitAI\n\nBefore uploading, you need to create a civitai session\nwith [civitai_client](https://github.com/narugo1992/civitai_client).\n\n```python\nfrom cyberharem.publish import civitai_upload_from_hf\n\ncivitai_upload_from_hf(\n    repository='my_hf_username/surtr_arknights',\n    civitai_session='your_civitai_session.json',\n\n    # use best step, you can use the step you like best\n    step=None,\n\n    # upload nsfw images (please attention the TOS of civitai)\n    allow_nsfw=True,\n\n    publish_at=None,  # publish now\n    # publish_at='2030-01-01 08:00:00+00:00', # schedule to publish at '2030-01-01 08:00:00+00:00'\n\n    # if you have already uploaded an older version, put the model id here\n    # existing_model_id=None,\n)\n```\n\n## F.A.Q.\n\n### Will Private Repository Or Local Directory Be Supported?\n\nNo, and never will be. We developed and open-sourced this project with the intention of making the training of waifus\nsimpler and more convenient, while also ensuring more stable quality control. Resources such as datasets and models\nshould belong to all anime waifu enthusiasts. They are created by a wide range of anime artists and are collected and\ncompiled fully automatically by tools like cyberharem and cyberharem. **Our hope is for these resources to be widely\ncirculated, rather than monopolized in any form**. If you do not agree with this philosophy, we do not recommend that\nyou continue using this project.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepghs%2Fcyberharem","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeepghs%2Fcyberharem","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeepghs%2Fcyberharem/lists"}