{"id":21433268,"url":"https://github.com/guokr/caver","last_synced_at":"2025-07-14T13:30:51.709Z","repository":{"id":62561078,"uuid":"140542641","full_name":"guokr/Caver","owner":"guokr","description":"Caver: a toolkit for multilabel text classification.","archived":false,"fork":false,"pushed_at":"2019-06-11T02:46:57.000Z","size":11826,"stargazers_count":39,"open_issues_count":0,"forks_count":3,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-04-08T09:11:12.189Z","etag":null,"topics":["attention-model","cnn","deep-learning","multi-label-classification","nlp","pytorch","text-classification"],"latest_commit_sha":null,"homepage":"https://guokr.github.io/Caver/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/guokr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-07-11T08:07:39.000Z","updated_at":"2023-03-18T08:22:58.000Z","dependencies_parsed_at":"2022-11-03T15:00:28.460Z","dependency_job_id":null,"html_url":"https://github.com/guokr/Caver","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/guokr/Caver","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guokr%2FCaver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guokr%2FCaver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guokr%2FCaver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guokr%2FCaver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/guokr","download_url":"https://codeload.github.com/guokr/Caver/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/guokr%2FCaver/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265297455,"owners_count":23742586,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attention-model","cnn","deep-learning","multi-label-classification","nlp","pytorch","text-classification"],"created_at":"2024-11-22T23:27:02.887Z","updated_at":"2025-07-14T13:30:51.674Z","avatar_url":"https://github.com/guokr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eCaver\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003eRising a torch in the cave to see the words on the wall, tag your short text in 3 lines. Caver uses Facebook's \u003ca href=\"https://pytorch.org/\"\u003ePyTorch\u003c/a\u003e project to make the implementation easier.\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://pypi.org/project/caver/\"\u003e\n      \u003cimg src=\"https://img.shields.io/pypi/v/caver.svg?colorB=brightgreen\"\n           alt=\"Pypi package\"\u003e\n    \u003c/a\u003e\n  \u003ca href=\"https://github.com/guokr/caver/releases\"\u003e\n      \u003cimg src=\"https://img.shields.io/github/release/guokr/caver.svg\"\n           alt=\"GitHub release\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/guokr/caver/issues\"\u003e\n        \u003cimg src=\"https://img.shields.io/github/issues/guokr/caver.svg\"\n             alt=\"GitHub issues\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://travis-ci.org/guokr/Caver/\"\u003e\n    \u003cimg src=\"https://travis-ci.org/guokr/Caver.svg\"\n         alt=\"Travis CI\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#quick-demo\"\u003eDemo\u003c/a\u003e •\n  \u003ca href=\"#requirements\"\u003eRequirements\u003c/a\u003e •\n  \u003ca href=\"#install\"\u003eInstall\u003c/a\u003e •\n  \u003ca href=\"#did-you-guys-have-some-pre-trained-models\"\u003ePre-trained models\u003c/a\u003e •\n  \u003ca href=\"#how-to-train-on-your-own-dataset\"\u003eTrain\u003c/a\u003e •\n  \u003ca href=\"#more-examples\"\u003eExamples\u003c/a\u003e •\n  \u003ca href=\"https://guokr.github.io/Caver/\"\u003eDocument\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\".github/demo.gif?raw=true\" width=\"550\"\u003e\n \u003c/p\u003e\n\n\u003ch2 align=\"center\"\u003eQuick Demo\u003c/h2\u003e\n\n```python\nfrom caver import CaverModel\nmodel = CaverModel(\"./checkpoint_path\")\n\nsentence = [\"看 美 剧 学 英 语 靠 谱 吗\",\n            \"科 比 携 手 姚 明 出 任 2019 篮 球 世 界 杯 全 球 大 使\",\n            \"如 何 在 《 权 力 的 游 戏 》 中 苟 到 最 后\",\n            \"英 雄 联 盟 LPL 夏 季 赛 RNG 能 否 击 败 TOP 战 队\"]\n\nmodel.predict([sentence[0]], top_k=3)\n\u003e\u003e\u003e ['美剧', '英语', '英语学习']\n\nmodel.predict([sentence[1]], top_k=5)\n\u003e\u003e\u003e ['篮球', 'NBA', '体育', 'NBA 球员', '运动']\n\nmodel.predict([sentence[2]], top_k=7)\n\u003e\u003e\u003e ['权力的游戏（美剧）', '美剧', '影视评论', '电视剧', '电影', '文学', '小说']\n\nmodel.predict([sentence[3]], top_k=6)\n\u003e\u003e\u003e ['英雄联盟（LoL）', '电子竞技', '英雄联盟职业联赛（LPL）', '游戏', '网络游戏', '多人联机在线竞技游戏 (MOBA)']\n```\n\n\u003ch2 align=\"center\"\u003eRequirements\u003c/h2\u003e\n\n* PyTorch\n* tqdm\n* torchtext\n* numpy\n* Python3\n\n\u003ch2 align=\"center\"\u003eInstall\u003c/h2\u003e\n\n```bash\n$ pip install caver --user\n```\n\n\u003ch2 align=\"center\"\u003eDid you guys have some pre-trained models\u003c/h2\u003e\n\nYes, we have released two pre-trained models on Zhihu NLPCC2018 [opendataset](http://tcci.ccf.org.cn/conference/2018/taskdata.php).\n\nIf you want to use the pre-trained model for performing text tagging, you can download it (along with other important inference material) from the Caver releases page. Alternatively, you can run the following command to download and unzip the files in your current directory:\n\n```bash\n$ wget -O - https://github.com/guokr/Caver/releases/download/0.1/checkpoints_char_cnn.tar.gz | tar zxvf -\n$ wget -O - https://github.com/guokr/Caver/releases/download/0.1/checkpoints_char_lstm.tar.gz | tar zxvf -\n```\n\n\u003ch2 align=\"center\"\u003eHow to train on your own dataset\u003c/h2\u003e\n\n```bash\n$ python3 train.py --input_data_dir {path to your origin dataset}\n                   --output_data_dir {path to store the preprocessed dataset}\n                   --train_filename train.tsv\n                   --valid_filename valid.tsv\n                   --checkpoint_dir {path to save the checkpoints}\n                   --model {fastText/CNN/LSTM}\n                   --batch_size {16, you can modify this for you own}\n                   --epoch {10}\n\n```\n\n\u003ch2 align=\"center\"\u003eMore Examples\u003c/h2\u003e\n\nIt's updating, but basically you can check [examples](https://github.com/guokr/Caver/tree/master/examples).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguokr%2Fcaver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fguokr%2Fcaver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguokr%2Fcaver/lists"}