{"id":13788226,"url":"https://github.com/Tony-Wang/YaYaNLP","last_synced_at":"2025-05-12T02:33:04.789Z","repository":{"id":97171964,"uuid":"47542344","full_name":"Tony-Wang/YaYaNLP","owner":"Tony-Wang","description":" Pure python NLP toolkit","archived":false,"fork":false,"pushed_at":"2016-01-20T08:55:44.000Z","size":68852,"stargazers_count":55,"open_issues_count":1,"forks_count":16,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-11-18T02:37:08.234Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Tony-Wang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-12-07T09:28:28.000Z","updated_at":"2024-06-20T00:35:19.000Z","dependencies_parsed_at":"2023-03-22T04:50:29.456Z","dependency_job_id":null,"html_url":"https://github.com/Tony-Wang/YaYaNLP","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Wang%2FYaYaNLP","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Wang%2FYaYaNLP/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Wang%2FYaYaNLP/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Tony-Wang%2FYaYaNLP/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Tony-Wang","download_url":"https://codeload.github.com/Tony-Wang/YaYaNLP/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253662781,"owners_count":21944129,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T21:00:39.658Z","updated_at":"2025-05-12T02:33:04.490Z","avatar_url":"https://github.com/Tony-Wang.png","language":"Python","funding_links":[],"categories":["Chinese NLP Toolkits 中文NLP工具"],"sub_categories":["Toolkits 综合NLP工具包"],"readme":"# YaYaNLP: Chinese Language Processing\nYaYaNLP是一个纯python编写的中文自然语言处理包，取名于“牙牙学语”。\nYaYaNLP提供以下功能：\n- 中文分词\n- 词性标注\n- 命名实体识别\n * 人名识别\n * 地名识别\n * 组织机构识别\n- 简繁转换\n\n## 项目\n\n项目主页：[https://github.com/Tony-Wang/YaYaNLP](https://github.com/Tony-Wang/YaYaNLP)\n\n我的主页：[www.huangyong.me](http://www.huangyong.me)\n\n## 安装\n\n### 直接下载源码包，解压后运行\n\n``` bash\npython setup.py install\n```\n\n### 下载字典与模型文件\n\nYaYaNLP使用了与HanLP兼容的字典数据，而编译后的字典数据保存的扩展名为.ya\n可以直接从hanLP项目下载，[data-for-1.2.4.zip](http://pan.baidu.com/s/1gd1vo8j)\n\n### 配置数据文件路径\n\n在**yaya/config.py**修改自己的数据文件路径\n``` python\nDATA_ROOT = \"/your/data/path\"\n```\n\n## 特性\n\n### 人名识别\n\n``` \n    # 识别人名\n    text = u\"签约仪式前，秦光荣、李纪恒、仇和等一同会见了参加签约的企业家。\"\n    terms = segment.seg(text)\n    print_terms(terms)\n```\n\n```\n签约/vi\n仪式/n\n前/f\n，/w\n秦光荣/nr\n、/w\n李纪恒/nr\n、/w\n仇和/nr\n等/udeng\n一同/d\n会见/v\n了/ule\n参加/v\n签约/vi\n的/ude1\n企业家/nnt\n。/w\n```\n\n\n### 歧意词识别\n\n```\n    # 识别歧意词\n    text = u\"龚学平等领导说,邓颖超生前杜绝超生\"\n    terms = segment.seg(text)\n    print_terms(terms)\n```\n\n```\n龚学平/nr\n等/udeng\n领导/n\n说/v\n,/w\n邓颖超/nr\n生前/t\n杜绝/v\n超生/vi\n```\n\n### 地名识别\n\n``` \n    # 识别地名\n    text = u\"蓝翔给宁夏固原市彭阳县红河镇黑牛沟村捐赠了挖掘机\"\n    terms = segment.seg(text)\n    print_terms(terms)\n```\n\n```\n蓝翔/nt\n给/p\n宁夏/ns\n固原市/ns\n彭阳县/ns\n红河镇/ns\n黑牛沟村/ns\n捐赠/v\n了/ule\n挖掘机/n\n```\n\n### 组织名识别\n\n```\n    # 组织名识别\n    text = u\"济南杨铭宇餐饮管理有限公司是由杨先生创办的餐饮企业\"\n    terms = segment.seg(text)\n    print_terms(terms)\n```\n\n```\n济南杨铭宇餐饮管理有限公司/nt\n是/vshi\n由/p\n杨先生/nr\n创办/v\n的/ude1\n餐饮企业/nz\n```\n\n### 简繁转换\n\n```\n    # 简繁转换\n    text = u\"以后等你当上皇后，就能买草莓庆祝了\"\n    print segment.simplified_to_traditional(text)\n```\n\n```\n以後等妳當上皇后，就能買士多啤梨慶祝了\n```\n\n```\n    # 繁简转换\n    text = u\"用筆記簿型電腦寫程式HelloWorld\"\n    print segment.traditional_to_simplified(text)\n```\n\n```\n用笔记本电脑写程序HelloWorld\n```\n\n## 感谢\n本项目参考了[hanck/HanLP](https://github.com/hankcs/HanLP/)项目实现原理并使用了该项目的字典和模型文件。\n\n\n## 版权\n* Apache License Version 2.0\n* 任何使用了YaYaNLP的全部或部分功能、词典、模型的项目、产品或文章等形式的成果必须显式注明YaYaNLP及此项目主页。\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTony-Wang%2FYaYaNLP","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTony-Wang%2FYaYaNLP","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTony-Wang%2FYaYaNLP/lists"}