{"id":13754313,"url":"https://github.com/JavaStudenttwo/ccks_kg","last_synced_at":"2025-05-09T22:31:55.307Z","repository":{"id":60337633,"uuid":"311369465","full_name":"JavaStudenttwo/ccks_kg","owner":"JavaStudenttwo","description":"ccks2020基于本体的金融知识图谱自动化构建技术评测第五名方法总结","archived":false,"fork":false,"pushed_at":"2022-10-28T15:32:38.000Z","size":2105,"stargazers_count":48,"open_issues_count":3,"forks_count":22,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-11-16T07:33:22.110Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JavaStudenttwo.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-11-09T14:44:35.000Z","updated_at":"2024-08-01T07:16:16.000Z","dependencies_parsed_at":"2023-01-20T19:49:44.556Z","dependency_job_id":null,"html_url":"https://github.com/JavaStudenttwo/ccks_kg","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JavaStudenttwo%2Fccks_kg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JavaStudenttwo%2Fccks_kg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JavaStudenttwo%2Fccks_kg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JavaStudenttwo%2Fccks_kg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JavaStudenttwo","download_url":"https://codeload.github.com/JavaStudenttwo/ccks_kg/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335803,"owners_count":21892737,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:01:54.216Z","updated_at":"2025-05-09T22:31:50.276Z","avatar_url":"https://github.com/JavaStudenttwo.png","language":"Python","readme":"# 方法说明\n\n该代码方法用到了开源工具Hanlp，和官方的预训练模型bert-base-chinese。\n\n项目目录结构如下：\n\n![](images/d1.PNG)\n\n其中expirement_attr、expirement_er和expirement_re三个文件夹下分别是做评测过程中进行的一些相关实验，data文件夹下存放的评测数据。\n\n## 1.实体抽取方法\n\n通过Hanlp实体识别工具，抽取“人物”和“机构”两种类型的实体。\n\n通过规则，抽取“研报“，“文章“，“风险“，“ 机构“四种类型的实体。\n\n除了规则匹配外，还可以采用远程监督的方法，主要用于抽取研报中的实体，具体流程如下图所示：\n\n![](images/d2.PNG)\n\n1.使用规则和外部工具抽取一部分实体\n\n2.将原始数据平均分成两半，一半用于训练，一半用于测试，对用于训练的一半数据使用远程监督进行标注\n\n3.采用将远程监督方法标注的数据按4:1划分，分别作为训练和验证集，训练模型\n\n4.使用上一步训练出的模型在测试集上进行预测，抽取出一部分实体\n\n5.查看是否达到中止循环的条件，达到条件后中止\n\n6.通过规则匹配的方法筛选掉一些实体，剩下的实体加入种子知识图谱，然后从第2步开始，重复上一次训练，迭代进行实体抽取\n\n## 2.属性抽取方法\n\n使用规则匹配的抽取方法\n\n## 3.关系抽取方法\n\n使用规则匹配的抽取方法\n\n# 程序运行说明\n\n需要先安装python3.7和pytorch1.3\n\n然后需要使用以下命令安装相关依赖库：\n\n```\npip install jieba\npip install hanlp\npip install pytorch_pretrained_bert\n```\n\n\n使用如下命令启动程序：\n\n```\npython main.py\n```\n\n最终结果存放在\n\noutput文件夹下，名称为answers.json\n\n\n\n","funding_links":[],"categories":["知识图谱"],"sub_categories":["其他_文本生成、文本对话"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJavaStudenttwo%2Fccks_kg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJavaStudenttwo%2Fccks_kg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJavaStudenttwo%2Fccks_kg/lists"}