{"id":18860356,"url":"https://github.com/lining0806/textfilter","last_synced_at":"2025-04-14T12:30:45.532Z","repository":{"id":30945985,"uuid":"34504040","full_name":"lining0806/TextFilter","owner":"lining0806","description":"敏感词过滤系统","archived":false,"fork":false,"pushed_at":"2015-12-01T04:49:22.000Z","size":1890,"stargazers_count":60,"open_issues_count":1,"forks_count":36,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-28T01:50:03.354Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"h31/ProgrammingLabTask2","license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lining0806.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-04-24T07:21:08.000Z","updated_at":"2025-03-18T15:47:04.000Z","dependencies_parsed_at":"2022-09-09T03:12:17.393Z","dependency_job_id":null,"html_url":"https://github.com/lining0806/TextFilter","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lining0806%2FTextFilter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lining0806%2FTextFilter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lining0806%2FTextFilter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lining0806%2FTextFilter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lining0806","download_url":"https://codeload.github.com/lining0806/TextFilter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248881303,"owners_count":21176828,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T04:23:43.900Z","updated_at":"2025-04-14T12:30:44.118Z","avatar_url":"https://github.com/lining0806.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 敏感词过滤系统\r\n\r\n### **更多详见[TextMining](https://github.com/lining0806/TextMining)**\r\n\r\n***\r\n\r\n**Ubuntu Linux下环境搭建：**\r\n \r\n    sudo apt-get install python-pip  \r\n    pip install nltk  \r\n    pip install jieba  \r\n    pip install pymongo  \r\n\r\n**Config下config文件：**  \r\n* 可以进行服务器配置，针对数据库中制订collection的不同字段column，  \r\n* 可以选择语言(中文，英文)，  \r\n* 可以设置要过滤的文章数目，时间默认从最近前推  \r\n* 添加邮件通知系统，SendMailFlag = \"Yes\" # \"No\" 一行可以修改是否接收邮件通知  \r\n* 结果：字段filter_status为1表示通过过滤，为0表示不通过过滤  \r\n\r\n**stopwords_chs和stopwords_eng为过滤词黑名单**    \r\n* 可以随时添加要过滤的单词，一行一个  \r\n* 如果添加的过滤词无法正确被jieba分词，则同样方法将该需要过滤的词及词频加入到主词典dict文件中或者用户词典user_dict，一行一个（词频也可省略）  \r\n* 如stopwords_chs，加入了“阿尼玛”换行， 在dict中加入“阿尼玛 3”，3表示词频，词频越大分词越准确","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flining0806%2Ftextfilter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flining0806%2Ftextfilter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flining0806%2Ftextfilter/lists"}