{"id":39640925,"url":"https://github.com/46319943/newsforpaddle","last_synced_at":"2026-01-18T09:01:17.276Z","repository":{"id":54113236,"uuid":"318540572","full_name":"46319943/NewsForPaddle","owner":"46319943","description":"借助百度飞桨对新闻进行分析，并将结果进行地图可视化。Analyze news with Paddle and visualize the result on the map.","archived":false,"fork":false,"pushed_at":"2021-03-09T05:51:56.000Z","size":1732,"stargazers_count":2,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2023-03-05T02:12:59.937Z","etag":null,"topics":["baidu-api","mapbox-gl","news","paddle","paddlehub","paddlepaddle"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/46319943.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-04T14:27:28.000Z","updated_at":"2022-07-04T09:03:54.000Z","dependencies_parsed_at":"2022-08-13T06:50:58.700Z","dependency_job_id":null,"html_url":"https://github.com/46319943/NewsForPaddle","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"purl":"pkg:github/46319943/NewsForPaddle","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/46319943%2FNewsForPaddle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/46319943%2FNewsForPaddle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/46319943%2FNewsForPaddle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/46319943%2FNewsForPaddle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/46319943","download_url":"https://codeload.github.com/46319943/NewsForPaddle/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/46319943%2FNewsForPaddle/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28534154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T00:39:45.795Z","status":"online","status_checked_at":"2026-01-18T02:00:07.578Z","response_time":98,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baidu-api","mapbox-gl","news","paddle","paddlehub","paddlepaddle"],"created_at":"2026-01-18T09:00:46.182Z","updated_at":"2026-01-18T09:01:17.218Z","avatar_url":"https://github.com/46319943.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 新闻地图之可视化\n- 在现在这样一个大数据时代，新闻资讯作为获取信息的有效途径，存在信息冗杂、分类不清、表达不直观等问题，为此我们团队不断寻找创新新闻表现形式的突破口，最终发现了地图。它作为一类重要的信息载体，在信息展示方面具有直观、多维的特点。\n- 由此，我们尝试将新闻与地图融合，借助百度飞桨，实现新闻在地图上的可视化表达。同时，根据新闻文本内容，我们可以尝试性地探究新闻情感、主题在时空上的分布特征。\n\n# 实验步骤\n- 使用Baidu AI Studio完成，项目链接：https://aistudio.baidu.com/aistudio/projectdetail/1301096\n- 在本例中，我们首先载入示例数据\n- 利用Senta模型对文本进行情感倾向分析，计算得到情感得分\n- 利用LDA主题模型，对文本进行主题分析，得到各个主题的关键词分布以及新闻对应的主题\n- 随后，利用飞桨进行命名实体识别，提取新闻中的地名，并结合百度地图进行地理编码\n- 最后，我们使用Mapbox-GL对新闻进行地图的可视化，将之前的分析结果呈现在地图上\n\n# 技术路线\n- 使用百度飞桨Paddle实现情感分析、分词、命名实体识别\n- 使用百度地图完成地理编码\n- 使用Gensim、Requests等库完成新闻主题分析\n- 使用Mapbox-GL完成结果的地图可视化\n\n# 数据来源\n- 本例中，我们提供了长江网上5月1日至5月5日的244条新闻作为示例数据\n- 实际平台中，我们采用的是定时运行的分布式爬虫，结合新闻智能提取算法对各新闻源网站进行新闻提取。代码可以参考base_scrape.py\n\n# 新闻地点名提取\n- 在本例中，我们仅演示了武汉地点名的提取，对于全国地点名的提取需要额外的处理步骤，可以参考geocoder.py\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F46319943%2Fnewsforpaddle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F46319943%2Fnewsforpaddle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F46319943%2Fnewsforpaddle/lists"}