{"id":13672573,"url":"https://github.com/GreatV/CloudMusic-Crawler","last_synced_at":"2025-04-27T22:32:23.857Z","repository":{"id":50636762,"uuid":"89907179","full_name":"GreatV/CloudMusic-Crawler","owner":"GreatV","description":"网易云音乐爬虫，数据可视化。","archived":true,"fork":false,"pushed_at":"2019-06-02T02:39:45.000Z","size":34,"stargazers_count":367,"open_issues_count":0,"forks_count":110,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-11-11T10:42:33.379Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/GreatV.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-01T07:57:48.000Z","updated_at":"2024-11-11T07:05:42.000Z","dependencies_parsed_at":"2022-09-19T04:30:55.957Z","dependency_job_id":null,"html_url":"https://github.com/GreatV/CloudMusic-Crawler","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreatV%2FCloudMusic-Crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreatV%2FCloudMusic-Crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreatV%2FCloudMusic-Crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/GreatV%2FCloudMusic-Crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/GreatV","download_url":"https://codeload.github.com/GreatV/CloudMusic-Crawler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251219601,"owners_count":21554444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T09:01:39.817Z","updated_at":"2025-04-27T22:32:23.472Z","avatar_url":"https://github.com/GreatV.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"﻿**新版即将到来。。。**\n\n![](https://i.loli.net/2018/01/23/5a672e59bbfab.png)\n\n![](https://i.loli.net/2018/01/23/5a672e63457b7.png)\n\n## Introduction\n\n看见有人写了一篇[我用Python分析了42万字的歌词，为了搞清楚民谣歌手们在唱些什么](https://ask.hellobi.com/blog/spuerwdk/6336)，觉得挺好玩的，于是就想自己也实现一下。于是本作品就诞生了。\n\n## 爬虫\n\n爬虫部分主要是调用已有的 API。这部分的工作可以参考[NetEase-MusicBox](https://github.com/darknessomi/musicbox)，该作品作者实现了网易云音乐的命令行版，我用了一下还不错。主要参考了该作者的[api.py](https://github.com/darknessomi/musicbox/blob/master/NEMbox/api.py)部分。\n\n![Screenshot3.png](https://i.loli.net/2017/12/28/5a44fdcfc0ba9.png)\n\n## 文件处理\n\n该部分主要的工作是将所有歌词写入一个文件，同时每个作者的所有歌词也放入一个文件，以备后面的分析之用。\n\n![Screenshot4.png](https://i.loli.net/2017/12/28/5a44fdcfdffae.png)\n\n本次获取的歌词大概 26000 行。\n\n## 文本分析\n\n分词用的是[“结巴”中文分词](https://github.com/fxsjy/jieba)。\n\n我首先选取了一位歌手作为代表分析了一下词频，如下所示：\n\n![shisanfigure_2.png](https://i.loli.net/2017/12/28/5a44fdcf52893.png)\n\n![figure_bar01.png](https://i.loli.net/2017/12/28/5a44fdcf44e0e.png)\n\n![figure_pie01.png](https://i.loli.net/2017/12/28/5a44fdcf85627.png)\n\n做了一个词云：\n\n![shisanfigure_1.png](https://i.loli.net/2017/12/28/5a44fdcf7d383.png)\n\n然后。把所有的歌词都分析了一下，得到了如下饼状图：\n\n![fm3.png](https://i.loli.net/2017/12/28/5a44fdcf7efac.png)\n\n还做了一个词云，如下所示：\n\n![fm0.png](https://i.loli.net/2017/12/28/5a44fdcf7cca2.png)\n\n## 接下来的工作\n\n- 情绪分析\n- 云音乐的评论很精彩，可以做一下评论，看看有什么发现\n\n## 如何使用\n\n```\ngit clone https://github.com/GreatV/CloudMusic-Crawler.git\n\ncd CloudMusic-Crawler\n\npython3 -m venv venv\n\nsource venv/bin/activate\n\npip install -r requirements.txt\n\ncd NEMCrawler\n\npython NEM_spider.py\n\npython text_mining.py\n\nfirefox render.html\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGreatV%2FCloudMusic-Crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FGreatV%2FCloudMusic-Crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FGreatV%2FCloudMusic-Crawler/lists"}