{"id":19140999,"url":"https://github.com/riophae/lynda-video-transcripts","last_synced_at":"2025-07-14T01:16:07.413Z","repository":{"id":146016102,"uuid":"48271584","full_name":"riophae/lynda-video-transcripts","owner":"riophae","description":"一个批量抓取 Lynda 视频字幕的爬虫脚本","archived":false,"fork":false,"pushed_at":"2017-06-23T15:46:28.000Z","size":34,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-22T19:14:23.527Z","etag":null,"topics":["cralwer","lynda","transcripts"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/riophae.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-12-19T06:39:58.000Z","updated_at":"2017-06-26T11:53:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"142581fb-e386-4166-89c6-12336e590f89","html_url":"https://github.com/riophae/lynda-video-transcripts","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/riophae/lynda-video-transcripts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/riophae%2Flynda-video-transcripts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/riophae%2Flynda-video-transcripts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/riophae%2Flynda-video-transcripts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/riophae%2Flynda-video-transcripts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/riophae","download_url":"https://codeload.github.com/riophae/lynda-video-transcripts/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/riophae%2Flynda-video-transcripts/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265228987,"owners_count":23731092,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cralwer","lynda","transcripts"],"created_at":"2024-11-09T07:19:44.424Z","updated_at":"2025-07-14T01:16:07.387Z","avatar_url":"https://github.com/riophae.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Lynda Video Transcripts\n\n一个批量抓取 [Lynda](http://www.lynda.com/) 视频字幕的爬虫脚本。\n\n## Requirements\n\n- Node.js\n- Phantom.js 2.x\n\n## Installation\n\n```bash\n$ git clone https://github.com/riophae/lynda-video-transcripts.git\n$ cd lynda-video-transcripts\n$ npm install # 安装依赖\n$ # 配置 config\n$ npm run build # 每次修改 config 后都要进行编译\n$ npm start # 执行爬虫脚本\n```\n\n## Configuration\n\n复制一份 `config.example.yaml` 并更名为 `config.yaml`，打开编辑：\n\n- `detectNetworkCondition` 设置是否在开始时检查网络连接状况 `yes`/`no`\n- `userAgent` 建议配置成与自己常用浏览器一致的 userAgent 可能好一些\n- `captureScreenAutomatically` 设置爬虫运行过程中是否定时自动截图 `yes`/`no`\n- `viewportSize` 设置爬虫使用的浏览器的可视区域大小，取值任意，不要太小即可\n- `username` `password` lynda.com 账号名和密码\n- `courses` 需要抓取的课程列表\n- `intervalBetweenTutorialVisits` 设置每两节课程抓取时间的间隔，不建议设置得太短，避免被反作弊处理\n\n#### `courses`\n\n支持两种方式。可以同时指定输出目录和该课程起始抓取点：\n\n```yaml\ncourses:\n  - dirName: \u003cCOURSE_OUTPUT_DIR\u003e\n    startPoint: \u003cSTART_POINT_URL\u003e\n  - dirName: ...\n    startPoint: ...\n  - dirName: ...\n    startPoint: ...\n```\n\n也可以只指定每个课程的起始点，程序会自动根据课程名称确定输出目录：\n\n```yaml\ncourses:\n  - \u003cSTART_POINT_URL\u003e\n  - \u003cANOTHER_START_POINT_URL\u003e\n  - ...\n```\n\n爬虫内部的运作逻辑是，会从指定的起始点开始抓取字幕，直到课程的最后一节。\n\n## Caveats\n\n每次启动爬虫脚本都会清空输出目录（`output/`），因此请注意及时转移文件。\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Friophae%2Flynda-video-transcripts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Friophae%2Flynda-video-transcripts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Friophae%2Flynda-video-transcripts/lists"}