{"id":24625671,"url":"https://github.com/awpala/udemy-my-courses-data-parser","last_synced_at":"2026-05-07T03:37:01.450Z","repository":{"id":193594889,"uuid":"689111774","full_name":"awpala/udemy-my-courses-data-parser","owner":"awpala","description":"Download Udemy lists and courses metadata for authenticated student user","archived":false,"fork":false,"pushed_at":"2025-01-05T03:21:07.000Z","size":1041,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-19T17:58:56.799Z","etag":null,"topics":["data","scripts","udemy"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/awpala.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-08T20:40:31.000Z","updated_at":"2025-01-05T03:21:10.000Z","dependencies_parsed_at":"2025-01-05T04:19:04.415Z","dependency_job_id":"680549fc-bbd4-4ed9-8d27-cb40cf1790cc","html_url":"https://github.com/awpala/udemy-my-courses-data-parser","commit_stats":null,"previous_names":["awpala/udemy-my-courses-data-parser"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/awpala/udemy-my-courses-data-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awpala%2Fudemy-my-courses-data-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awpala%2Fudemy-my-courses-data-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awpala%2Fudemy-my-courses-data-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awpala%2Fudemy-my-courses-data-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/awpala","download_url":"https://codeload.github.com/awpala/udemy-my-courses-data-parser/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awpala%2Fudemy-my-courses-data-parser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262085287,"owners_count":23256418,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","scripts","udemy"],"created_at":"2025-01-25T04:39:37.209Z","updated_at":"2026-05-07T03:37:01.399Z","avatar_url":"https://github.com/awpala.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Downloading Metadata for Udemy Student Account Courses Lists\n\n## Overview\n\nA semi-automated, scripts-based approach for downloading courses metadata for an authenticated [Udemy](https://www.udemy.com) user account. Based on Udemy v.2.0 API. Corresponding data is based on HTTP requests inferred from user-authenticated view `https://www.udemy.com/home/my-courses/lists/`.\n\n### Use Policy\n\n***Caution***: Use of this script and corresponding compliance with [Udemy's terms and related policies](https://www.udemy.com/terms/) are the **sole responsibility** of the user and corresponding Udemy account holder. This script is provided for ***personal use only***, and is primarily intended for data organization and/or for related pedagogical purposes (e.g., courses scheduling/planning). The author neither endorses nor encourages the use of these scripts for \"abusive,\" counter-to-terms, or related purposes, and correspondingly claims **no** responsibility for any such misuses. ***Download responsibly***.\n\n## Prereqs\n\nIntended for running/use on Unix-based systems (drafted on Ubuntu 22.04), or equivalent (e.g., [Git Bash](https://gitforwindows.org/), [WSL](https://aka.ms/wsl), etc. on Windows). Additional external dependencies are as follows:\n  * [`jq`](https://jqlang.github.io/jq/) command-line utility for JSON parsing\n  * (***optional***) Node.js v.16+\n  * (***optional***) Postgres v.13+\n\n***N.B.*** ***optional*** dependencies only required for JSON to SQL conversion. See corresponding steps `2` and `3` in the next section.\n\n## Steps for Use\n\n***N.B.*** All terminal commands in this section assume a reference location of the top-level repo directory (i.e., `.../udemy-my-courses-data-parser/`).\n\n### 1. Authentication\n\nTo download the JSON payload containing the lists and courses data, first you must authenticate into your Udemy user account. To do this most simply, use the [Udemy website](https://www.udemy.com) itself and log in there.\n\nOnce authenticated, retrieve your `access_token` via browser developer tools or equivalent. Using Google Chrome Developer Tools as a representative example, this can be accomplished as follows:\n\n\u003ccenter\u003e\n\u003cimg src=\"./assets/access-token.png\"\u003e\n\u003c/center\u003e\n\n***N.B.*** The `access_token` value is an alphanumeric string, approximately 40 total alphanumeric characters in length. Furthermore, note that this value is **sensitive** and should **NOT** be shared accordingly, as it provides **direct access** to your Udemy account!\n\n### 2. Downloading Data as JSON\n\nTo download all lists and their constituent courses as a flattened JSON array (i.e., of general form `[ { list1 }, { list2 }, ...]`), run the corresponding parsing script from the command line as follows:\n\n```bash\n./udemy-list-parser.sh\n```\n\nAt the terminal prompt `Enter your Udemy account access token: `, provide the `access_token` value from the previous step and then press `ENTER` to run the script.\n\n***N.B.*** The script has dynamically set parameters, which can be adjusted accordingly to your preference. These are set here to run with relatively \"slow delay\" (ca. 8 ± 0.25 seconds, downloading/proceeding at a rate of page size `1` and page count `1`, where a page here corresponds to a single list) to ensure full data download and no corresponding prematurely canceled requests. Correspondingly, this may take a few minutes to run.\n\nOn successful completion of the script, the following message will appear in the terminal:\n\n```bash\nCombined JSON data saved to udemy_lists_\u003ctimestamp\u003e.json\n```\n\nwhere `\u003ctimestamp\u003e` is the corresponding [Unix timestamp](https://www.unixtimestamp.com/) taken on initial run of the script.\n\nThis resulting payload has the following **general form**:\n\n```json\n[\n  // list 1\n  {\n    \"id\": \u003cnumber\u003e,\n    \"title\": \u003cstring\u003e,\n    \"description\": \u003cstring\u003e,\n    \"list_id\": \u003cnumber\u003e,\n    \"courses\": [\n      // list 1, course 1\n      {\n        \"_class\": \"course\",\n        \"id\": \u003cnumber\u003e,\n        \"title\": \u003cstring\u003e,\n        \"url\": \u003cstring\u003e,\n        \"is_paid\": \u003cboolean\u003e,\n        \"visible_instructor\": [\n          // list 1, course 1, instructor 1\n          {\n            \"_class\": \"user\",\n            \"id\": \u003cnumber\u003e,\n            \"title\": \u003cstring\u003e,\n            \"name\": \u003cstring\u003e,\n            \"display_name\": \u003cstring\u003e,\n            \"job_title\": \u003cstring\u003e,\n            \"image_50x50\": \u003cstring\u003e,\n            \"image_100x100\": \u003cstring\u003e,\n            \"initials\": \u003cstring\u003e,\n            \"url\": \u003cstring\u003e\n          },\n          ... // list 1, course 1, instructors 2...N\n        ],\n        \"image_240x135\": \u003cstring\u003e,\n        \"is_practice_test_course\": \u003cboolean\u003e,\n        \"image_480x270\": \u003cstring\u003e,\n        \"published_title\": \u003cstring\u003e,\n        \"tracking_id\": \u003cstring\u003e,\n        \"headline\":  \u003cstring\u003e,\n        \"num_subscribers\": \u003cnumber\u003e,\n        \"avg_rating\": \u003cnumber\u003e,\n        \"num_reviews\": \u003cnumber\u003e,\n        \"favorite_time\": \u003cstring | null\u003e,\n        \"archive_time\": \u003cstring | null\u003e,\n        \"completion_ratio\": \u003cnumber\u003e,\n        \"num_quizzes\": \u003cnumber\u003e,\n        \"num_lectures\": \u003cnumber\u003e,\n        \"is_private\": \u003cboolean\u003e,\n        \"status_label\": \u003cstring\u003e,\n        \"features\": {\n          \"_class\": \"course\",\n          \"discussions_create\": \u003cboolean\u003e,\n          \"discussions_view\": \u003cboolean\u003e,\n          \"discussions_replies_create\": \u003cboolean\u003e,\n          \"enroll\": \u003cboolean\u003e,\n          \"reviews_create\": \u003cboolean\u003e,\n          \"reviews_view\": \u003cboolean\u003e,\n          \"reviews_responses_create\": \u003cboolean\u003e,\n          \"announcements_comments_view\": \u003cboolean\u003e,\n          \"educational_announcements_create\": \u003cboolean\u003e,\n          \"promotional_announcements_create\": \u003cboolean\u003e,\n          \"promotions_create\": \u003cboolean\u003e,\n          \"promotions_view\": \u003cboolean\u003e,\n          \"students_view\": \u003cboolean\u003e\n        },\n        \"is_published\": \u003cboolean\u003e,\n        \"primary_category\": {\n          \"id\": \u003cnumber\u003e,\n          \"title\": \u003cstring\u003e,\n          \"title_cleaned\": \u003cstring\u003e,\n          \"url\": \u003cstring\u003e,\n          \"icon_class\": \u003cstring\u003e,\n          \"type\": \"category\",\n          \"channel_id\": null,\n          \"_class\": \"course_category\"\n        },\n        \"primary_subcategory\": {\n          \"id\": \u003cnumber\u003e,\n          \"title\": \u003cstring\u003e,\n          \"title_cleaned\": \u003cstring\u003e,\n          \"url\": \u003cstring\u003e,\n          \"icon_class\": \u003cstring\u003e,\n          \"type\": \"subcategory\",\n          \"channel_id\": null,\n          \"_class\": \"course_subcategory\"\n        },\n        \"created\": \u003cstring\u003e,\n        \"estimated_content_length\": \u003cnumber\u003e,\n        \"buyable_object_type\": \"course\",\n        \"last_accessed_time\": \u003cstring\u003e,\n        \"enrollment_time\": \u003cstring\u003e,\n        \"last_update_date\": \u003cstring\u003e,\n        \"context_info\": {\n          \"category\": {\n            \"id\": \u003cnumber\u003e,\n            \"title\": \u003cstring\u003e,\n            \"url\": \u003cstring\u003e,\n            \"tracking_object_type\": \"cat\"\n          },\n          \"subcategory\": null,\n          \"label\": {\n            \"id\": \u003cnumber\u003e,\n            \"display_name\": \u003cstring\u003e,\n            \"title\": \u003cstring\u003e,\n            \"topic_channel_url\": \u003cstring\u003e,\n            \"url\": \u003cstring\u003e,\n            \"tracking_object_type\": \"cl\"\n          }\n        }\n      },\n      ... // list 1, courses 2...N\n    ],\n  },\n  ... // lists 2...N\n]\n```\n\n### 3. (***Optional***) Transform JSON Payload to SQL Tables\n\nIf desired, the resulting JSON payload from the previous step can be transformed to SQL tables for additional querying, transformation, etc.\n\nTo do this, connect to a live Postgres server instance, create a new database (or simply use default database `postgres` or equivalent if not specified otherwise), and then create tables using the script provided in `create_tables.sql`. This will define tables as follows (under schema `student`):\n\n| Schema-Qualified Table Name | Entity or Join Table Type|\n|:--:|:--:|\n| `student.course` | entity |\n| `student.list` | entity |\n| `student.instructor` | entity |\n| `student.category` | entity |\n| `student.subcategory` | entity |\n| `student.topic` | entity |\n| `student.course_list` | join |\n| `student.course_instructor` | join |\n| `student.course_category` | join |\n| `student.course_subcategory` | join | \n| `student.course_topic` | join |\n\n***N.B.*** In general, there is a many-to-many relationship between `course` and the other entities, which in turn is captured via the respective `course_\u003c...\u003e` join tables accordingly.\n\nThe corresponding [ER diagram](https://dbdiagram.io/d/64fbb50802bd1c4a5e3d738e) is as follows:\n\n\u003ccenter\u003e\n\u003cimg src=\"./assets/er-diagram.png\"\u003e\n\u003c/center\u003e\n\nWith the tables created, to create the seed data to populate the tables from the downloaded JSON payload in the previous step, run the generator script `create-seed.js` via Node.js from the command line as follows:\n\n```bash\nnode create-seed.js udemy_lists_\u003ctimestamp\u003e.json\n```\n where `udemy_lists_\u003ctimestamp\u003e.json` is the JSON payload file generated in the previous step.\n\nOn successful completion of the script, the following message will appear in the terminal:\n\n```bash\nEntity SQL insert statements have been generated and saved to seed_data_\u003ctimestamp\u003e.sql\nJoin SQL insert statements have been generated and saved to seed_data_\u003ctimestamp\u003e.sql\n```\n\nwhere `\u003ctimestamp\u003e` is the corresponding [Unix timestamp](https://www.unixtimestamp.com/) taken on initial run of the script.\n\nFurthermore, as this message suggests, this creates a new file `seed_data_\u003ctimestamp\u003e.sql` which provides the corresponding `INSERT ...` statements to populate the tables.\n\nWith the tables populated, this provides a useful \"bird's eye\" view of your courses (i.e., relative to more-tedious browser-/UI-based navigation to discern the same data), along with enhanced querying capabilities, e.g.,:\n\n```sql\nSET search_path TO student; -- use schema `student`\n\nSELECT \n  ARRAY_AGG(DISTINCT list.title ORDER BY list.title) AS \"lists\",\n  course.title AS \"course\",\n  ARRAY_AGG(DISTINCT instructor.display_name ORDER BY instructor.display_name) AS \"instructors\",\n  ROUND(course.estimated_content_length / 60.0, 2) AS \"length (hrs)\",\n  category.title AS \"category\",\n  subcategory.title AS \"subcategory\",\n  topic.title AS \"topic\"\nFROM course\nLEFT JOIN course_list ON course_list.course_id = course.id\nLEFT JOIN course_category ON course_category.course_id = course.id\nLEFT JOIN course_subcategory ON course_subcategory.course_id = course.id\nLEFT JOIN course_topic ON course_topic.course_id = course.id\nLEFT JOIN list ON course_list.list_id = list.id\nLEFT JOIN category ON course_category.category_id = category.id\nLEFT JOIN subcategory ON course_subcategory.subcategory_id = subcategory.id\nLEFT JOIN topic ON course_topic.topic_id = topic.id\nLEFT JOIN course_instructor ON course_instructor.course_id = course.id\nLEFT JOIN instructor ON course_instructor.instructor_id = instructor.id\nGROUP BY course.title, course.estimated_content_length, topic.title, category.title, subcategory.title\nORDER BY \"lists\", \"category\", \"instructors\", \"subcategory\", \"course\"\n;\n```\n\n### 4. Updating the Data\n\nTo repeat this process (i.e., to update the existing data with new/live data from Udemy), re-run the script in step `2` for regenerating the JSON payload. Furthermore, to reseed the database data, run the script in `truncate_tables.sql`, and then repeat step `3`. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawpala%2Fudemy-my-courses-data-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fawpala%2Fudemy-my-courses-data-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawpala%2Fudemy-my-courses-data-parser/lists"}