{"id":21890070,"url":"https://github.com/ejfox/youtube-explore-to-gexf","last_synced_at":"2026-04-09T08:13:53.407Z","repository":{"id":57404287,"uuid":"181743776","full_name":"ejfox/youtube-explore-to-gexf","owner":"ejfox","description":"A tool to convert youtube-explore outputs to gexf","archived":false,"fork":false,"pushed_at":"2019-04-16T18:53:35.000Z","size":18,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-19T20:05:34.045Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ejfox.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-04-16T18:19:47.000Z","updated_at":"2019-04-26T01:00:04.000Z","dependencies_parsed_at":"2022-09-08T14:11:55.550Z","dependency_job_id":null,"html_url":"https://github.com/ejfox/youtube-explore-to-gexf","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fyoutube-explore-to-gexf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fyoutube-explore-to-gexf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fyoutube-explore-to-gexf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ejfox%2Fyoutube-explore-to-gexf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ejfox","download_url":"https://codeload.github.com/ejfox/youtube-explore-to-gexf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244898409,"owners_count":20528335,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-28T11:28:34.461Z","updated_at":"2025-10-30T01:40:21.373Z","avatar_url":"https://github.com/ejfox.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Usage\nRun the [youtube-explore]() tool with a command similar to `python2.7 follow-youtube-recommendations.py --query=\"global warming,vaccines,nasa\" --searches=4 --branch=4 --depth=4 --name=\"science\"`\n\nThis will generate a JSON file. This tool provides the ability to convert the resulting JSON file to the [.gexf](https://gephi.org/gexf/format/) file format so it can be easily imported into [Gephi](https://gephi.org/)\n\n1. Git clone this repo\n1. Copy the JSON file you want to analyze into the folder\n1. Run `node convertYoutubeScrapeToGexf.js --filename=video-infos-SOURCE.json \u003e OUTPUT.gexf` to send the output of the conversion to a new file named `OUTPUT.gexf`\n1. In Gephi `File \u003e Open` and then select the newly created output .gexf file\n1. Analyze your recommendation network with Gephi\n\n## Global installation\nThis tool is also available as an npm package.\n\n+ `npm install -g youtube-explore-to-gexf`\n\n### Usage\n+ `youtube-explore-to-gexf --filename=video-infos-SOURCE.json \u003e OUTPUT.gexf`\n\n## How it works\nWe are going to convert the output JSON of a YouTube recommendation scrape performed with https://github.com/pnbt/youtube-explore\n\nLet's load the file\n```javascript\nvar fs = require('fs') // To load things from filesystem\nvar _ = require('lodash') // To process data\nvar filename = 'data/video-infos-candidates2020-Andrew Yang-20190415.json' // File to process\nvar scrape = JSON.parse(fs.readFileSync(filename, 'utf8')) // Load the file\nvar gexf = require('gexf') // To read and write gexf data\nconst argv = require('yargs').argv\n```\n\nAccept filenames when run through CLI\nBy adding the flag `--filename=PATH-TO-YOUR-FILE`\n\n#### Example command\n`\u003e node convertYoutubeScrapeToGexf.js --filename=data/video-infos-candidates2020-Bernie\\ Sanders-20190415.json \u003e bernie.gexf`\n\n## The data\n+ At the top level is an array of videos that are the result of the search\n+ If you dive into any of these videos... (which have a unique ID like `4U2eDJnwz_s`)\n\n#### You get the following data:\n|key | value |\n|----|--------|\n|pubdate|   \"2016-08-14\"|\n|views|     10545448|\n|dislikes|  3374|\n|likes|     93179|\n|key|       [ (0) ]|\n|duration|  1076|\n|id|        \"4U2eDJnwz_s\"|\n|mult|      0.8598425196850393|\n|title|     \"Auto Lending: Last Week Tonight with John Oliver (HBO)\"|\n|nb_recommendations| 2|\n|depth|     1|\n|recommendations|    [ (20) video ID strings ]|\n|channel|   \"LastWeekTonight\"|\n\nThe unique ID can be used to generate a link to the actual video like so\nExample: https://www.youtube.com/watch?v=UNIQUE-ID-HERE\nThis video's URL: https://www.youtube.com/watch?v=4U2eDJnwz_s  \n\nThe `recommendations` key contains an array of 20 more IDs of videos that are recommended from this video\n\n## Converting to GEXF (which is a lot like XML)\n\nOkay, now that we know the general shape of our source data\nWe need to figure out how to convert it so that it can be read by GEPHI\nOr other software through the GEXF graph format\nLike most directed graphs, the two main components are\n+ Nodes: A list of every entity, in our case YouTube videos\n+ Edges: A list of every link between entities, in our case recommendations\n\n## Nodes\nNodes have an ID, a label, and other data can be attached for analysis in GEPHI\nYou end up with something like `\u003cnode id=\"0\" label=\"Hello\" /\u003e`\n\n## Links\nLinks have a unique ID\nThe \"source\" field is the unique ID for the source node which in our case is the video that is being recommended from\nThe \"target\" field is the unique ID of the recommended video\n\nA hello world GEXF graph would look something like this\n```xml\n\u003c?xml version=\"1.0\" encoding=\"UTF-8\"?\u003e\n\u003cgexf xmlns=\"http://www.gexf.net/1.2draft\" version=\"1.2\"\u003e\n    \u003cmeta lastmodifieddate=\"2009-03-20\"\u003e\n        \u003ccreator\u003eGexf.net\u003c/creator\u003e\n        \u003cdescription\u003eA hello world! file\u003c/description\u003e\n    \u003c/meta\u003e\n    \u003cgraph mode=\"static\" defaultedgetype=\"directed\"\u003e\n        \u003cnodes\u003e\n            \u003cnode id=\"0\" label=\"Hello\" /\u003e\n            \u003cnode id=\"1\" label=\"Word\" /\u003e\n        \u003c/nodes\u003e\n        \u003cedges\u003e\n            \u003cedge id=\"0\" source=\"0\" target=\"1\" /\u003e\n        \u003c/edges\u003e\n    \u003c/graph\u003e\n\u003c/gexf\u003e\n```\n\nSo let's loop over our JSON file and fill our nodes and links\n```javascript\n_.each(scrape, function(video) {\n  var sourceID = video.id // Get the source video ID\n  nodes.push({\n    id: sourceID,\n    label: video.title,\n    attributes: {\n      duration: video.duration,\n      likes: video.likes,\n      dislikes: video.dislikes,\n      pubdate: video.pubdate,\n      mult: video.mult,\n      channel: video.channel\n    }\n  }) // Add source video as node\n  _.each(video.recommendations, function(rec, i) {\n    // Generate new links for every recommended video\n    var newLink = {\n      id: sourceID + '-' + rec + '-' + i,\n      source: sourceID,\n      target: rec\n    }\n    links.push(newLink) // Add to our list of links\n  })\n})\n```\n\nSo now we have our links and nodes in JSON format\nWe're going to use the [gexf library](https://github.com/Yomguithereal/gexf#writer)\nTo convert things\n\n```javascript\nvar convertedGraph = gexf.create({\n  defaultEdgeType: 'directed',\n  model: {\n    node: [\n      {\n        id: 'duration',\n        type: 'float',\n        title: 'likes'\n      },\n      {\n        id: 'likes',\n        type: 'float',\n        title: 'likes'\n      },\n      {\n        id: 'dislikes',\n        type: 'float',\n        title: 'dislikes'\n      },\n      {\n        id: 'mult',\n        type: 'float',\n        title: 'mult'\n      },\n      {\n        id: 'pubdate',\n        type: 'string',\n        title: 'pubdate'\n      },\n      {\n        id: 'channel',\n        type: 'string',\n        title: 'channel'\n      }\n    ]\n  },\n  nodes: nodes,\n  edges: links\n})\n```\n\n# Made by EJ Fox 🌞\n#### Apr 16, 2019\n\nejfox@ejfox.com // Questions, comments, collaboration welcome\n\nSupport for my work is appreciated! https://ejfox.com/donate/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fejfox%2Fyoutube-explore-to-gexf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fejfox%2Fyoutube-explore-to-gexf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fejfox%2Fyoutube-explore-to-gexf/lists"}