{"id":18300278,"url":"https://github.com/docnow/twarc-videos","last_synced_at":"2025-04-09T09:26:23.788Z","repository":{"id":62585642,"uuid":"350913757","full_name":"DocNow/twarc-videos","owner":"DocNow","description":null,"archived":false,"fork":false,"pushed_at":"2021-05-31T21:04:36.000Z","size":24,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-19T04:46:19.196Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DocNow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-24T01:51:32.000Z","updated_at":"2021-07-09T07:23:16.000Z","dependencies_parsed_at":"2022-11-03T22:06:38.164Z","dependency_job_id":null,"html_url":"https://github.com/DocNow/twarc-videos","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocNow%2Ftwarc-videos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocNow%2Ftwarc-videos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocNow%2Ftwarc-videos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DocNow%2Ftwarc-videos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DocNow","download_url":"https://codeload.github.com/DocNow/twarc-videos/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248010510,"owners_count":21032927,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T15:11:56.223Z","updated_at":"2025-04-09T09:26:23.746Z","avatar_url":"https://github.com/DocNow.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# twarc-videos\n\nThis twarc plugin uses [youtube_dl] to download videos and their metadata from\ntweets. This is nice because youtube_dl downloads video from [many more\nplatforms] than YouTube including Twitter itself.\n\nTo use twarc-videos first you need to install it:\n\n    pip install twarc-videos\n\nNow you can collect data using the core twarc utility. For example this search\nfinds tweets that mention the word \"nirvana\" and also have native video\n(Twitter video) or a link to YouTube:\n\n    twarc2 search 'nirvana (has:videos OR url:\"https://youtu.be\")' \u003e nirvana-tweets.jsonl\n\nAnd you have a new subcommand `videos` that is supplied by twarc-videos.\n\n    twarc2 videos nirvana-tweets.jsonl\n\nOnce it is finished you will have a new `videos` directory that looks something\nlike:\n\n```\nvideos\n├── archive.txt\n├── mapping.tsv\n├── twitter\n│   ├── 1339223561731530753\n│   │   ├── Psychedelia_-_Nirvana_-_Come_As_You_Are.description\n│   │   ├── Psychedelia_-_Nirvana_-_Come_As_You_Are.info.json\n│   │   └── Psychedelia_-_Nirvana_-_Come_As_You_Are.mp4\n│   ├── 1341668409428353025\n│   │   ├── Rt_Your_Fav_Bands_-_Nirvana_Come_As_You_Are.description\n│   │   ├── Rt_Your_Fav_Bands_-_Nirvana_Come_As_You_Are.info.json\n│   │   └── Rt_Your_Fav_Bands_-_Nirvana_Come_As_You_Are.mp4\n│   ├── 1374212180002926594\n│   │   ├── Hanna_-_She_s_in_Nirvana....description\n│   │   ├── Hanna_-_She_s_in_Nirvana....info.json\n│   │   └── Hanna_-_She_s_in_Nirvana....mp4\n│   ├── 1374467789885378569\n│   │   ├── MUSIC_NOSTALGIA_-_Nirvana_The_Man_Who_Sold_The_World_..description\n│   │   ├── MUSIC_NOSTALGIA_-_Nirvana_The_Man_Who_Sold_The_World_..info.json\n│   │   └── MUSIC_NOSTALGIA_-_Nirvana_The_Man_Who_Sold_The_World_..mp4\n│   ├── 1374469206226264067\n│   │   ├── Take_it_easy_-_Abuelo_donde_andas_Nirvana.description\n│   │   ├── Take_it_easy_-_Abuelo_donde_andas_Nirvana.info.json\n│   │   └── Take_it_easy_-_Abuelo_donde_andas_Nirvana.mp4\n│   ├── 1374631023502360576\n│   │   ├── OraEtLabora_-_Reel_Stories_-_Dave_Grohl_is_on_@bbctwo_this_Saturday_at_10.30pm...talking_@Nirvana_amp_@foofighters_with_Dermot_@radioleary_@wearecraftuk.description\n│   │   ├── OraEtLabora_-_Reel_Stories_-_Dave_Grohl_is_on_@bbctwo_this_Saturday_at_10.30pm...talking_@Nirvana_amp_@foofighters_with_Dermot_@radioleary_@wearecraftuk.info.json\n│   │   └── OraEtLabora_-_Reel_Stories_-_Dave_Grohl_is_on_@bbctwo_this_Saturday_at_10.30pm...talking_@Nirvana_amp_@foofighters_with_Dermot_@radioleary_@wearecraftuk.mp4\n│   ├── 1374656171844329477\n│   ├── 1374656880694292483\n│   ├── 1374660019241762817\n│   ├── 1374664809078272000\n│   └── 1374671562016661506\n│       ├── John_-_Nirvana_-_In_Bloom_Live_at_Reading_1992_@YouTube.description\n│       ├── John_-_Nirvana_-_In_Bloom_Live_at_Reading_1992_@YouTube.info.json\n│       └── John_-_Nirvana_-_In_Bloom_Live_at_Reading_1992_@YouTube.mp4\n└── youtube\n    ├── 5X9CGFQyjN4\n    │   ├── Heart-Shaped_Box_Nirvana_Music_Box.description\n    │   ├── Heart-Shaped_Box_Nirvana_Music_Box.en.vtt\n    │   ├── Heart-Shaped_Box_Nirvana_Music_Box.info.json\n    │   └── Heart-Shaped_Box_Nirvana_Music_Box.mp4\n    ├── AhcttcXcRYY\n    │   ├── Nirvana_-_About_A_Girl_MTV_Unplugged.description\n    │   ├── Nirvana_-_About_A_Girl_MTV_Unplugged.en.vtt\n    │   ├── Nirvana_-_About_A_Girl_MTV_Unplugged.info.json\n    │   └── Nirvana_-_About_A_Girl_MTV_Unplugged.mp4\n    ├── AXU-LaaO_xQ\n    │   ├── Nirvana_Drain_You_lyrics_sub_espanol.description\n    │   ├── Nirvana_Drain_You_lyrics_sub_espanol.info.json\n    │   └── Nirvana_Drain_You_lyrics_sub_espanol.mp4\n    ├── D742dNm1f8Q\n    │   ├── Nirvana_-_In_Bloom_Live_at_Reading_1992.description\n    │   ├── Nirvana_-_In_Bloom_Live_at_Reading_1992.info.json\n    │   └── Nirvana_-_In_Bloom_Live_at_Reading_1992.mp4\n    ├── -fh-bqSV73E\n    │   ├── Becoming_a_minimalist_w_Matt_D_Avella.description\n    │   ├── Becoming_a_minimalist_w_Matt_D_Avella.en.vtt\n    │   ├── Becoming_a_minimalist_w_Matt_D_Avella.info.json\n    │   └── Becoming_a_minimalist_w_Matt_D_Avella.mp4\n    ├── hTWKbfoikeg\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Official_Music_Video.description\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Official_Music_Video.en.vtt\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Official_Music_Video.info.json\n    │   └── Nirvana_-_Smells_Like_Teen_Spirit_Official_Music_Video.mp4\n    ├── jWkSt4G8F18\n    │   ├── Nirvana_healing_centre_overview.description\n    │   ├── Nirvana_healing_centre_overview.info.json\n    │   └── Nirvana_healing_centre_overview.mp4\n    ├── MW6E_TNgCsY\n    │   ├── Everclear_-_Santa_Monica_Official_Music_Video.description\n    │   ├── Everclear_-_Santa_Monica_Official_Music_Video.info.json\n    │   └── Everclear_-_Santa_Monica_Official_Music_Video.mp4\n    ├── n6P0SitRwy8\n    │   ├── Nirvana_-_Heart-Shaped_Box.description\n    │   ├── Nirvana_-_Heart-Shaped_Box.info.json\n    │   └── Nirvana_-_Heart-Shaped_Box.mp4\n    ├── OgeR2oqZGTs\n    │   ├── Nirvana_-_The_Man_Who_Sold_The_World_Live_On_MTV_Unplugged_1993_Unedited.description\n    │   ├── Nirvana_-_The_Man_Who_Sold_The_World_Live_On_MTV_Unplugged_1993_Unedited.en.vtt\n    │   ├── Nirvana_-_The_Man_Who_Sold_The_World_Live_On_MTV_Unplugged_1993_Unedited.info.json\n    │   └── Nirvana_-_The_Man_Who_Sold_The_World_Live_On_MTV_Unplugged_1993_Unedited.mp4\n    ├── v9RY25eImcw\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Cover_RADIO_TAPOK.description\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Cover_RADIO_TAPOK.en.vtt\n    │   ├── Nirvana_-_Smells_Like_Teen_Spirit_Cover_RADIO_TAPOK.info.json\n    │   └── Nirvana_-_Smells_Like_Teen_Spirit_Cover_RADIO_TAPOK.mp4\n    ├── ycHvL3W3_PA\n    │   ├── Nirvana_-_Where_Did_You_Sleep_Last_Night_8D_Audio.description\n    │   ├── Nirvana_-_Where_Did_You_Sleep_Last_Night_8D_Audio.info.json\n    │   └── Nirvana_-_Where_Did_You_Sleep_Last_Night_8D_Audio.mp4\n    └── y-lQgqHD8Xs\n        ├── dodo_tofubeats_-_nirvana_Official_Music_Video.description\n        ├── dodo_tofubeats_-_nirvana_Official_Music_Video.info.json\n        └── dodo_tofubeats_-_nirvana_Official_Music_Video.mp4\n```\n\nThe `video/mapping.tsv` file is a tab separated value file of video URLs found\nand their corresponding location in disk. \n\n## Testing\n\nTo run the tests you will need create a `.env` file that looks like:\n\n    BEARER_TOKEN=YOUR_TOKEN_HERE\n\nAnd then:\n\n    python setup.py test\n\n[twarc]: https://github.com/docnow/twarc \n[youtube_dl]: https://youtube-dl.org/ \n[many more platforms]: http://ytdl-org.github.io/youtube-dl/supportedsites.html\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocnow%2Ftwarc-videos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdocnow%2Ftwarc-videos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdocnow%2Ftwarc-videos/lists"}