{"id":13440112,"url":"https://github.com/bibanon/tubeup","last_synced_at":"2025-05-14T19:06:09.983Z","repository":{"id":3853624,"uuid":"51124830","full_name":"bibanon/tubeup","owner":"bibanon","description":"Use yt-dlp to download video/metadata and upload to the Internet Archive.","archived":false,"fork":false,"pushed_at":"2025-04-03T01:15:16.000Z","size":341,"stargazers_count":446,"open_issues_count":6,"forks_count":71,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-04-13T13:57:49.010Z","etag":null,"topics":["archival","gplv3","internet-archive","preservation","python","video","youtube","youtube-dl","yt-dlp"],"latest_commit_sha":null,"homepage":"https://pypi.python.org/pypi/tubeup/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bibanon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-05T03:57:38.000Z","updated_at":"2025-04-09T14:59:19.000Z","dependencies_parsed_at":"2024-06-19T19:05:13.341Z","dependency_job_id":"21725590-5cc3-41e7-97cd-f15cded11f7b","html_url":"https://github.com/bibanon/tubeup","commit_stats":{"total_commits":312,"total_committers":26,"mean_commits":12.0,"dds":0.5,"last_synced_commit":"037cc14f0d9be1a60c417245c1de8dc878964865"},"previous_names":[],"tags_count":36,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bibanon%2Ftubeup","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bibanon%2Ftubeup/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bibanon%2Ftubeup/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bibanon%2Ftubeup/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bibanon","download_url":"https://codeload.github.com/bibanon/tubeup/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248724586,"owners_count":21151559,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["archival","gplv3","internet-archive","preservation","python","video","youtube","youtube-dl","yt-dlp"],"created_at":"2024-07-31T03:01:19.890Z","updated_at":"2025-04-13T13:57:54.842Z","avatar_url":"https://github.com/bibanon.png","language":"Python","funding_links":[],"categories":["Python","[↑](#-table-of-contents) Video Search and Other Video Tools"],"sub_categories":["[↑](#-table-of-contents) Telegram","[↑](#-table-of-contents) GitHub"],"readme":"Tubeup - a multi-VOD service to Archive.org uploader\n==========================================\n\n![Unit Tests](https://github.com/bibanon/tubeup/workflows/Unit%20Tests/badge.svg)\n![Lint](https://github.com/bibanon/tubeup/workflows/Lint/badge.svg)\n\n`tubeup` uses yt-dlp to download a Youtube video (or [any other provider supported by yt-dlp](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md)), and then uploads it with all metadata to the Internet Archive using the python module internetarchive.\n\nIt was designed by the [Bibliotheca Anonoma](https://github.com/bibanon/bibanon/wiki) to archive single videos, playlists (see warning below about more than video uploads) or accounts to the Internet Archive.\n\n## Prerequisites\n\nThis script strongly recommends Linux or some sort of POSIX system (such as macOS), preferably from a rented VPS and not your personal machine or phone.\n\nReccomended system specifications:\n- Linux VPS with Python 3.8 or higher and `pipx` installed\n- 2GB of RAM, 100GB of storage or much more for anything other than single short video mirroring. If your OS drive is too small, `symlink` it to something larger.\n\n## Setup and Installation\n\n1. Install `ffmpeg`, pip3 (typically `python3-pipx` or in Arch `python-pipx`), and git.  \n   To install ffmpeg in Ubuntu, enable the Universe repository.\n\nFor Debian/Ubuntu:\n\n```\n   sudo apt remove yt-dlp ; sudo apt install ffmpeg python3-pipx git\n```\n\nThen run:\n\n```\n   pipx ensurepath\n```\n\n2. Use pipx to install the required python packages.\n   At a minimum Python 3.9 and up is required (latest Python preferred).\n\n```\n   pipx install \"yt-dlp[default,curl-cffi]\"\n   pipx install internetarchive\n   pipx install tubeup\n```\n\n3. If you don't already have an Internet Archive account, [register for one](https://archive.org/account/login.createaccount.php) to give the script upload privileges.\n\n4. Configure `internetarchive` with your Internet Archive account.\n\n```\n   ia configure\n```\n\nYou will be prompted for your login credentials for the Internet Archive account you use.\n\nOnce configured to upload, you're ready to go.\n\n5. Start archiving a video by running the script on a URL (or multiple URLs) [supported by yt-dlp.](https://github.com/yt-dlp/yt-dlp/blob/master/supportedsites.md). For YouTube, this includes account URLs and playlist URLs.\n\n```\n   tubeup \u003curl\u003e\n```\n\n6. Each archived video gets its own Archive.org item. Check out what you've uploaded at\n\n   `http://archive.org/details/@YOURUSERNAME`.\n\n\nPerodically *before* running, upgrade `tubeup` and its dependencies by running:\n\n```\n   pipx upgrade-all\n```\n\n\n## Docker\n\nDockerized tubeup is provided by [etnguyen03/docker-tubeup](https://github.com/etnguyen03/docker-tubeup). Instructions are provided.\n   \n## Windows Setup\n\n1. Install WSL2, pick a distribution of your choice. Ubuntu is popular and well-supported.\n2. Use Windows Terminal by Microsoft to interact with the WSL2 instance.\n3. Fully update the Linux installation with your package manager of choice.\n   ```sudo apt update ; sudo apt upgrade```\n4. Install python `pipx` and `ffmpeg`.\n5. Install Tubeup using steps 4-6 in the Linux configuration guide above and configuring `internetarchive` for your Archive.org account.\n6. Periodically update your Linux packages and python packages.\n\n## Usage\n\n```\nUsage:\n  tubeup \u003curl\u003e... [--username \u003cuser\u003e] [--password \u003cpass\u003e]\n                  [--metadata=\u003ckey:value\u003e...]\n                  [--cookies=\u003cfilename\u003e]\n                  [--proxy \u003cprox\u003e]\n                  [--quiet] [--debug]\n                  [--use-download-archive]\n                  [--output \u003coutput\u003e]\n                  [--ignore-existing-item]\n  tubeup -h | --help\n  tubeup --version\n```\n```\nArguments:\n  \u003curl\u003e                         yt-dlp compatible URL to download.\n                                Check yt-dlp documentation for a list\n                                of compatible websites.\n  --metadata=\u003ckey:value\u003e        Custom metadata to add to the archive.org\n                                item.\nOptions:\n  -h --help                    Show this screen.\n  -p --proxy \u003cprox\u003e            Use a proxy while uploading.\n  -u --username \u003cuser\u003e         Provide a username, for sites like Nico Nico Douga.\n  -p --password \u003cpass\u003e         Provide a password, for sites like Nico Nico Douga.\n  -a --use-download-archive    Record the video url to the download archive.\n                               This will download only videos not listed in\n                               the archive file. Record the IDs of all\n                               downloaded videos in it.\n  -q --quiet                   Just print errors.\n  -d --debug                   Print all logs to stdout.\n  -o --output \u003coutput\u003e         yt-dlp output template.\n  -i --ignore-existing-item    Don't check if an item already exists on archive.org\n```\n\n## Metadata\n\nYou can specify custom metadata with the `--metadata` flag.\nFor example, this script will upload your video to the [Community Video collection](https://archive.org/details/opensource_movies) by default.\nYou can specify a different collection with the `--metadata` flag:\n\n```\n   tubeup --metadata=collection:opensource_audio \u003curl\u003e\n```\n\nAny arbitrary metadata can be added to the item, with a few exceptions.\nYou can learn more about archive.org metadata [here](https://archive.org/services/docs/api/metadata-schema/).\n\n### Collections\n\nArchive.org users can upload to four open collections:\n\n* [Community Audio](https://archive.org/details/opensource_audio) where the identifier is `opensource_audio`.\n* [Community Software](https://archive.org/details/open_source_software)  where the identifier is `opensource_software`.\n* [Community Texts](https://archive.org/details/opensource) where the identifier is `opensource`.\n* [Community Video](https://archive.org/details/opensource_movies) where the identifier is `opensource_movies`.\n\nNote that care should be taken when uploading entire channels.\nRead the appropriate section [in this guide](https://archive.org/about/faqs.php#Collections) for creating collections, and contact the [collections staff](mailto:collections-service@archive.org) if you're uploading a channel or multiple channels on one subject (gaming or horticulture for example). Internet Archive collections staff will either create a collection for you or merge any uploaded items based on the YouTube uploader name that are already up into a new collection.\n\n**Dumping entire channels into Community Video is abusive and may get your account locked.** _Talk to the Internet Archive admins first before doing large uploads; it's better to ask for guidence or help first than run afoul of the rules._\n\n**If you do not own a collection you will need to be added as an admin for that collection if you want to upload to it.** Talk to the collection owner or staff if you need assistance with this.\n\n## Troubleshooting\n\n* Some videos are copyright blocked in certain countries. Use the proxy or torrenting/privacy VPN option to use a proxy to bypass this. Sweden and Germany are good countries to bypass geo-restrictions.\n* Upload taking forever? Getting s3 throttling on upload? Tubeup has specifically been tailored to wait the longest possible time before failing, and we've never seen a S3 outage that outlasted the insane wait times set in Tubeup. Disabling waits for S3 timeouts won't make the upload work, instead it will leave the downloaded contents on your disk in the downloads folder (`~/.tubeup/downloads`) because the download will immeaditly fail instead of gracefully waiting. The waits are a safety in case timeouts occur, do not disable them.\n\n## A note on live videos\n\n- [yt-dlp cannot do simultaneous downloads, cannot prioritize live video first on Youtube over live chat](https://github.com/bibanon/tubeup/discussions/283#discussioncomment-5625558), This couldn't be fixed unless for YT which is what most people use it for, except by disabling livechat ripping to start video ripping, but even if that solution was acceptable by building in a flag on our end that disables chats to get video (again unacceptable) thats canceled by the next problem....\n\n- [yt-dlp has a unacceptably high failure rate with `--live-from-start` is called](https://github.com/bibanon/tubeup/issues/186#issuecomment-1127698704), sometimes the result doesn't mux, and in Twitches case is incomplete and isn't supported by all extractors. This flag is actually considered experimental by yt-dlp maintainers and has been said is unsuitable for archival purposes.\n\nDo not use Tubeup to archive live Youtube (or any other site) video. We will not/cannot fix it, it's not even our problem, and any solutions are unpalitable since they involve more code complexity to be maintained ontop of having to disable livechat for one extractor only for live video.\n\n## Major Credits (in no particular order)\n\n- [emijrp](https://github.com/emijrp/) who wrote the original [youtube2internetarchive.py](https://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py) in 2012\n- [Matt Hazinski](https://github.com/matthazinski) who forked emijrp's work in 2015 with numerous improvements of his own.\n- Antonizoon for switching the script to library calls rather than functioning as an external script, and many small improvements.\n- Small PRs from various people, both in and out of BibAnon.\n- vxbinaca for stabilizing downloads/uploads in `yt-dlp`/`internetarchive` library calls, cleansing item output, subtitles collection, and numerous small improvements over time.\n- mrpapersonic for adding logic to check if an item already exists in the Internet Archive and if so skips ingestion.\n- [Jake Johnson](https://github.com/jjjake) of the Internet Archive for adding variable collections ability as a flag, switching Tubeup from a script to PyPi repository, ISO-compliant item dates, fixing what others couldn't, and many improvements.\n- [Refeed](https://github.com/refeed) for re-basing the code to OOP, turning Tubeup itself into a library. and adding download and upload bar graphs, and squashing bugs.\n\n## License (GPLv3)\n\nCopyright (C) 2024 Bibliotheca Anonoma\n\nThis program is free software: you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation, either version 3 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License\nalong with this program.  If not, see \u003chttp://www.gnu.org/licenses/\u003e.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbibanon%2Ftubeup","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbibanon%2Ftubeup","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbibanon%2Ftubeup/lists"}