{"id":24511027,"url":"https://github.com/devcybiko/yt-dlp-fun","last_synced_at":"2025-03-15T09:43:10.772Z","repository":{"id":272223341,"uuid":"915867982","full_name":"devcybiko/yt-dlp-fun","owner":"devcybiko","description":null,"archived":false,"fork":false,"pushed_at":"2025-01-18T15:31:43.000Z","size":460,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-08T21:41:27.799Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devcybiko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-13T01:54:59.000Z","updated_at":"2025-01-18T15:31:45.000Z","dependencies_parsed_at":"2025-01-13T03:30:46.978Z","dependency_job_id":"20e52a9c-80b6-42b0-9c45-6c60c91071a0","html_url":"https://github.com/devcybiko/yt-dlp-fun","commit_stats":null,"previous_names":["devcybiko/yt-dlp-fun"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devcybiko%2Fyt-dlp-fun","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devcybiko%2Fyt-dlp-fun/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devcybiko%2Fyt-dlp-fun/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devcybiko%2Fyt-dlp-fun/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devcybiko","download_url":"https://codeload.github.com/devcybiko/yt-dlp-fun/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243713399,"owners_count":20335566,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-22T00:36:03.913Z","updated_at":"2025-03-15T09:43:10.748Z","avatar_url":"https://github.com/devcybiko.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# YT-DLP-FUN\n\n- A set of bash scripts to download favorite youtube subscriptions, extract the audio, and send them to a personal podcast channel for easy review\n\n## TODO\n\n- don't process LIVE or overly long videos / audios\n- purging of audio/video folder\n- set a lower limit on videos downloaded (avoiding shorts)\n- set the # of days (currently hard-coded to 7 days prior)\n- manage purging of reviewed audios/videos\n- verify that duplicates do not occur\n- downloads are in .webm format, is that a concern?\n- performance\n  - it takes about 5 secs per channel, and of course download rates when there are new videos\n  - can we download just the audio?\n- error handling.\n  - some of the scripts log to a log file, others not so much\n  - should we stop if there's an error? currently we push along\n\n## Running\n\n- The scripts also depend upon the AWS CLI being installed and configured. Installation is beyond the scope of this document\n- `brew install yt-dlp` # one time for set up\n- `source ./scripts/source.me`\n    - sets environment vars, path\n- `cd podcasts/drfrancintosh`\n- `all.sh`\n\n## Folder Structure\n\n- ./podcasts/ - a parent folder for all podcasts to be created, one per \"channel\"\n  - [podcast name]/\n    - videos/ - a folder to hold downloaded videos\n      - [channel] - a folder per channel that is subscribed to\n    - audios/ - a folder to hold extracted audios\n      - [channel] - a folder per channel that is subscribed to\n    - archive.txt - a file of videos previously downloaded that should not be re-downloaded\n    - artwork.jpg - the artwork for the podcast rss feed\n    - subscriptions.csv - the output from Google Takeout for all your subscriptions (optionally sorted)\n    - variables.env - variables that drive the scripts, customized for this podcast\n    - rss.xml - the generated rss feed \n\n## Secrets.env\n\n- You should keep any passwords, etc in a ~/secrets.env file and source that file \n- Don't keep secrets in this folder\n\n## AWS S3\n\n- this project expects that you have access to S3 buckets\n- the bucket should\n  - Have ACLs turned off\n  - Be a `Static website hosting` enabled\n  - Set the `Bucket Policy Permissions` as below:\n- the `variables.env` file should set\n  - s3_bucket     # Replace with your S3 bucket name\n  - s3_folder     # Folder inside the S3 bucket to store files\n- all audio files and the rss.xml will be delivered to ${s3_bucket}/${s3_folder}/\n- NOTE: files are stored in a flat organization with [channel_name]:[file_name]\n\n```json\n{\n    \"Version\": \"2012-10-17\",\n    \"Statement\": [\n        {\n            \"Sid\": \"PublicReadForStaticWebsite\",\n            \"Effect\": \"Allow\",\n            \"Principal\": \"*\",\n            \"Action\": \"s3:GetObject\",\n            \"Resource\": \"arn:aws:s3:::[your-bucket-name]/*\"\n        }\n    ]\n}\n```\n## For SYNC capabilities\n\n```json\n{\n  \"Version\": \"2012-10-17\",\n  \"Statement\": [\n    {\n      \"Sid\": \"FullAccessForSelf\",\n      \"Effect\": \"Allow\",\n      \"Principal\": {\n        \"AWS\": \"arn:aws:iam::YOUR_ACCOUNT_ID:user/YOUR_USER_NAME\"\n      },\n      \"Action\": [\n        \"s3:ListBucket\",\n        \"s3:GetObject\",\n        \"s3:PutObject\",\n        \"s3:DeleteObject\"\n      ],\n      \"Resource\": [\n        \"arn:aws:s3:::your-bucket-name\",\n        \"arn:aws:s3:::your-bucket-name/*\"\n      ]\n    },\n    {\n      \"Sid\": \"PublicReadForStaticWebsite\",\n      \"Effect\": \"Allow\",\n      \"Principal\": \"*\",\n      \"Action\": \"s3:GetObject\",\n      \"Resource\": \"arn:aws:s3:::your-bucket-name/*\"\n    }\n  ]\n}\n\n```\n## Scripts\n\n- all.sh - run all the scripts\n  - yt-download.sh - download the most recent youtube videos from `subscriptions.csv` and store them in `./videos/channel-name`\n  - split-audios.sh - split the audio out of the videos and store in `./audios/channel-name`\n  - upload-files-to-s3.sh - send the files to the S3 bucket\n  - upload-rss-feed-to-s3.sh - send the rss feed to the S3 bucket\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevcybiko%2Fyt-dlp-fun","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevcybiko%2Fyt-dlp-fun","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevcybiko%2Fyt-dlp-fun/lists"}