{"id":28222130,"url":"https://github.com/charlescol/news-domains-x-tracker","last_synced_at":"2025-06-11T16:32:02.336Z","repository":{"id":274227481,"uuid":"922287416","full_name":"charlescol/news-domains-X-tracker","owner":"charlescol","description":"An auto-refreshing dataset of major news domains and their X (formerly Twitter) accounts, complete with real-time stats like follower counts and engagement metrics. Made for tracking media trends and analytics at scale.","archived":false,"fork":false,"pushed_at":"2025-06-03T16:34:56.000Z","size":4180,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-04T02:09:04.453Z","etag":null,"topics":["news","news-aggregator","news-analysis","news-analytics","newsapi","trading","tradingbot","twitter","twitter-api","twitter-bot","twitter-client","twitter-sentiment-analysis","x","x-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/charlescol.png","metadata":{"files":{"readme":"README.md","changelog":"news-domains-x.csv","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-25T20:01:06.000Z","updated_at":"2025-06-03T16:34:58.000Z","dependencies_parsed_at":"2025-02-10T21:28:16.199Z","dependency_job_id":"424e1557-7051-4e9a-aece-23f780398a21","html_url":"https://github.com/charlescol/news-domains-X-tracker","commit_stats":null,"previous_names":["charlescol/us-news-domains-twitter-tracker","charlescol/news-domains-x-tracker"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/charlescol/news-domains-X-tracker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/charlescol%2Fnews-domains-X-tracker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/charlescol%2Fnews-domains-X-tracker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/charlescol%2Fnews-domains-X-tracker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/charlescol%2Fnews-domains-X-tracker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/charlescol","download_url":"https://codeload.github.com/charlescol/news-domains-X-tracker/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/charlescol%2Fnews-domains-X-tracker/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259297727,"owners_count":22836434,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["news","news-aggregator","news-analysis","news-analytics","newsapi","trading","tradingbot","twitter","twitter-api","twitter-bot","twitter-client","twitter-sentiment-analysis","x","x-api"],"created_at":"2025-05-18T06:10:05.841Z","updated_at":"2025-06-11T16:32:02.330Z","avatar_url":"https://github.com/charlescol.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![GitHub last commit](https://img.shields.io/github/last-commit/charlescol/news-domains-X-tracker)\n![GitHub repo size](https://img.shields.io/github/repo-size/charlescol/news-domains-X-tracker)\n![GitHub issues](https://img.shields.io/github/issues/charlescol/news-domains-X-tracker)\n![GitHub license](https://img.shields.io/github/license/charlescol/news-domains-X-tracker)\n\n# US \u0026 International News Domains Twitter Stats Tracker\n\n## **Project Overview**\n\nThis project aims to compile a list of major **news domains** along with their associated X (formerly Twitter) accounts. The repository includes **auto-refreshing job** to fetch real-time statistics related to these X accounts, such as follower count, tweet activity, and engagement metrics. You can find the dataset in **news-domains-x.csv**.\n\n- Top 100 accounts (sorted by followers) are updated daily.\n- The other records are updated daily in batches of 300.\n\nThe current dataset contains around 1550 news domain (\u003e10k followers) collected from multiple sources and will be continuously enriched and updated.\n\nThis project leverages **multiple free-tier** accounts of the X API to implement its refreshing strategy. Each account can retrieve data for up to 100 accounts daily, a limitation imposed by the X API.\n\n---\n\n## **Auto-Update Process**\n\nThe project leverages **GitHub Actions** to automatically update the statistics for tracked X accounts:\n\n### **Workflow:**\n\n1. **Job 1 (Real-time priority refresh):**\n   - Updates the **top 100 most-followed accounts** daily.\n2. **Job 2 \u0026 Job 3 \u0026 Job 4 (Incremental updates):**\n\n   - These jobs run in parallel to process accounts in batches of 100. With 3 tokens currently available, **records are updated daily in batches of 300**.\n   - The progress is tracked using a JSON file (`state/progress.json`) to ensure no accounts are skipped.\n\n3. **Reordering and Cleaning:**\n\n   - Once the entire list has been processed, it is re-sorted based on the number of followers.\n   - Inactive or suspended accounts are removed automatically.\n\n4. **Commit to GitHub:**\n   - The updated data is committed back to the repository, ensuring the latest statistics are always available.\n\n---\n\n## **Current Data Sources**\n\nThe data currently used in this project has been sourced from the following repositories:\n\n1. [ercexpo/us-news-domains](https://github.com/ercexpo/us-news-domains)\n2. [palewire/news-homepages](https://github.com/palewire/news-homepages)\n\nMore sources will be added over time.\n\n---\n\n## **Contributing**\n\nWe welcome contributions to expand the dataset and improve automation workflows. Feel free to submit issues and pull requests.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcharlescol%2Fnews-domains-x-tracker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcharlescol%2Fnews-domains-x-tracker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcharlescol%2Fnews-domains-x-tracker/lists"}