{"id":19860997,"url":"https://github.com/muneeb1030/webscrapper_politifact","last_synced_at":"2025-09-09T11:15:51.728Z","repository":{"id":222282295,"uuid":"753783555","full_name":"Muneeb1030/WebScrapper_Politifact","owner":"Muneeb1030","description":"This initiative seeks to extract and analyze fact-checking data from Politifact.com, providing valuable insights into political statements, rulings, and the evolving information landscape.","archived":false,"fork":false,"pushed_at":"2024-07-28T09:23:57.000Z","size":34,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-01T00:12:50.296Z","etag":null,"topics":["data","data-collection","dataanalysis","python3","scrapy","scrapy-spider","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Muneeb1030.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-06T19:34:09.000Z","updated_at":"2024-07-28T09:24:00.000Z","dependencies_parsed_at":"2024-02-13T12:46:38.621Z","dependency_job_id":"4e733851-b765-4d78-af3c-b99820820833","html_url":"https://github.com/Muneeb1030/WebScrapper_Politifact","commit_stats":null,"previous_names":["muneeb1030/webscrapper_politifact"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Muneeb1030/WebScrapper_Politifact","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muneeb1030%2FWebScrapper_Politifact","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muneeb1030%2FWebScrapper_Politifact/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muneeb1030%2FWebScrapper_Politifact/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muneeb1030%2FWebScrapper_Politifact/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Muneeb1030","download_url":"https://codeload.github.com/Muneeb1030/WebScrapper_Politifact/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Muneeb1030%2FWebScrapper_Politifact/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274290070,"owners_count":25258094,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-09T02:00:10.223Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-collection","dataanalysis","python3","scrapy","scrapy-spider","webscraping"],"created_at":"2024-11-12T15:07:46.228Z","updated_at":"2025-09-09T11:15:51.663Z","avatar_url":"https://github.com/Muneeb1030.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Politifact Web Scraping Project\n\n## Overview\n\nUnveiling the intricacies of political discourse, the Politifact Web Scraping Project is a Python-powered endeavor utilizing the Scrapy framework. This initiative seeks to extract and analyze fact-checking data from [Politifact.com](https://politifact.com), providing valuable insights into political statements, rulings, and the evolving information landscape.\n\n## Key Features\n\n1. **Data Extraction:** Scraps author names, saying dates, headlines, rulings, publishers, and article URLs for a comprehensive dataset.\n2. **File Management:** Dynamically creates directories for organized storage of scraped data, ensuring a systematic approach from the project's outset.\n3. **Image Downloads:** Utilizes Scrapy's image pipeline for downloading header images, enhancing the visual context of each article.\n4. **Efficient CSV Handling:** Implements regular write intervals to prevent data loss and alleviate memory burden during asynchronous requests.\n\n\n## Requirements\n- **Python 3.x**\n- **Scrapy**\n- **Requests**\n- **Pandas**\n\n## Getting Started\n1. **Clone the Repository:**\n    ```\n    git clone https://github.com/Muneeb1030/WebScrapper_Politifact.git\n    ```\n\n2. **Install Dependencies:**\n    ```\n    pip install scrapy pandas requests\n    ```\n\n\n3. **Run the Scraper:**\n    ```\n    scrapy crawl politifact\n    ```\n\n## Additional Information\n- **Customization:**\n    - Tailor the scraper to your needs by modifying the Scrapy spiders.\n- **GitHub Repository:**\n    - Explore, contribute, and stay updated on the [GitHub repository](\\https://github.com/Muneeb1030/WebScrapper_Politifact.git).\n\n\n## Disclaimer\nThis project is intended for educational purposes and strictly adheres to Politifact's terms of service. Users are advised to deploy the scraper responsibly and in compliance with platform policies.\n\n## Additional Resources\n\nExplore the project in detail through my [Medium blog](https://medium.com/@m.muneeb.ur.rehman.2000/fact-checking-the-fact-checkers-scraping-politifact-com-for-political-truths-with-pythons-scrapy-fcfa42f5bcf2), where I share insights, motivation, and in-depth explanations about the Politifact Scraper.\n\n## Contributors\n- M Muneeb ur Rehman\n\nFeel free to fork, contribute, and enhance the capabilities of this Politifact scraper. Happy scraping! 🌐💻\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuneeb1030%2Fwebscrapper_politifact","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmuneeb1030%2Fwebscrapper_politifact","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmuneeb1030%2Fwebscrapper_politifact/lists"}