{"id":28961841,"url":"https://github.com/official-imvoiid/multifetch","last_synced_at":"2026-05-19T19:10:19.954Z","repository":{"id":300115807,"uuid":"1005247497","full_name":"official-imvoiid/MultiFetch","owner":"official-imvoiid","description":"A high-performance web scraper for bulk image and GIF extraction from reliable sources — built for AI/ML data pipelines and large-scale media collection","archived":false,"fork":false,"pushed_at":"2025-06-19T23:39:50.000Z","size":48,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-20T00:32:14.544Z","etag":null,"topics":["aiml","data","dataset","gifscraper","imagescraper","python","pythontool","tools","webscraper","windows"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/official-imvoiid.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-19T23:31:44.000Z","updated_at":"2025-06-19T23:39:53.000Z","dependencies_parsed_at":"2025-06-20T00:42:18.432Z","dependency_job_id":null,"html_url":"https://github.com/official-imvoiid/MultiFetch","commit_stats":null,"previous_names":["official-imvoiid/multifetch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/official-imvoiid/MultiFetch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/official-imvoiid%2FMultiFetch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/official-imvoiid%2FMultiFetch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/official-imvoiid%2FMultiFetch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/official-imvoiid%2FMultiFetch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/official-imvoiid","download_url":"https://codeload.github.com/official-imvoiid/MultiFetch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/official-imvoiid%2FMultiFetch/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266590719,"owners_count":23952987,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-22T02:00:09.085Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aiml","data","dataset","gifscraper","imagescraper","python","pythontool","tools","webscraper","windows"],"created_at":"2025-06-24T02:05:36.676Z","updated_at":"2026-05-19T19:10:19.921Z","avatar_url":"https://github.com/official-imvoiid.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# WebScap CLI\n\nA powerful command-line image scraping tool designed for AI/ML research, dataset creation, and personal use. WebScap provides easy access to images from multiple platforms to help build comprehensive datasets for AI model training.\n\n## 🚀 Motivation\n\nThis project was born from the challenge of finding quality datasets for AI Text-To-Video model training. WebScap solves this problem by providing a simple, efficient way to gather large image datasets from various platforms, enabling:\n\n- **Training Data Collection**: Build robust datasets for AI model training\n- **Topic Understanding**: Help AI models understand specific subjects through visual data\n- **Vision Capability Enhancement**: Improve model performance by including diverse image datasets\n\n## ✨ Features\n\nWebScap supports scraping from 10 different platforms:\n\n- **Pinterest** - Creative inspiration and lifestyle images\n- **DeviantArt** - Digital art and creative content\n- **Pixiv Art** - Japanese illustration and artwork (Requires PHPSESSID)\n- **Civitai** - AI-generated art and models (Requires API)\n- **Google Images** - Comprehensive web image search\n- **WebScap GIF** - Specialized GIF collection and animation scraping\n- **StaticPage** - Extract images from static websites and HTML pages\n- **Image Upscaler** - Enhance image quality automatically\n- **Image Converter** - Convert images between different formats\n\n## 📋 Requirements\n\n### System Requirements\n- **Operating System**: Windows 11\n- **Python**: Version 8+ \n- **Browser**: Google Chrome (installed and set as default)\n\n### API Requirements\n- **Pixiv**: PHPSESSID token required\n- **Civitai**: API key required\n\n## 📊 Performance\n\n- **Tested Capacity**: Successfully scraped 1,700+ images\n- **API Calls**: Handles 200+ API requests efficiently\n- **GIF Support**: Optimized for animated content collection\n- **Static Pages**: Efficiently extracts images from HTML/CSS structures\n- **Scalability**: Potentially supports larger volumes (untested)\n\n## 🔒 Content Policy \u0026 NSFW Handling\n\nWebScap respects platform-specific content policies and user preferences:\n\n### NSFW Content Management\n- **Default Behavior**: Platforms maintain their original NSFW/SFW structure\n- **User Control**: Content filtering depends on your platform account settings\n- **Safe Mode**: Enable \"Safe=ON\" in your account settings to avoid NSFW content on supported platforms\n- **Platform Respect**: No modification of platform content policies - choice remains with users\n\n### Supported Platforms NSFW Policy\n- ✅ **Google Images**: Follows your SafeSearch settings\n- ✅ **Pinterest**: Default Safe Setting is on  \n- ✅ **DeviantArt**: Default Safe Setting is on\n- ✅ **Pixiv**: Follows account content filters\n- ✅ **Civitai**: Respects platform content settings\n\n## ⚠️ Important Disclaimer\n\n**Developer Responsibility Notice**: \nThe developers are not responsible for user actions. Please use this tool responsibly and ethically.\n\n### Acceptable Use\n✅ **Permitted Uses:**\n- AI/ML research and development\n- Academic research projects\n- Personal dataset creation\n- Fair use educational purposes\n\n❌ **Prohibited Uses:**\n- Commercial redistribution without permission\n- Violation of platform Terms of Service\n- Copyright infringement\n- Malicious or harmful activities\n\n### Legal Compliance\n- Always follow platform Terms of Service\n- Respect copyright and intellectual property rights\n- Use scraped content within fair use guidelines\n- Ensure compliance with local laws and regulations\n\n## 🛠️ Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/official-imvoiid/MultiFetch.git\n\n# Navigate to project directory\ncd MultiFetch\n\n# Install dependencies\npip install -r requirements.txt\n```\n\n## 🔧 Configuration\n\n### Required Setup\n1. Ensure Chrome is installed and set as default browser\n2. Obtain necessary API keys/tokens:\n   - **Pixiv**: Get your PHPSESSID from browser cookies\n   - **Civitai**: Register and obtain API key\n\n### Platform Account Settings\nFor optimal results and content filtering:\n1. Configure your account settings on each platform\n2. Set appropriate content filters (Safe=ON for family-friendly content)\n3. Adjust privacy and content preferences as needed\n\n## 📜 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n**Remember**: Always scrape responsibly and ethically. Respect platform terms of service and copyright laws.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fofficial-imvoiid%2Fmultifetch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fofficial-imvoiid%2Fmultifetch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fofficial-imvoiid%2Fmultifetch/lists"}