Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-web-scraping
A list of libraries, tools, and APIs for web scraping and data processing. Find everything you need for extracting, managing, and processing data from the web, from HTTP libraries to browser automation tools and proxy services.
https://github.com/luminati-io/awesome-web-scraping
Last synced: 6 days ago
JSON representation
-
Topics
- Recommended Headless Browsers - A list of the best headless browsers for web scraping.
- R - A collection of R libraries and tools for web scraping, data parsing, automation, and export, with support for HTTP clients, proxy integration, CAPTCHA solving, and user-agent spoofing.
- Rust - A collection of Rust tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA handling, and browser automation.
- Go - A collection of Go tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA solving, serialization, and task scheduling.
- Perl - A collecton of Perl tools and libraries for web scraping, data parsing, and automation, with tools for HTTP clients, proxy integration, CAPTCHA solving, and data export.
- Java - A collection of Java tools and libraries for web scraping, parsing, and automation, including HTTP clients, proxy integration, CAPTCHA solving, data processing, and scheduling.
- Web Scraping Guides, Tips, and Tricks - A comprehensive document of web scraping guides, tips, and tricks for efficiently navigating web scraping challenges, handling anti-bot measures, optimizing proxy use, and much more.
- PHP - A collection of PHP libraries, frameworks, and tools for web scraping, data parsing, export, and automation, featuring solutions for proxy integration, CAPTCHA solving, and task scheduling.
- Ruby - A collection of Ruby resources for web scraping, data parsing, and automation, covering libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, and task scheduling.
- JavaScript - A collection of JavaScript resources for web scraping, data parsing, and automation, featuring libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, user-agent spoofing, and task scheduling.
- Python - A collection of Python libraries, tools, and frameworks for web scraping, data parsing, export, and processing, with support for anti-bot bypass, proxy integration, and automation.
- PHP - A collection of PHP libraries, frameworks, and tools for web scraping, data parsing, export, and automation, featuring solutions for proxy integration, CAPTCHA solving, and task scheduling.
- Ruby - A collection of Ruby resources for web scraping, data parsing, and automation, covering libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, and task scheduling.
- JavaScript - A collection of JavaScript resources for web scraping, data parsing, and automation, featuring libraries for HTTP clients, parsers, proxy integration, CAPTCHA solving, user-agent spoofing, and task scheduling.
- Go - A collection of Go tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA solving, serialization, and task scheduling.
- R - A collection of R libraries and tools for web scraping, data parsing, automation, and export, with support for HTTP clients, proxy integration, CAPTCHA solving, and user-agent spoofing.
- Rust - A collection of Rust tools and libraries for web scraping, parsing, and data automation, including HTTP clients, proxy integration, CAPTCHA handling, and browser automation.
- Perl - A collecton of Perl tools and libraries for web scraping, data parsing, and automation, with tools for HTTP clients, proxy integration, CAPTCHA solving, and data export.
- Java - A collection of Java tools and libraries for web scraping, parsing, and automation, including HTTP clients, proxy integration, CAPTCHA solving, data processing, and scheduling.
- Web Scraping Guides, Tips, and Tricks - A comprehensive document of web scraping guides, tips, and tricks for efficiently navigating web scraping challenges, handling anti-bot measures, optimizing proxy use, and much more.
-
Recommended CAPTCHA Solving Services
-
Recommended Proxy Types
- Residential Proxies - The perfect solution for large-scale and complicated projects that require real user IPs.
- Datacenter Proxies - A cost-effective and high speed solution, suitable for large-scale scraping on less strict websites.
- Residential Proxies - The perfect solution for large-scale and complicated projects that require real user IPs.
- Datacenter Proxies - A cost-effective and high speed solution, suitable for large-scale scraping on less strict websites.
-
Free Dataset Samples
- Amazon data
- Facebook data
- Zillow data
- LinkedIn data
- Crunchbase data
- Glassdoor data
- Target data
- Indeed data
- Walmart data
- Airbnb data
- Shopee data
- Shein data
- TikTok data
- Google Maps data
- ZoomInfo data
- Pinterest data
- Twitter data
- B2B data
- Zillow data
- LinkedIn data
- Crunchbase data
- Amazon data
- Glassdoor data
- Target data
- Indeed data
- Walmart data
- Airbnb data
- Shopee data
- Shein data
- TikTok data
- Google Maps data
- Twitter data
- B2B data
- ZoomInfo data
- Pinterest data
-
Popular Web Scraping Videos (Bright Data's Collaborations)
- I turned Tinder into a pet adoption app
- 3 million dollar project ideas for developers
- Overcoming web scraping challenges, price alert monitoring with Puppeteer, NodeJS, and Hono
- What's the best Python web scraping library?
- How to create custom datasets to train LLMs using Bright Data
- Real estate end to end data engineering using AI
- eCommerce web scraping tutorial (Puppeteer, Cheerio, and Node.js
- How to scrape any website (ft. scraping browser)
- Easiest way to web scraping using Playwright
- The ultimate guide to Python & ChatGPT data analysis
- Build a fullstack SEO rank tracker app with Next.js and Bright Data
- Web Data Masterclass
- How to scrape any website (ft. scraping browser)
- Easiest way to web scraping using Playwright
- The ultimate guide to Python & ChatGPT data analysis
- Build a fullstack SEO rank tracker app with Next.js and Bright Data
- I turned Tinder into a pet adoption app
- 3 million dollar project ideas for developers
- Overcoming web scraping challenges, price alert monitoring with Puppeteer, NodeJS, and Hono
- What's the best Python web scraping library?
- How to create custom datasets to train LLMs using Bright Data
- Real estate end to end data engineering using AI
- eCommerce web scraping tutorial (Puppeteer, Cheerio, and Node.js
- Web Data Masterclass
Categories
Sub Categories
Keywords
web-scraping
25
api
23
datasets
19
dataset
14
data-analysis
12
data
12
database
8
data-science
7
data-mining
6
webscraping
6
products
6
web-scraper
4
data-extraction
4
amazon
2
glassdoor-reviews
2
glassdoor-jobs
2
glassdoor
2
webscraper-api
2
crunchbase-scraper
2
crunchbase-api
2
crunchbase
2
ecommerce
2
sample
2
linkedin
2
zillow-scraper
2
zillow-house-listings
2
zillow
2
x
2
twitter-scraper
2
twitter-api
2
twitter
2
structured-data
2
pinterest-api
2
pinterest
2
zoominfo
2
companies
2
business
2
b2b
2
maps
2
google-maps
2
shein
2
data-abalysis
2
shopee
2
web-scraper-api
2
airbnb-listings
2
airbnb
2
walmart-scraper
2
walmart
2
jobs
2
indeed
2