Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sanikamal/web-scraping-atoz
Extract data from websites using Python
https://github.com/sanikamal/web-scraping-atoz
beautifulsoup4 data-mining requests scrapy selenium web-scraping
Last synced: about 11 hours ago
JSON representation
Extract data from websites using Python
- Host: GitHub
- URL: https://github.com/sanikamal/web-scraping-atoz
- Owner: sanikamal
- Created: 2021-09-26T14:51:19.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-31T14:31:15.000Z (about 2 years ago)
- Last Synced: 2025-01-10T03:54:21.667Z (about 11 hours ago)
- Topics: beautifulsoup4, data-mining, requests, scrapy, selenium, web-scraping
- Language: Jupyter Notebook
- Homepage:
- Size: 16.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Scraping with Python
> ### Web scraping is the process of extracting data from web sites via programmatic means.### What is Web Scraping
`Web Scraping` (also termed `Screen Scraping`, `Web Data Extraction`, `Web Harvesting` etc.) is a technique employed to extract large amounts of data from websites whereby the data is extracted and saved to a local file in your computer or to a database in table (spreadsheet) format.### Popular web scraping tools:
- `BeautifulSoup` is a python library for pulling data (parsing) out of HTML and XML files.
- `Scrapy` is a free open source application framework used for crawling web sites and extracting structured data which can be used for a variety of things like data mining,research ,information process or historical archival.
### Contents
- [Scraping Car_Dealer_Website](https://github.com/sanikamal/web-scraping-with-python/blob/main/notebook/Web_Scraping_a_Car_Dealer_Website.ipynb)### Some Useful Link
- [Scrapy documentation](https://scrapy.org/)
- [Scrapinghub website](https://scrapinghub.com/)
- [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)