https://github.com/trilokida/web_scraping

beautifulsoup bs4 extract information-extraction machine-learning python requests webscraper-website webscraping

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/trilokida/web_scraping
Owner: TrilokiDA
Created: 2018-09-11T16:00:08.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2019-12-04T05:01:24.000Z (over 5 years ago)
Last Synced: 2025-01-17T03:16:53.918Z (6 months ago)
Topics: beautifulsoup, bs4, extract, information-extraction, machine-learning, python, requests, webscraper-website, webscraping
Language: Jupyter Notebook
Homepage:
Size: 14.6 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Web_Scraping

---

**Web Scraping** (also termed ***Screen Scraping***, ***Web Data Extraction***, ***Web Harvesting*** etc.) is a technique employed to extract large amounts of data from websites whereby the data is extracted and saved to a local file in your computer or to a database in table (spreadsheet) format.

Data displayed by most websites can only be viewed using a web browser. They do not offer the functionality to save a copy of this data for personal use. The only option then is to manually copy and paste the data - a very tedious job which can take many hours or sometimes days to complete. **Web Scraping** is the technique of automating this process, so that instead of manually copying the data from websites, the Web Scraping software will perform the same task within a fraction of the time.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/trilokida/web_scraping

Awesome Lists containing this project

README