Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/noureldin2303/web-scraping-using-multithreading
extract data using web scraping with python
https://github.com/noureldin2303/web-scraping-using-multithreading
chrome communityexchange dataset datasets github-campus-experts github-codespaces learn python scraper scraping-websites selenium selenium-python selenium-webdriver web-scraping
Last synced: 2 days ago
JSON representation
extract data using web scraping with python
- Host: GitHub
- URL: https://github.com/noureldin2303/web-scraping-using-multithreading
- Owner: Noureldin2303
- License: mit
- Created: 2023-02-02T14:56:47.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-04-02T12:57:55.000Z (almost 2 years ago)
- Last Synced: 2023-04-03T10:00:02.183Z (almost 2 years ago)
- Topics: chrome, communityexchange, dataset, datasets, github-campus-experts, github-codespaces, learn, python, scraper, scraping-websites, selenium, selenium-python, selenium-webdriver, web-scraping
- Language: Jupyter Notebook
- Homepage:
- Size: 6.47 MB
- Stars: 4
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: License.md
Awesome Lists containing this project
README
# Web-scraping
**To extract data using web scraping with python using multithreading:**
1- Find the URL that you want to scrape
2- Inspecting the Page
3- Find the data you want to extract
4- Write the code
5- Run the code and extract the data
6- Store the data in the required format
**Download Browser driver is using**Chrome:
https://sites.google.com/chromium.org/driver/
- Importing packages: from selenium import webdriver
- Create Chrome driver Instance: driver = webdriver.Chrome(r'Path in your computer where you have installed chromedriver')
- Fetch webpage: driver.get('URL')
- Parse webpage using Xpath: Data = driver.find_elements(By.XPATH,‘Xpath’)