https://github.com/alexmhack/js_driven_scraping

Scraping Javascript Driven Websites Using Python-Selenium
https://github.com/alexmhack/js_driven_scraping

beginner-friendly beginner-project python-selenium python-tutorial python3

Last synced: 4 months ago
JSON representation

Scraping Javascript Driven Websites Using Python-Selenium

Host: GitHub
URL: https://github.com/alexmhack/js_driven_scraping
Owner: Alexmhack
Created: 2018-09-10T05:04:27.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-09-11T15:04:47.000Z (over 7 years ago)
Last Synced: 2025-10-09T02:44:19.152Z (5 months ago)
Topics: beginner-friendly, beginner-project, python-selenium, python-tutorial, python3
Language: Python
Homepage:
Size: 38.1 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # js_driven_scraping

Scraping Javascript Driven Websites Using Python

Run ```js_scrape.py``` file and you should get lesser results of the images then 

actually exists on the website. This is due to the javascript getting loaded while

the website is also loading so python ```requests``` cannot scrape that for us

For scraping javascript-driven websites we need a more powerful python package

which is [selenium-python](https://selenium-python.readthedocs.io/)

Using the instructions given in the docs for selenium-python install the selenium 

and firefox drivers for selenium. Be sure to donwload 

![geckodriver](https://github.com/Alexmhack/js_driven_scraping/blob/master/images/Capture.PNG)

Download ```32bit``` or ```64bit``` according to our specs for windows, unzip the 

folder and add the path of that folder in **system variables**

Create a new file named ```using_selenium.py``` 

```

import requests

from bs4 import BeautifulSoup

from selenium import webdriver

driver = webdriver.Firefox()

```

When you run the file the firefox browser with a new window should open. 

For opening a url in browser window, close the firefox browser and add 

```

driver.get('https://google.com')

```

at last. Then run the file again and the browser should open the google website

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alexmhack/js_driven_scraping

Awesome Lists containing this project

README