https://github.com/fracpete/wai-scraper
Webscraping library for the University of Waikato.
https://github.com/fracpete/wai-scraper
python selenium
Last synced: 2 months ago
JSON representation
Webscraping library for the University of Waikato.
- Host: GitHub
- URL: https://github.com/fracpete/wai-scraper
- Owner: fracpete
- License: apache-2.0
- Created: 2020-07-30T21:17:50.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2024-04-22T05:04:56.000Z (about 2 years ago)
- Last Synced: 2025-10-06T16:26:11.525Z (9 months ago)
- Topics: python, selenium
- Language: Python
- Homepage:
- Size: 22.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.rst
- License: LICENSE
Awesome Lists containing this project
README
# wai-scraper
Python webscraping library for the [University of Waikato](https://www.waikato.ac.nz/).
Uses [selenium](https://pypi.org/project/selenium/) and
[selenium-requests](https://pypi.org/project/selenium-requests/) under the hood.
While on campus, the [DUO](https://duo.com/) two-factor authentication does not
prompt the user, which allows using the library in non-interactive mode (`init_driver(False)`).
However, when off-campus, it is necessary to run it in interactive mode (`init_driver(True)`),
in order to tick the *Remember me for 30 days* box and click on the *Send me a push* button to
accept the authentication on your mobile device.
The use of selenium was inspired by:
https://stackoverflow.com/a/23929939/4698227
## Installation
Create a virtual environment:
```bash
virtualenv -p /usr/bin/python3 venv
```
Install *wai.scraper* in the virtual environment:
```commandline
./venv/bin/pip install git+https://github.com/fracpete/wai-scraper.git
```
## Example
The following example logs into the university website via [SSO](https://en.wikipedia.org/wiki/Single_sign-on)
and outputs the HTML content of the staff landing page.
```python
import getpass
import wai.scraper as ws
# initialize logger with debugging output
ws.init_logger(True)
# run Firefox in interactive mode (eg when off-campus, for interacting with 2FA)
driver = ws.init_driver(True)
# perform logins
user = input("Enter user: ")
pw = getpass.getpass("Enter password: ")
ws.sso(driver, user, pw, delay=15)
url = 'https://www.waikato.ac.nz/landing/staff.shtml'
# obtain staff landing page via selenium
ws.driver_get(driver, "staff landing page", url)
print("--> selenium")
print(driver.page_source)
# close the session
ws.close_driver(driver)
```