Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hhalaby/web-crawling-automation
Educational example codes for web crawling and web monitoring
https://github.com/hhalaby/web-crawling-automation
automation python python3
Last synced: 3 months ago
JSON representation
Educational example codes for web crawling and web monitoring
- Host: GitHub
- URL: https://github.com/hhalaby/web-crawling-automation
- Owner: hhalaby
- License: unlicense
- Created: 2022-01-16T13:57:13.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-01-22T02:58:59.000Z (almost 3 years ago)
- Last Synced: 2024-06-27T14:31:04.503Z (5 months ago)
- Topics: automation, python, python3
- Language: Python
- Homepage:
- Size: 6.84 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Website Crawling & Automation
Repository to learn about crawling dynamic websites for pdf printing or checking websites for available appointments or product stock.
## Links
- [Repo](https://github.com/hhalaby/web-crawling-automation "Web Crawl Repo")
- [Web Crawling Docs](https://github.com/hhalaby/web-crawling-automation)## General Info
This is a collection of code I have written as small personal projects and wanted to share. Each file contains code for a specific task that I wanted to learn how to automate.
They are the following:
- Reserving an appointment on a website. The sort of appointments that are filled within seconds!
- Downloading PDFs from a **dynamic** website (so. many. clicks.)
- Checking product stock on a websiteP.S: *Good knowledge of Selenium, HTML and websites in general is needed.*
## Disclaimer
Please note that this repository is given for educational purposes only. The illegal use of web crawling and web monitoring is strongly discouraged. If web crawling is needed for a site that isn't your own, it is highly encouraged to reach out to the website owners and ask for their express permission.## Setup
To run this project, follow the steps below:
- Create a python virtual environment
- Clone the repository in the directory where the virtual environment folder is installed
- Install requirements from [requirements.txt](https://github.com/hhalaby/web-crawling-automation/blob/main/requirements.txt) file
- **Edit the code to your own needs**
- Import the relevant class from the three available classes. For example:
```from check_appointment import CheckAppointment```
- Use the available [commands](https://github.com/hhalaby/web-crawling-automation#Commands)## Current Status
This is an initial commit to get the project rolling. It should be tailored to specific needs as it will most probably not work out of the box.## Future Updates
- Easier User Interaction (maybe through GUI)
- Customizable commands## Commands
The following table summarizes the most used commands:
| Command | Code | Description |
|-----------|:-------------:|-------------|
| To be filled | ```To be filled``` | To be filled |