Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/hhalaby/web-crawling-automation

Educational example codes for web crawling and web monitoring
https://github.com/hhalaby/web-crawling-automation

automation python python3

Last synced: 3 months ago
JSON representation

Educational example codes for web crawling and web monitoring

Host: GitHub
URL: https://github.com/hhalaby/web-crawling-automation
Owner: hhalaby
License: unlicense
Created: 2022-01-16T13:57:13.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-01-22T02:58:59.000Z (almost 3 years ago)
Last Synced: 2024-06-27T14:31:04.503Z (5 months ago)
Topics: automation, python, python3
Language: Python
Homepage:
Size: 6.84 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

Website Crawling & Automation

Repository to learn about crawling dynamic websites for pdf printing or checking websites for available appointments or product stock.

## Links
- [Repo](https://github.com/hhalaby/web-crawling-automation "Web Crawl Repo")
- [Web Crawling Docs](https://github.com/hhalaby/web-crawling-automation)

## General Info
This is a collection of code I have written as small personal projects and wanted to share. Each file contains code for a specific task that I wanted to learn how to automate.
They are the following:
- Reserving an appointment on a website. The sort of appointments that are filled within seconds!
- Downloading PDFs from a **dynamic** website (so. many. clicks.)
- Checking product stock on a website

P.S: *Good knowledge of Selenium, HTML and websites in general is needed.*

## Disclaimer
Please note that this repository is given for educational purposes only. The illegal use of web crawling and web monitoring is strongly discouraged. If web crawling is needed for a site that isn't your own, it is highly encouraged to reach out to the website owners and ask for their express permission.

## Setup
To run this project, follow the steps below:
- Create a python virtual environment
- Clone the repository in the directory where the virtual environment folder is installed
- Install requirements from [requirements.txt](https://github.com/hhalaby/web-crawling-automation/blob/main/requirements.txt) file
- **Edit the code to your own needs**
- Import the relevant class from the three available classes. For example:
```from check_appointment import CheckAppointment```
- Use the available [commands](https://github.com/hhalaby/web-crawling-automation#Commands)

## Current Status
This is an initial commit to get the project rolling. It should be tailored to specific needs as it will most probably not work out of the box.

## Future Updates
- Easier User Interaction (maybe through GUI)
- Customizable commands

## Commands
The following table summarizes the most used commands:
| Command | Code | Description |
|-----------|:-------------:|-------------|
| To be filled | ```To be filled``` | To be filled |