https://github.com/maxralph1/webscraper-python
https://github.com/maxralph1/webscraper-python
python webscraper
Last synced: 11 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/maxralph1/webscraper-python
- Owner: maxralph1
- Created: 2023-08-06T02:08:01.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-08-06T04:35:41.000Z (almost 3 years ago)
- Last Synced: 2025-03-03T03:26:39.043Z (over 1 year ago)
- Topics: python, webscraper
- Language: Python
- Homepage:
- Size: 66.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Webscraper Project
This is a web scraper project crafted in Python
## Installation
### Virtual Environment
First, you would have to create a virtual environment if you would not want to install the required dependencies globally.
Use the following command in the directory where you intend to house your application.
```
py -m venv venv
```
OR
```
python -m venv venv
```
Then, activate the virtual environment by running:
On Windows:
```
.\venv\Scripts\activate
```
OR on Mac / GitBash:
```
source ./venv/Scripts/activate
```
### Install the dependencies:
```
pip install -r requirements.txt
```
### Optional: Global Installation of Dependencies
You may also wish to install globally, the dependencies we used. Thereby, skipping the virtual environment and requirements.txt file steps above.
If so, use these commands instead in any order:
```
pip install beautifulsoup4
```
```
pip install lxml
```
```
pip install requests
```
## To run the webscraper:
Use either of the following commands, based on your operating system.
```
py index.py
```
OR
```
python index.py
```
## To run the webscraper and have the results saved to a file:
Use either of the following commands, based on your operating system.
```
py file_index.py
```
OR
```
python file_index.py
```