https://github.com/shubhamagrawal1507/playwright-datascraping-validation
This project is a web scraping and data validation tool built with Playwright, pytest, and pandas. It scrapes country data (such as names, capitals, and currencies) from a specified website and validates it against expected results. The project follows the Page Object Model (POM) design pattern for better maintainability and readability.
https://github.com/shubhamagrawal1507/playwright-datascraping-validation
data-scraper-framework data-validation page-object-model playwright-python pytest pytest-html
Last synced: 3 months ago
JSON representation
This project is a web scraping and data validation tool built with Playwright, pytest, and pandas. It scrapes country data (such as names, capitals, and currencies) from a specified website and validates it against expected results. The project follows the Page Object Model (POM) design pattern for better maintainability and readability.
- Host: GitHub
- URL: https://github.com/shubhamagrawal1507/playwright-datascraping-validation
- Owner: shubhamagrawal1507
- Created: 2024-07-06T09:07:07.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-07-06T09:38:18.000Z (12 months ago)
- Last Synced: 2025-01-22T12:46:17.025Z (5 months ago)
- Topics: data-scraper-framework, data-validation, page-object-model, playwright-python, pytest, pytest-html
- Language: HTML
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Playwright Country Data Scraper and Validator
## Description
This project is a web scraping and data validation tool built with Playwright, pytest, and pandas. It scrapes country data (such as names, capitals, and currencies) from a specified website and validates it against expected results. The project follows the Page Object Model (POM) design pattern for better maintainability and readability.
## Features
- Scrapes country data from a website
- Validates scraped data against expected results stored in a CSV file
- Generates detailed HTML test reports using pytest-html
- Automatically reruns failed tests to ensure robustness## Project Structure
country_scraper/
│
├── data/
│ ├── expected_results.csv # Expected data for validation
│ └── scraped_data.csv # Scraped data from the website
│
├── pages/
│ └── country_page.py # Page Object Model for the country data page
│
├── reports/
│ └── test_report.html # HTML test report
│
├── tests/
│ └── test_country_data.py # Test script for scraping and validation
│
├── src/
│ ├── data_loader.py # Utility functions for loading and saving data
│ └── validation.py # Utility functions for data validation
│
├── README.md # Project documentation
|
├── requirements.txt # List of Python dependencies
|
└── pytest.ini # Pytest configuration file## Installation
-Clone the repository:
git clone https://github.com/shubhamagrawal1507/playwright-datascraping-validation.git-Install the dependencies:
pip install -r requirements.txt
playwright install## Usage
-Run the tests:
python -m pytest-View the HTML report:
After running the tests, open reports/test_report.html in your web browser to view the detailed test report.