https://github.com/matheussc017/passmarkwebscraping
Script for extracting data from hardware such as video card, hard disk and CPU
https://github.com/matheussc017/passmarkwebscraping
pandas python webscraping
Last synced: about 2 months ago
JSON representation
Script for extracting data from hardware such as video card, hard disk and CPU
- Host: GitHub
- URL: https://github.com/matheussc017/passmarkwebscraping
- Owner: MatheusSC017
- License: mit
- Created: 2022-07-28T09:06:49.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-01-11T10:45:13.000Z (over 3 years ago)
- Last Synced: 2025-01-13T00:33:25.767Z (over 1 year ago)
- Topics: pandas, python, webscraping
- Language: Jupyter Notebook
- Homepage:
- Size: 6.25 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Web Scraping - PassMark
Script for extracting data from hardware such as video card, hard disk and CPU, allowing raw/clean data to be obtained in two formats:
- Python list;
- Pandas dataframe;
Another functionality present in the class responsible for extracting the data is the save() function that exports the data to an excel file.
In this project, the BeautifullSoup4 library was used to extract data from the PassMark site, respecting the robot.txt file guidelines. For more details, consult the documentation present in the script itself.