Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yonathanguez/crawler_python3
Training :This project is dedicate to get some information on website
https://github.com/yonathanguez/crawler_python3
Last synced: about 1 month ago
JSON representation
Training :This project is dedicate to get some information on website
- Host: GitHub
- URL: https://github.com/yonathanguez/crawler_python3
- Owner: YonathanGuez
- Created: 2020-03-03T17:44:53.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T03:43:53.000Z (about 2 years ago)
- Last Synced: 2023-02-28T04:38:36.229Z (almost 2 years ago)
- Language: Python
- Size: 765 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Beginning Crawler / Scraping with Python3 :
This project is dedicate to get some information on website and test some library like BeautifulSoup / Chromedrive / requests## Install
```bash
$ pip install -r requirements.txt
```
## Check the Hierarchy of all Tags Heading
This is a Project check all Tags Hn and the number of character for each tags HnThis project call some SEO technical like how to check the hierarchy of hn.
It is important for the rang of the site if you not use well you can have a bad rank.In this script we will use only the library : requests and BeautifulSoup
### Configuration:
Python 3.5
pip 20### Run the code :
```
python check_heading_hierarchy.py --url
```
### Example how to use it :
Click here :
## Help_archive:
### Chrome
version : 80.0.3XXXX### Configuration
Python 3.5
pip 20
chromedriveExample and test for help me to build the project:
```
python get_all_anchors.py
python check_heading_hierarchy.py
python get_load_time_chromedrive.py
python get_meta_title.py
python get_requests_status_headers.py
python scrapjs_chrome_headless.py
```