Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yonathanguez/crawler_python3

Training :This project is dedicate to get some information on website
https://github.com/yonathanguez/crawler_python3

Last synced: about 1 month ago
JSON representation

Training :This project is dedicate to get some information on website

Host: GitHub
URL: https://github.com/yonathanguez/crawler_python3
Owner: YonathanGuez
Created: 2020-03-03T17:44:53.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2022-12-08T03:43:53.000Z (about 2 years ago)
Last Synced: 2023-02-28T04:38:36.229Z (almost 2 years ago)
Language: Python
Size: 765 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Beginning Crawler / Scraping with Python3 :
This project is dedicate to get some information on website and test some library like BeautifulSoup / Chromedrive / requests

## Install
```bash
$ pip install -r requirements.txt
```
## Check the Hierarchy of all Tags Heading
This is a Project check all Tags Hn and the number of character for each tags Hn

This project call some SEO technical like how to check the hierarchy of hn.

It is important for the rang of the site if you not use well you can have a bad rank.

In this script we will use only the library : requests and BeautifulSoup

### Configuration:
Python 3.5
pip 20

### Run the code :
```
python check_heading_hierarchy.py --url
```
### Example how to use it :
Click here :

## Help_archive:
### Chrome
version : 80.0.3XXXX

### Configuration
Python 3.5
pip 20
chromedrive

Example and test for help me to build the project:
```
python get_all_anchors.py
python check_heading_hierarchy.py
python get_load_time_chromedrive.py
python get_meta_title.py
python get_requests_status_headers.py
python scrapjs_chrome_headless.py
```