https://github.com/dubzzz/py-seo-helper

Scan a website and give some points that could be modified
https://github.com/dubzzz/py-seo-helper

Last synced: 7 months ago
JSON representation

Scan a website and give some points that could be modified

Host: GitHub
URL: https://github.com/dubzzz/py-seo-helper
Owner: dubzzz
License: mit
Created: 2014-04-07T23:17:36.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2016-09-26T22:57:05.000Z (about 9 years ago)
Last Synced: 2025-01-25T22:38:22.364Z (9 months ago)
Language: Python
Size: 3.59 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# SEO Helper

## Search Engine Optimization Tool

SEO Helper is a Search engine optimization tool. It carries out tests on several functionalities that are required for a top-ranked website.

It raises points that can be enhanced in an attempt to improve the ranking of the website on Search Engines.

## Tested functionalities

SEO Helper tests 27 points that can enhanced your ranking. Among those:
+ robots.txt exists?
+ broken links
+ broken ressources
+ duplicated title
+ duplicated description
+ balance between internal and external links
+ at least one h1 tag
+ title and description meta
+ 3-click rule

All these tests are defined in init method of WebSite class. Results can then be exported to PDF and sent by email.

## Make it work

This project requires [wkhtmltopdf](https://github.com/wkhtmltopdf/wkhtmltopdf) to work properly. wkhtmltopdf exports reports to PDF. It needs to be put in ./bin/wkhtmltopdf.

The code to launch the scan is in ./src/scan.py.
eg.: ./src/scan.py --url=http://portfolio.dubien.me/ --nofollow --noindex

Parameters for scan.py:
+ **-h, --help**: Display help.
+ **-u [url], --url=[url]**: URL where the analysis will start. The crawler will then crawl from page to page. The URL must end by /.
+ **-d [max-depth], --max-depth=[max-depth]**: Specify a different value for max-depth. By default: max-depth=5.
+ **--nofollow**: By default, the crawler follows every link in the page. By using this option, the crawler will not follow links tagged with rel='nofollow'.
+ **--noindex**: By default, the output includes every page of your website that can be reached in max-depths clicks. By using --noindex option results with will not be diplayed
+ **-m [me@domain.com], --email=[me@domain.com]**: Specify the email address of the user that should received the PDF report. Not specified implies no email, but PDF generation in ./output/pdf.pdf.
+ **-a, --deep**: Deep analysis. Instead of asking only for the header of external webpages (default behaviour), it will ask the complete webpage.
This kind of analysis will certainly be a bit longer. It allows to follow redirections and check whether or not the targetted element is accessible.
+ **-n, --retry=[num-retry]**: Specify the number of times to retry queries before failing.
+ **-c, --color**: Colored output. Default: no-colors.

## What's next?

Some functionalities are still missing:
+ Sitemap in robots.txt
+ Broken-links in Sitemap
+ URL that can be read and understood (not ?id=782)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dubzzz/py-seo-helper

Awesome Lists containing this project

README