Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dubzzz/py-seo-helper
Scan a website and give some points that could be modified
https://github.com/dubzzz/py-seo-helper
Last synced: about 1 month ago
JSON representation
Scan a website and give some points that could be modified
- Host: GitHub
- URL: https://github.com/dubzzz/py-seo-helper
- Owner: dubzzz
- License: mit
- Created: 2014-04-07T23:17:36.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2016-09-26T22:57:05.000Z (about 8 years ago)
- Last Synced: 2024-05-02T00:04:37.526Z (7 months ago)
- Language: Python
- Size: 3.59 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SEO Helper
## Search Engine Optimization Tool
SEO Helper is a Search engine optimization tool. It carries out tests on several functionalities that are required for a top-ranked website.
It raises points that can be enhanced in an attempt to improve the ranking of the website on Search Engines.
## Tested functionalities
SEO Helper tests 27 points that can enhanced your ranking. Among those:
+ robots.txt exists?
+ broken links
+ broken ressources
+ duplicated title
+ duplicated description
+ balance between internal and external links
+ at least one h1 tag
+ title and description meta
+ 3-click ruleAll these tests are defined in init method of WebSite class. Results can then be exported to PDF and sent by email.
## Make it work
This project requires [wkhtmltopdf](https://github.com/wkhtmltopdf/wkhtmltopdf) to work properly. wkhtmltopdf exports reports to PDF. It needs to be put in ./bin/wkhtmltopdf.
The code to launch the scan is in ./src/scan.py.
eg.: ./src/scan.py --url=http://portfolio.dubien.me/ --nofollow --noindexParameters for scan.py:
+ **-h, --help**: Display help.
+ **-u [url], --url=[url]**: URL where the analysis will start. The crawler will then crawl from page to page. The URL must end by /.
+ **-d [max-depth], --max-depth=[max-depth]**: Specify a different value for max-depth. By default: max-depth=5.
+ **--nofollow**: By default, the crawler follows every link in the page. By using this option, the crawler will not follow links tagged with rel='nofollow'.
+ **--noindex**: By default, the output includes every page of your website that can be reached in max-depths clicks. By using --noindex option results with will not be diplayed
+ **-m [[email protected]], --email=[[email protected]]**: Specify the email address of the user that should received the PDF report. Not specified implies no email, but PDF generation in ./output/pdf.pdf.
+ **-a, --deep**: Deep analysis. Instead of asking only for the header of external webpages (default behaviour), it will ask the complete webpage.
This kind of analysis will certainly be a bit longer. It allows to follow redirections and check whether or not the targetted element is accessible.
+ **-n, --retry=[num-retry]**: Specify the number of times to retry queries before failing.
+ **-c, --color**: Colored output. Default: no-colors.## What's next?
Some functionalities are still missing:
+ Sitemap in robots.txt
+ Broken-links in Sitemap
+ URL that can be read and understood (not ?id=782)