https://github.com/emmanuelroecker/php-linkchecker
Check broken links in html / json files, sitemap.xml, markdown and robots.txt
https://github.com/emmanuelroecker/php-linkchecker
html links php php-linkchecker
Last synced: 4 months ago
JSON representation
Check broken links in html / json files, sitemap.xml, markdown and robots.txt
- Host: GitHub
- URL: https://github.com/emmanuelroecker/php-linkchecker
- Owner: emmanuelroecker
- License: mit
- Created: 2015-03-27T15:38:55.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2021-01-19T18:14:33.000Z (over 5 years ago)
- Last Synced: 2024-08-10T10:12:02.038Z (almost 2 years ago)
- Topics: html, links, php, php-linkchecker
- Language: PHP
- Homepage:
- Size: 41 KB
- Stars: 28
- Watchers: 3
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# php-linkchecker
[](https://scrutinizer-ci.com/g/emmanuelroecker/php-linkchecker/?branch=master)
[](https://travis-ci.org/emmanuelroecker/php-linkchecker)
[](https://coveralls.io/github/emmanuelroecker/php-linkchecker?branch=master)
[](https://insight.sensiolabs.com/projects/4f63b147-1922-4527-9d0d-e369397a1c13)
Check broken links in html / json files, sitemap.xml, markdown and robots.txt.
It's working with :
* [Guzzle](http://docs.guzzlephp.org)
* [Symfony Finder Component](http://symfony.com/doc/2.3/components/finder.html)
* [Glicer Simply-html Component](https://github.com/emmanuelroecker/php-simply-html)
## Installation
This library can be found on [Packagist](https://packagist.org/packages/glicer/link-checker).
The recommended way to install is through [composer](http://getcomposer.org).
Edit your `composer.json` and add :
```json
{
"require": {
"glicer/link-checker": "dev-master"
}
}
```
Install dependencies :
```bash
php composer.phar install
```
## How to check links in html / json files ?
```php
require 'vendor/autoload.php';
use GlLinkChecker\GlLinkChecker;
use GlLinkChecker\GlLinkCheckerReport;
use Symfony\Component\Finder\Finder;
//relative url use host http://lyon.glicer.com to check link
$linkChecker = new GlLinkChecker('http://lyon.glicer.com');
//construct list of local html and json files to check
$finder = new Finder();
$files = $finder->files()->in('./public')->name("*.html")->name("*.json");
//launch links checking
$result = $linkChecker->checkFiles(
$files,
function ($nbr) {
// called at beginning - $nbr urls to check
},
function ($url, $files) {
// called each $url - $files : list of filename containing $url link
},
function () {
// called at the end
}
);
//convert $result array in a temp html file
$filereport = GlLinkCheckerReport::toTmpHtml('lyonCheck',$result);
//$filereport contain fullpath to html file
print_r($filereport);
```
you can view $filereport with your browser
## How to check links in robots.txt and sitemap files ?
```php
require 'vendor/autoload.php';
use GlLinkChecker\GlLinkChecker;
$linkChecker = new GlLinkChecker('http://lyon.glicer.com');
$result = $linkChecker->checkRobotsSitemap();
print_r($result);
```
GlLinkChecker::checkRobotsSitemap() return an array like :
```php
$result = [
'disallow' =>
['error' => ['/img/', '/download/']],
'sitemap' =>
[
'ok' => [
'/sitemap.xml' =>
[
'ok' =>
[
'/index.html',
'/section/probleme-solution/compresser-css-html-js.html'
]
]
]
]
];
```
## Running Tests
Launch from command line :
```console
vendor\bin\phpunit
```
## License MIT
## Contact
Authors : Emmanuel ROECKER & Rym BOUCHAGOUR
[Web Development Blog - http://dev.glicer.com](http://dev.glicer.com)