An open API service indexing awesome lists of open source software.

https://github.com/dr4g0nsr/sitemap-crawler

Crawler using sitemap to crawl site/regenerate cache
https://github.com/dr4g0nsr/sitemap-crawler

crawler php

Last synced: 5 months ago
JSON representation

Crawler using sitemap to crawl site/regenerate cache

Awesome Lists containing this project

README

          

# ALPHA VERSION, DO NOT USE ON PRODUCTION

[![Test status](https://github.com/dr4g0nsr/sitemap-crawler/workflows/Composer/badge.svg)](https://github.com/dr4g0nsr/sitemap-crawler/actions)

## Sitemap Crawler

Crawler using sitemap to crawl site/regenerate cache.

Files are not stored, point is just to trigger url.

## Get code using composer

```
composer require dr4g0nsr/sitemap-crawler
```

## How to implement

Create config.php:

```
0,
"excluded" => []
];
```

Use code like this:

```
0, 'verbose' => true]);
$crawler->loadConfig(__DIR__ . '/config.php');
$sitemap = $crawler->getSitemap($url);
$crawler->crawlURLS($sitemap);
```

That would be simplest code, you can also find it in test subdir under vendor/dr4g0nsr/SitemapCrawler/test.