Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/strugee/node-crawl-mf2
Crawl microformats2 data for h-entry and h-feeds
https://github.com/strugee/node-crawl-mf2
hacktoberfest indieweb javascript mf2 microformats2 nodejs small-modules
Last synced: 11 days ago
JSON representation
Crawl microformats2 data for h-entry and h-feeds
- Host: GitHub
- URL: https://github.com/strugee/node-crawl-mf2
- Owner: strugee
- License: lgpl-3.0
- Created: 2018-04-14T06:24:11.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-05-07T20:56:15.000Z (8 months ago)
- Last Synced: 2024-12-30T12:47:22.357Z (12 days ago)
- Topics: hacktoberfest, indieweb, javascript, mf2, microformats2, nodejs, small-modules
- Language: JavaScript
- Homepage:
- Size: 209 KB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: COPYING
Awesome Lists containing this project
README
# `crawl-mf2`
[![Build Status](https://travis-ci.org/strugee/node-crawl-mf2.svg?branch=master)](https://travis-ci.org/strugee/node-crawl-mf2)
[![Coverage Status](https://coveralls.io/repos/github/strugee/node-crawl-mf2/badge.svg?branch=master)](https://coveralls.io/github/strugee/node-crawl-mf2?branch=master)
[![Greenkeeper badge](https://badges.greenkeeper.io/strugee/node-crawl-mf2.svg)](https://greenkeeper.io/)Crawl a [microformats2][] site to find things like canonical URLs for `h-entry`s
Note: this module does not really handle pages with more than one top-level [microformats2][] nodes.
## Installation
npm install crawl-mf2
## Example
Start a crawl and log canonical h-entry URLs found on `https://strugee.net/blog/`:
```js
var crawl = require('crawl-mf2');var crawler = crawl('https://strugee.net/blog/');
crawler.on('h-entry', function(url, mf2node) {
console.log(url);
});
```## API
The module exports a single function, `crawlMf2`, which takes a single argument, the base URL to crawl from.
It returns an [`EventEmitter`](https://nodejs.org/api/events.html#events_class_eventemitter).
## Events
### `'error'`
Emitted when an error occurs. Currently this means either the [microformats2][] parser failed or an HTTP error occurred.
Note: [treated specially](https://nodejs.org/api/events.html#events_error_events) by Node.js.
### `'urlDisco'`
* `String` The URL being discovered
Emitted when a new URL is discovered, including the initial base URL.
### `'mf2Parse'`
* `String` The URL being parsed
* `Object` The parsed [microformats2][] node, returned by [`microformat-node`'s `.get()`](https://www.npmjs.com/package/microformat-node#get)Emitted when a URL is parsed for [microformats2][] markup.
### `'h-feed'`
* `String` The URL containing the `h-feed`
* `Object` The parsed [microformats2][] node, returned by [`microformat-node`'s `.get()`](https://www.npmjs.com/package/microformat-node#get)Emitted when an `h-feed` page is discovered.
### `'h-entry'`
* `String` The URL containing the `h-entry`
* `Object` The parsed [microformats2][] node, returned by [`microformat-node`'s `.get()`](https://www.npmjs.com/package/microformat-node#get)Emitted when an `h-entry` page is discovered.
## License
LGPL 3.0+
## Author
AJ Jordan
[microformats2]: http://microformats.org/