https://github.com/velocityzen/meta-extractor

Super simple and fast html page meta data extractor with low memory footprint
https://github.com/velocityzen/meta-extractor

atom extractor feed html meta metadata nodejs opengraph parser rss

Last synced: 4 months ago
JSON representation

Super simple and fast html page meta data extractor with low memory footprint

Host: GitHub
URL: https://github.com/velocityzen/meta-extractor
Owner: velocityzen
License: mit
Created: 2016-03-16T21:37:45.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2023-01-04T21:36:36.000Z (over 2 years ago)
Last Synced: 2025-03-13T22:46:25.002Z (4 months ago)
Topics: atom, extractor, feed, html, meta, metadata, nodejs, opengraph, parser, rss
Language: JavaScript
Homepage:
Size: 860 KB
Stars: 36
Watchers: 3
Forks: 4
Open Issues: 7
Metadata Files:
- Readme: readme.md
- License: LICENSE

Awesome Lists containing this project

README

        # meta-extractor

[![NPM Version](https://img.shields.io/npm/v/meta-extractor.svg?style=flat-square)](https://www.npmjs.com/package/meta-extractor)

[![NPM Downloads](https://img.shields.io/npm/dt/meta-extractor.svg?style=flat-square)](https://www.npmjs.com/package/meta-extractor)

Super simple and fast meta data extractor with low memory footprint.

Extracts:

- title

- description

- charset

- theme-color

- rss/atom feeds

- all opengraph meta data

- all twitter meta data

- all app links meta data

- all vk meta data

- all unique image urls (absolute)

- **returns mime and extension for binary files without downloading the whole file**

## install

`npm i meta-extractor`

## usage

```js

const extract = require('meta-extractor');

extract({ uri: 'http://www.newyorker.com' }, (err, res) =>

  console.log(err, res)

);

or;

const res = await extract({ uri: 'http://www.newyorker.com' });

console.log(res);

```

If no callback provided returns a Promise.

The first parameter `opts` as in [got](https://github.com/sindresorhus/got) module and:

- **uri** — uri to get meta from.

- rxMeta — regexp, custom regexp for meta data.

- limit — number, response body size limit in bytes. Default 2Mb.

License MIT;

© velocityzen

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/velocityzen/meta-extractor

Awesome Lists containing this project

README