Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/JS-DevTools/rehype-url-inspector
A rehype plugin to inspect, validate, or rewrite URLs anywhere in an HTML document
https://github.com/JS-DevTools/rehype-url-inspector
broken-links html html5 javascript nodejs rehype rehype-plugin unified url url-rewrite url-validation urls validate-url
Last synced: about 1 month ago
JSON representation
A rehype plugin to inspect, validate, or rewrite URLs anywhere in an HTML document
- Host: GitHub
- URL: https://github.com/JS-DevTools/rehype-url-inspector
- Owner: JS-DevTools
- License: mit
- Created: 2019-08-23T00:56:54.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-18T19:57:40.000Z (over 4 years ago)
- Last Synced: 2024-09-21T13:13:44.102Z (3 months ago)
- Topics: broken-links, html, html5, javascript, nodejs, rehype, rehype-plugin, unified, url, url-rewrite, url-validation, urls, validate-url
- Language: JavaScript
- Homepage: https://jstools.dev/rehype-url-inspector
- Size: 597 KB
- Stars: 18
- Watchers: 3
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- project-awesome - JS-DevTools/rehype-url-inspector - A rehype plugin to inspect, validate, or rewrite URLs anywhere in an HTML document (JavaScript)
README
Rehype URL Inspector
==============================
### A [rehype](https://github.com/rehypejs/rehype) plugin to inspect, validate, or rewrite URLs anywhere in an HTML document[![Cross-Platform Compatibility](https://jstools.dev/img/badges/os-badges.svg)](https://github.com/JS-DevTools/rehype-url-inspector/actions)
[![Build Status](https://github.com/JS-DevTools/rehype-url-inspector/workflows/CI-CD/badge.svg)](https://github.com/JS-DevTools/rehype-url-inspector/actions)[![Coverage Status](https://coveralls.io/repos/github/JS-DevTools/rehype-url-inspector/badge.svg?branch=master)](https://coveralls.io/github/JS-DevTools/rehype-url-inspector)
[![Dependencies](https://david-dm.org/JS-DevTools/rehype-url-inspector.svg)](https://david-dm.org/JS-DevTools/rehype-url-inspector)[![npm](https://img.shields.io/npm/v/@jsdevtools/rehype-url-inspector.svg)](https://www.npmjs.com/package/@jsdevtools/rehype-url-inspector)
[![License](https://img.shields.io/npm/l/@jsdevtools/rehype-url-inspector.svg)](LICENSE)
[![Buy us a tree](https://img.shields.io/badge/Treeware-%F0%9F%8C%B3-lightgreen)](https://plant.treeware.earth/JS-DevTools/rehype-url-inspector)Features
--------------------------
- Inspect every URL on an HTML page and do whatever you want to, such as:
- Normalize URLs
- Check for broken links
- Replace URLs with different URLs
- Add attributes (like `target="blank"`) to certain links- Finds **all types of URLs** by default, such as:
- ``
- ``
- ``
- ``
- ``
- ``
- ``
- `<script type="application/ld+json">{"url": "www.example.com"}`
- `body { background: url("/img/background.png"); }`- You can remove the built-in URL rules
- You can add your own **custom URL rules**
- You can abort the URL search at any timeExample
--------------------------**example.html**
This HTML file contains many different types of URLs:```html
{
"@context": "http://schema.org",
"headline": "Hello, World!",
"url": "http://example.com/some/page/",
"image": "http://example.com/img/logo.png"
}
body {
background: #ffffff url("img/background.png") center center no-repeat;
}
Hello World
Lorem ipsum dolor sit amet,
non dignissim dolor. Sed diam tellus, malesuada, dictum nulla.
```
**example.js**
This script reads the `example.html` file above and finds all the URLs in it. The script uses [unified](https://unifiedjs.com/), [rehype-parse](https://github.com/rehypejs/rehype/tree/master/packages/rehype-parse), [rehype-stringify](https://github.com/rehypejs/rehype/tree/master/packages/rehype-stringify), and [to-vfile](https://github.com/vfile/to-vfile).```javascript
const unified = require("unified");
const parse = require("rehype-parse");
const inspectUrls = require("@jsdevtools/rehype-url-inspector");
const stringify = require("rehype-stringify");
const toVFile = require("to-vfile");async function example() {
// Create a Rehype processor with the inspectUrls plugin
const processor = unified()
.use(parse)
.use(inspectUrls, {
inspectEach({ url }) {
// Log each URL
console.log(url);
}
})
.use(stringify);// Read the example HTML file
let file = await toVFile.read("example.html");// Crawl the HTML file and find all the URLs
await processor.process(file);
}example();
```Running this script produces the following output:
```
http://example.com/some/page/
/site.webmanifest
/img/favicon.png
/css/main.css?v=5
http://example.com/some/page/
http://example.com/img/logo.png
http://schema.org
http://example.com/some/page/
http://example.com/img/logo.png
img/background.png
/
/img/logo.png
//external.com
some-page.html
//external.com/script.js
```Installation
--------------------------
You can install Rehype URL Inspector via [npm](https://docs.npmjs.com/about-npm/).```bash
npm install @jsdevtools/rehype-url-inspector
```You'll probably want to install [unified](https://unifiedjs.com/), [rehype-parse](https://github.com/rehypejs/rehype/tree/master/packages/rehype-parse), [rehype-stringify](https://github.com/rehypejs/rehype/tree/master/packages/rehype-stringify), and [to-vfile](https://github.com/vfile/to-vfile) as well.
```bash
npm install unified rehype-parse rehype-stringify to-vfile
```Usage
--------------------------
Using the URL Inspector plugin requires an understanding of how to use Unified and Rehype. [Here is an excelleng guide](https://unifiedjs.com/using-unified.html) to learn the basics.The URL Inspector plugin works just like any other Rehype plugin. Pass it to [the `.use()` method](https://github.com/unifiedjs/unified#processoruseplugin-options) with an [options object](#options).
```javascript
const unified = require("unified");
const inspectUrls = require("@jsdevtools/rehype-url-inspector");// Use the Rehype URL Inspector plugin with custom options
unified().use(inspectUrls, {
inspect(urls) { ... }, // This function is called once with ALL of the URLs
inspectEach(url) { ... }, // This function is called for each URL as it's found
selectors: [
"a[href]", // Only search for links, not other types of URLs
"div[data-image]" // CSS selectors for custom URL attributes
]
});
```Options
--------------------------
Rehype URL Inspector supports the following options:|Option |Type |Default |Description
|:---------------------|:-------------------|:----------------------|:-----------------------------------------
|`selectors` |array of strings, objects, and/or functions |[built-in selectors](src/selectors.ts) |Selectors indicate where to look for URLs in the document. Each selector can be a CSS attribute selector string, like `a[href]` or `img[src]`, or a function that accepts a [HAST node](https://github.com/syntax-tree/hast) and returns its URL(s). See [`extractors.ts`](src/extractors.ts) for examples.
|`keepDefaultSelectors`|boolean |`false` |Whether to keep the default selectors in addition to any custom ones.
|`inspect` |function |no-op |A function that is called _once_ and receives an array containing all the URLs in the document
|`inspectEach` |function |no-op |A function that is called for _each_ URL in the document as it's found. Return `false` to abort the search and skip the rest of the document.URL Objects
--------------------------
The `inspectEach()` function receives a [`UrlMatch` object](src/types.ts). The `inspect()` function receves an array of `UrlMatch` objects. Each object has the following properties:|Property |Type |Description
|:----------------------|:--------------------|:------------------------------------
|`url` |string |The URL that was found
|`propertyName` |string or undefined |The name of the [HAST node property](https://github.com/syntax-tree/hast#properties) where the URL was found, such as `"src"` or `"href"`. If the URL was found in the text content of the node, then `propertyName` is `undefined`.
|`node` |object |The [HAST Element node](https://github.com/syntax-tree/hast#element) where the URL was found. **You can make changes to this node**, such as re-writing the URL, adding additional attributes, etc.
|`root` |object |The [HAST Root node](https://github.com/syntax-tree/hast#root). This gives you access to the whole document if you need it.
|`file` |object |The [File object](https://github.com/vfile/vfile) that gives you information about the HTML file itself, such as the path and file name.Contributing
--------------------------
Contributions, enhancements, and bug-fixes are welcome! [Open an issue](https://github.com/JS-DevTools/rehype-url-inspector/issues) on GitHub and [submit a pull request](https://github.com/JS-DevTools/rehype-url-inspector/pulls).#### Building
To build the project locally on your computer:1. __Clone this repo__
`git clone https://github.com/JS-DevTools/rehype-url-inspector.git`2. __Install dependencies__
`npm install`3. __Build the code__
`npm run build`4. __Run the tests__
`npm test`License
--------------------------
Rehype URL Inspector is 100% free and open-source, under the [MIT license](LICENSE). Use it however you want.This package is [Treeware](http://treeware.earth). If you use it in production, then we ask that you [**buy the world a tree**](https://plant.treeware.earth/JS-DevTools/rehype-url-inspector) to thank us for our work. By contributing to the Treeware forest you’ll be creating employment for local families and restoring wildlife habitats.
Big Thanks To
--------------------------
Thanks to these awesome companies for their support of Open Source developers ❤[![Travis CI](https://jstools.dev/img/badges/travis-ci.svg)](https://travis-ci.com)
[![SauceLabs](https://jstools.dev/img/badges/sauce-labs.svg)](https://saucelabs.com)
[![Coveralls](https://jstools.dev/img/badges/coveralls.svg)](https://coveralls.io)