Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/antoinegagne/robots
A parser for robots.txt with support for wildcards. See also RFC 9309.
https://github.com/antoinegagne/robots
crawling erlang erlang-library parser parsing parsing-library robots-parser robots-txt
Last synced: 3 months ago
JSON representation
A parser for robots.txt with support for wildcards. See also RFC 9309.
- Host: GitHub
- URL: https://github.com/antoinegagne/robots
- Owner: AntoineGagne
- License: other
- Created: 2019-11-28T02:14:09.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2023-11-21T21:07:56.000Z (about 1 year ago)
- Last Synced: 2024-11-02T13:33:46.236Z (3 months ago)
- Topics: crawling, erlang, erlang-library, parser, parsing, parsing-library, robots-parser, robots-txt
- Language: Erlang
- Homepage:
- Size: 30.3 KB
- Stars: 3
- Watchers: 1
- Forks: 2
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# robots
[![Build Status](https://github.com/AntoineGagne/robots/actions/workflows/erlang.yml/badge.svg)](https://github.com/AntoineGagne/robots/actions)
[![Hex Pm](http://img.shields.io/hexpm/v/robots.svg?style=flat)](https://hex.pm/packages/robots)
[![Docs](https://img.shields.io/badge/hex-docs-green.svg?style=flat)](https://hexdocs.pm/robots)
[![Releases](https://img.shields.io/github/release/AntoineGagne/robots?color=brightgreen)](https://github.com/AntoineGagne/robots/releases)
[![Coverage](https://coveralls.io/repos/github/AntoineGagne/robots/badge.svg?branch=master)](https://coveralls.io/github/AntoineGagne/robots?branch=master)A library that parses and validates rules from `robots.txt`.
## Installation
This library is available on [hex.pm](https://hex.pm/packages/robots).
To install this library, simply add the following lines to your
`rebar.config`:```erlang
{robots, "1.1.1"}
```## Usage
```erlang
Content = <<"User-Agent: bot\nAllow: /fish">>,
%% This will return an opaque type that contains all the rules and their agents
{ok, RulesIndex} = robots:parse(Content, 200),
true = robots:is_allowed(<<"bot/1.0.0">>, <<"/fish/salmon.html">>, RulesIndex),
true = robots:is_allowed(<<"bot/1.0.0">>, <<"/Fish.asp">>, RulesIndex),
```## Development
### Running all the tests and linters
You can run all the tests and linters with the `rebar3` alias:
```sh
rebar3 check
```