An open API service indexing awesome lists of open source software.

https://github.com/lainx86/latae

A simple python library to parse and read robots.txt files
https://github.com/lainx86/latae

library package python python-3 python-package scraper

Last synced: 10 months ago
JSON representation

A simple python library to parse and read robots.txt files

Awesome Lists containing this project

README

          

# Latae

> A pure Python library for parsing and reading robots.txt files

## 🛠 Note

Latae is currently in heavy development, expect bugs! More features are planned.

## 💻 Usage

Via a file on your local system...

```python
import latae as lt

with open("robots.txt", "r") as f:
rb_file = f.readlines()

# Get disallowed paths in the form of a Dict
lt.get_disallowed(rb_file)

# Get the XML sitemap
lt.get_sitemap(rb_file)
```

...Or via the `requests` module

```python
import requests
import latae as lt

rb_file = requests.get("https://duckduckgo.com/robots.txt").text

# Get disallowed paths in the form of a Dict
lt.get_disallowed(rb_file.splitlines())

# Get the XML sitemap
lt.get_sitemap(rb_file.splitlines())
```