An open API service indexing awesome lists of open source software.

https://github.com/innovativeinventor/mitmscrape

A tool to filter network resources of sites, with an emphasis on js-heavy sites using mitmproxy and selenium. Mostly used to scrape Secretaries of State websites for raw election data.
https://github.com/innovativeinventor/mitmscrape

Last synced: over 1 year ago
JSON representation

A tool to filter network resources of sites, with an emphasis on js-heavy sites using mitmproxy and selenium. Mostly used to scrape Secretaries of State websites for raw election data.

Awesome Lists containing this project

README

          

## mitmscrape
A little utility designed to scrape and find resources fetched on a JS-heavy site.

## Setup
To set this up, get the latest chrome(ium) driver for your version of Google Chrome/chromium: https://chromedriver.chromium.org/downloads and unzip it.
Then, rename the driver to be `chromedriver`.

## Usage
Enviroment setup
```bash
poetry shell
```

Running
```bash
python3 mitmscrape.py [url] [recursion_depth]
```

Filtering urls (needs `ripgrep`)
```bash
cat urls.list | rg "\.json"
```

## Example usage
```
python3 mitmscrape.py https://results.enr.clarityelections.com/GA/105369 2
cat urls.list | rg "\.json"
```