https://github.com/audiodude/court-version-scraper

A simple web app which scrapes the PACER court listing for ECF versions and displays them
https://github.com/audiodude/court-version-scraper

Last synced: 26 days ago
JSON representation

A simple web app which scrapes the PACER court listing for ECF versions and displays them

Host: GitHub
URL: https://github.com/audiodude/court-version-scraper
Owner: audiodude
License: mit
Created: 2015-06-26T20:39:38.000Z (almost 10 years ago)
Default Branch: main
Last Pushed: 2025-03-22T16:50:54.000Z (2 months ago)
Last Synced: 2025-05-07T03:45:05.301Z (26 days ago)
Language: Python
Homepage: http://court-version-scraper.herokuapp.com/
Size: 65.4 KB
Stars: 5
Watchers: 3
Forks: 4
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

# Court Version Scraper

A web app that scrapes information from the PACER court listings and displays it.

This app is currently deployed on Fly.io: https://court-version-scraper.fly.dev/

It currently only grabs the court's ECF version, but could easily be extended to grab more information.

The data that is used to generate the web page can also be obtained in JSON format by running: `$ python scrape.py`.
It takes about 3-4 minutes to download all of the relevant court pages. This JSON data is also available at the URL /courts.json (so for the Heroku deploy, http://court-version-scraper.herokuapp.com/courts.json)

Because it takes so long to download the pages, the production version of this application uses a hosted
[memcached](http://memcached.org/) instance to store the scraped data. This data is cached indefinitely. There is a
Heroku scheduled job (`$ python scrape.py -f`) which runs daily to refresh the contents of the cache.

The web server part of the project is written in the Python [Flask](https://flask.palletsprojects.com/) web framework.

### Installation and development

First:

```bash
pip -r requirements.txt
```

Then:

```bash
FLASK_DEBUG=true FLASK_APP=app.py flask run
```

If you have MEMCACHIER credentials, you can provide them to let your development server connect
to the production dataset:

MEMCACHIER_PASSWORD=1234 MEMCACHIER_SERVERS=foobar.memcachier.com:11211 MEMCACHIER_USERNAME=1234 FLASK_DEBUG=true FLASK_APP=app.py flask run

### Deployment

First install the [Fly,io CLI app](https://fly.io/docs/flyctl/). Then:

```bash
fly deploy
```

### Legal

Author: Travis Briggs ([email protected])

Licensed under the MIT License, see LICENSE.md

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/audiodude/court-version-scraper

Awesome Lists containing this project

README