https://github.com/dylanhogg/legaldata

Provides access to Australian legal data
https://github.com/dylanhogg/legaldata

crawler data law lawtech legal legaltech

Last synced: 3 months ago
JSON representation

Provides access to Australian legal data

Host: GitHub
URL: https://github.com/dylanhogg/legaldata
Owner: dylanhogg
License: mit
Created: 2020-10-12T01:37:46.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2022-03-10T05:51:28.000Z (over 3 years ago)
Last Synced: 2025-06-24T02:11:26.197Z (4 months ago)
Topics: crawler, data, law, lawtech, legal, legaltech
Language: Python
Homepage:
Size: 40 KB
Stars: 5
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Legal Data

[![pypi Version](https://img.shields.io/pypi/v/legaldata.svg?logo=pypi)](https://pypi.org/project/legaldata/)

![Latest Tag](https://img.shields.io/github/v/tag/dylanhogg/legaldata)

![Depenencies](https://img.shields.io/librariesio/github/dylanhogg/legaldata)

A package for crawling Australian legal data from legislation.com.au and austlii.edu.au with cache support.

Please be respectful of server host resources by using a reasonable crawl delay, 

honouring robots.txt and crawling at times when the server load is lighter.  

### Install from PyPi

```shell script

pip install legaldata

```

### legislation.com.au example

This example will crawl Commonwealth Acts from [legislation.com.au](https://www.legislation.gov.au/) and copy 

files (docx, pdf, zip) to the save path.

```python

from legaldata.legislation.crawler import ActCrawler

crawler = ActCrawler()

save_path = "./legislation.com.au/"

for index_url in crawler.get_index_pages():

    acts = crawler.get_acts_from_index(index_url, save_path)

```

### austlii.edu.au example

This example will crawl Commonwealth Acts from [austlii.edu.au/](http://www.austlii.edu.au/) and copy 

files (rtf, txt) to the save path.

```python

from legaldata.austlii.crawler import ActCrawler

crawler = ActCrawler()

save_path = "./austlii.edu.au/"

for index_url in crawler.get_index_pages():

    acts = crawler.get_acts_from_index(index_url, save_path)

```

Legal Data is distributed under the MIT license.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dylanhogg/legaldata

Awesome Lists containing this project

README