Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dylanhogg/legaldata
Provides access to Australian legal data
https://github.com/dylanhogg/legaldata
crawler data law lawtech legal legaltech
Last synced: 9 days ago
JSON representation
Provides access to Australian legal data
- Host: GitHub
- URL: https://github.com/dylanhogg/legaldata
- Owner: dylanhogg
- License: mit
- Created: 2020-10-12T01:37:46.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-03-10T05:51:28.000Z (over 2 years ago)
- Last Synced: 2024-10-11T18:13:37.490Z (26 days ago)
- Topics: crawler, data, law, lawtech, legal, legaltech
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 3
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Legal Data
[![pypi Version](https://img.shields.io/pypi/v/legaldata.svg?logo=pypi)](https://pypi.org/project/legaldata/)
![Latest Tag](https://img.shields.io/github/v/tag/dylanhogg/legaldata)
![Depenencies](https://img.shields.io/librariesio/github/dylanhogg/legaldata)A package for crawling Australian legal data from legislation.com.au and austlii.edu.au with cache support.
Please be respectful of server host resources by using a reasonable crawl delay,
honouring robots.txt and crawling at times when the server load is lighter.### Install from PyPi
```shell script
pip install legaldata
```### legislation.com.au example
This example will crawl Commonwealth Acts from [legislation.com.au](https://www.legislation.gov.au/) and copy
files (docx, pdf, zip) to the save path.```python
from legaldata.legislation.crawler import ActCrawlercrawler = ActCrawler()
save_path = "./legislation.com.au/"for index_url in crawler.get_index_pages():
acts = crawler.get_acts_from_index(index_url, save_path)
```### austlii.edu.au example
This example will crawl Commonwealth Acts from [austlii.edu.au/](http://www.austlii.edu.au/) and copy
files (rtf, txt) to the save path.```python
from legaldata.austlii.crawler import ActCrawlercrawler = ActCrawler()
save_path = "./austlii.edu.au/"for index_url in crawler.get_index_pages():
acts = crawler.get_acts_from_index(index_url, save_path)
```Legal Data is distributed under the MIT license.