https://github.com/growgraph/legal_ie
https://github.com/growgraph/legal_ie
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/growgraph/legal_ie
- Owner: growgraph
- License: other
- Created: 2024-11-01T12:10:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-10-01T13:50:02.000Z (9 months ago)
- Last Synced: 2025-10-01T15:29:54.428Z (9 months ago)
- Language: Jupyter Notebook
- Size: 26.2 MB
- Stars: 8
- Watchers: 0
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Information extraction from Legal Documents
This packages contains:
- a criminal ontology for criminal appeals to the French Cassation Court (2023)
- a pipeline for fetching pdfs of appeals from [the French Cassation Court](https://www.courdecassation.fr/)
- a pipeline for deriving RDF triples from the appeals pdfs (based on GPT-4o mini)
- upload pipeline to Fuseki triple store
- sparql analysis scripts
### Downloading court decisions use geckodriver
1. Download gecko driver
```shell
wget https://github.com/mozilla/geckodriver/releases/download/v0.32.0/geckodriver-v0.32.0-linux64.tar.gz
```
2. install `xvfb` and set parameters, do not run geckodriver as root user
```shell
sudo apt install xvfb
Xvfb :99 -screen 0 1024x768x24 &
export DISPLAY=:99
```
3. run geckodriver (default port is 4444)
```shell
nohup geckodriver --port 4444 &
```
4. scrape
### setting up triple store (Fuseki)
```shell
cd docker/fuseki
cp .example.env .env
```
### Appeals Ontology
