https://github.com/divefish/compatibilitymodeling

Last synced: 9 months ago
JSON representation

Host: GitHub
URL: https://github.com/divefish/compatibilitymodeling
Owner: DiveFish
Created: 2022-01-06T15:25:51.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2022-01-06T15:37:06.000Z (over 4 years ago)
Last Synced: 2025-10-10T08:06:38.946Z (9 months ago)
Language: Python
Size: 2.19 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # CompatibilityModeling

This is a fork of sfb833-a3/CompatibilityModeling that includes the parts that can be made publicly available. Code written by [janniss91](https://github.com/janniss91).

## Setup

For the setup of this repository simply type:

    make

This will

- set up a virtual environment for this repository,

- install all necessary project dependencies.

The lowest python version you can use is 3.7 (because the repository includes data classes).  

It is recommended to use ```Python 3.8```.

## Clean and Re-install

To reset the repository to its inital state, type:

    make dist-clean

This will remove the virtual environment and all dependencies.  

With the `make` command you can re-install them.

To remove temporary files like .pyc or .pyo files, type:

    make clean

## Packaging Info

The directory ```src/``` is a package, which is installed when you run the ```make``` command.  

Be careful to run it with the **Python module syntax** that you can see in the section ```Running``` below.

## Running

To extract all ambiguous PPs with their possible heads, type:

    python3 -m src.extract_ambiguous_pp your_input.conll your_output.tsv

**Todo**: The explanation for extract_pps must go here.

## Testing

To run the tests, type:

    python -m pytest tests/test_extract_ambiguous_pp.py

## Stored Models

3 of the different trained models have been stored:

1. Logistic Regression

2. NeuralCandidateScoringModel

3. NeuralCandidateScoringModel with Averaged nouns and lemma inputs

The models are stored in the `trained-models` directory.

The typical pytorch `state_dict` saving has been used.  

To load the model for inference, use:

```python

from src.pp_head_selection.models import NeuralCandidateScoringModel

model = NeuralCandidateScoringModel(input_dim=1076, output_dim=2)

model.load_state_dict(torch.load("trained-models/NeuralCandidateScoringModel"))

# OR:

from src.pp_head_selection.models import NeuralCandidateScoringModel

model = NeuralCandidateScoringModel(input_dim=1076, output_dim=2)

model.load_state_dict(torch.load("trained-models/NeuralCandidateScoringModel-averaged-nouns"))

# OR: 

from src.pp_head_selection.models import LogisticRegression

model = LogisticRegression(input_dim=1076, output_dim=2)

model.load_state_dict(torch.load("trained-models/LogisticRegression"))

# To inspect the model, use:

model.eval()

```

You must provide the `input_dim` and `output_dim` to the models.

The input and output dimensions are `1076` and `2` for all models.

However, note that for other feature selection processes, the input dimension might change.

You can find the training metrics and metadata for the threem models in `train-results/logs.txt`.  

For the stored models a category `Stored Model Path` is among the logged information.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/divefish/compatibilitymodeling

Awesome Lists containing this project

README