An open API service indexing awesome lists of open source software.

https://github.com/tikquuss/eulapp

Django application of https://github.com/Tikquuss/eulascript, the Machine learning (ML) solution that review end-user license agreements (EULA) for terms and conditions that are unacceptable to the government
https://github.com/tikquuss/eulapp

bert css django fine-tuning html linear-regression nlp-machine-learning nltk sklearn text-classification transformer xlnet

Last synced: 7 months ago
JSON representation

Django application of https://github.com/Tikquuss/eulascript, the Machine learning (ML) solution that review end-user license agreements (EULA) for terms and conditions that are unacceptable to the government

Awesome Lists containing this project

README

          

Use the application by directly following this link : https://whispering-cove-26674.herokuapp.com/

# Dependencies
* django
* torch
* numpy
* nltk
* numpy
* transformers

# User's Guide
## Setting up dependencies
```
pip install -r requirements.txt
```

All the pre-trained models, dictionaries and useful methods have been serialized and deposited in [production.pth](prediction/production.pth)

It is a dictionary containing the following elements:
- **WORDS_TO_INDEX** : Dictionary of words and their order in the vocabulary (Bag of words)
- **DICT_SIZE** : size of the dictionary (Bag of word)
- **classifier_mybag** : model of logistic regression based on the Bag of word
- **tfidf_vectorizer** : method to transform sentences into vectors (TF-IDF)
- **classifier_tfidf**: logistic regression model based on TF-IDF
- **classifier_bert** : logistic regression model based on BERT
- **max_input_length** : maximum length of sentences accepted by BERT

More details can be found in [utils.py](prediction/utils.py).

The first launch of the application takes a little time because bert pre-trained is loaded as well as his tokenizer: see [utils.py](prediction/utils.py).

You can also adapt the previous parameters by following the steps in this [notebook](https://colab.research.google.com/drive/1Ptq1A27ENcqqtcq2WmB6aztBc-qzBq_7#scrollTo=3QFCqOZ9hCb1).

## Launch the application
```
python manage.py runsever
```
http://localhost:8000/

## Home