Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Lukas-Justen/Law-OMNI-BERT-Project

Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract law or court judgments.
https://github.com/Lukas-Justen/Law-OMNI-BERT-Project

bert bert-model contracts language-model law legal-texts statistical-linguistics

Last synced: about 2 months ago
JSON representation

Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract law or court judgments.

Awesome Lists containing this project

README

        

# Law-OMNI-BERT-Project

This is the repository for our "Law and Artificial Intelligence" project at Northwestern University. The team member for the project are Noah Caldwell-Gatsos __@ncaldwell17__, Rhett D'souza __@rhettdsouza13__ and Lukas Justen __@Lukas-Justen__.

#### Problem
Directly applying advancements in transfer learning from BERT results in poor accuracy in domain-specific areas like law because of a word distribution shift from general domain corpora to domain-specific corpora. In our project, we will demonstrate how the pre-trained language model BERT can be adapted to additional domains, such as contract law or court judgments.

#### Goal
We did not create and train the model, that requires resources beyond the scope of the project. Instead, what we propose is a framework for creating a domain-specific BERT by using legal contracts as a case study. This framework will cover why this is necessary, what kind of data is necessary, how the model is trained, and how the model’s performance can be evaluated.

Finally, we built a small frontend that allows you to visualize the complexity of a corpora. We hoped that this will help other people to gain insights into their datasets and figure out whether it makes sense to apply BERT to their domain.

## Repository Structure

## Case Study

## Frontend