Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/nyu-mll/jiant

jiant is an nlp toolkit
https://github.com/nyu-mll/jiant

bert multitask-learning nlp sentence-representation transfer-learning transformers

Last synced: about 1 month ago
JSON representation

jiant is an nlp toolkit

Host: GitHub
URL: https://github.com/nyu-mll/jiant
Owner: nyu-mll
License: mit
Created: 2018-06-18T18:12:47.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2023-07-06T22:00:38.000Z (11 months ago)
Last Synced: 2024-04-18T04:28:31.913Z (about 1 month ago)
Topics: bert, multitask-learning, nlp, sentence-representation, transfer-learning, transformers
Language: Python
Homepage: https://jiant.info
Size: 4.32 MB
Stars: 1,604
Watchers: 44
Forks: 292
Open Issues: 77
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Codeowners: CODEOWNERS

Lists

Awesome-pytorch-list - jiant
awesome-sentence-embedding - jiant
awesome-list - jiant - The multitask and transfer learning toolkit for natural language processing research. (Natural Language Processing / General Purpose NLP)

README

        🚨**Update**🚨: As of 2021/10/17, the `jiant` project is no longer being actively maintained. This means there will be no plans to add new models, tasks, or features, or update support to new libraries.



# `jiant` is an NLP toolkit

**The multitask and transfer learning toolkit for natural language processing research**

[![Generic badge](https://img.shields.io/github/v/release/nyu-mll/jiant)](https://shields.io/)

[![codecov](https://codecov.io/gh/nyu-mll/jiant/branch/master/graph/badge.svg)](https://codecov.io/gh/nyu-mll/jiant)

[![CircleCI](https://circleci.com/gh/nyu-mll/jiant/tree/master.svg?style=shield)](https://circleci.com/gh/nyu-mll/jiant/tree/master)

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)



**Why should I use `jiant`?**

- `jiant` supports [multitask learning](https://colab.research.google.com/github/nyu-mll/jiant/blob/master/examples/notebooks/jiant_Multi_Task_Example.ipynb)

- `jiant` supports [transfer learning](https://colab.research.google.com/github/nyu-mll/jiant/blob/master/examples/notebooks/jiant_STILTs_Example.ipynb)

- `jiant` supports [50+ natural language understanding tasks](./guides/tasks/supported_tasks.md)

- `jiant` supports the following benchmarks:

    - [GLUE](./guides/benchmarks/glue.md)

    - [SuperGLUE](./guides/benchmarks/superglue.md)

    - [XTREME](./guides/benchmarks/xtreme.md)

- `jiant` is a research library and users are encouraged to extend, change, and contribute to match their needs!

**A few additional things you might want to know about `jiant`:**

- `jiant` is configuration file driven

- `jiant` is built with [PyTorch](https://pytorch.org)

- `jiant` integrates with [`datasets`](https://github.com/huggingface/datasets) to manage task data

- `jiant` integrates with [`transformers`](https://github.com/huggingface/transformers) to manage models and tokenizers.

## Getting Started

* Get started with some simple [Examples](./examples)

* Learn more about `jiant` by reading our [Guides](./guides)

* See our [list of supported tasks](./guides/tasks/supported_tasks.md)

## Installation

To import `jiant` from source (recommended for researchers):

```bash

git clone https://github.com/nyu-mll/jiant.git

cd jiant

pip install -r requirements.txt

# Add the following to your .bash_rc or .bash_profile 

export PYTHONPATH=/path/to/jiant:$PYTHONPATH

```

If you plan to contribute to jiant, install additional dependencies with `pip install -r requirements-dev.txt`.

To install `jiant` from source (alternative for researchers):

```

git clone https://github.com/nyu-mll/jiant.git

cd jiant

pip install . -e

```

To install `jiant` from pip (recommended if you just want to train/use a model):

```

pip install jiant

```

We recommended that you install `jiant` in a virtual environment or a conda environment.

To check `jiant` was correctly installed, run a [simple example](./examples/notebooks/simple_api_fine_tuning.ipynb).

## Quick Introduction

The following example fine-tunes a RoBERTa model on the MRPC dataset.

Python version:

```python

from jiant.proj.simple import runscript as run

import jiant.scripts.download_data.runscript as downloader

EXP_DIR = "/path/to/exp"

# Download the Data

downloader.download_data(["mrpc"], f"{EXP_DIR}/tasks")

# Set up the arguments for the Simple API

args = run.RunConfiguration(

   run_name="simple",

   exp_dir=EXP_DIR,

   data_dir=f"{EXP_DIR}/tasks",

   hf_pretrained_model_name_or_path="roberta-base",

   tasks="mrpc",

   train_batch_size=16,

   num_train_epochs=3

)

# Run!

run.run_simple(args)

```

Bash version:

```bash

EXP_DIR=/path/to/exp

python jiant/scripts/download_data/runscript.py \

    download \

    --tasks mrpc \

    --output_path ${EXP_DIR}/tasks

python jiant/proj/simple/runscript.py \

    run \

    --run_name simple \

    --exp_dir ${EXP_DIR}/ \

    --data_dir ${EXP_DIR}/tasks \

    --hf_pretrained_model_name_or_path roberta-base \

    --tasks mrpc \

    --train_batch_size 16 \

    --num_train_epochs 3

```

Examples of more complex training workflows are found [here](./examples/).

## Contributing

The `jiant` project's contributing guidelines can be found [here](CONTRIBUTING.md).

## Looking for `jiant v1.3.2`?

`jiant v1.3.2` has been moved to [jiant-v1-legacy](https://github.com/nyu-mll/jiant-v1-legacy) to support ongoing research with the library. `jiant v2.x.x` is more modular and scalable than `jiant v1.3.2` and has been designed to reflect the needs of the current NLP research community. We strongly recommended any new projects use `jiant v2.x.x`.

`jiant 1.x` has been used in in several papers. For instructions on how to reproduce papers by `jiant` authors that refer readers to this site for documentation (including Tenney et al., Wang et al., Bowman et al., Kim et al., Warstadt et al.), refer to the [jiant-v1-legacy](https://github.com/nyu-mll/jiant-v1-legacy) README.

## Citation

If you use `jiant ≥ v2.0.0` in academic work, please cite it directly:

```

@misc{phang2020jiant,

    author = {Jason Phang and Phil Yeres and Jesse Swanson and Haokun Liu and Ian F. Tenney and Phu Mon Htut and Clara Vania and Alex Wang and Samuel R. Bowman},

    title = {\texttt{jiant} 2.0: A software toolkit for research on general-purpose text understanding models},

    howpublished = {\url{http://jiant.info/}},

    year = {2020}

}

```

If you use `jiant ≤ v1.3.2` in academic work, please use the citation found [here](https://github.com/nyu-mll/jiant-v1-legacy).

## Acknowledgments

- This work was made possible in part by a donation to NYU from Eric and Wendy Schmidt made

by recommendation of the Schmidt Futures program, and by support from Intuit Inc.

- We gratefully acknowledge the support of NVIDIA Corporation with the donation of a Titan V GPU used at NYU in this work.

- Developer Jesse Swanson is supported by the Moore-Sloan Data Science Environment as part of the NYU Data Science Services initiative.

## License

`jiant` is released under the [MIT License](https://github.com/nyu-mll/jiant/blob/master/LICENSE).