https://github.com/cyberagentailab/japanese-nli-model
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
https://github.com/cyberagentailab/japanese-nli-model
bert japanese natural-language-processing natural-language-understanding nli nlp roberta sentence-transformers transformers
Last synced: 24 days ago
JSON representation
This repository provides the code for Japanese NLI model, a fine-tuned masked language model.
- Host: GitHub
- URL: https://github.com/cyberagentailab/japanese-nli-model
- Owner: CyberAgentAILab
- License: cc-by-4.0
- Created: 2022-10-24T06:59:08.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-26T08:42:51.000Z (almost 3 years ago)
- Last Synced: 2025-09-10T07:42:53.275Z (about 1 month ago)
- Topics: bert, japanese, natural-language-processing, natural-language-understanding, nli, nlp, roberta, sentence-transformers, transformers
- Language: Jupyter Notebook
- Homepage: https://huggingface.co/cyberagent/xlm-roberta-large-jnli-jsick
- Size: 33.2 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Japanese Natural Language Inference Model
This repository provides the code for [Japanese NLI model](https://huggingface.co/cyberagent/xlm-roberta-large-jnli-jsick), a fine-tuned masked language model.## Performance
The model showed performance comparable with those reported in [JGLUE](https://github.com/yahoojapan/JGLUE) [Kurihara et al. 2022] and [JSICK](https://github.com/verypluming/JSICK) [Yanaka and Mineshima 2022] papers, in terms of overall accuracy:| Model | JGLUE-JNLI valid [%] | JSICK test [%] |
|:-------------------------------:|:----:|:-----:|
| [Kurihara et al. 2022] | 91.9 | N/A |
| [Yanaka and Mineshima 2022] | N/A | 89.1 |
| ours using both JNLI and JSICK | 90.9 | 89.0 |## References
- Hitomi Yanaka and Koji Mineshima. [Compositional Evaluation on Japanese Textual Entailment and Similarity](https://arxiv.org/abs/2208.04826). TACL2022.
- Kentaro Kurihara, Daisuke Kawahara, and Tomohide Shibata. [JGLUE: Japanese General Language Understanding Evaluation](https://aclanthology.org/2022.lrec-1.317/). LREC2022.
- Nils Reimers and Iryna Gurevych. [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://aclanthology.org/D19-1410/). EMNLP-IJCNLP2019.
- Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. [Unsupervised Cross-lingual Representation Learning at Scale](https://aclanthology.org/2020.acl-main.747/). ACL2020.## Appendix: Hyperparameters
### random seeds
Yes, we tested only a single run :(
```python
torch.manual_seed(0)
random.seed(0)
np.random.seed(0)
```### dataset order
1. JSICK
1. JGLUE### labels
We converted string label into integer using the following mapping:
```python
label2int = {"contradiction": 0, "entailment": 1, "neutral": 2}
```### CrossEncoder
We mimicked `batch_size=128` using gradient accumulation `32 * 4 = 128`.
```python
batch_size=32,
shuffle=True,
epochs=3,
accumulation_steps=4,
optimizer_params={'lr': 5e-5},
warmup_steps=math.ceil(0.1 * len(data)),
```