https://github.com/moisestg/rare-lm
Understanding of broad discourse context for word prediction
https://github.com/moisestg/rare-lm
deep-learning nlp python tensorflow
Last synced: about 1 year ago
JSON representation
Understanding of broad discourse context for word prediction
- Host: GitHub
- URL: https://github.com/moisestg/rare-lm
- Owner: moisestg
- License: gpl-3.0
- Created: 2017-05-03T08:23:45.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2020-10-31T18:25:34.000Z (over 5 years ago)
- Last Synced: 2025-04-01T12:49:27.833Z (about 1 year ago)
- Topics: deep-learning, nlp, python, tensorflow
- Language: Python
- Size: 20.7 MB
- Stars: 6
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Broad Discourse Context for language modeling
## About the code:
This repo features a simple implementation of a [pointer sentinel mixture](https://arxiv.org/abs/1609.07843) language model (PSMM) in Tensorflow v1.4 (`psmm` folder) which may serve you as a starting point for your own projects. It also includes a vanilla RNNLM (`vanilla` folder).
## About the thesis:
The work is focused on analyizing the nature of the [LAMBADA dataset](https://arxiv.org/abs/1606.06031) and exploring techniques that may increase the performance on this task. Some key points of our work:
- LAMBADA focuses on probing the ability of language models to handle long-range dependencies. However, rare words (e.g. named entities) play an important role on the overall performance that is usually overlooked when all results are averaged.
- Pointer-based models yield state-of-the-art performance on LAMBADA. PSMM produces comparable results while having a high degree of adaptability.
For more details and results, check the full thesis [here](../master/report/MasterThesis_MTorres.pdf) or [there](https://doi.org/10.3929/ethz-b-000223923).