https://github.com/moisestg/rare-lm

Understanding of broad discourse context for word prediction
https://github.com/moisestg/rare-lm

deep-learning nlp python tensorflow

Last synced: about 1 year ago
JSON representation

Understanding of broad discourse context for word prediction

Host: GitHub
URL: https://github.com/moisestg/rare-lm
Owner: moisestg
License: gpl-3.0
Created: 2017-05-03T08:23:45.000Z (about 9 years ago)
Default Branch: master
Last Pushed: 2020-10-31T18:25:34.000Z (over 5 years ago)
Last Synced: 2025-04-01T12:49:27.833Z (over 1 year ago)
Topics: deep-learning, nlp, python, tensorflow
Language: Python
Size: 20.7 MB
Stars: 6
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Broad Discourse Context for language modeling

## About the code:
This repo features a simple implementation of a [pointer sentinel mixture](https://arxiv.org/abs/1609.07843) language model (PSMM) in Tensorflow v1.4 (`psmm` folder) which may serve you as a starting point for your own projects. It also includes a vanilla RNNLM (`vanilla` folder).

## About the thesis:
The work is focused on analyizing the nature of the [LAMBADA dataset](https://arxiv.org/abs/1606.06031) and exploring techniques that may increase the performance on this task. Some key points of our work:
- LAMBADA focuses on probing the ability of language models to handle long-range dependencies. However, rare words (e.g. named entities) play an important role on the overall performance that is usually overlooked when all results are averaged.
- Pointer-based models yield state-of-the-art performance on LAMBADA. PSMM produces comparable results while having a high degree of adaptability.

For more details and results, check the full thesis [here](../master/report/MasterThesis_MTorres.pdf) or [there](https://doi.org/10.3929/ethz-b-000223923).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/moisestg/rare-lm

Awesome Lists containing this project

README