https://github.com/gibbsbravo/paraphrasee
Paraphrase Generation Using Deep Reinforcement Learning - MSc Thesis
https://github.com/gibbsbravo/paraphrasee
deep-learning deep-reinforcement-learning natural-language-generation natural-language-processing paraphrase-detection paraphrase-generation paraphrase-identification reinforcement-learning reinforcement-learning-agent reinforcement-learning-algorithms reinforcement-learning-environments
Last synced: about 1 month ago
JSON representation
Paraphrase Generation Using Deep Reinforcement Learning - MSc Thesis
- Host: GitHub
- URL: https://github.com/gibbsbravo/paraphrasee
- Owner: gibbsbravo
- Created: 2020-06-10T15:44:04.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-06-10T17:03:58.000Z (almost 5 years ago)
- Last Synced: 2025-04-12T19:09:01.383Z (about 1 month ago)
- Topics: deep-learning, deep-reinforcement-learning, natural-language-generation, natural-language-processing, paraphrase-detection, paraphrase-generation, paraphrase-identification, reinforcement-learning, reinforcement-learning-agent, reinforcement-learning-algorithms, reinforcement-learning-environments
- Language: Python
- Homepage:
- Size: 1.88 MB
- Stars: 18
- Watchers: 2
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ParaPhrasee - Paraphrase Generation Using Deep Reinforcement Learning
The thesis and repo associated with the article Paraphrase Generation Using Deep Reinforcement Learning.
The code is not intended to run end-to-end for new applications and is instead meant to be used as starter code or for taking code snippets.
On page 29 of the thesis there is a full list of the different modules and a brief description. Please send me an email at [email protected] if you would like one of the specific modules not contained in the repo.
The key modules included are:
- data: Imports raw data from various sources, preprocesses, creates train/validation/test sets and vocab index. Also contains functions for saving and loading
- encoder_models: Defines classes for each encoder: GloVe, BERT, InferSent, Vanilla and GPT along with related code for retrieving embeddings
- MCTS: Monte Carlo Tree Search implementation for both FrozenLake and ParaPhrasee environments
- model_evaluation: Contains wrappers for different evaluation functions to be used primarily as reward functions for RL model
- paraphrasee_env: Defines environment dynamics for paraphrase generation task as RL problem
- RL_model: Defines and trains RL models for ParaPhrasee environment
- supervised_model: Defines, trains, and evaluates the defined supervised model with MLE and beam search. Includes modifications for teacher-forcing and attention
- toy_RL_pipeline: Defines and trains RL models for either CartPole or FrozenLake environments
- train_ESIM: Defines and trains an ESIM model for use as the discriminator / adversary