Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rbroc/contrastive-user-encoders
Code for: Rocca, R., & Yarkoni, T. (2022), Language models as user encoders: Self-supervised learning of user encodings using transformers, to appear in Findings of the Association for Computational Linguistics: EMNLP 2022
https://github.com/rbroc/contrastive-user-encoders
contrastive-learning nlp python tensorflow transformers user-modeling
Last synced: 11 days ago
JSON representation
Code for: Rocca, R., & Yarkoni, T. (2022), Language models as user encoders: Self-supervised learning of user encodings using transformers, to appear in Findings of the Association for Computational Linguistics: EMNLP 2022
- Host: GitHub
- URL: https://github.com/rbroc/contrastive-user-encoders
- Owner: rbroc
- Created: 2022-10-19T10:16:38.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2022-10-24T08:32:16.000Z (about 2 years ago)
- Last Synced: 2024-10-24T11:52:23.890Z (about 2 months ago)
- Topics: contrastive-learning, nlp, python, tensorflow, transformers, user-modeling
- Language: Jupyter Notebook
- Homepage:
- Size: 45.2 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Contrastive training of author encoders using transformers
Includes code for constrastive user encoder from the EMNLP Findings paper:
- Rocca, R., & Yarkoni, T. (2022), Language models as user encoders: Self-supervised learning of user encodings using transformers, to appear in *Findings of the Association for Computational Linguistics: EMNLP 2022* (link coming soon)### Structure
- This repository does not include data, but the dataset can be recreated entirely using scripts made available under `reddit/preprocessing`;
- Model classes, trainer, and other utils can be found under `reddit`;
- `notebooks` include the code needed to replicate plots presented in the paper, as well as baseline fitting;
- `scripts` contain Python training scripts for both triplet loss training and downstream tasks;Note: triplet loss training could be streamlined using HuggingFace's `transformers` library - future refactoring may simplify the current code in this direction.