Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lucidrains/distilled-retriever-pytorch
Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"
https://github.com/lucidrains/distilled-retriever-pytorch
artificial-intelligence attention-mechanism deep-learning question-answering retrieval
Last synced: 26 days ago
JSON representation
Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"
- Host: GitHub
- URL: https://github.com/lucidrains/distilled-retriever-pytorch
- Owner: lucidrains
- License: mit
- Created: 2020-12-11T17:56:16.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2020-12-16T20:03:00.000Z (about 4 years ago)
- Last Synced: 2024-10-29T20:11:12.288Z (2 months ago)
- Topics: artificial-intelligence, attention-mechanism, deep-learning, question-answering, retrieval
- Homepage:
- Size: 3.91 KB
- Stars: 32
- Watchers: 4
- Forks: 6
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Distilling Knowledge from Reader to Retriever
Implementation of the retriever distillation procedure as outlined in the paper Distilling Knowledge from Reader to Retriever in Pytorch. They propose to train the retriever using the cross attention scores as pseudo-labels. SOTA on QA.
Update: The BM25 gains actually do not look as impressive as the BERT gains. Also, it seems like distilling with BERT as the starting point never gets to the same level as BM25.
I am thinking whether it makes more sense to modify Marge (https://github.com/lucidrains/marge-pytorch) so one minimizes a loss between an extra prediction head on top of the retriever to the cross-attention scores, during training.
## Citations
```bibtex
@misc{izacard2020distilling,
title={Distilling Knowledge from Reader to Retriever for Question Answering},
author={Gautier Izacard and Edouard Grave},
year={2020},
eprint={2012.04584},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```