https://github.com/rish-16/ma4270-project

Source code for MA4270: Data Modelling and Computation on Transformers and Nadaraya-Watson Kernel Regression
https://github.com/rish-16/ma4270-project

Last synced: about 1 year ago
JSON representation

Source code for MA4270: Data Modelling and Computation on Transformers and Nadaraya-Watson Kernel Regression

Host: GitHub
URL: https://github.com/rish-16/ma4270-project
Owner: rish-16
License: mit
Created: 2024-03-28T08:51:39.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-05-29T00:21:36.000Z (about 2 years ago)
Last Synced: 2024-05-29T14:36:16.798Z (about 2 years ago)
Language: Python
Size: 8.79 MB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Self-Attention and Nadaraya-Watson Kernel Regression

Here, we show connections between the **Transformer** and the **Kernel Regression**. We show how the dot-product between queries $\mathbf{q}_i$ and keys $\mathbf{k}_i$ can be swapped out with miscellaneous kernel operations $\alpha(\cdot, \cdot)$, chief among them being the _Nadaraya-Watson kernel_ $K$. We also empirically show how Self-attention variants can successfully learn on sequential data like periodic and aperiodic functions.

> This is a class project for _MA4270: Data Modelling and Computation_ by Rishabh Anand (A0220603Y) and Ryan Chung Yi Sheng (A0219702J). [[`pdf`](https://github.com/rish-16/ma4270-project/blob/main/MA4270_Final_Report.pdf)]

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rish-16/ma4270-project

Awesome Lists containing this project

README