https://github.com/googleinterns/localizing-paragraph-memorization
https://github.com/googleinterns/localizing-paragraph-memorization
Last synced: 11 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/googleinterns/localizing-paragraph-memorization
- Owner: googleinterns
- License: apache-2.0
- Created: 2024-02-02T17:14:49.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-21T22:30:53.000Z (over 2 years ago)
- Last Synced: 2024-11-29T13:50:13.297Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 1.79 MB
- Stars: 13
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Localizing and Controlling Paragraph Level Recitation
Explores how small open source LMs like GPT-Neo implement paragraph level recitation from the training data. Includes helper scripts and exploratory Jupyter notebooks. Work done by [Niklas Stoehr](https://niklas-stoehr.com/) during his winter 2023 research internship.
**Not an official google project**
## Project Structure
### utils
__________________________________________________
helper scripts with basic functionality that is used in different notebooks
- patching
- evaluation
- dataLoaders
- gradient
- intervening
- localizing
- modelHandlers
### notebooks
__________________________________________________
notebooks to reproduce the main experiments
#### 1 descriptive
- explorative
- token pertubation
#### 2 localizing
- activation patching
- gradient-based attribution
- parameter gradients
- activation gradients
- attention head analysis
#### 3 editing
## paragraphs
__________________________________________________
CSV file of some paragraphs that are memorized by GPT-neo-125M