https://github.com/jqhoogland/pattern-learning

Toy models of quanta learning
https://github.com/jqhoogland/pattern-learning

Last synced: 4 months ago
JSON representation

Toy models of quanta learning

Host: GitHub
URL: https://github.com/jqhoogland/pattern-learning
Owner: jqhoogland
Created: 2023-03-28T15:34:13.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-08-09T10:27:02.000Z (almost 2 years ago)
Last Synced: 2025-01-20T21:48:26.721Z (5 months ago)
Language: Jupyter Notebook
Size: 112 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Delayed Generalization: Unifying Grokking and Double Descent

## Interpolating between grokking and double descent

## Running the sweeps

Replace `config.yml` in the following with the relevant config file:

```shell
wandb sweep --project grokking
```

This will initialize a sweep.

To run the sweep, run the following command:

```shell
wandb agent
```

where `` is the id of the sweep you want to run. You can find the sweep id by running `wandb sweep ls`.

You can pass an optional `--count` flag to the `wandb agent` command to specify the number of runs you want to execute. If you don't pass this flag, the agent will run until all the runs in the sweep are complete (for a grid sweep).

On a multi-GPU machine, you can run multiple agents in parallel through the following:

```shell
CUDA_VISIBLE_DEVICES=0 wandb agent &
CUDA_VISIBLE_DEVICES=1 wandb agent &
...
```

## Toy Model

See the jupyter notebooks in `toy_models` for more instructions.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jqhoogland/pattern-learning

Awesome Lists containing this project

README