https://github.com/llm-random/llm-random

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/llm-random/llm-random
Owner: llm-random
License: apache-2.0
Created: 2022-01-17T09:23:38.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-04-19T17:36:52.000Z (about 1 year ago)
Last Synced: 2024-04-20T14:42:56.123Z (about 1 year ago)
Language: Python
Homepage: https://llm-random.github.io/
Size: 6.67 MB
Stars: 145
Watchers: 7
Forks: 10
Open Issues: 5
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-state-space-models - Mixture of Experts

README

# llm-random
We are LLM-Random, a research group at [IDEAS NCBR](https://ideas-ncbr.pl/en/) (Warsaw, Poland). We develop this repo and use it to conduct research. To learn more about us and our research, check out our blog, [llm-random.github.io](https://llm-random.github.io/).

## Publications, preprints and blogposts
- Scaling Laws for Fine-Grained Mixture of Experts ([arxiv](https://arxiv.org/abs/2402.07871))
- MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts ([arxiv](https://arxiv.org/abs/2401.04081), [blogpost](https://llm-random.github.io/posts/moe_mamba/))
- Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation ([arxiv](https://arxiv.org/abs/2310.15961), [blogpost](https://llm-random.github.io/posts/mixture_of_tokens/))
- Decoupled Relative Learning Rate Schedules (implementation [tag](https://github.com/llm-random/llm-random/releases/tag/relative_learning_rates))

## Development (WIP)
### Getting started
In the root directory run `./start-dev.sh`. This will create a virtual environment, install requirements and set up git hooks.

## Running Experiments (WIP)

### Experiments config
Use the baseline configuration as a template, which is in `configs/test/test_baseline.yaml`. Based on this template, create a new experiment config and put it in `lizrd/scripts/run_configs`.

### Running Locally
`python -m lizrd.grid path/to/config`

### Running Remotely
`bash scripts/run_exp_remotely.sh scripts/run_configs/`

#### Running on the Helios cluster
Since Helios uses an older Python version, you need to install a conda environment with newer Python. To do so, copy `setup_helios.sh` to your home directory and run it.

### Initializing New Project

```bash
cd research/
cp -r template new_project
cd new_project
find . -type f -exec sed -i 's/research\.template/research\.new_project/g' {} +
```
To use the runner of your new project, add `runner: ` to your yaml config.
If you move train.py or argparse.py, also add `argparse: ` to your yaml config.

### Troubleshooting

#### `TypeError: 'type' object is not subscriptable`
This means you are probably running an older Python version. You can easily upgrade the Python version by installing conda.

#### Host key verification failed when running remotely
```
Host key verification failed.
fatal: Could not read from remote repository.
```
This might point to an issue with github authentication when setting up a new cluster
Fix by:
1. Adding `ForwardAgent yes` to the host's config in ~/.ssh/config
2. Running `ssh -T [email protected]` on the remote host

# License

This project is licensed under the terms of the Apache License, Version 2.0.

Copyright 2023 LLM-Random Authors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/llm-random/llm-random

Awesome Lists containing this project

README