https://github.com/replicate/cog-mamba
Cog wrapper for Mamba language models
https://github.com/replicate/cog-mamba
Last synced: 11 months ago
JSON representation
Cog wrapper for Mamba language models
- Host: GitHub
- URL: https://github.com/replicate/cog-mamba
- Owner: replicate
- License: apache-2.0
- Created: 2024-02-05T18:24:05.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-05T19:06:40.000Z (almost 2 years ago)
- Last Synced: 2025-02-25T18:15:46.112Z (11 months ago)
- Language: Python
- Size: 7.81 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Cog wrapper for Mamba LLMs
This is a cog wrapper for Mamba LLM models. See the original [repo](https://github.com/state-spaces/mamba), [paper](https://arxiv.org/abs/2312.00752) and Replicate [demo](https://replicate.com/adirik/mamba-130m) for details.
## Basic Usage
You will need to have [Cog](https://github.com/replicate/cog/blob/main/docs/getting-started-own-model.md) and Docker installed to serve your model as an API. Follow the [model pushing guide](https://replicate.com/docs/guides/push-a-model) to push your own fork of the model to [Replicate](https://replicate.com) with Cog. To run a prediction:
```bash
cog predict -i prompt="How are you doing today?"
```
To start your server and serve the model as an API:
```bash
cog run -p 5000 python -m cog.server.http
```
The API input arguments are as follows:
- **prompt:** The text prompt for Mamba.
- **max_length:** Maximum number of tokens to generate. A word is generally 2-3 tokens.
- **temperature:** Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
- **top_p:** Samples from the top p percentage of most likely tokens during text decoding, lower to ignore less likely tokens.
- **top_k:** Samples from the top k most likely tokens during text decoding, lower to ignore less likely tokens.
- **repetition_penalty:** Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.
- **seed:** The seed parameter for deterministic text generation. A specific seed can be used to reproduce results or left blank for random generation.
## References
```
@article{mamba,
title={Mamba: Linear-Time Sequence Modeling with Selective State Spaces},
author={Gu, Albert and Dao, Tri},
journal={arXiv preprint arXiv:2312.00752},
year={2023}
}
```