Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lgrammel/modelfusion-llamacpp-nextjs-starter
Starter examples for using Next.js and the Vercel AI SDK with Llama.cpp and ModelFusion.
https://github.com/lgrammel/modelfusion-llamacpp-nextjs-starter
ai llama2 llamacpp mistral modelfusion next nextjs vercel-ai vercel-ai-sdk
Last synced: 6 days ago
JSON representation
Starter examples for using Next.js and the Vercel AI SDK with Llama.cpp and ModelFusion.
- Host: GitHub
- URL: https://github.com/lgrammel/modelfusion-llamacpp-nextjs-starter
- Owner: lgrammel
- License: mit
- Created: 2023-11-19T16:46:24.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-01-13T12:23:26.000Z (9 months ago)
- Last Synced: 2024-09-18T18:51:14.559Z (15 days ago)
- Topics: ai, llama2, llamacpp, mistral, modelfusion, next, nextjs, vercel-ai, vercel-ai-sdk
- Language: TypeScript
- Homepage:
- Size: 281 KB
- Stars: 31
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Next.js, Vercel AI SDK, Llama.cpp & ModelFusion starter
This starter example shows how to use [Next.js](https://nextjs.org/), the [Vercel AI SDK](https://sdk.vercel.ai/docs), [Llama.cpp](https://github.com/ggerganov/llama.cpp) and [ModelFusion](https://modelfusion.dev) to create a ChatGPT-like AI-powered streaming chat bot.
## Setup
1. Install [Llama.cpp](https://github.com/ggerganov/llama.cpp) on your machine.
2. Clone the repository: `git clone https://github.com/lgrammel/modelfusion-llamacpp-nextjs-starter.git`
3. Install dependencies: `npm install`
4. Start the development server: `npm run dev`For each example, you also need to download the GGUF model and start the Llama.cpp server:
## Examples
### Llama 2
1. Model: [Llama-2-7B-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF)
2. Server start: `./server -m models/llama-2-7b-chat.Q4_K_M.gguf` (with the right model path)
3. Go to http://localhost:3000/llama2
4. Code: `app/api/llama/route.ts`### Mistral Instruct
1. Model: [Mistral-7B-Instruct-v0.2-GGUF](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF)
1. Server start: `./server -m models/mistral-7b-instruct-v0.2.Q4_K_M.gguf` (with the right model path)
1. Go to http://localhost:3000/mistral
1. Code: `app/api/mistral/route.ts`### Mixtral Instruct
1. Model: [Mixtral-8x7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF)
1. Server start: `./server -m models/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf` (with the right model path)
1. Go to http://localhost:3000/mixtral
1. Code: `app/api/mixtral/route.ts`### OpenHermes 2.5
1. Model: [OpenHermes-2.5-Mistral-7B-GGUF](https://huggingface.co/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF)
1. Server start: `./server -m models/openhermes-2.5-mistral-7b.Q4_K_M.gguf` (with the right model path)
1. Go to http://localhost:3000/openhermes
1. Code: `app/api/openhermes/route.ts`## Example Route
```ts
import { ModelFusionTextStream, asChatMessages } from "@modelfusion/vercel-ai";
import { Message, StreamingTextResponse } from "ai";
import { llamacpp, streamText, trimChatPrompt } from "modelfusion";export const runtime = "edge";
export async function POST(req: Request) {
const { messages }: { messages: Message[] } = await req.json();const model = llamacpp
.CompletionTextGenerator({
promptTemplate: llamacpp.prompt.Llama2, // choose the correct prompt template
temperature: 0,
cachePrompt: true,
contextWindowSize: 4096, // Llama 2 context window size
maxGenerationTokens: 512, // Room for answer
})
.withChatPrompt();// Use ModelFusion to call llama.cpp:
const textStream = await streamText({
model,
// reduce chat prompt length to fit the context window:
prompt: await trimChatPrompt({
model,
prompt: {
system:
"You are an AI chat bot. " +
"Follow the user's instructions carefully.",// map Vercel AI SDK Message to ModelFusion ChatMessage:
messages: asChatMessages(messages),
},
}),
});// Return the result using the Vercel AI SDK:
return new StreamingTextResponse(
ModelFusionTextStream(
textStream,
// optional callbacks:
{
onStart() {
console.log("onStart");
},
onToken(token) {
console.log("onToken", token);
},
onCompletion: () => {
console.log("onCompletion");
},
onFinal(completion) {
console.log("onFinal", completion);
},
}
)
);
}
```