https://github.com/metaskills/fast-llama-inference

Exploring Accelerated Compound AI Systems with SambaNova & Llama 3.3-70B
https://github.com/metaskills/fast-llama-inference

artificial-intelligence evaluation llama3 sambanova sambanova-cloud typescript vercel vercel-ai-sdk

Last synced: 3 months ago
JSON representation

Exploring Accelerated Compound AI Systems with SambaNova & Llama 3.3-70B

Host: GitHub
URL: https://github.com/metaskills/fast-llama-inference
Owner: metaskills
License: mit
Created: 2024-12-26T21:04:33.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-02-16T00:15:57.000Z (4 months ago)
Last Synced: 2025-02-24T03:45:35.765Z (3 months ago)
Topics: artificial-intelligence, evaluation, llama3, sambanova, sambanova-cloud, typescript, vercel, vercel-ai-sdk
Language: TypeScript
Homepage: https://www.unremarkable.ai/exploring-accelerated-compound-ai-systems-with-sambanova-llama-3-3/
Size: 479 KB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

# Fast Llama Inference with SambaNova

https://www.unremarkable.ai/exploring-accelerated-compound-ai-systems-with-sambanova-llama-3-3/

![Xyz](images/samba-nova.png)

## Setup

Must use 23.6.0 or higher of Node.js with TypeScript support.

Run npm install.

```shell
npm install
```

Make sure you have the following environment variables needed:

- `SAMBANOVA_API_KEY` - Your [SambaNova](https://cloud.sambanova.ai/apis?ref=unremarkable.ai) API Key.

All demos & experiments leverage the following tools:

1. Use of [Inquirer.js](https://www.npmjs.com/package/inquirer?ref=unremarkable.ai) with the CLI to prompt for user questions.
2. The [Vercel AI SDK](https://sdk.vercel.ai?ref=unremarkable.ai) is used to invoke and stream model output to the CLI.
3. Using the `Meta-Llama-3.3-70B-Instruct` model via hyper fast inference thanks to [SambaNova](https://sambanova.ai?ref=unremarkable.ai).

```shell
npm run demo
npm run evalVerifyClaims
```

If you are testing other providers such as OpenAI or Bedrock you will need to make sure you have supporting API keys in your environment. For example `OPENAI_API_KEY` and standard AWS environment variables.

```shell
MODEL=bedrock npm run evalVerifyClaims
```

## Notes

The bedrock-cross-region.md file contains the following claims which the demo verifies.

```
✅ Amazon Bedrock launched cross-region inference on August 27, 2024
Summary: The claim is correct as Amazon Bedrock did launch cross-region inference on August 27, 2024, as stated in the provided sources.
✅ Cross-region inference provides up to 2x the allocated in-region quotas
Summary: The claim is correct as cross-region inference provides up to 2x the allocated in-region quotas according to the sources.
✅ Cross-region inference support was extended to Knowledge Bases on September 13, 2024
Summary: The claim is correct as cross-region inference support was indeed extended to Knowledge Bases on September 13, 2024, as stated in the provided sources.
```

But if you change the first claim in that file to August 25, 2024, you will see the following output.

```
❌ Amazon Bedrock launched cross-region inference on August 25, 2024
Summary: The claim is incorrect because Amazon Bedrock launched cross-region inference on August 27, 2024, not August 25, 2024.
Fixed Claim: Amazon Bedrock launched cross-region inference on August 27, 2024
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/metaskills/fast-llama-inference

Awesome Lists containing this project

README