https://github.com/davzoku/cria

An end-to-end LLM app prototype based on Llama 2
https://github.com/davzoku/cria

artificial-intelligence chatbot llama2 llm nextjs transformers

Last synced: about 1 year ago
JSON representation

An end-to-end LLM app prototype based on Llama 2

Host: GitHub
URL: https://github.com/davzoku/cria
Owner: davzoku
License: mit
Created: 2023-08-06T13:49:29.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-02-14T13:12:05.000Z (over 2 years ago)
Last Synced: 2025-04-19T12:06:21.960Z (over 1 year ago)
Topics: artificial-intelligence, chatbot, llama2, llm, nextjs, transformers
Language: TypeScript
Homepage: https://chat.walterteng.com
Size: 8.4 MB
Stars: 6
Watchers: 3
Forks: 2
Open Issues: 4
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          # 🍼🦙 CRIA

[![Netlify Status](https://api.netlify.com/api/v1/badges/a9502b61-04e1-4202-be27-e5bd304321ae/deploy-status)](https://app.netlify.com/sites/cria-chat/deploys)

💡 [Article](https://walterteng.com/cria) | 💻 [HuggingFace](https://huggingface.co/davzoku/cria-llama2-7b-v1.3) | 📔 Colab [1](https://colab.research.google.com/drive/1rYTs3qWJerrYwihf1j0f00cnzzcpAfYe),[2](https://colab.research.google.com/drive/1Wjs2I1VHjs6zT_GE42iEXsLtYh6VqiJU)

Welcome to CRIA, a LLM model series based on [Llama 2-7B](https://github.com/facebookresearch/llama).

## What is CRIA?

> **Hint:** krē-ə plural crias; a baby llama, alpaca, vicuña, or guanaco.



   




With ChatGPT's help, CRIA also stands for **"Crafting a Rapid prototype of an Intelligent llm App using open source resources"**. This encapsulates the objective of this project perfectly.

Additionally, akin to a baby llama in nature, CRIA pays homage to its foundational model, Meta's Llama-2 7B Large Language Model.

## Features

- Demostration of instruction-tuning on latest open source LLM using a custom dataset on a _free colab instance_.

- Utilized FastAPI for efficient model serving and inference deployment.

  - Supports real-time with Server-Sent Events (SSE) for a seamless chat experience.

- Enjoy a modern front-end built with Next.js and Chakra UI.

- Supports both local deployment, and cloud deployment. (Coming Soon!)



  


  Demo: Leveraging on open source resources such as Horizon AI Template



## What You'll Find Here

In this repository, you'll find:

**Code:** Dive into the technical details of our chatbot implementation, including the training process, API server implementation, the integration of Next.js for the user interface, and more.

**Documentation:** Detailed documentation to help you understand and replicate the CRIA setup, from model selection to deployment considerations.

**Demo:** Access a live demo showcasing CRIA in action.

## Model History

| HuggingFace Model                                                                                                                                                   | Model Type    | Base Model                                                                                    | Dataset                                                                                                                | Colab                                                                                                                                                                                                                                                                                                           | Status       |

| ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | --------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------ |

| [cria-llama2-7b-v1.3](https://huggingface.co/davzoku/cria-llama2-7b-v1.3), 
 [cria-llama2-7b-v1.3_peft](https://huggingface.co/davzoku/cria-llama2-7b-v1.3_peft) | Merged / PEFT | [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf)     | [mlabonne/CodeLlama-2-20k](https://huggingface.co/datasets/mlabonne/CodeLlama-2-20k)                                   | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1rYTs3qWJerrYwihf1j0f00cnzzcpAfYe) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Wjs2I1VHjs6zT_GE42iEXsLtYh6VqiJU) | Latest       |

| cria-llama2-7b-v1.1, cria-llama2-7b-v1.2                                                                                                                            | Merged / PEFT | [TinyPixel/Llama-2-7B-bf16-sharded](https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded) | [n3rd0/DreamBook_Guanaco_Format](https://huggingface.co/datasets/n3rd0/DreamBook_Guanaco_Format)                       | N.A.                                                                                                                                                                                                                                                                                                            | Experimental |

| cria-llama2-7b-v1.0                                                                                                                                                 | PEFT          | [TinyPixel/Llama-2-7B-bf16-sharded](https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded) | [Elliot4AI/dolly-15k-chinese-guanacoformat](https://huggingface.co/datasets/Elliot4AI/dolly-15k-chinese-guanacoformat) | N.A.                                                                                                                                                                                                                                                                                                            | Experimental |

## Documentation

### Setup

The instructions to run the various components, such as the API server and frontend interface, can be found at [/docs/setup.md](/docs/setup.md).

### Deployment

The instructions to deploy the API server and frontend on the cloud, can be found at [/docs/deployment.md](/docs/deployment.md).

### Slides

CRIA v1.3 was first presented in a private session on 18 Aug 2023. The slides is publicly available [here](https://docs.google.com/presentation/d/1HdHfl0XiGIvRd-R3AHTEZn8Ee9ibFTp_Dv-q1S5SgrQ/edit?usp=sharing).

### Architectural Overview

The overview of the project can be found at [/docs/architecture.md](/docs/architecture.md).

### Architectural Decision Records (ADR)

Please refer to the [/docs/adr/](/docs/adr/) folder for the detailed information on the list of design decisions made so far.

### Model Evaluation

The preliminary model evaluation can be be found at [/docs/model-eval/](/docs/model-eval/) folder.

## References

### Guides / Tutorials / Discussions

- [ML Blog - Fine-Tune Your Own Llama 2 Model in a Colab Notebook](https://mlabonne.github.io/blog/posts/Fine_Tune_Your_Own_Llama_2_Model_in_a_Colab_Notebook.html)

- [Fine-tune Llama 2 in Google Colab.ipynb - Colaboratory](https://colab.research.google.com/drive/1PEQyJO1-f6j0S_XJ8DV50NkpzasXkrzd?usp=sharing)

- [Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA](https://huggingface.co/blog/4bit-transformers-bitsandbytes)

- [bnb-4bit-training.ipynb - Colaboratory](https://colab.research.google.com/drive/1VoYNfYDKcKRQRor98Zbf2-9VQTtGJ24k?usp=sharing)

- [🐐Llama 2 Fine-Tune with QLoRA [Free Colab 👇🏽] - YouTube](https://www.youtube.com/watch?v=eeM6V5aPjhk)

- [Fine-Tune Large LLMs with QLoRA (Free Colab Tutorial) - YouTube](https://www.youtube.com/watch?v=NRVaRXDoI3g)

- [LLaMA2 for Multilingual Fine Tuning? - YouTube](https://www.youtube.com/watch?v=ThKWQcyQXF8)

- [How to Tune Falcon-7B With QLoRA on a Single GPU - YouTube](https://www.youtube.com/watch?v=AXG7TA7vIQ8)

- [🦙Llama 2 Fine-Tuning with 4-Bit QLoRA on Dolly-15k [Free Colab 🙌] - YouTube](https://www.youtube.com/watch?v=o5bU1H-6TqM)

- [Fine-Tune Your Own Llama 2 Model in a Colab Notebook | Towards Data Science](https://towardsdatascience.com/fine-tune-your-own-llama-2-model-in-a-colab-notebook-df9823a04a32)

### Datasets

- [mlabonne/CodeLlama-2-20k](https://huggingface.co/datasets/mlabonne/CodeLlama-2-20k)

- [n3rd0/DreamBook_Guanaco_Format](https://huggingface.co/datasets/n3rd0/DreamBook_Guanaco_Format)

- [Elliot4AI/dolly-15k-chinese-guanacoformat](https://huggingface.co/datasets/Elliot4AI/dolly-15k-chinese-guanacoformat)

### Models

- [NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf)

- [TinyPixel/Llama-2-7B-bf16-sharded](https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/davzoku/cria

Awesome Lists containing this project

README