https://github.com/ziweek/award-factory

🎓 Showcasing Project, in 2024 Google Machine Learning Bootcamp - 🏆🤖 Award-Factory: Awards lovingly crafted for you by a hilariously talented generative AI! #Google #Gemma:2b #fine-tuning #quantization
https://github.com/ziweek/award-factory

docker docker-compose fastapi fine-tuning gemma-2b google large-language-model llama-cpp nextjs quantization

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/ziweek/award-factory
Owner: ziweek
License: mit
Created: 2024-05-12T07:18:56.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-05T16:45:26.000Z (9 months ago)
Last Synced: 2025-02-10T08:48:45.814Z (9 months ago)
Topics: docker, docker-compose, fastapi, fine-tuning, gemma-2b, google, large-language-model, llama-cpp, nextjs, quantization
Language: TypeScript
Homepage: https://award-factory.vercel.app
Size: 15.3 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# award-factory

🎓 Showcasing Project, in 2024 Google Machine Learning Bootcamp 🎓

KOREAN

|

ENGLISH

Award-Factory: Awards crafted for you by a hilariously talented generative AI

Check out prototypes in the badge below

# 1. Introduction

> [!NOTE]
>
> - This project aims to develop a service where anyone can effortlessly create a customized certificate in just a few minutes, making it easy to celebrate and appreciate others.
> - Award Factory was conceived as a heartwarming project to spread happiness, inspired by the idea of creating special certificates for parents. Built with sustainability in mind, the service integrates front-end components and leverages the fine-tuned Google Gemma:2b model to deliver personalized award texts. While the service is not fully active due to server operation costs, a demo is available on Huggingface.
> - Advanced technologies like QLoRA quantization and llama-cpp optimizations were employed to reduce model size and improve performance, ensuring an efficient user experience in the future.

https://github.com/user-attachments/assets/2def17e0-46ea-4561-8b50-fc78d595b88b

App Design

Generated Awards

# Implementation

Google Gemma:2B Finetuning
Implemented prompt engineering and QLoRA-based quantization fine-tuning using the Google/Gemma-2b-it model with PEFT techniques to optimize personalized award text generation tailored to user preferences.

llama-cpp Quantization
Applied quantization with the Q5_K_M option in llama-cpp, achieving a 63.3% reduction in model size and an 83.4% decrease in inference time without compromising performance, enabling faster and more efficient service.

```
$ llama.cpp/llama-quantize gguf_model/gemma-2b-it-award-factory-v2.gguf gguf_model/gemma-2b-it-award-factory-v2.gguf-Q5_K_M.gguf Q5_K_M

...
llama_model_quantize_internal: model size = 4780.29 MB
llama_model_quantize_internal: quant size = 1748.67 MB

main: quantize time = 17999.81 ms
main: total time = 17999.81 ms
```

```
$ ollama list

NAME ID SIZE MODIFIED
award-factory:q5 8df06172b64b 1.8 GB 19 seconds ago
award-factory:latest ae186115cc83 5.0 GB 28 minutes ago
```

Docker-compose
Utilized Docker Compose to containerize the backend and frontend services, ensuring consistency in deployment environments and facilitating scalable and maintainable full-stack web application development.

# Contribution

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ziweek/award-factory

Awesome Lists containing this project

README