An open API service indexing awesome lists of open source software.

https://github.com/ziweek/award-factory

🎓 Showcasing Project, in 2024 Google Machine Learning Bootcamp - 🏆🤖 Award-Factory: Awards lovingly crafted for you by a hilariously talented generative AI! #Google #Gemma:2b #fine-tuning #quantization
https://github.com/ziweek/award-factory

docker docker-compose fastapi fine-tuning gemma-2b google large-language-model llama-cpp nextjs quantization

Last synced: 3 months ago
JSON representation

🎓 Showcasing Project, in 2024 Google Machine Learning Bootcamp - 🏆🤖 Award-Factory: Awards lovingly crafted for you by a hilariously talented generative AI! #Google #Gemma:2b #fine-tuning #quantization

Awesome Lists containing this project

README

        

# award-factory




🎓 Showcasing Project, in 2024 Google Machine Learning Bootcamp 🎓





KOREAN

 | 

ENGLISH





Award-Factory: Awards crafted for you by a hilariously talented generative AI



























Check out prototypes in the badge below
















# 1. Introduction

> [!NOTE]
>
> - This project aims to develop a service where anyone can effortlessly create a customized certificate in just a few minutes, making it easy to celebrate and appreciate others.
> - Award Factory was conceived as a heartwarming project to spread happiness, inspired by the idea of creating special certificates for parents. Built with sustainability in mind, the service integrates front-end components and leverages the fine-tuned Google Gemma:2b model to deliver personalized award texts. While the service is not fully active due to server operation costs, a demo is available on Huggingface.
> - Advanced technologies like QLoRA quantization and llama-cpp optimizations were employed to reduce model size and improve performance, ensuring an efficient user experience in the future.

https://github.com/user-attachments/assets/2def17e0-46ea-4561-8b50-fc78d595b88b



App Design




Generated Awards














# Implementation





Google Gemma:2B Finetuning
Implemented prompt engineering and QLoRA-based quantization fine-tuning using the Google/Gemma-2b-it model with PEFT techniques to optimize personalized award text generation tailored to user preferences.


llama-cpp Quantization
Applied quantization with the Q5_K_M option in llama-cpp, achieving a 63.3% reduction in model size and an 83.4% decrease in inference time without compromising performance, enabling faster and more efficient service.


```
$ llama.cpp/llama-quantize gguf_model/gemma-2b-it-award-factory-v2.gguf gguf_model/gemma-2b-it-award-factory-v2.gguf-Q5_K_M.gguf Q5_K_M

...
llama_model_quantize_internal: model size = 4780.29 MB
llama_model_quantize_internal: quant size = 1748.67 MB

main: quantize time = 17999.81 ms
main: total time = 17999.81 ms
```

```
$ ollama list

NAME ID SIZE MODIFIED
award-factory:q5 8df06172b64b 1.8 GB 19 seconds ago
award-factory:latest ae186115cc83 5.0 GB 28 minutes ago
```


Docker-compose
Utilized Docker Compose to containerize the backend and frontend services, ensuring consistency in deployment environments and facilitating scalable and maintainable full-stack web application development.




# Contribution