Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kreasof-ai/homunculus-project
Long term project about a custom AI architecture. Consist of cutting-edge technique in machine learning such as Flash-Attention, Group-Query-Attention, ZeRO-Infinity, BitNet, etc.
https://github.com/kreasof-ai/homunculus-project
bitnet deep-learning flash-attention jupyter-notebook large-language-models low-rank-adaptation machine-learning python pytorch pytorch-lightning transformer vision-transformer
Last synced: about 1 month ago
JSON representation
Long term project about a custom AI architecture. Consist of cutting-edge technique in machine learning such as Flash-Attention, Group-Query-Attention, ZeRO-Infinity, BitNet, etc.
- Host: GitHub
- URL: https://github.com/kreasof-ai/homunculus-project
- Owner: kreasof-ai
- License: agpl-3.0
- Created: 2024-07-23T10:13:53.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-10-15T12:36:36.000Z (3 months ago)
- Last Synced: 2024-12-03T09:43:52.333Z (about 2 months ago)
- Topics: bitnet, deep-learning, flash-attention, jupyter-notebook, large-language-models, low-rank-adaptation, machine-learning, python, pytorch, pytorch-lightning, transformer, vision-transformer
- Language: Python
- Homepage:
- Size: 4.63 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)
![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge&logo=PyTorch&logoColor=white)
![Jupyter Notebook](https://img.shields.io/badge/jupyter-%23FA0F00.svg?style=for-the-badge&logo=jupyter&logoColor=white)[![Follow me on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/follow-me-on-HF-md.svg)](https://huggingface.co/ChavyvAkvar)
# Homunculus Project - Experimental Custom Transformer Architecture
By [Habibullah Akbar](https://chavyv.vercel.app).Key features:
- Seamless integration with vision encoder. Along with selective RoPE for each image and text embedding sequence.
- Internal iteration, making deeper abstraction while keeping the same parameter count.
- GeGLU activation function, inspired by [Gemma 2 models](https://blog.google/technology/developers/google-gemma-2/).
- Custom KV-caching, making sure each internal iteration has an independent KV-cache.
- BPE tokenizer based on KBBI.
- Grouped Query Attention.
- PyTorch Lightning implementation.
- DeepSpeed and ZeRO-3 integration. Automatically offload the memory overflow into CPU and NVMe.
- Finetuning scripts example with LoRA adapters, with and without quantization.
- Add BitNet implementation.
- Flash Attention implementation.
- Speech encoder.
- 2D and 3D RoPE.
- Diffusion Transformer for image detokenization.
- Influential token extraction from attention heatmap.
- Jupyter notebook example, both for training and finetuning.
- Dual license open-source for individuals, paid for commercial uses.![Internal latent loop (9)](https://github.com/user-attachments/assets/fe74e8b8-2f74-4b20-9f36-6f61c6946f2a)
The iterable Transformer model, where the model can *rethink* its internal cognitive process with an internal confidence score as a guide. Akin of slow thinking mechanism.
So this is the simple explanation of how it works:
- We put an adjustable parameter to handle internal looping, the default value is 1.
- If the loss value is high, this iteration is triggered, with max iterations set to 10.
- We train an independent layer to output a confidence score, trained by loss value from the main training process.
- When inference, both the next token and confidence scores are outputted and can determine how many iterations are needed for the current inference.YouTube progress documentation playlist:
- First short brief (27 July 2024): [https://youtu.be/NjK1BJyhrlI](https://youtu.be/NjK1BJyhrlI)Soon:
- Short-term memory injection.
- [SageAttention](https://github.com/thu-ml/SageAttention) implementation.
- Speech generation integration.
- [Discrete Latent Representation](https://arxiv.org/abs/2312.01203)."
- [Grokfast](https://arxiv.org/abs/2405.20233)
- Mamba2 block (?).
- Kolmogorov Arnold Network (KAN).
- Mixture of Experts block.
- Fast object detection integration, possibly YOLO or RT-DETR.
- OCR model integration.
- [MIinference](https://github.com/microsoft/MInference).
- Pre-train model integration, possibly Gemma 2 since it uses the same activation function.
- Citation to all of the papers used as references or inspirations.> UPDATE LICENSE:
***This software is dual-licensed under the terms of the GNU Affero General Public License (AGPL) and a commercial license. For commercial use, please contact Habibullah Akbar at akbar2habibullah.gmail to obtain a commercial license. Commercial use is defined as any use of the software for financial gain, including but not limited to, selling, licensing, or distributing the software as part of a product or service.***