https://github.com/prajeshshrestha/llama-2.0-architecture-and-inference-from-scratch-with-pytorch

grouped-query-attention kv-cache llama2 pytorch pytorch-implementation rotary-positional-embedding

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/prajeshshrestha/llama-2.0-architecture-and-inference-from-scratch-with-pytorch
Owner: prajeshshrestha
Created: 2024-08-05T11:44:21.000Z (10 months ago)
Default Branch: master
Last Pushed: 2024-08-05T15:34:13.000Z (10 months ago)
Last Synced: 2025-03-11T11:16:51.144Z (3 months ago)
Topics: grouped-query-attention, kv-cache, llama2, pytorch, pytorch-implementation, rotary-positional-embedding
Language: Python
Homepage:
Size: 24.4 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:

ecosyste.ms