https://github.com/ostad-ai/large-language-models
THis repository includes topics related to the Large Language Models (LLMs)
https://github.com/ostad-ai/large-language-models
large-language-models llm self-attention
Last synced: 6 months ago
JSON representation
THis repository includes topics related to the Large Language Models (LLMs)
- Host: GitHub
- URL: https://github.com/ostad-ai/large-language-models
- Owner: ostad-ai
- License: mit
- Created: 2025-02-02T17:50:59.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-02-19T13:48:37.000Z (8 months ago)
- Last Synced: 2025-02-19T14:36:16.345Z (8 months ago)
- Topics: large-language-models, llm, self-attention
- Language: Jupyter Notebook
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Large Language Models (LLMs)
(under construction)
1) What is a Large Language Model (LMM)?
2) What is the building block of an LLM?
3) **LLMs, self-attention mechanism:** The self-attention mechanism is the core concept of **transformer**-based LLMs. Here, we review the formulae of this mechanism and implement a self-attention from scratch in Python.
4) **LLMs, the softmax in self-attention:** We remind the softmax function ,which is widely used in *neural networks*, *deep learning*, and *machine learning*. The function softmax is implemented in Python with an example.
5) **LLMs: Layer normalization:** Layer normalization is a critical component of *Transformers* and *LLMs*, ensuring stable and efficient training by normalizing activations across the *feature dimension*. It is particularly well-suited for sequence-based tasks and deep architectures. Here, we implement the layer normalization with Numpy. Moreover, we give the code of *PyTorch* for the layer normalization so that you can compare the results.