https://github.com/ostad-ai/large-language-models

THis repository includes topics related to the Large Language Models (LLMs)
https://github.com/ostad-ai/large-language-models

large-language-models llm self-attention

Last synced: 6 months ago
JSON representation

THis repository includes topics related to the Large Language Models (LLMs)

Host: GitHub
URL: https://github.com/ostad-ai/large-language-models
Owner: ostad-ai
License: mit
Created: 2025-02-02T17:50:59.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-02-19T13:48:37.000Z (8 months ago)
Last Synced: 2025-02-19T14:36:16.345Z (8 months ago)
Topics: large-language-models, llm, self-attention
Language: Jupyter Notebook
Homepage:
Size: 8.79 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Large Language Models (LLMs)

(under construction)

1) What is a Large Language Model (LMM)?

2) What is the building block of an LLM?

3) **LLMs, self-attention mechanism:** The self-attention mechanism is the core concept of **transformer**-based LLMs. Here, we review the formulae of this mechanism and implement a self-attention from scratch in Python.

4) **LLMs, the softmax in self-attention:** We remind the softmax function ,which is widely used in *neural networks*, *deep learning*, and *machine learning*. The function softmax is implemented in Python with an example.

5) **LLMs: Layer normalization:** Layer normalization is a critical component of *Transformers* and *LLMs*, ensuring stable and efficient training by normalizing activations across the *feature dimension*. It is particularly well-suited for sequence-based tasks and deep architectures. Here, we implement the layer normalization with Numpy. Moreover, we give the code of *PyTorch* for the layer normalization so that you can compare the results.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ostad-ai/large-language-models

Awesome Lists containing this project

README