Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jrschumacher/gophercon24-notes
https://github.com/jrschumacher/gophercon24-notes
Last synced: about 2 hours ago
JSON representation
- Host: GitHub
- URL: https://github.com/jrschumacher/gophercon24-notes
- Owner: jrschumacher
- Created: 2024-07-08T13:35:04.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-07-08T18:15:27.000Z (4 months ago)
- Last Synced: 2024-10-09T22:22:13.724Z (about 1 month ago)
- Size: 2.93 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Gophercon 24 notes
## Gen AI 1
https://hackmd.io/SoeISE6cQRGO8S3qZdXUvQ
https://github.com/dwhitena/go-genai-workshop
https://github.com/dwhitena/go-genai-workshop-build### History
2012-2017:
```mermaid
graph LR
subgraph train
A --> B
AA --> B
B --> C
end
subgraph inference
AC --> C
AB --> C
C --> E
E --> X
end
A[ex inputs]
AA[known out]
AB[model arch]
AC[new model inputs]
B[model train code]
C[ideal params]
E[model infer code]
X[output]
```Bert: 2017-2022
Fine-tuning / transfer learning
```mermaid
graph LRsubgraph google training
A-->B
AA-->B
AB-->B
B-->C
endsubgraph my code
C-->FFA-->F
FB-->FF-->P
endP-->I
B[pre-training]
C[initial params]F[fine tune code]
FA[my ex in]
FB[my ex out]P[params]
I[infer]
```The major change is the pre-training and then appending task specific code to the end of the pipeline. The task specific code is often called `embedding` or `feature representation`.
Training on meta tasks was intended to help with further model enhancements however as they trained on the autocomplete models on the vast data of the internet they discovered the emergent behavior of gen ai.
> [!NOTE]
> emergent just means unexpected in this context2022+: Gen AI
```mermaid
graph LR
subgraph others //compute heavy//
A-->PT
B-->PT
C-->PTPT-->P1
PT-->P2
PT-->P3P1-->FT
P2-->FT
P3-->FT
endsubgraph me //func call//
P-->I
FT-->II-->O
endP[prompt]
FT[fine tuning]
I[infer]
O[out]
```LLM generates a list of next tokens with a probability and then passes it as the new prompt. One of the tokens might be the end of sentence token.
```mermaid
flowchart LRA[prompt] -- "A gopher is..." --> LLM
LLM --> T1[1. a\n2. animal\n...\n30k. flea\n]
T1 --> A
```Temperature was added to shuffle the ranked words to give some variance on the output.
### Misc
OpenAI initiated the process of using SSEs (server sent events) with responses vs websockets. Returning SSE vs the finished response gives a better UX to users. Using websockets would work, but it would be moving against the trend.
System prompts `client.Roles.System` are used to give context to the LLM,
Context is injected with every prompt. There are ways to reduce this burden, but its not consistent. Remember the greatest burden is the generation.
When the long context window its hard to determine how to determine which aspect of the context caused the issue. If there is a large context window you can use an LLM to summarize and reduce the window size.
### Retrieval Augmented Generation
```mermaid
flowchart LR
D-->C1
D-->C2
D-->CN
D-->CXC1-->DV
C2-->DV
CN-->DV
CX-->DVDV-->PT
PT-->LLM
LLM-->OD[Docs]
C1[Chunk 1]
C2[Chunk 2]
CN[...]
CX[Chunk N]
DV[Vector DB]
PT[Prompt Temp]
O[Out]
```DB Vector is a closeness via cosign-sim, l2 distance, etc
Multi-lingual models would theoretically share a similar vector space.
Advanced RAG
- hierarchical search finding most relevant chunk and related context
- elastic search then vectorIf the context is 3 pages a summary may not be sufficient and the enrichment should be fine-tuned to the problem.
Bridge-tower model supports embedding images and text
Perf in enterprise:
- how long will it take to vectorize the docs
- how log will it take to search the docsLLM is judge: use the LLM to rate if we answered the users question.