Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-generative-ai-data-scientist
A curated list of resources for building and deploying generative AI specifically focusing on helping you become a GenAI developer with LLMs
https://github.com/business-science/awesome-generative-ai-data-scientist
Last synced: 4 days ago
JSON representation
-
LLM Models and Providers
- Meta Llama Models - tune, distill and deploy anywhere.
- Google Gemini
- Ollama
- Grok
- Anthropic Claude
- OpenAI
- Hugging Face Models
-
AI LLM Frameworks
- LlamaIndex Workflows - complex AI application we see our users building.
- LangChain
- LangGraph - actor applications with LLMs, used to create agent and multi-agent workflows. [Documentation](https://langchain-ai.github.io/langgraph/) [Tutorials](https://github.com/langchain-ai/langgraph/tree/main/docs/docs/tutorials)
- LlamaIndex - augmented generative AI applications with LLMs.
- LlamaIndex Workflows - complex AI application we see our users building.
- LangGraph - actor applications with LLMs, used to create agent and multi-agent workflows.
- LlamaIndex - augmented generative AI applications with LLMs.
- LangChain - ai/langchain) [Cookbook](https://github.com/langchain-ai/langchain/tree/master/cookbook)
- LlamaIndex - augmented generative AI applications with LLMs. [Documentation](https://docs.llamaindex.ai/) [Github](https://github.com/run-llama/llama_index)
- CrewAI
- AutoGen - A programming framework for agentic AI by Microsoft.
- LangFlow - code tool that makes building powerful AI agents and workflows that can use any API, model, or database easier. [Documentation](https://docs.langflow.org/) [Github](https://github.com/langflow-ai/langflow)
- Pydantic AI - ai)
-
Vector Databases (RAG)
- ChromaDB
- FAISS
- Pinecone
- Milvus - source vector database built to power embedding similarity search and AI applications.
- NVIDIA NIM - host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations.
- ChromaDB
- FAISS
- Pinecone
- Milvus - source vector database built to power embedding similarity search and AI applications.
- AWS Bedrock - performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon
- Microsoft Azure AI Services - edge, market-ready, and responsible applications with out-of-the-box and prebuilt and customizable APIs and models.
- Google Vertex AI - managed, unified AI development platform for building and using generative AI.
- NVIDIA NIM - host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations.
- Qdrant - Performance Vector Search at Scale
-
Miscellaneous
- Microsoft Azure AI Services - edge, market-ready, and responsible applications with out-of-the-box and prebuilt and customizable APIs and models.
- Google Vertex AI - managed, unified AI development platform for building and using generative AI.
- AdalFlow - The library to build & auto-optimize LLM applications, from Chatbot, RAG, to Agent by [SylphAI](https://www.sylph.ai/).
- dspy - DSPy: The framework for programming—not prompting—foundation models.
- AutoPrompt - based Prompt Calibration.
- PromptFify
- LiteLLM
- LLMOps
- Jupyter Agent
- Jupyter AI - ai.readthedocs.io/en/latest/)
- Pyspur - Based Editor for LLM Workflows
- Browser-Use
- Agenta - source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM Observability all in one place. [Documentation](https://docs.agenta.ai/)
- AWS Bedrock - performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon
-
Building AI
- LangChain Cookbook - to-end examples.
- LangGraph Examples
- Llama Index Examples
- Streamlit LLM Examples
-
Deploying AI
-
Amazon Web Services (AWS)
-
Google Cloud Platform (GCP)
-
NVIDIA
- NVIDIA NIM Anywhere - sized labs and up to production environments.
- NVIDIA NIM Deploy
- Python AI/ML Tips - Free newsletter on Generative AI and Data Science.
- unwind ai - Latest AI news, tools, and tutorials for AI Developers
-
Microsoft Azure
- Microsoft Generative AI for Beginners - ai-for-beginners)
- Microsoft Intro to Generative AI Course
-
-
LLM Models
-
Cookbooks and Examples:
- LangChain Cookbook - to-end examples.
- LangGraph Examples
- Llama Index Examples
- Streamlit LLM Examples
-
Cloud Examples:
- Azure Generative AI Examples
- Amazon Bedrock Workshop
- Google Vertex AI Examples
- NVIDIA NIM Anywhere - sized labs and up to production environments.
- NVIDIA NIM Deploy
-
8-Week AI Bootcamp by Business Science
-
Contents:
- AI Data Science Team - powered data science team of copilots that uses agents to help you perform common data science tasks 10X faster.
- AI Hedge Fund - powered hedge fund
- AI Financial Agent
- Awesome LLM Apps - By-Step Tutorials
-
Huggingface Platform
- Huggingface - source platform for machine learning (ML) and artificial intelligence (AI) tools and models. [Documentation](https://huggingface.co/docs)
-
Pretraining
- PyTorch - PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing.
- TensorFlow - TensorFlow is an open source machine learning library developed by Google.
- JAX - Google’s library for high-performance computing and automatic differentiation.
- tinygrad - A minimalistic deep learning library with a focus on simplicity and educational use, created by George Hotz.
- micrograd - A simple, lightweight autograd engine for educational purposes, created by Andrej Karpathy.
-
Fine-tuning
- Transformers - Hugging Face Transformers is a popular library for Natural Language Processing (NLP) tasks, including fine-tuning large language models.
- Unsloth - Finetune Llama 3.2, Mistral, Phi-3.5 & Gemma 2-5x faster with 80% less memory!
- LitGPT - 20+ high-performance LLMs with recipes to pretrain, finetune, and deploy at scale.
- AutoTrain - No code fine-tuning of LLMs and other machine learning tasks.
-
Testing and Monitoring
- Opik - Opik is an open-source platform for evaluating, testing and monitoring LLM applications
-
Document Parsing
- Embedchain - started/quickstart) [Github Repo](https://github.com/mem0ai/mem0/tree/main/embedchain)
- Docling by IBM
- Markitdown by Microsoft
- Gitingest
-
LLM Memory
-
Free Training
-
NVIDIA
- Generative AI Data Scientist Workshops - science.io/ai-register)
-
-
Paid Courses
-
NVIDIA
- 8-Week AI Bootcamp by Business Science - Powered Data Science Solutions using LangChain, LangGraph, Pandas, Scikit Learn, Streamlit, AWS, Bedrock, and EC2.
-
Programming Languages
Categories
Vector Databases (RAG)
14
Miscellaneous
14
AI LLM Frameworks
13
Deploying AI
10
LLM Models and Providers
7
LLM Models
5
Cloud Examples:
5
Pretraining
5
Contents:
4
Cookbooks and Examples:
4
Document Parsing
4
Building AI
4
Fine-tuning
4
8-Week AI Bootcamp by Business Science
2
LLM Memory
2
Testing and Monitoring
1
Huggingface Platform
1
Free Training
1
Paid Courses
1
Keywords
llm
18
llms
10
python
9
rag
8
openai
8
generative-ai
8
ai
7
langchain
7
llama3
5
llama
5
prompt-engineering
5
gemini-api
5
gemini
5
genai
4
llmops
4
agents
4
vector-database
4
golang
4
fine-tuning
3
framework
3
vertexai
3
vertex-ai
3
mistral
3
phi3
3
gemma2
3
agent
3
machine-learning
3
large-language-models
3
faiss
3
chatgpt
3
gemma
3
llm-inference
3
google
3
model-garden
2
notebook
2
pipeline
2
model
2
predictions
2
samples
2
mlops
2
pdf
2
ml
2
markdown
2
workbench
2
go
2
lcel
2
nim
2
nvidia
2
nvwb-project
2
llm-evaluation
2