awesome-knowledge-infrastructure
A curated list of tools and practices for capturing, connecting, and operationalizing what your organization (and you) know.
https://github.com/onlydole/awesome-knowledge-infrastructure
Last synced: about 19 hours ago
JSON representation
-
AI Knowledge Assistants
- Onyx - An open source AI assistant and enterprise search over your team's documents and apps (formerly Danswer).
- Khoj - A self-hostable AI second brain that searches and chats across your notes and documents.
- Quivr - An open source framework for building a personal, RAG-powered knowledge assistant.
- Guru - An enterprise knowledge platform that surfaces verified answers in the tools where people work.
-
AI Memory and Context
- Mem0 - An open source memory layer that gives AI agents persistent, personalized recall.
- Letta - An open source framework (formerly MemGPT) for building stateful agents with long-term memory.
- Graphiti - An open source framework for building temporally-aware knowledge graphs for agent memory.
- cognee - An open source framework for building memory and knowledge graphs for AI agents from your data.
-
Architecture Decision Records
- adr-tools - Command-line tools for working with Architecture Decision Records.
- Architecture Decision Record - Templates and examples for capturing architectural decisions.
- MADR - Markdown Any Decision Records, a lean template for documenting decisions.
- Log4brains - A tool to log and publish ADRs as a searchable knowledge base.
-
Data Catalogs and Metadata
- DataHub - An open source metadata platform for data discovery, lineage, and governance.
- OpenMetadata - A unified, open source platform for metadata, data discovery, and observability.
- OpenLineage - An open standard for metadata and lineage collection across data pipelines.
- CKAN - An open source data management system for building open data portals and catalogs.
-
Diagramming and Visualization as Code
- Mermaid - A tool for generating diagrams and flowcharts from Markdown-like text.
- PlantUML - A tool for creating UML and many other diagrams from a simple text description.
- D2 - A modern, declarative language and engine for turning text into diagrams.
- Diagrams - A Python library for drawing cloud system architecture diagrams as code.
- Kroki - A unified API that renders many text-based diagram formats into images.
- Excalidraw - An open source virtual whiteboard for sketching hand-drawn-style diagrams.
-
Documentation as Code
- Docusaurus - A React-based static site generator purpose-built for documentation.
- MkDocs - A fast, simple static site generator geared toward project documentation.
- Material for MkDocs - A popular, feature-rich theme and toolkit on top of MkDocs.
- Sphinx - A powerful documentation generator with rich cross-referencing, widely used in Python.
- Starlight - A documentation theme for Astro with great defaults and performance.
- Read the Docs - A platform that automatically builds, versions, and hosts documentation from your repository.
- VitePress - A Vite-powered static site generator optimized for fast, content-focused documentation.
- Antora - A multi-repository documentation site generator built around AsciiDoc.
- Redoc - An open source tool that renders OpenAPI definitions into reference API documentation.
-
Foundations and Concepts
- DIKW Pyramid - A foundational model describing how data becomes information, knowledge, and ultimately wisdom.
- Communities of Practice - Wenger-Trayner's framework for groups that deepen shared expertise through ongoing practice.
- Knowledge-Centered Service - A methodology for capturing and reusing support knowledge as a by-product of solving problems.
- Docs as Code - Treating documentation with the same tools and workflows as software.
-
Internal Developer Portals
- Backstage - An open platform for building developer portals, with a software catalog and TechDocs.
- Clutch - Lyft's extensible platform for infrastructure tooling and developer self-service.
- Kratix - An open source framework for building platforms that offer self-service infrastructure as a product.
- Compass - Atlassian's developer experience platform for cataloging services and tracking software health.
-
Interoperability and Standards
- Pandoc - A universal document converter that translates between Markdown, HTML, LaTeX, and dozens of formats.
- JSON Canvas - An open file format for infinite-canvas notes, originating in Obsidian.
- Web Annotation Data Model - A W3C standard for representing annotations and highlights on web resources.
-
Knowledge Graphs
- Neo4j - A widely used graph database for connected data and knowledge graphs.
- Memgraph - An in-memory graph database compatible with the Cypher query language.
- Dgraph - A distributed, GraphQL-native graph database.
- Apache Jena - A Java framework for building semantic web and linked-data applications.
- TerminusDB - An open source graph database for collaborative, versioned knowledge.
- RDFLib - A Python library for working with RDF, including parsing, serializing, and SPARQL queries.
- FalkorDB - A low-latency, open source graph database designed for GraphRAG and AI workloads.
- Protégé - A free, open source ontology editor for building OWL ontologies and knowledge models.
- schema.org - A shared vocabulary for marking up structured data on the web.
-
Learning and Community
- r/PKMS - A community for discussing personal knowledge management systems.
-
Operational Knowledge
- PagerDuty Incident Response - PagerDuty's open documentation on running effective incident response.
- Dispatch - Netflix's tool for orchestrating incident response and capturing what was learned.
- Rundeck - Runbook automation that turns operational knowledge into safe, self-service actions.
- Google SRE Books - Google's freely available books on site reliability engineering, including incident response and postmortems.
- Grafana OnCall - An open source on-call and alert management tool for incident response.
- Cachet - An open source status page system for communicating incidents and uptime.
- PagerDuty Incident Response - PagerDuty's open documentation on running effective incident response.
-
Orchestration and Durable Execution
-
Personal Knowledge Management
- Logseq - A privacy-first, open source outliner for networked note-taking and tasks.
- SiYuan - A self-hosted, block-based personal knowledge management system.
- Joplin - An open source note and to-do app with end-to-end encryption and sync.
- Anytype - A local-first, end-to-end encrypted workspace built on an object graph you fully own.
- AppFlowy - An open source Notion alternative for notes, wikis, and projects with local data ownership.
- Trilium Notes - A hierarchical note-taking application for building large personal knowledge bases.
- Org-roam - A plain-text knowledge management system for Emacs Org-mode built on the Zettelkasten method.
-
Read It Later and Annotation
- Hypothesis - An open source tool for annotating and discussing any web page or PDF.
- Wallabag - A self-hostable, open source read-it-later application that saves and archives articles.
- Karakeep - An open source, self-hostable bookmarking app (formerly Hoarder) with AI tagging and full-text search.
-
Retrieval-Augmented Generation
- LangChain - A framework for building applications with LLMs, including retrieval-augmented generation.
- LlamaIndex - A data framework for connecting custom data sources to LLMs.
- Haystack - An end-to-end framework for building search and RAG pipelines.
- txtai - An all-in-one embeddings database for semantic search, RAG, and LLM orchestration.
- RAGFlow - An open source RAG engine built on deep document understanding for grounded question answering.
- Unstructured - Open source libraries for ingesting and preprocessing documents into LLM-ready data.
- Docling - An open source toolkit that parses PDFs, Office files, and more into structured, LLM-ready formats.
- Sentence Transformers - A Python library for state-of-the-art text and image embeddings used in semantic search.
-
Search and Retrieval
- LanceDB - An embedded, open source vector database for multimodal AI built on the Lance format.
- Meilisearch - A fast, typo-tolerant, open source search engine.
- Elasticsearch - A distributed search and analytics engine for full-text, structured, and vector search.
- Apache Solr - A mature, Lucene-based open source search platform for large-scale deployments.
- Quickwit - A cloud-native, open source search engine optimized for logs and large append-only datasets.
- pgvector - Open source vector similarity search for PostgreSQL.
- Qdrant - A high-performance, open source vector database for similarity search.
- Weaviate - An open source vector database with built-in vectorization and hybrid search.
- Chroma - An open source embedding database designed for building AI applications quickly.
- Milvus - A cloud-native, open source vector database built for massive-scale similarity search.
-
Wikis and Team Knowledge Bases
- Wiki.js - A modern, open source wiki engine with a powerful editor and flexible storage backends.
- BookStack - A simple, open source platform for organizing documentation into books, chapters, and pages.
- DokuWiki - A lightweight, database-free wiki that stores pages as plain text files.
- Outline - A fast, open source team knowledge base and wiki with real-time collaboration.
- TiddlyWiki - A self-contained, non-linear personal wiki you can carry in a single HTML file.
- AFFiNE - An open source, local-first workspace that merges documents, whiteboards, and databases.
- Confluence - Atlassian's widely used team workspace for documentation and knowledge sharing.
Programming Languages
Categories
Search and Retrieval
10
Knowledge Graphs
9
Documentation as Code
9
Retrieval-Augmented Generation
8
Personal Knowledge Management
7
Wikis and Team Knowledge Bases
7
Operational Knowledge
7
Diagramming and Visualization as Code
6
AI Memory and Context
4
Foundations and Concepts
4
AI Knowledge Assistants
4
Data Catalogs and Metadata
4
Architecture Decision Records
4
Internal Developer Portals
4
Interoperability and Standards
3
Read It Later and Annotation
3
Orchestration and Durable Execution
2
Learning and Community
1
Sub Categories
Keywords
documentation
13
rag
12
llm
12
python
11
ai
11
markdown
11
search-engine
8
vector-database
8
chatgpt
8
semantic-search
6
agents
6
self-hosted
6
database
6
information-retrieval
6
wiki
6
open-source
6
vector-search
6
diagrams
5
javascript
5
knowledge-graph
5
graph
5
nearest-neighbor-search
5
framework
5
graph-database
5
java
5
search
4
knowledge-base
4
php
4
nosql
4
nlp
4
ai-agents
4
openai
4
api
4
machine-learning
4
image-search
4
developer-tools
3
rdf
3
similarity-search
3
recommender-system
3
electron
3
hnsw
3
approximate-nearest-neighbor-search
3
react
3
mkdocs
3
static-site-generator
3
architecture
3
pdf
3
retrieval-augmented-generation
3
pdf-to-text
3
document-parser
3