An open API service indexing awesome lists of open source software.

awesome-knowledge-infrastructure

A curated list of tools and practices for capturing, connecting, and operationalizing what your organization (and you) know.
https://github.com/onlydole/awesome-knowledge-infrastructure

Last synced: about 19 hours ago
JSON representation

  • AI Knowledge Assistants

    • Onyx - An open source AI assistant and enterprise search over your team's documents and apps (formerly Danswer).
    • Khoj - A self-hostable AI second brain that searches and chats across your notes and documents.
    • Quivr - An open source framework for building a personal, RAG-powered knowledge assistant.
    • Guru - An enterprise knowledge platform that surfaces verified answers in the tools where people work.
  • AI Memory and Context

    • Mem0 - An open source memory layer that gives AI agents persistent, personalized recall.
    • Letta - An open source framework (formerly MemGPT) for building stateful agents with long-term memory.
    • Graphiti - An open source framework for building temporally-aware knowledge graphs for agent memory.
    • cognee - An open source framework for building memory and knowledge graphs for AI agents from your data.
  • Architecture Decision Records

    • adr-tools - Command-line tools for working with Architecture Decision Records.
    • Architecture Decision Record - Templates and examples for capturing architectural decisions.
    • MADR - Markdown Any Decision Records, a lean template for documenting decisions.
    • Log4brains - A tool to log and publish ADRs as a searchable knowledge base.
  • Data Catalogs and Metadata

    • DataHub - An open source metadata platform for data discovery, lineage, and governance.
    • OpenMetadata - A unified, open source platform for metadata, data discovery, and observability.
    • OpenLineage - An open standard for metadata and lineage collection across data pipelines.
    • CKAN - An open source data management system for building open data portals and catalogs.
  • Diagramming and Visualization as Code

    • Mermaid - A tool for generating diagrams and flowcharts from Markdown-like text.
    • PlantUML - A tool for creating UML and many other diagrams from a simple text description.
    • D2 - A modern, declarative language and engine for turning text into diagrams.
    • Diagrams - A Python library for drawing cloud system architecture diagrams as code.
    • Kroki - A unified API that renders many text-based diagram formats into images.
    • Excalidraw - An open source virtual whiteboard for sketching hand-drawn-style diagrams.
  • Documentation as Code

    • Docusaurus - A React-based static site generator purpose-built for documentation.
    • MkDocs - A fast, simple static site generator geared toward project documentation.
    • Material for MkDocs - A popular, feature-rich theme and toolkit on top of MkDocs.
    • Sphinx - A powerful documentation generator with rich cross-referencing, widely used in Python.
    • Starlight - A documentation theme for Astro with great defaults and performance.
    • Read the Docs - A platform that automatically builds, versions, and hosts documentation from your repository.
    • VitePress - A Vite-powered static site generator optimized for fast, content-focused documentation.
    • Antora - A multi-repository documentation site generator built around AsciiDoc.
    • Redoc - An open source tool that renders OpenAPI definitions into reference API documentation.
  • Foundations and Concepts

    • DIKW Pyramid - A foundational model describing how data becomes information, knowledge, and ultimately wisdom.
    • Communities of Practice - Wenger-Trayner's framework for groups that deepen shared expertise through ongoing practice.
    • Knowledge-Centered Service - A methodology for capturing and reusing support knowledge as a by-product of solving problems.
    • Docs as Code - Treating documentation with the same tools and workflows as software.
  • Internal Developer Portals

    • Backstage - An open platform for building developer portals, with a software catalog and TechDocs.
    • Clutch - Lyft's extensible platform for infrastructure tooling and developer self-service.
    • Kratix - An open source framework for building platforms that offer self-service infrastructure as a product.
    • Compass - Atlassian's developer experience platform for cataloging services and tracking software health.
  • Interoperability and Standards

    • Pandoc - A universal document converter that translates between Markdown, HTML, LaTeX, and dozens of formats.
    • JSON Canvas - An open file format for infinite-canvas notes, originating in Obsidian.
    • Web Annotation Data Model - A W3C standard for representing annotations and highlights on web resources.
  • Knowledge Graphs

    • Neo4j - A widely used graph database for connected data and knowledge graphs.
    • Memgraph - An in-memory graph database compatible with the Cypher query language.
    • Dgraph - A distributed, GraphQL-native graph database.
    • Apache Jena - A Java framework for building semantic web and linked-data applications.
    • TerminusDB - An open source graph database for collaborative, versioned knowledge.
    • RDFLib - A Python library for working with RDF, including parsing, serializing, and SPARQL queries.
    • FalkorDB - A low-latency, open source graph database designed for GraphRAG and AI workloads.
    • Protégé - A free, open source ontology editor for building OWL ontologies and knowledge models.
    • schema.org - A shared vocabulary for marking up structured data on the web.
  • Learning and Community

    • r/PKMS - A community for discussing personal knowledge management systems.
  • Operational Knowledge

    • PagerDuty Incident Response - PagerDuty's open documentation on running effective incident response.
    • Dispatch - Netflix's tool for orchestrating incident response and capturing what was learned.
    • Rundeck - Runbook automation that turns operational knowledge into safe, self-service actions.
    • Google SRE Books - Google's freely available books on site reliability engineering, including incident response and postmortems.
    • Grafana OnCall - An open source on-call and alert management tool for incident response.
    • Cachet - An open source status page system for communicating incidents and uptime.
    • PagerDuty Incident Response - PagerDuty's open documentation on running effective incident response.
  • Orchestration and Durable Execution

    • Temporal - A durable execution platform for orchestrating reliable, long-running workflows and services.
    • Restate - An open source durable execution engine for building resilient applications, workflows, and agents.
  • Personal Knowledge Management

    • Logseq - A privacy-first, open source outliner for networked note-taking and tasks.
    • SiYuan - A self-hosted, block-based personal knowledge management system.
    • Joplin - An open source note and to-do app with end-to-end encryption and sync.
    • Anytype - A local-first, end-to-end encrypted workspace built on an object graph you fully own.
    • AppFlowy - An open source Notion alternative for notes, wikis, and projects with local data ownership.
    • Trilium Notes - A hierarchical note-taking application for building large personal knowledge bases.
    • Org-roam - A plain-text knowledge management system for Emacs Org-mode built on the Zettelkasten method.
  • Read It Later and Annotation

    • Hypothesis - An open source tool for annotating and discussing any web page or PDF.
    • Wallabag - A self-hostable, open source read-it-later application that saves and archives articles.
    • Karakeep - An open source, self-hostable bookmarking app (formerly Hoarder) with AI tagging and full-text search.
  • Retrieval-Augmented Generation

    • LangChain - A framework for building applications with LLMs, including retrieval-augmented generation.
    • LlamaIndex - A data framework for connecting custom data sources to LLMs.
    • Haystack - An end-to-end framework for building search and RAG pipelines.
    • txtai - An all-in-one embeddings database for semantic search, RAG, and LLM orchestration.
    • RAGFlow - An open source RAG engine built on deep document understanding for grounded question answering.
    • Unstructured - Open source libraries for ingesting and preprocessing documents into LLM-ready data.
    • Docling - An open source toolkit that parses PDFs, Office files, and more into structured, LLM-ready formats.
    • Sentence Transformers - A Python library for state-of-the-art text and image embeddings used in semantic search.
    • LanceDB - An embedded, open source vector database for multimodal AI built on the Lance format.
    • Meilisearch - A fast, typo-tolerant, open source search engine.
    • Elasticsearch - A distributed search and analytics engine for full-text, structured, and vector search.
    • Apache Solr - A mature, Lucene-based open source search platform for large-scale deployments.
    • Quickwit - A cloud-native, open source search engine optimized for logs and large append-only datasets.
    • pgvector - Open source vector similarity search for PostgreSQL.
    • Qdrant - A high-performance, open source vector database for similarity search.
    • Weaviate - An open source vector database with built-in vectorization and hybrid search.
    • Chroma - An open source embedding database designed for building AI applications quickly.
    • Milvus - A cloud-native, open source vector database built for massive-scale similarity search.
  • Wikis and Team Knowledge Bases

    • Wiki.js - A modern, open source wiki engine with a powerful editor and flexible storage backends.
    • BookStack - A simple, open source platform for organizing documentation into books, chapters, and pages.
    • DokuWiki - A lightweight, database-free wiki that stores pages as plain text files.
    • Outline - A fast, open source team knowledge base and wiki with real-time collaboration.
    • TiddlyWiki - A self-contained, non-linear personal wiki you can carry in a single HTML file.
    • AFFiNE - An open source, local-first workspace that merges documents, whiteboards, and databases.
    • Confluence - Atlassian's widely used team workspace for documentation and knowledge sharing.