Projects in Awesome Lists tagged with knowledge-extraction
A curated list of projects in awesome lists tagged with knowledge-extraction .
https://github.com/microsoft/pike-rag
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation
domain-specific industrial-ai knowledge-extraction rag
Last synced: 14 Apr 2025
https://github.com/microsoft/PIKE-RAG
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation
domain-specific industrial-ai knowledge-extraction rag
Last synced: 25 Mar 2025
https://github.com/lemonhu/open-entity-relation-extraction
Knowledge triples extraction and knowledge base construction based on dependency syntax for open domain text.
entity-relation information-extraction knowledge-base knowledge-extraction open-domain paper-implementations python3 relation-extraction
Last synced: 05 Apr 2025
https://github.com/huangwl18/language-planner
Official Code for "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents"
artificial-intelligence codex deep-learning embodied-ai foundation-models gpt-3 in-context-learning knowledge-extraction language-model planning transformers
Last synced: 11 Apr 2025
https://github.com/ds4sd/deepsearch-toolkit
Interact with the Deep Search platform for new knowledge explorations and discoveries
accelerated-discovery deepsearch knowledge-extraction knowledge-graph nlp pdf-converter python rag semantic-retrieval
Last synced: 15 May 2025
https://github.com/zjunlp/low-resource-kepapers
A Paper List of Low-resource Information Extraction
artificial-intelligence awsome-list event-extraction few-shot-learning information-extraction knowledge-extraction knowledge-graph low-resource ner nlp paper paper-list relation-extraction survey
Last synced: 31 Jan 2026
https://github.com/zjunlp/oneke
[WWW 2025] A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System.
agent agents artificial-intelligence event-extraction information-extraction knowledge-extraction knowledge-graph large-language-models multi-agent named-entity-recognition natural-language-processing ner oneke openie relation-extraction schema
Last synced: 13 Jun 2025
https://github.com/cmungall/semantic-llama
A knowledge extraction tool that uses a large language model to extract semantic information from text
ai knowledge-extraction language-models linkml oaklib obofoundry
Last synced: 05 May 2025
https://github.com/yueyu1030/STEAM
[KDD 2020] This is the code repository for our KDD'20 paper STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths.
gnn-propagated-embeddings hypernymy-detection kdd knowledge-extraction multi-view-learning self-supervised-learning taxonomy taxonomy-expansion
Last synced: 14 May 2025
https://github.com/kallemickelborg/stateful-ai-agent
Stateful AI Agent for Knowledge Extraction
agentic agentic-ai agentic-workflow ai-agents chain-of-thought dspy dspy-ai knowledge-extraction knowledge-retrieval state-workflow stateflow stateful
Last synced: 06 Aug 2025
https://github.com/kallemickelborg/agentic-ai
Stateful AI Agent for Knowledge Extraction
agentic agentic-ai agentic-workflow ai-agents chain-of-thought dspy dspy-ai knowledge-extraction knowledge-retrieval state-workflow stateflow stateful
Last synced: 13 May 2025
https://github.com/t-charura/yt-quick-insights
Python webapp for rapid extraction and analysis of YouTube content, enabling users to gain insights from videos and playlists without spending hours watching them.
insights knowledge-extraction langchain-python llm python transcript web-app web-application youtube
Last synced: 16 Feb 2026
https://github.com/diocrafts/ai-book-summarizer
📚 AI-Powered Book PDF Knowledge Extractor & Summarizer Transform your PDF books into structured knowledge effortlessly! This tool leverages AI to analyze books page by page, extracting key insights, definitions, and concepts, and organizes them into Markdown summaries for easier study
ai ai-powered-tools automation book-summary document-analysis educational-tools knowledge-extraction machine-learning markdown natural-language-processing openai pdf pdf-processing pdf-summarization pymupdf python study-materials text-analysis text-summarization
Last synced: 16 Apr 2026
https://github.com/zwh20081/bookdatamaker
A powerful CLI tool for extracting text from documents using DeepSeek OCR and generating high-quality datasets with LLM assistance.
dataset-generation knowledge-extraction llm-pipeline python-cli self-hosted-ocr
Last synced: 02 Mar 2026
https://github.com/pjaskulski/gpt_historical_text
GPT3/GPT4, information extraction from historical texts
biographies gpt-4 gpt3 information-extraction knowledge-extraction
Last synced: 24 Mar 2025
https://github.com/chigwell/refactor-llm-analyzer
A new package designed to facilitate structured and reliable analysis of user input related to software refactoring in the context of LLM capabilities. It accepts a user's discussion or question about
automated-decision-making automatic-categorization concern-extraction consistent-interpretation free-form-discussion-analysis insights-generation knowledge-extraction llm-capabilities pattern-matching pattern-validation reliable-analysis software-refactoring strategy-extraction structured-analysis structured-summaries text-based-input theme-extraction user-input-processing
Last synced: 14 Jan 2026
https://github.com/fullscreen-triangle/helicopter
Iterative expert research system for extracting structured knowledge from images and converts them into trainable tokens for domain-specific Language Models
domain-llm image-to-text knowledge-extraction
Last synced: 19 Jul 2025
https://github.com/zevio/pcu
Plateforme de Connaissances Unifiées (PCU) project (i.e Unified Knowledge Platform)
extraction json keyphrase-extraction kleis knowledge knowledge-extraction langdetect pcu pcu-io pcu-json pcu-keyphrase pcu-language pcu-nlp pcu-pdf pcu-relation pdf python spacy text workflow
Last synced: 13 Apr 2026
https://github.com/chigwell/text2structured-summary
text2structured-summary generates structured summaries from unstructured text using an LLM.
automated-summarization content-structuring educational-content-integration information-organization knowledge-extraction llm-response-variability-handling llm-summarization pattern-adherence retries-and-diagnostics structured-output-generation text-analysis unstructured-text-processing
Last synced: 14 Jan 2026
https://github.com/lucidprogrammer/youtube-vision-transcriber
AI-powered pipeline that converts YouTube videos into polished articles using vision-based transcription - captures code, terminal output, and on-screen text that subtitles miss
ai fast-agent gemini knowledge- knowledge-extraction llm mcp model-context-protocol openai python transcription video-to-text vision-ai youtube
Last synced: 13 Mar 2026
https://github.com/mariajbp/aec
Machine Learning and Knowledge Extraction: Salary prediction model
knowledge-extraction machine-learning python
Last synced: 09 Apr 2026
https://github.com/fabioc-aloha/youtube-mcp-server
🎬 Comprehensive YouTube MCP Server with 31 tools, AI intelligence layer, learning path generator, content repurposing, and watch history analysis
ai ai-agents ai-tools content-creation flashcards github-copilot knowledge-extraction learning llm-tools mcp model-context-protocol quiz-generator typescript video-analysis youtube youtube-api
Last synced: 30 May 2026
https://github.com/dross20/tuatara
Generates high-quality fine-tuning pairs for large language models (LLMs) from unstructured documents.
dataset-generation fine-tuning graph knowledge-extraction llm nlp ocr python sft synthetic-data
Last synced: 05 Jan 2026
https://github.com/nxgeo/id-svo-extractor
id-svo-extractor: Extract SVO triples from Indonesian text.
artificial-intelligence indonesian-language indonesian-linguistics indonesian-nlp information-extraction knowledge-extraction knowledge-representation natural-language-processing nlp python rdf-triples spacy spacy-stanza stanza text-analysis triple-extraction
Last synced: 23 Jan 2026
https://github.com/pjaskulski/gpt_psb
Extraction of information from biographies of historical figures with scripts using the GPT-4 model
gpt-4 information-extraction knowledge-extraction llm
Last synced: 09 Sep 2025
https://github.com/nextgenailabs/genaimindmapflowbuilder
GenAI Mind Map Flow Builder is a Generative AI tool that creates intelligent mind maps from diverse data sources like PDFs, SQL, CSV, media files, and websites. It visualizes core ideas and relationships using Gen AI LLM models from OpenAI and Google Gemini. Built with FastAPI and ReactJS.
fastapi flow-builder generative-ai google-gemini knowledge-extraction llm mindmap openai pdf-to-mindmap python reactjs second-brain
Last synced: 20 Apr 2026