https://github.com/zjunlp/OmniThink
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
https://github.com/zjunlp/OmniThink
artificial-intelligence generation gpt information-seeking knowledge-augmented-generation large-language-models machine-writing natural-language-processing ominithink qwen retrieval-augmented-generation slow-thinking
Last synced: 4 months ago
JSON representation
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
- Host: GitHub
- URL: https://github.com/zjunlp/OmniThink
- Owner: zjunlp
- License: mit
- Created: 2024-12-10T06:42:25.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-17T07:02:04.000Z (about 1 year ago)
- Last Synced: 2025-01-17T08:18:06.489Z (about 1 year ago)
- Topics: artificial-intelligence, generation, gpt, information-seeking, knowledge-augmented-generation, large-language-models, machine-writing, natural-language-processing, ominithink, qwen, retrieval-augmented-generation, slow-thinking
- Language: Python
- Homepage:
- Size: 12.7 MB
- Stars: 11
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llm-agent - OmniThink
README
OmniThink
Expanding Knowledge Boundaries in Machine Writing
through Thinking
👏 Welcome to try OmniThink in our **[
Modelscope online demo](https://www.modelscope.cn/studios/iic/OmniThink) and [🤗HuggingFace online demo]( https://huggingface.co/spaces/zjunlp/OmniThink)**!
[🤖Project]
[📄Paper]
[📺Youtube]
## Table of Contents
- 🚩[Acknowledgement](#Acknowledgement)
- 🌻[Quick Start](#quick-start)
- 🌟[Introduction](#Introduction)
- 🔧[Dependencies](#Dependencies)
- 🔍[Local Search Support](#-local-search-support)
- 📉[Results](#Results)
- 🧐[Evaluation](#evaluation)
# 🔔News
- `2025-08-24`, We have added **offline local search** support using RAGFlow technology! Now you can search local documents without internet connection.
- `2025-03-12`, We have optimized the Docker usage for OmniThink.
- `2025-02-20`, We have added the evaluation methods from the paper to OmniThink, and in the future, we will integrate more evaluation methods.
- `2025-01-28`, We have provided support for the deepseek-reasoner model. You can try running ./examples/deepseekr1.py to test OmniThink's performance within deepseek-reasoner.
Previous News
- `2025-01-18`, we open-sourced OmniThink, a machine writing framework.
# 🌻Acknowledgement
- This work is implemented by [DsPY](https://github.com/stanfordnlp/dspy), [STORM](https://github.com/stanford-oval/storm) Sincere thanks for their efforts.
- We are also very grateful to [Zhangjiabao-nudt](https://github.com/Zhangjiabao-nudt) and [techshoww](https://github.com/techshoww) for their contributions to this repository.
- if you have any questions, please feel free to contact via xizekun.xzk@alibaba-inc.com, 1786594371@qq.com or xizekun2023@zju.edu.cn or create an issue.
## 📖 Quick Start
- 🌏 The **Online Demo** is avaiable at [ModelScope](https://www.modelscope.cn/studios/iic/OmniThink) now!

# 📌 Introduction
Welcome to **OmniThink**, an innovative machine writing framework designed to replicate the human cognitive process of iterative expansion and reflection in generating insightful long-form articles.
- **Iterative Expansion and Reflection**: OmniThink uses a unique mechanism that simulates human cognitive behaviors to deepen the understanding of complex topics.
- **Enhanced Knowledge Density**: OmniThink focuses on expanding knowledge boundaries, resulting in articles that are rich in information and insights.
- **Comprehensive Article Generation**: OmniThink constructs outlines and generates articles, delivering high-quality content that is both coherent and contextually robust.
# 🛠 Dependencies
## 📦 Conda
```bash
conda create -n OmniThink python=3.11
git clone https://github.com/zjunlp/OmniThink.git
cd OmniThink
# Install requirements
pip install -r requirements.txt
```
## 🔍 Local Search Support
OmniThink now supports **offline local search** using RAGFlow technology! This feature allows you to:
- **Search local documents** without internet connection
- **Use vector embeddings** for semantic search
- **Index and retrieve** your own document collections
- **Maintain data privacy** with local-only processing
### Local Search Features
- **OfflineRAGFlow**: Core RAG engine with FAISS vector database
- **LocalSearch**: DSPy-compatible search interface
- **Sentence Transformers**: High-quality text embeddings
- **Smart Chunking**: Intelligent document segmentation
- **Semantic Retrieval**: Context-aware search results
### Quick Local Search Setup
```python
from src.tools.rm import OfflineRAGFlow, LocalSearch
# Initialize the local RAG engine
rag_engine = OfflineRAGFlow(
model_name="sentence-transformers/all-MiniLM-L6-v2",
chunk_size=800,
overlap=120,
k=5
)
# Add documents to your local index
rag_engine.ingest(
text="Your document content here...",
meta={"title": "Document Title", "doc_id": "doc1"}
)
# Create DSPy-compatible search interface
local_search = LocalSearch(search=rag_engine, k=3)
# Use in your DSPy pipeline
results = local_search.forward("your search query")
```
## 🐳 Docker
```
git clone https://github.com/zjunlp/OmniThink.git
docker pull zjunlp/omnithink:latest
docker run -it zjunlp/omnithink:latest
```
🔑 Before running, please export the LM API key and SEARCH key as an environment variable:
```bash
export LM_KEY=YOUR_API_KEY
export SEARCHKEY=YOUR_SEARCHKEY
```
### Local Search Dependencies
For local search functionality, additional packages are required:
```bash
# Install local search dependencies
pip install sentence-transformers faiss-cpu numpy
# Or use the updated requirements.txt
pip install -r requirements.txt
```
> You can define your own [LM API](https://github.com/zjunlp/OmniThink/blob/main/src/tools/lm.py) and [SEARCH API](https://github.com/zjunlp/OmniThink/blob/main/src/tools/rm.py)
> Note that the output of the LM should be a LIST.
# Results in OmniThink
The preformance of OmniThink is shown below:
# Generate Article in OmniThink
Just one command required
```bash
sh run.sh
```
You can find your Article, Outline and mindmap in ./results/
# 🔍 Evaluation
We provide convenient scripts for evaluating your method. The evaluation is divided into three categories: **Rubric_Grading**, **Knowledge_Density**, and **Information_Diversity**.
We use the `factscore` library. Please run the following code before starting the evaluation.
```
cd eval
git clone https://github.com/shmsw25/FActScore.git
```
For Rubric Grading
```
python Rubric_Grading.py \
--articlepath articlepath \
--modelpath modelpath
```
For Information Diversity
```
python Information_Diversity.py \
--mappath mappath \
--model_path model_path
```
For Knowledge_Density
```
python Knowledge_Density.py \
--articlepath articlepath \
--api_path api_path \
--threads threads
```
## Citation
If you find our repo useful in your research, please kindly consider cite:
```angular2
@misc{xi2025omnithinkexpandingknowledgeboundaries,
title={OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking},
author={Zekun Xi and Wenbiao Yin and Jizhan Fang and Jialong Wu and Runnan Fang and Ningyu Zhang and Jiang Yong and Pengjun Xie and Fei Huang and Huajun Chen},
year={2025},
eprint={2501.09751},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.09751},
}
```