https://github.com/node0/timbermill
OCR-powered chat session renderer that slices long conversations into paginated, searchable PDFs
https://github.com/node0/timbermill
chat-archive chatgpt cv2 document-processing llm-tools ocr pdf-generation python
Last synced: about 1 month ago
JSON representation
OCR-powered chat session renderer that slices long conversations into paginated, searchable PDFs
- Host: GitHub
- URL: https://github.com/node0/timbermill
- Owner: Node0
- License: mit
- Created: 2025-04-15T09:01:25.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2025-04-15T09:01:40.000Z (about 1 month ago)
- Last Synced: 2025-04-15T10:22:20.548Z (about 1 month ago)
- Topics: chat-archive, chatgpt, cv2, document-processing, llm-tools, ocr, pdf-generation, python
- Homepage:
- Size: 3.91 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Project Timbermill
**OCR-powered chat session renderer that slices long conversations into paginated, searchable PDFs.**
Timbermill processes full-length ChatGPT-style conversation screenshots or PDFs by cutting them into equal-length pages at logical whitespace breaks using OpenCV. Once segmented, each page is either OCR’d individually before assembly or assembled into a unified PDF with embedded searchable text, depending on pipeline strategy. This project aims to streamline the archival and export of mobile LLM chat logs, especially from iOS where traditional exports are cumbersome. It is designed for short-lived, high-utility deployment.