https://github.com/thiswillbeyourgithub/logseqpdfimporter
Import pdf into logseq but also import annotations made from other softwares
https://github.com/thiswillbeyourgithub/logseqpdfimporter
Last synced: 6 months ago
JSON representation
Import pdf into logseq but also import annotations made from other softwares
- Host: GitHub
- URL: https://github.com/thiswillbeyourgithub/logseqpdfimporter
- Owner: thiswillbeyourgithub
- License: gpl-3.0
- Created: 2023-07-28T13:22:39.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-22T08:12:27.000Z (over 1 year ago)
- Last Synced: 2025-08-16T09:40:24.122Z (7 months ago)
- Language: Python
- Size: 707 KB
- Stars: 30
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LogseqPDFImporter
Import PDF into [logseq](https://github.com/logseq/logseq/) but also import annotations made from other software.
## Status
* *Not feature complete but I've used it successfuly several times*
* The text highlights are correctly parsed.
* Other type of annotation (lines, shapes, rectangles, etc) are parsed as "area highlight" (open an issue if something goes wrong). The area is currently only one rectangle that surrounds the whole area, I have yet to code the exact rectangle geometry extractions (help welcome!)
* Colors are correctly matched to logseq's available colors.
* Creates both the .md and .edn files, as well as images of area highlights.
### PDF reader compatibility
- I use [Okular](https://okular.kde.org/) from KDE software on my computers and [Xodo](https://xodo.com/) on android. Both use annotations that are fully compatible by the way!
- I assume it works out of the box with other readers minus some quirks. Notably related to freehand movement I'm sure.
- **Tell me if you tested it on other software!**
### TODO (please help)
* Put it on pypi and detail in the README how to use it as a uv tool
* fix the text annotation by using small rectangles that cover exactly the text instead of one large overlapping area over the whole text
## Usage
* `python -m pip install -r requirements.txt`
* `python LogseqPDFImporter.py path_to_pdf --md_path path_to_md --edn_path path_to_edn`
## Example
### 1

### 2

## credits
* [user e-zz who was indispensable in getting the annotation locations right](https://github.com/e-zz/logseq-pdf-extract/discussions/3#discussioncomment-7902471)
* [pdfannots](https://github.com/0xabu/pdfannots/)