https://github.com/preciz/pdf_info
Extract all metadata from a PDF binary
https://github.com/preciz/pdf_info
elixir pdf
Last synced: about 2 months ago
JSON representation
Extract all metadata from a PDF binary
- Host: GitHub
- URL: https://github.com/preciz/pdf_info
- Owner: preciz
- License: mit
- Created: 2020-07-10T19:58:51.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2025-03-08T23:04:06.000Z (3 months ago)
- Last Synced: 2025-04-10T02:52:42.299Z (about 2 months ago)
- Topics: elixir, pdf
- Language: Elixir
- Homepage: https://hexdocs.pm/pdf_info
- Size: 91.8 KB
- Stars: 11
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PDFInfo

Extracts all /Info and /Metadata objects from a PDF binary using Regex
and with zero dependencies.Limitations:
If the PDF is encrypted or the metadata is compressed you have to first decrypt and uncompress:
```
qpdf --stream-data=uncompress --compress-streams=n --decrypt --password='' myfile.pdf myfile_out.pdf
```## Installation
Add `pdf_info` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:pdf_info, "~> 0.1.0"}
]
end
```## Usage
```elixir
iex(1)> pdf = File.read!("/Downloads/sample.pdf")
<<37, 80, 68, 70, 45, ...>>
iex(2)> PDFInfo.is_pdf?(pdf)
true # looks like it's a PDF!
iex(3)> PDFInfo.is_encrypted?(pdf)
false # it's not encrypted (this lib can't decrypt, if it's encrypted then decrypt first)
iex(4)> PDFInfo.info_objects(pdf)
# a map with info objects
%{"/Info 6 0 R" => [
%{
"Author" => "Barna Kovacs",
"CreationDate" => "D:20200212212756Z",
"Title" => "Can't come up with a title"
}
]}
iex(5)> PDFInfo.metadata_objects(pdf)
# list of maps with metadata
[
%{
{"dc", "creator"} => "Barna Kovacs",
{"dc", "format"} => "application/pdf",
{"dc", "title"} => "Can't come up with a title",
...
}
]```
## Documentation
Documentation can be found at [https://hexdocs.pm/pdf_info](https://hexdocs.pm/pdf_info).
## License
PDFInfo is [MIT licensed](LICENSE).
## Credit
Inspired by [https://gitlab.com/nxl4/pdf-metadata](https://gitlab.com/nxl4/pdf-metadata)