https://github.com/preciz/pdf_info

Extract all metadata from a PDF binary
https://github.com/preciz/pdf_info

elixir pdf

Last synced: about 2 months ago
JSON representation

Extract all metadata from a PDF binary

Host: GitHub
URL: https://github.com/preciz/pdf_info
Owner: preciz
License: mit
Created: 2020-07-10T19:58:51.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2025-03-08T23:04:06.000Z (3 months ago)
Last Synced: 2025-04-10T02:52:42.299Z (about 2 months ago)
Topics: elixir, pdf
Language: Elixir
Homepage: https://hexdocs.pm/pdf_info
Size: 91.8 KB
Stars: 11
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # PDFInfo

![Actions Status](https://github.com/preciz/pdf_info/workflows/test/badge.svg)

Extracts all /Info and /Metadata objects from a PDF binary using Regex

and with zero dependencies.

Limitations:

If the PDF is encrypted or the metadata is compressed you have to first decrypt and uncompress:

```

qpdf --stream-data=uncompress --compress-streams=n --decrypt --password='' myfile.pdf myfile_out.pdf

```

## Installation

Add `pdf_info` to your list of dependencies in `mix.exs`:

```elixir

def deps do

  [

    {:pdf_info, "~> 0.1.0"}

  ]

end

```

## Usage

```elixir

iex(1)> pdf = File.read!("/Downloads/sample.pdf")

<<37, 80, 68, 70, 45, ...>>

iex(2)> PDFInfo.is_pdf?(pdf)

true # looks like it's a PDF!

iex(3)> PDFInfo.is_encrypted?(pdf)

false # it's not encrypted (this lib can't decrypt, if it's encrypted then decrypt first)

iex(4)> PDFInfo.info_objects(pdf)

# a map with info objects

%{"/Info 6 0 R" => [

  %{

  "Author" => "Barna Kovacs",

  "CreationDate" => "D:20200212212756Z",

  "Title" => "Can't come up with a title"

  }

]}

iex(5)> PDFInfo.metadata_objects(pdf)

# list of maps with metadata

[

  %{

    {"dc", "creator"} => "Barna Kovacs",

    {"dc", "format"} => "application/pdf",

    {"dc", "title"} => "Can't come up with a title",

    ...

  }

]

```

## Documentation

Documentation can be found at [https://hexdocs.pm/pdf_info](https://hexdocs.pm/pdf_info).

## License

PDFInfo is [MIT licensed](LICENSE).

## Credit

Inspired by [https://gitlab.com/nxl4/pdf-metadata](https://gitlab.com/nxl4/pdf-metadata)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/preciz/pdf_info

Awesome Lists containing this project

README