{"id":13588952,"url":"https://github.com/jonmagic/grim","last_synced_at":"2025-05-16T14:04:45.769Z","repository":{"id":56875328,"uuid":"2330620","full_name":"jonmagic/grim","owner":"jonmagic","description":"Tool for extracting pages from pdf as images and text as strings.","archived":false,"fork":false,"pushed_at":"2023-09-21T18:36:55.000Z","size":810,"stargazers_count":218,"open_issues_count":0,"forks_count":52,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-14T16:58:20.732Z","etag":null,"topics":["ghostscript","imagemagick","pdf","ruby"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jonmagic.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2011-09-05T20:49:58.000Z","updated_at":"2025-04-13T08:25:19.000Z","dependencies_parsed_at":"2024-01-05T21:59:53.502Z","dependency_job_id":null,"html_url":"https://github.com/jonmagic/grim","commit_stats":{"total_commits":119,"total_committers":14,"mean_commits":8.5,"dds":0.6554621848739496,"last_synced_commit":"0ecc4d262e7a3a417c037d046b86e4c75511c515"},"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonmagic%2Fgrim","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonmagic%2Fgrim/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonmagic%2Fgrim/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonmagic%2Fgrim/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jonmagic","download_url":"https://codeload.github.com/jonmagic/grim/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254544146,"owners_count":22088807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ghostscript","imagemagick","pdf","ruby"],"created_at":"2024-08-01T16:00:15.402Z","updated_at":"2025-05-16T14:04:45.742Z","avatar_url":"https://github.com/jonmagic.png","language":"Ruby","funding_links":[],"categories":["RUBY","PDF","Libraries"],"sub_categories":["Ruby"],"readme":"```\n                    ,____\n                    |---.\\\n            ___     |    `\n           / .-\\  ./=)\n          |  |\"|_/\\/|\n          ;  |-;| /_|\n         / \\_| |/ \\ |\n        /      \\/\\( |\n        |   /  |` ) |\n        /   \\ _/    |\n       /--._/  \\    |\n       `/|)    |    /\n         /     |   |\n       .'      |   |\n      /         \\  |\n     (_.-.__.__./  /\n```\n\n# Grim\n\nGrim is a simple gem for extracting (reaping) a page from a pdf and converting it to an image as well as extract the text from the page as a string. It basically gives you an easy to use api to ghostscript, imagemagick, and pdftotext specific to this use case.\n\n## Prerequisites\n\nYou will need ghostscript, imagemagick, and xpdf installed. On the Mac (OSX) I highly recommend using [Homebrew](http://mxcl.github.com/homebrew/) to get them installed.\n\n```bash\n$ brew install ghostscript imagemagick xpdf\n```\n\n## Installation\n\n```bash\n$ gem install grim\n```\n\n## Usage\n\n```ruby\npdf   = Grim.reap(\"/path/to/pdf\")         # returns Grim::Pdf instance for pdf\ncount = pdf.count                         # returns the number of pages in the pdf\npng   = pdf[3].save('/path/to/image.png') # will return true if page was saved or false if not\ntext  = pdf[3].text                       # returns text as a String\n\npdf.each do |page|\n  puts page.text\nend\n```\n\nWe also support using other processors (the default is whatever version of Imagemagick/Ghostscript is in your path).\n\n```ruby\n# specifying one processor with specific ImageMagick and GhostScript paths\nGrim.processor =  Grim::ImageMagickProcessor.new({:imagemagick_path =\u003e \"/path/to/convert\", :ghostscript_path =\u003e \"/path/to/gs\"})\n\n# multiple processors with fallback if first fails, useful if you need multiple versions of convert/gs\nGrim.processor = Grim::MultiProcessor.new([\n  Grim::ImageMagickProcessor.new({:imagemagick_path =\u003e \"/path/to/6.7/convert\", :ghostscript_path =\u003e \"/path/to/9.04/gs\"}),\n  Grim::ImageMagickProcessor.new({:imagemagick_path =\u003e \"/path/to/6.6/convert\", :ghostscript_path =\u003e \"/path/to/9.02/gs\"})\n])\n\npdf = Grim.reap('/path/to/pdf')\n```\n\nYou can even specify a Windows executable :zap:\n\n```ruby\n# specifying another ghostscript executable, win64 in this example\n# the ghostscript/bin folder still has to be in the PATH for this to work\nGrim.processor =  Grim::ImageMagickProcessor.new({:ghostscript_path =\u003e \"gswin64c.exe\"})\n\npdf = Grim.reap('/path/to/pdf')\n```\n\n`Grim::ImageMagickProcessor#save` supports several options as well:\n\n```ruby\npdf = Grim.reap(\"/path/to/pdf\")\npdf[0].save('/path/to/image.png', {\n  :width =\u003e 600,         # defaults to 1024\n  :density =\u003e 72,        # defaults to 300\n  :quality =\u003e 60,        # defaults to 90\n  :colorspace =\u003e \"CMYK\", # defaults to \"RGB\"\n  :alpha =\u003e \"Activate\"   # not used when not set\n})\n```\n\nGrim has limited logging abilities. The default logger is `Grim::NullLogger` but you can also set your own logger.\n\n```ruby\nrequire \"logger\"\nGrim.logger = Logger.new($stdout).tap { |logger| logger.progname = 'Grim' }\nGrim.processor = Grim::ImageMagickProcessor.new({:ghostscript_path =\u003e \"/path/to/bin/gs\"})\npdf = Grim.reap(\"/path/to/pdf\")\npdf[3].save('/path/to/image.png')\n# D, [2016-06-09T22:43:07.046532 #69344] DEBUG -- grim: Running imagemagick command\n# D, [2016-06-09T22:43:07.046626 #69344] DEBUG -- grim: PATH=/path/to/bin:/usr/local/bin:/usr/bin\n# D, [2016-06-09T22:43:07.046787 #69344] DEBUG -- grim: convert -resize 1024 -antialias -render -quality 90 -colorspace RGB -interlace none -density 300 /path/to/pdf /path/to/image.png\n```\n\n## Reference\n\n* [jonmagic.com: Grim](http://theprogrammingbutler.com/blog/archives/2011/09/06/grim/)\n* [jonmagic.com: Grim MultiProcessor](http://theprogrammingbutler.com/blog/archives/2011/10/06/grim-multiprocessor-to-the-rescue/)\n\n## Contributors\n\n* [@jonmagic](https://github.com/jonmagic)\n* [@jnunemaker](https://github.com/jnunemaker)\n* [@bryckbost](https://github.com/bryckbost)\n* [@bkeepers](https://github.com/bkeepers)\n* [@BobaFaux](https://github.com/BobaFaux)\n* [@Rubikan](https://github.com/Rubikan)\n* [@victormier](https://github.com/victormier)\n* [@philgooch](https://github.com/philgooch)\n* [@adamcrown](https://github.com/adamcrown)\n* [@fujimura](https://github.com/fujimura)\n* [@JamesPaden](https://github.com/JamesPaden)\n* [@fgiannattasio](https://github.com/fgiannattasio)\n\n## License\n\nSee [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonmagic%2Fgrim","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjonmagic%2Fgrim","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonmagic%2Fgrim/lists"}