{"id":13484394,"url":"https://github.com/boazsegev/combine_pdf","last_synced_at":"2025-05-13T18:12:12.926Z","repository":{"id":20285522,"uuid":"23558888","full_name":"boazsegev/combine_pdf","owner":"boazsegev","description":"A Pure ruby library to merge PDF files, number pages and maybe more...","archived":false,"fork":false,"pushed_at":"2025-04-08T06:53:09.000Z","size":900,"stargazers_count":755,"open_issues_count":54,"forks_count":167,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-04-25T15:48:48.474Z","etag":null,"topics":["pdf","pdf-files","pdf-generation","pdf-merge","ruby"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/boazsegev.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2014-09-01T23:47:44.000Z","updated_at":"2025-04-16T05:28:19.000Z","dependencies_parsed_at":"2024-05-01T10:35:57.705Z","dependency_job_id":"3c12d1c9-087a-4f03-bc12-bc4461162d29","html_url":"https://github.com/boazsegev/combine_pdf","commit_stats":{"total_commits":391,"total_committers":33,"mean_commits":"11.848484848484848","dds":0.6521739130434783,"last_synced_commit":"d023427f5b1eaa80aa8be2900833b1091e750664"},"previous_names":[],"tags_count":92,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boazsegev%2Fcombine_pdf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boazsegev%2Fcombine_pdf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boazsegev%2Fcombine_pdf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/boazsegev%2Fcombine_pdf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/boazsegev","download_url":"https://codeload.github.com/boazsegev/combine_pdf/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254000857,"owners_count":21997442,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdf","pdf-files","pdf-generation","pdf-merge","ruby"],"created_at":"2024-07-31T17:01:23.595Z","updated_at":"2025-05-13T18:12:12.901Z","avatar_url":"https://github.com/boazsegev.png","language":"Ruby","funding_links":[],"categories":["Ruby","PDF"],"sub_categories":[],"readme":"# CombinePDF - the ruby way for merging PDF files\n[![Gem Version](https://badge.fury.io/rb/combine_pdf.svg)](http://badge.fury.io/rb/combine_pdf)\n[![GitHub](https://img.shields.io/badge/GitHub-Open%20Source-blue.svg)](https://github.com/boazsegev/combine_pdf)\n[![Documentation](http://inch-ci.org/github/boazsegev/combine_pdf.svg?branch=master)](https://www.rubydoc.info/github/boazsegev/combine_pdf)\n[![Maintainers Wanted](https://img.shields.io/badge/maintainers-wanted-red.svg)](https://github.com/pickhardt/maintainers-wanted)\n\n\nCombinePDF is a nifty model, written in pure Ruby, to parse PDF files and combine (merge) them with other PDF files, watermark them or stamp them (all using the PDF file format and pure Ruby code).\n\n## Unmaintained - Help Wanted(!)\n\nI decided to stop maintaining this gem and hope someone could take over the PR reviews and maintenance of this gem (or simply open a successful fork).\n\nI wrote this gem because I needed to solve an issue with bates-numbering existing PDF documents.\n\nHowever, since 2014 I have been maintaining the gem for free and for no reason at all, except that I enjoyed sharing it with the community.\n\nI love this gem, but I cannot keep maintaining it as I have my own projects to focus own and I need both the time and (more importantly) the mindspace.\n\n## Install\n\nInstall with ruby gems:\n\n```ruby\ngem install combine_pdf\n```\n\n## Known Limitations\n\nQuick rundown:\n\n* When reading PDF Forms, some form data might be lost. I tried fixing this to the best of my ability, but I'm not sure it all works just yet.\n\n* When combining PDF Forms, form data might be unified. I couldn't fix this because this is how PDF forms work (filling a field fills in the data in any field with the same name), but frankly, I kinda liked the issue... it's almost a feature.\n\n* When unifying the same TOC data more then once, one of the references will be unified with the other (meaning that if the pages look the same, both references will link to the same page instead of linking to two different pages). You can fix this by adding content to the pages before merging the PDF files (i.e. add empty text boxes to all the pages).\n\n* Some links and data (URL links and PDF \"Named Destinations\") are stored at the root of a PDF and they aren't linked back to from the page. Keeping this information requires merging the PDF objects rather then their pages.\n\n    Some links will be lost when ripping pages out of PDF files and merging them with another PDF.\n\n* Some encrypted PDF files (usually the ones you can't view without a password) will fail quietly instead of noisily. If you prefer to choose the noisy route, you can specify the `raise_on_encrypted` option using `CombinePDF.load(pdf_file, raise_on_encrypted: true)` which will raise a `CombinePDF::EncryptionError`.\n\n* Sometimes the CombinePDF will raise an exception even if the PDF could be parsed (i.e., when PDF optional content exists)... I find it better to err on the side of caution, although for optional content PDFs an exception is avoidable using `CombinePDF.load(pdf_file, allow_optional_content: true)`.\n\n* The CombinePDF gem runs recursive code to both parse and format the PDF files. Hence, PDF files that have heavily nested objects, as well as those that where combined in a way that results in cyclic nesting, might explode the stack - resulting in an exception or program failure.\n\nCombinePDF is written natively in Ruby and should (presumably) work on all Ruby platforms that follow Ruby 2.0 compatibility.\n\nHowever, PDF files are quite complex creatures and no guaranty is provided.\n\nFor example, PDF Forms are known to have issues and form data might be lost when attempting to combine PDFs with filled form data (also, forms are global objects, not page specific, so one should combine the whole of the PDF for any data to have any chance of being preserved).\n\nThe same applies to PDF links and the table of contents, which all have global attributes and could be corrupted or lost when combining PDF data.\n\nIf this library causes loss of data or burns down your house, I'm not to blame - as pointed to by the MIT license. That being said, I'm using the library happily after testing against different solutions.\n\n## Combine/Merge PDF files or Pages\n\nTo combine PDF files (or data):\n\n```ruby\npdf = CombinePDF.new\npdf \u003c\u003c CombinePDF.load(\"file1.pdf\") # one way to combine, very fast.\npdf \u003c\u003c CombinePDF.load(\"file2.pdf\")\npdf.save \"combined.pdf\"\n```\n\nOr even a one liner:\n\n```ruby\n(CombinePDF.load(\"file1.pdf\") \u003c\u003c CombinePDF.load(\"file2.pdf\") \u003c\u003c CombinePDF.load(\"file3.pdf\")).save(\"combined.pdf\")\n```\n\nyou can also add just odd or even pages:\n\n```ruby\npdf = CombinePDF.new\ni = 0\nCombinePDF.load(\"file.pdf\").pages.each do |page|\n  i += 1\n  pdf \u003c\u003c page if i.even?\nend\npdf.save \"even_pages.pdf\"\n```\n\nnotice that adding all the pages one by one is slower then adding the whole file.\n## Add content to existing pages (Stamp / Watermark)\n\nTo add content to existing PDF pages, first import the new content from an existing PDF file. After that, add the content to each of the pages in your existing PDF.\n\nIn this example, we will add a company logo to each page:\n\n```ruby\ncompany_logo = CombinePDF.load(\"company_logo.pdf\").pages[0]\npdf = CombinePDF.load \"content_file.pdf\"\npdf.pages.each {|page| page \u003c\u003c company_logo} # notice the \u003c\u003c operator is on a page and not a PDF object.\npdf.save \"content_with_logo.pdf\"\n```\n\nNotice the \u003c\u003c operator is on a page and not a PDF object. The \u003c\u003c operator acts differently on PDF objects and on Pages.\n\nThe \u003c\u003c operator defaults to secure injection by renaming references to avoid conflics. For overlaying pages using compressed data that might not be editable (due to limited filter support), you can use:\n\n```ruby\npdf.pages(nil, false).each {|page| page \u003c\u003c stamp_page}\n```\n\n## Page Numbering\n\nadding page numbers to a PDF object or file is as simple as can be:\n\n```ruby\npdf = CombinePDF.load \"file_to_number.pdf\"\npdf.number_pages\npdf.save \"file_with_numbering.pdf\"\n```\n\nNumbering can be done with many different options, with different formating, with or without a box object, and even with opacity values - [see documentation](https://www.rubydoc.info/github/boazsegev/combine_pdf/CombinePDF/PDF#number_pages-instance_method).\n\nFor example, should you prefer to place the page number on the bottom right side of all PDF pages, do:\n\n```ruby\npdf.number_pages(location: [:bottom_right])\n```\n\nAs another example, the dashes around the number are removed and a box is placed around it. The numbering is semi-transparent and the first 3 pages are numbered using letters (a,b,c) rather than numbers:\n\n\n```ruby\n# number first 3 pages as \"a\", \"b\", \"c\"\npdf.number_pages(number_format: \" %s \",\n                 location: [:top, :bottom, :top_left, :top_right, :bottom_left, :bottom_right],\n                 start_at: \"a\",\n                 page_range: (0..2),\n                 box_color: [0.8,0.8,0.8],\n                 border_color: [0.4, 0.4, 0.4],\n                 border_width: 1,\n                 box_radius: 6,\n                 opacity: 0.75)\n# number the rest of the pages as 4, 5, ... etc'\npdf.number_pages(number_format: \" %s \",\n                 location: [:top, :bottom, :top_left, :top_right, :bottom_left, :bottom_right],\n                 start_at: 4,\n                 page_range: (3..-1),\n                 box_color: [0.8,0.8,0.8],\n                 border_color: [0.4, 0.4, 0.4],\n                 border_width: 1,\n                 box_radius: 6,\n                 opacity: 0.75)\n```\n\n    pdf.number_pages(number_format: \" %s \", location: :bottom_right, font_size: 44)\n\n\n## Loading and Parsing PDF data\n\nLoading PDF data can be done from file system or directly from the memory.\n\nLoading data from a file is easy:\n\n```ruby\npdf = CombinePDF.load(\"file.pdf\")\n```\n\nYou can also parse PDF files from memory. Loading from the memory is especially effective for importing PDF data recieved through the internet or from a different authoring library such as Prawn:\n\n```ruby\npdf_data = prawn_pdf_document.render # Import PDF data from Prawn\npdf = CombinePDF.parse(pdf_data)\n```\n\nUsing `parse` is also effective when loading data from a remote location, circumventing the need for unnecessary temporary files. For example:\n\n```ruby\nrequire 'combine_pdf'\nrequire 'net/http'\n\nurl = \"https://example.com/my.pdf\"\npdf = CombinePDF.parse Net::HTTP.get_response(URI.parse(url)).body\n```\n\n## Rendering PDF data\n\nSimilarly, to loading and parsing, rendering can also be performed either to the memory or to a file.\n\nYou can output a string of PDF data using `.to_pdf`. For example, to let a user download the PDF from either a [Rails application](http://rubyonrails.org) or a [Plezi application](http://www.plezi.io):\n\n```ruby\n# in a controller action\nsend_data combined_file.to_pdf, filename: \"combined.pdf\", type: \"application/pdf\"\n```\n\nIn [Sinatra](http://www.sinatrarb.com):\n\n```ruby\n# in your path's block\nstatus 200\nbody combined_file.to_pdf\nheaders 'content-type' =\u003e \"application/pdf\"\n```\n\n\nIf you prefer to save the PDF data to a file, you can always use the `save` method as we did in our earlier examples.\n\nSome PDF files contain optional content sections which cannot always be merged reliably. By default, an exception is\nraised if one of these files are detected. You can optionally pass an `allow_optional_content` parameter to the\n`PDFParser.new`, `CombinePDF.load` and `CombinePDF.parse` methods:\n\n```ruby\nnew_pdf = CombinePDF.new\nnew_pdf \u003c\u003c CombinePDF.load(pdf_file, allow_optional_content: true)\nattachments.each { |att| new_pdf \u003c\u003c CombinePDF.load(att, allow_optional_content: true) }\n```\n\nDemo\n====\n\nYou can see a Demo for a [\"Bates stumping web-app\"](http://combine-pdf-demo.herokuapp.com/bates) and read through it's [code](https://github.com/boazsegev/combine_pdf_demo/blob/c9914588e4116dcfdaa37f85727f442b064e2b04/pdf_controller.rb) . Good luck :)\n\nDecryption \u0026 Filters\n====================\n\nSome PDF files are encrypted and some are compressed (the use of filters)...\n\nThere is very little support for encrypted files and very very basic and limited support for compressed files.\n\nI need help with that.\n\nComments and file structure\n===========================\n\nIf you want to help with the code, please be aware:\n\nI'm a self learned hobbiest at heart. The documentation is lacking and the comments in the code are poor guidlines.\n\nThe code itself should be very straight forward, but feel free to ask whatever you want.\n\nCredit\n======\n\nStefan Leitner (@sLe1tner) wrote the outline merging code supporting PDFs which contain a ToC.\n\nCaige Nichols wrote an amazing RC4 gem which I used in my code.\n\nI wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.\n\nCredit to his wonderful is given here. Please respect his license and copyright... and mine.\n\nLicense\n=======\nMIT\n\nContributions\n=======\n\nYou can look at the [GitHub Issues Page](https://github.com/boazsegev/combine_pdf/issues) and see the [\"help wanted\"](https://github.com/boazsegev/combine_pdf/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) tags.\n\nIf you're thinking of donations or sending me money - no need. This project can sustain itself without your money.\n\nWhat this project needs is the time given by caring developers who keep it up to date and fix any documentation errors or issues they notice ... having said that, gifts (such as free coffee or iTunes gift cards) are always fun. But I think there are those in real need that will benefit more from your generosity.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboazsegev%2Fcombine_pdf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fboazsegev%2Fcombine_pdf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboazsegev%2Fcombine_pdf/lists"}