{"id":16717209,"url":"https://github.com/mseri/doi2bib","last_synced_at":"2026-02-28T16:04:46.044Z","repository":{"id":44627360,"uuid":"302752014","full_name":"mseri/doi2bib","owner":"mseri","description":"Smalls CLIs to get a bibtex entries from a DOI, an arXiv ID or a PubMed ID and to pretty print bibtex entries (or files)","archived":false,"fork":false,"pushed_at":"2026-01-06T11:00:54.000Z","size":327,"stargazers_count":64,"open_issues_count":4,"forks_count":7,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-01-14T18:28:51.802Z","etag":null,"topics":["arxiv","bibtex","doi","hacktoberfest","ocaml"],"latest_commit_sha":null,"homepage":"","language":"OCaml","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mseri.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-10-09T20:57:37.000Z","updated_at":"2026-01-12T22:18:18.000Z","dependencies_parsed_at":"2024-10-28T11:33:49.518Z","dependency_job_id":"73fe37c8-86eb-42ab-94db-fb70c32c129a","html_url":"https://github.com/mseri/doi2bib","commit_stats":null,"previous_names":[],"tags_count":38,"template":false,"template_full_name":null,"purl":"pkg:github/mseri/doi2bib","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mseri%2Fdoi2bib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mseri%2Fdoi2bib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mseri%2Fdoi2bib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mseri%2Fdoi2bib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mseri","download_url":"https://codeload.github.com/mseri/doi2bib/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mseri%2Fdoi2bib/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29858488,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-26T08:51:08.701Z","status":"ssl_error","status_checked_at":"2026-02-26T08:50:19.607Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arxiv","bibtex","doi","hacktoberfest","ocaml"],"created_at":"2024-10-12T21:30:48.427Z","updated_at":"2026-02-26T12:09:40.756Z","avatar_url":"https://github.com/mseri.png","language":"OCaml","funding_links":[],"categories":[],"sub_categories":[],"readme":"# doi2bib ![Build status](https://github.com/mseri/doi2bib/workflows/Main%20workflow/badge.svg)\n\nSmall CLI tools to work with bibtex entries: get entries from DOI/arXiv/PubMed IDs and format bibtex files.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://raw.githubusercontent.com/mseri/doi2bib/refs/heads/main/logo.svg\"/\u003e\n\u003c/p\u003e\n\nJust so you know, there is now [Zotero BIB](https://zbib.org/) on the browser that can do this (and more). I will keep maintaining `doi2bib` though, since it is an integral part of my workflow.\n\n## Tools\n\nThis package provides three CLI tools:\n\n1. **doi2bib** - Get bibtex entries from DOI, arXiv ID, or PubMed ID (pretty printed with `bibfmt`)\n2. **bibfmt** - Pretty print and format bibtex files (using very few dependencies)\n3. **bibdedup** - Deduplicate BibTeX entries across multiple files\n\n## doi2bib Usage\n\n```\n$ doi2bib --help=plain\nNAME\n   doi2bib - A little CLI tool to get the bibtex entries for DOIs, arXiv\n   IDs, or PubMed IDs.\n\nSYNOPSIS\n   doi2bib [OPTION]... [FILES]...\n\nDESCRIPTION\n   doi2bib reads files containing identifiers (DOIs, arXiv IDs, or\n   PubMed IDs) with one identifier per line, and fetches the\n   corresponding BibTeX entries.\n\n   The tool automatically infers the type of identifier. You can force\n   the CLI to lookup a DOI by using the form 'doi:ID' or an arXiv ID by\n   using the form 'arXiv:ID'. PubMed IDs always start with 'PMC'.\n\n   Use '-' as a filename to read identifiers from stdin.\n\nARGUMENTS\n   FILES  Files containing DOIs, arXiv IDs, or PubMed IDs (one per\n          line). Use '-' to read from stdin. Multiple files can be\n          specified and will be processed sequentially.\n\nOPTIONS\n   -o OUTPUT, --output=OUTPUT (absent=stdout)\n       Append the bibtex output to the specified file. It will create the\n       file if it does not exist. If not specified, writes to stdout.\n\n   --help[=FMT] (default=auto)\n       Show this help in format FMT. The value FMT must be one of `auto',\n       `pager', `groff' or `plain'. With `auto', the format is `pager` or\n       `plain' whenever the TERM env var is `dumb' or undefined.\n\n   --version\n       Show version information.\n\nEXAMPLES\n   Process a file containing DOIs:\n     $ doi2bib dois.txt -o bibliography.bib\n\n   Process multiple files:\n     $ doi2bib dois.txt arxiv_ids.txt -o bibliography.bib\n\n   Read from stdin:\n     $ echo '10.1145/3357713.3384296' | doi2bib -\n\n   Combine stdin with files:\n     $ echo '10.1000/xyz123' | doi2bib - existing.txt -o output.bib\n\nEXIT STATUS\n   doi2bib exits with the following status:\n\n   0   on success.\n\n   124 on command line parsing errors.\n\n   125 on unexpected internal errors (bugs).\n\nBUGS\n   Report bugs to https://github.com/mseri/doi2bib/issues\n```\n\nThe tool retrieves the bibtex entry using published details when possible.\n\n## bibfmt Usage\n\n```\n$ bibfmt --help=plain\nNAME\n   bibfmt - A little CLI tool to pretty print bibtex files.\n\nSYNOPSIS\n   bibfmt [OPTION]... [FILES]...\n\nDESCRIPTION\n   bibfmt reads one or more BibTeX files, parses them, and outputs\n   formatted BibTeX entries.\n\n   Use '-' as a filename to read from stdin.\n\nARGUMENTS\n   FILES  BibTeX files to format. Use '-' to read from stdin. Multiple\n          files can be specified and will be combined.\n\nOPTIONS\n   --force\n       Force mode: ignore parsing errors and output only successfully\n       parsed entries.\n\n   -o OUTPUT, --output=OUTPUT (absent=stdout)\n       Saves the pretty printed bib to the specified file. If not\n       specified, writes to stdout.\n\n   -q, --quiet\n       Quiet mode: suppress all output except errors.\n\n   -s, --strict\n       Enable strict parsing mode that rejects BibTeX files with\n       duplicate fields.\n\n   --help[=FMT] (default=auto)\n       Show this help in format FMT. The value FMT must be one of `auto',\n       `pager', `groff' or `plain'. With `auto', the format is `pager` or\n       `plain' whenever the TERM env var is `dumb' or undefined.\n\n   --version\n       Show version information.\n\nEXAMPLES\n   Format a single file:\n     $ bibfmt bibliography.bib -o formatted.bib\n\n   Format multiple files:\n     $ bibfmt file1.bib file2.bib -o combined.bib\n\n   Read from stdin:\n     $ cat input.bib | bibfmt -\n\n   Combine stdin with files:\n     $ echo '@article{...}' | bibfmt - existing.bib -o output.bib\n\n   Format with strict mode:\n     $ bibfmt --strict bibliography.bib\n\nEXIT STATUS\n   bibfmt exits with the following status:\n\n   0   on success.\n\n   123 on indiscriminate errors reported on standard error.\n\n   124 on command line parsing errors.\n\n   125 on unexpected internal errors (bugs).\n\nBUGS\n   Report bugs to https://github.com/mseri/doi2bib/issues\n```\n\n## bibdedup Usage\n\n```\n$ bibdedup --help=plain\nNAME\n   bibdedup - Deduplicate BibTeX entries across multiple files.\n\nSYNOPSIS\n   bibdedup [OPTION]... FILES...\n\nDESCRIPTION\n   bibdedup reads one or more BibTeX files, combines all entries, and\n   removes duplicates based on specified key fields.\n\n   By default, entries are considered duplicates if they have the same\n   title, author, and year (after whitespace normalization and\n   case-insensitive comparison).\n\nARGUMENTS\n   FILES  BibTeX files to deduplicate. Use '-' to read from stdin.\n          Multiple files can be specified.\n\nOPTIONS\n   -i, --interactive\n       Enable interactive mode to resolve conflicts. If not set,\n       automatically keeps the first occurrence of conflicting fields.\n\n   -k KEYS, --keys=KEYS (absent=title,author,year)\n       Comma-separated list of field names to use for duplicate\n       detection. Special key 'citekey' matches on citation keys.\n       Default: title,author,year\n\n   -o OUTPUT, --output=OUTPUT (absent=stdout)\n       Output file for deduplicated BibTeX. If not specified, writes to\n       stdout.\n\n   -s, --strict\n       Enable strict mode that checks for and reports duplicate fields\n       in entries.\n\n   --help[=FMT] (default=auto)\n       Show this help in format FMT. The value FMT must be one of `auto',\n       `pager', `groff' or `plain'. With `auto', the format is `pager` or\n       `plain' whenever the TERM env var is `dumb' or undefined.\n\n   --version\n       Show version information.\n\nEXAMPLES\n   Deduplicate a single file:\n     $ bibdedup bibliography.bib -o clean.bib\n\n   Deduplicate multiple files using DOI:\n     $ bibdedup --keys doi file1.bib file2.bib -o merged.bib\n\n   Read from stdin:\n     $ cat input.bib | bibdedup -\n\n   Combine stdin with files:\n     $ cat extra.bib | bibdedup - existing.bib -o output.bib\n\n   Deduplicate using citekey with interactive conflict resolution:\n     $ bibdedup --keys citekey --interactive *.bib -o output.bib\n\n   Deduplicate with strict mode (reports duplicate fields):\n     $ bibdedup --strict --keys doi file1.bib file2.bib -o clean.bib\n\n   Deduplicate and output to stdout:\n     $ bibdedup --keys title,year file1.bib file2.bib\n\nEXIT STATUS\n   bibdedup exits with the following status:\n\n   0   on success.\n\n   124 on command line parsing errors.\n\n   125 on unexpected internal errors (bugs).\n\nBUGS\n   Report bugs to https://github.com/mseri/doi2bib/issues\n```\n\n## Examples\n\n### doi2bib Examples\n\nRead index entries from standard output and produce bibtex entries (one or more at a time):\n\n```bash\n$ doi2bib 10.1007/s10569-019-9946-9\n$ doi2bib 1902.00436 arXiv:1609.01724 PMC2883744\n```\n\nSave bibtex entry to a file:\n\n```bash\n$ doi2bib doi:10.4171/JST/226 -o bibliography.bib\n```\nThis will create the file if not present or append the bibliography to the existing file.\n\nYou can batch-process lists of entries by listing them line by line in a file and using the `-i`,`--input` option. For instance,\n\n```bash\n$ cat dois.txt\n10.1007/s10569-019-9946-9\n1902.00436\narXiv:1609.01724\nPMC2883744\n\n$ doi2bib -i dois.txt\n```\n\n### bibfmt Examples\n\nFormat a bibtex file and print to stdout:\n\n```bash\n$ bibfmt bibliography.bib\n```\n\nFormat a bibtex file and save to a new file:\n\n```bash\n$ bibfmt messy.bib -o clean.bib\n```\n\nFormat bibtex content from stdin, using `-` as the filename:\n\n```bash\n$ echo \"@article{key, title={My Title}, author={John Doe}}\" | bibfmt -\n```\n\nFormat with strict mode to check for duplicate fields (these can be removed\nwith `bibdedup`):\n\n```bash\n$ bibfmt bibliography.bib --strict -q\n```\n\nYou can use quiet mode to suppress normal output and only see warnings/errors:\n\n```bash\n$ bibfmt messy.bib --quiet\n```\n\nForce formatting even with parsing errors, by removing all the problematic\nentries (_only do this after careful consideration_):\n\n```bash\n$ bibfmt problematic.bib --force -o partial.bib\n```\n\n### bibdedup Examples\n\nDeduplicate entries from multiple files:\n\n```bash\n$ bibdedup file1.bib file2.bib -o merged.bib\n```\n\nUse custom keys for duplicate detection:\n\n```bash\n$ bibdedup --keys doi papers1.bib papers2.bib -o output.bib\n$ bibdedup --keys title,year lib1.bib lib2.bib -o combined.bib\n```\n\nDeduplicate using citation keys:\n\n```bash\n$ bibdedup --keys citekey old.bib new.bib -o updated.bib\n```\n\nInteractive mode for conflict resolution:\n\n```bash\n$ bibdedup --interactive --keys title,author,year *.bib -o curated.bib\n```\n\nEnable strict mode to check for duplicate fields:\n\n```bash\n$ bibdedup --strict --keys doi papers.bib -o clean.bib\n```\n\n## Installation\n\nEach release comes with attached binaries for Windows, Mac, and Linux. You can simply unpack the binaries (`doi2bib` or `bibfmt`) and place them in a folder accessible by your terminal.\n\n### Building from Source\n\nTo build the package yourself, use [opam](https://opam.ocaml.org/):\n\n```bash\n$ opam install doi2bib    # or bibfmt if you only need the pretty printer\n```\n\nThis will install both `doi2bib` and `bibfmt` tools, since the latter is a dependency of `doi2bib`.\n\nTo run the tests, clone this repository and from the root of the project run:\n\n```bash\n$ opam install --deps-only .    # first time only\n$ dune runtest\n```\n\n## Troubleshooting\n\nIf on macOS you get a `Library not loaded: /usr/local/opt/gmp/lib/libgmp.10.dylib` failure, you will need to install `gmp`:\n\n- MacPorts users: `port install gmp`\n- Homebrew users: `brew install gmp`\n\n## Editor Integration\n\n### Zed Configuration\n\nUse the following to configure `bibfmt` as your bibtex formatter in [Zed](https://zed.dev):\n\n```json\n\"languages\": {\n  \"BibTeX\": {\n    \"formatter\": {\n      \"external\": {\n        \"command\": \"/path/to/bibfmt\",\n        \"arguments\": [\"-\"]\n      }\n    }\n  }\n}\n```\n\nReplace `/path/to/bibfmt` with the actual path to your `bibfmt` binary.\n\n### Other Editors\n\nSince `bibfmt` reads from stdin and writes to stdout by default, it can be easily integrated with other editors that support external formatters. The tool will preserve the content if parsing errors are encountered, making it safe to use in automated workflows.\n\n## API References\n\n- [DOI Content Negotiation](https://citation.crosscite.org/docs.html)\n- [arXiv API](https://arxiv.org/help/api/index)\n- [PubMed API](https://www.ncbi.nlm.nih.gov/home/develop/api/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmseri%2Fdoi2bib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmseri%2Fdoi2bib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmseri%2Fdoi2bib/lists"}