{"id":16776866,"url":"https://github.com/renanbr/bibtex-parser","last_synced_at":"2025-04-04T11:11:38.382Z","repository":{"id":48384792,"uuid":"53791744","full_name":"renanbr/bibtex-parser","owner":"renanbr","description":"BibTex Parser provides an API to read .bib files programmatically.","archived":false,"fork":false,"pushed_at":"2024-12-22T23:16:37.000Z","size":309,"stargazers_count":40,"open_issues_count":1,"forks_count":17,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-28T10:06:36.128Z","etag":null,"topics":["bib","bibtex","parser","php"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/renanbr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-13T15:19:26.000Z","updated_at":"2024-12-22T23:16:40.000Z","dependencies_parsed_at":"2024-06-18T15:29:44.729Z","dependency_job_id":"636b2fe4-3b68-4035-a7b3-f5547c9c9373","html_url":"https://github.com/renanbr/bibtex-parser","commit_stats":{"total_commits":227,"total_committers":16,"mean_commits":14.1875,"dds":0.2599118942731278,"last_synced_commit":"d02d2426822235f5179ecdf635ba710c9d6d2ddd"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renanbr%2Fbibtex-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renanbr%2Fbibtex-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renanbr%2Fbibtex-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/renanbr%2Fbibtex-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/renanbr","download_url":"https://codeload.github.com/renanbr/bibtex-parser/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247166168,"owners_count":20894654,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bib","bibtex","parser","php"],"created_at":"2024-10-13T07:11:10.264Z","updated_at":"2025-04-04T11:11:38.359Z","avatar_url":"https://github.com/renanbr.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003ePHP BibTeX Parser 2.x\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n    This is a\n    \u003ca href=\"https://tug.org/bibtex/\"\u003eBibTeX\u003c/a\u003e\n    parser written in\n    \u003ca href=\"https://php.net\"\u003ePHP\u003c/a\u003e.\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://tug.org/bibtex/\"\u003e\n        \u003cimg src=\"https://upload.wikimedia.org/wikipedia/commons/3/30/BibTeX_logo.svg\" height=\"83\" alt=\"BibTeX logo\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://php.net\"\u003e\n        \u003cimg src=\"https://upload.wikimedia.org/wikipedia/commons/2/27/PHP-logo.svg\" height=\"83\" alt=\"PHP logo\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\n![Tests](https://github.com/renanbr/bibtex-parser/workflows/Tests/badge.svg)\n[![codecov](https://codecov.io/gh/renanbr/bibtex-parser/branch/master/graph/badge.svg)](https://codecov.io/gh/renanbr/bibtex-parser)\n![Static Analysis](https://github.com/renanbr/bibtex-parser/workflows/Static%20Analysis/badge.svg)\n![Coding Standards](https://github.com/renanbr/bibtex-parser/workflows/Coding%20Standards/badge.svg)\n\nYou are browsing the documentation of **BibTeX Parser 2.x**, the latest version.\n\n## Table of contents\n\n* [Installing](#installing)\n* [Usage](#usage)\n* [Vocabulary](#vocabulary)\n* [Processors](#processors)\n   * [Tag name case](#tag-name-case)\n   * [Authors and editors](#authors-and-editors)\n   * [Keywords](#keywords)\n   * [Date](#date)\n   * [Fill missing tag](#fill-missing-tag)\n   * [Trim tags](#trim-tags)\n   * [Determine URL from the DOI](#determine-url-from-the-doi)\n   * [LaTeX to unicode](#latex-to-unicode)\n   * [Custom](#custom)\n* [Handling errors](#handling-errors)\n* [Advanced usage](#advanced-usage)\n\n## Installing\n\n```bash\ncomposer require renanbr/bibtex-parser\n```\n\n## Usage\n\n```php\nuse RenanBr\\BibTexParser\\Listener;\nuse RenanBr\\BibTexParser\\Parser;\nuse RenanBr\\BibTexParser\\Processor;\n\nrequire 'vendor/autoload.php';\n\n$bibtex = \u003c\u003c\u003cBIBTEX\n@article{einstein1916relativity,\n  title={Relativity: The Special and General Theory},\n  author={Einstein, Albert},\n  year={1916}\n}\nBIBTEX;\n\n// Create and configure a Listener\n$listener = new Listener();\n$listener-\u003eaddProcessor(new Processor\\TagNameCaseProcessor(CASE_LOWER));\n// $listener-\u003eaddProcessor(new Processor\\NamesProcessor());\n// $listener-\u003eaddProcessor(new Processor\\KeywordsProcessor());\n// $listener-\u003eaddProcessor(new Processor\\DateProcessor());\n// $listener-\u003eaddProcessor(new Processor\\FillMissingProcessor([/* ... */]));\n// $listener-\u003eaddProcessor(new Processor\\TrimProcessor());\n// $listener-\u003eaddProcessor(new Processor\\UrlFromDoiProcessor());\n// $listener-\u003eaddProcessor(new Processor\\LatexToUnicodeProcessor());\n// ... you can append as many Processors as you want\n\n// Create a Parser and attach the listener\n$parser = new Parser();\n$parser-\u003eaddListener($listener);\n\n// Parse the content, then read processed data from the Listener\n$parser-\u003eparseString($bibtex); // or parseFile('/path/to/file.bib')\n$entries = $listener-\u003eexport();\n\nprint_r($entries);\n```\n\nThis will output:\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [_type] =\u003e article\n            [citation-key] =\u003e einstein1916relativity\n            [title] =\u003e Relativity: The Special and General Theory\n            [author] =\u003e Einstein, Albert\n            [year] =\u003e 1916\n        )\n)\n```\n\n## Vocabulary\n\n[BibTeX] is all about \"entry\", \"tag's name\" and \"tag's content\".\n\n\u003e A [BibTeX] **entry** consists of the type (the word after @), a citation-key and a number of tags which define various characteristics of the specific [BibTeX] entry.\n\u003e (...) A [BibTeX] **tag** is specified by its **name** followed by an equals sign, and the **content**.\n\nSource: http://www.bibtex.org/Format/\n\nNote:\nThis library considers \"type\" and \"citation-key\" as tags.\nThis behavior can be changed [implementing your own Listener](#advanced-usage).\n\n## Processors\n\n`Processor` is a [callable] that receives an entry as argument and returns a modified entry.\n\nThis library contains three main parts:\n\n- `Parser` class, responsible for detecting units inside a [BibTeX] input;\n- `Listener` class, responsible for gathering units and transforming them into a list of entries;\n- `Processor` classes, responsible for manipulating entries.\n\nDespite you can't configure the `Parser`, you can append as many `Processor` as you want to the `Listener` through `Listener::addProcessor()` before exporting the contents.\nBe aware that `Listener` provides, by default, these features:\n\n- Found entries are reachable through `Listener::export()` method;\n- [Tag content concatenation](http://www.bibtex.org/Format/);\n    - e.g. `hello # \" world\"` tag's content will generate `hello world` [string]\n- [Tag content abbreviation handling](http://www.bibtex.org/Format/);\n    - e.g. `@string{foo=\"bar\"} @misc{bar=foo}` will make `$entries[1]['bar']` assume `bar` as value\n- Publication's type exposed as `_type` tag;\n- Citation key exposed as `citation-key` tag;\n- Original entry text exposed as `_original` tag.\n\nThis project ships some useful processors.\n\n### Tag name case\n\nIn [BibTeX] the tag's names aren't case-sensitive.\nThis library exposes entries as [array], in which keys are case-sensitive.\nTo avoid this misunderstanding, you can force the tags' name character case using `TagNameCaseProcessor`.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\TagNameCaseProcessor;\n\n$listener-\u003eaddProcessor(new TagNameCaseProcessor(CASE_UPPER)); // or CASE_LOWER\n```\n\n```bib\n@article{\n  title={BibTeX rocks}\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [TYPE] =\u003e article\n            [TITLE] =\u003e BibTeX rocks\n        )\n)\n```\n\n\u003c/details\u003e\n\n### Authors and editors\n\n[BibTeX] recognizes four parts of an author's name: First Von Last Jr.\nIf you would like to parse the `author` and `editor` tags included in your entries, you can use the `NamesProcessor` class.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\NamesProcessor;\n\n$listener-\u003eaddProcessor(new NamesProcessor());\n```\n\n```bib\n@article{\n  title={Relativity: The Special and General Theory},\n  author={Einstein, Albert}\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e article\n            [title] =\u003e Relativity: The Special and General Theory\n            [author] =\u003e Array\n                (\n                    [0] =\u003e Array\n                        (\n                            [first] =\u003e Albert\n                            [von] =\u003e\n                            [last] =\u003e Einstein\n                            [jr] =\u003e\n                        )\n                )\n        )\n)\n```\n\n\u003c/details\u003e\n\n### Keywords\n\nThe `keywords` tag contains a list of expressions represented as [string], you might want to read them as an [array] instead.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\KeywordsProcessor;\n\n$listener-\u003eaddProcessor(new KeywordsProcessor());\n```\n\n```bib\n@misc{\n  title={The End of Theory: The Data Deluge Makes the Scientific Method Obsolete},\n  keywords={big data, data deluge, scientific method}\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e misc\n            [title] =\u003e The End of Theory: The Data Deluge Makes the Scientific Method Obsolete\n            [keywords] =\u003e Array\n                (\n                    [0] =\u003e big data\n                    [1] =\u003e data deluge\n                    [2] =\u003e scientific method\n                )\n        )\n)\n```\n\n\u003c/details\u003e\n\n### Date\n\nIt adds a new tag `_date` as [DateTimeImmutable].\nThis processor adds the new tag **if and only if** this the tags `month` and `year` are fulfilled.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\DateProcessor;\n\n$listener-\u003eaddProcessor(new DateProcessor());\n```\n\n```bib\n@misc{\n  month=\"1~oct\",\n  year=2000\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e misc\n            [month] =\u003e 1~oct\n            [year] =\u003e 2000\n            [_date] =\u003e DateTimeImmutable Object\n                (\n                    [date] =\u003e 2000-10-01 00:00:00.000000\n                    [timezone_type] =\u003e 3\n                    [timezone] =\u003e UTC\n                )\n        )\n)\n```\n\n\u003c/details\u003e\n\n### Fill missing tag\n\nIt puts a default value to some missing field.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\FillMissingProcessor;\n\n$listener-\u003eaddProcessor(new FillMissingProcessor([\n    'title' =\u003e 'This entry has no title',\n    'year' =\u003e 1970,\n]));\n```\n\n```bib\n@misc{\n}\n\n@misc{\n    title=\"I do exist\"\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e misc\n            [title] =\u003e This entry has no title\n            [year] =\u003e 1970\n        )\n    [1] =\u003e Array\n        (\n            [type] =\u003e misc\n            [title] =\u003e I do exist\n            [year] =\u003e 1970\n        )\n)\n```\n\n\u003c/details\u003e\n\n### Trim tags\n\nApply [trim()] to all tags.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\TrimProcessor;\n\n$listener-\u003eaddProcessor(new TrimProcessor());\n```\n\n```bib\n@misc{\n  title=\" too much space  \"\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e misc\n            [title] =\u003e too much space\n        )\n\n)\n```\n\n\u003c/details\u003e\n\n### Determine URL from the DOI\n\nSets `url` tag with [DOI] if `doi` tag is present and `url` tag is missing.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\UrlFromDoiProcessor;\n\n$listener-\u003eaddProcessor(new UrlFromDoiProcessor());\n```\n\n```bib\n@misc{\n  doi=\"qwerty\"\n}\n\n@misc{\n  doi=\"azerty\",\n  url=\"http://example.org\"\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e misc\n            [doi] =\u003e qwerty\n            [url] =\u003e https://doi.org/qwerty\n        )\n\n    [1] =\u003e Array\n        (\n            [type] =\u003e misc\n            [doi] =\u003e azerty\n            [url] =\u003e http://example.org\n        )\n)\n```\n\n\u003c/details\u003e\n\n### LaTeX to unicode\n\n[BibTeX] files store [LaTeX] contents.\nYou might want to read them as unicode instead.\nThe `LatexToUnicodeProcessor` class solves this problem, but before adding the processor to the listener you must:\n\n- [install Pandoc](http://pandoc.org/installing.html) in your system; and\n- add [ryakad/pandoc-php](https://github.com/ryakad/pandoc-php) or [ueberdosis/pandoc](https://github.com/ueberdosis/pandoc) as a dependency of your project.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\nuse RenanBr\\BibTexParser\\Processor\\LatexToUnicodeProcessor;\n\n$listener-\u003eaddProcessor(new LatexToUnicodeProcessor());\n```\n\n```bib\n@article{\n  title={Caf\\\\'{e}s and bars}\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e article\n            [title] =\u003e Cafés and bars\n        )\n)\n```\n\n\u003c/details\u003e\n\nNote: Order matters, add this processor as the last.\n\n### Custom\n\nThe `Listener::addProcessor()` method expects a [callable] as argument.\nIn the example shown below, we append the text `with laser` to the `title` tags for all entries.\n\n\u003cdetails\u003e\u003csummary\u003eUsage\u003c/summary\u003e\n\n```php\n$listener-\u003eaddProcessor(static function (array $entry) {\n    $entry['title'] .= ' with laser';\n    return $entry;\n});\n```\n\n```\n@article{\n  title={BibTeX rocks}\n}\n```\n\n```\nArray\n(\n    [0] =\u003e Array\n        (\n            [type] =\u003e article\n            [title] =\u003e BibTeX rocks with laser\n        )\n)\n```\n\n\u003c/details\u003e\n\n## Handling errors\n\nThis library throws two types of exception: `ParserException` and `ProcessorException`.\nThe first one may happen during the data extraction.\nWhen it occurs it probably means the parsed BibTeX isn't valid.\nThe second exception may happen during the data processing.\nWhen it occurs it means the listener's processors can't handle properly the data found.\nBoth implement `ExceptionInterface`.\n\n```php\nuse RenanBr\\BibTexParser\\Exception\\ExceptionInterface;\nuse RenanBr\\BibTexParser\\Exception\\ParserException;\nuse RenanBr\\BibTexParser\\Exception\\ProcessorException;\n\ntry {\n    // ... parser and listener configuration\n\n    $parser-\u003eparseFile('/path/to/file.bib');\n    $entries = $listener-\u003eexport();\n} catch (ParserException $exception) {\n    // The BibTeX isn't valid\n} catch (ProcessorException $exception) {\n    // Listener's processors aren't able to handle data found\n} catch (ExceptionInterface $exception) {\n    // Alternatively, you can use this exception to catch all of them at once\n}\n```\n\n## Advanced usage\n\nThe core of this library contains these main classes:\n\n- `RenanBr\\BibTexParser\\Parser` responsible for detecting units inside a [BibTeX] input;\n- `RenanBr\\BibTexParser\\ListenerInterface` responsible for treating units found.\n\nYou can attach listeners to the parser through `Parser::addListener()`.\nThe parser is able to detect [BibTeX] units, such as \"type\", \"tag's name\", \"tag's content\".\nAs the parser finds a unit, it triggers the listeners attached to it.\n\nYou can code your own listener! All you have to do is handle units.\n\n```php\nnamespace RenanBr\\BibTexParser;\n\ninterface ListenerInterface\n{\n    /**\n     * Called when an unit is found.\n     *\n     * @param string $text    The original content of the unit found.\n     *                        Escape character will not be sent.\n     * @param string $type    The type of unit found.\n     *                        It can assume one of Parser's constant value.\n     * @param array  $context Contains details of the unit found.\n     */\n    public function bibTexUnitFound($text, $type, array $context);\n}\n```\n\n`$type` may assume one of these values:\n\n- `Parser::TYPE`\n- `Parser::CITATION_KEY`\n- `Parser::TAG_NAME`\n- `Parser::RAW_TAG_CONTENT`\n- `Parser::BRACED_TAG_CONTENT`\n- `Parser::QUOTED_TAG_CONTENT`\n- `Parser::ENTRY`\n\n`$context` is an [array] with these keys:\n\n- `offset` contains the `$text`'s beginning position.\n  It may be useful, for example, to [seek on a file pointer](https://php.net/fseek);\n- `length` contains the original `$text`'s length.\n  It may differ from [string] length sent to the listener because may there are escaped characters.\n\n[BibTeX]: https://tug.org/bibtex/\n[DOI]: https://www.doi.org/\n[DateTimeImmutable]: https://www.php.net/manual/class.datetimeimmutable.php\n[LaTeX]: https://www.latex-project.org/\n[array]: https://php.net/manual/language.types.array.php\n[callable]: https://php.net/manual/en/language.types.callable.php\n[string]: https://php.net/manual/language.types.string.php\n[trim()]: https://www.php.net/trim\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frenanbr%2Fbibtex-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frenanbr%2Fbibtex-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frenanbr%2Fbibtex-parser/lists"}