{"id":15069337,"url":"https://github.com/gpakosz/unicodebominputstream","last_synced_at":"2025-04-10T17:40:57.681Z","repository":{"id":24880954,"uuid":"28296944","full_name":"gpakosz/UnicodeBOMInputStream","owner":"gpakosz","description":"Doing things right, in the name of Sun / Oracle","archived":false,"fork":false,"pushed_at":"2023-05-24T14:15:09.000Z","size":12,"stargazers_count":38,"open_issues_count":0,"forks_count":12,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-24T15:21:39.604Z","etag":null,"topics":["bom","inputstream","java","jdk","unicode","utf-8"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gpakosz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/funding.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null},"funding":{"github":"gpakosz"}},"created_at":"2014-12-21T11:09:05.000Z","updated_at":"2024-11-02T13:10:43.000Z","dependencies_parsed_at":"2022-08-23T03:50:10.909Z","dependency_job_id":"957aae9e-ac77-43d1-aa87-172f0c899b0f","html_url":"https://github.com/gpakosz/UnicodeBOMInputStream","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gpakosz%2FUnicodeBOMInputStream","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gpakosz%2FUnicodeBOMInputStream/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gpakosz%2FUnicodeBOMInputStream/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gpakosz%2FUnicodeBOMInputStream/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gpakosz","download_url":"https://codeload.github.com/gpakosz/UnicodeBOMInputStream/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248262181,"owners_count":21074257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bom","inputstream","java","jdk","unicode","utf-8"],"created_at":"2024-09-25T01:41:52.956Z","updated_at":"2025-04-10T17:40:57.655Z","avatar_url":"https://github.com/gpakosz.png","language":"Java","funding_links":["https://github.com/sponsors/gpakosz"],"categories":[],"sub_categories":[],"readme":"# UnicodeBOMInputStream\n\nA helper class to skip [Unicode BOMs] at the beginning of input streams.\n\nI initially released this class as a [Stack Overflow answer] and it apparently\ngot copy-pasted into several Java projects already. However, code put as answers\non Stack Overflow is licensed under [CC-BY-SA 3.0] which may not suit everybody.\n\n[Unicode BOMs]: http://www.unicode.org/faq/utf_bom.html#bom1\n[Stack Overflow answer]: http://stackoverflow.com/a/1835529/216063\n\n--------------------------------------------------------------------------------\n\n## Why?\n\nMany years have passed since I wrote this class and today Java still doesn't\nproperly deal with UTF-8 Unicode Byte Order Marks (BOMs) at the beginning of\ndata.\n\nIn 2001, someone opened bug [JDK-4508058] with the sound expectation that Java\nshould detect and skip UTF-8 BOMs at the beginning of UTF-8 streams, the same\nway it does for e.g. UTF-16.\n\nBug [JDK-4508058] remained open for a while, got fixed, and ultimately got\nreverted because existing Java code relied on UTF-8 BOMs not being skipped, see\n[JDK-6378911]:\n\n\u003e the Java EE 5 RI and SJSAS 9.0 has been relying on detecting a BOM, setting\n\u003e the appropriate encoding, and discarding the BOM bytes before reading the\n\u003e input\n\n\u003e The problem is that we cannot implement different behaviour depending on the\n\u003e JRE version we're running against.\n\nIn the end, instead of fixing [JDK-4508058] and accept this would be an\nannoyance only for Java EE 5 RI and SJSAS 9.0 users, people in charge at Sun\ncouldn't be bothered supporting 2 JDK versions and decided we're all living in a\nbetter world if [JDK-4508058] gets closed as \"won't fix\".\n\nIn 2010, someone opened bug [JDK-6959785] to reconsider the decision...\n\nFor more than 20 years now, every now and then, someone in the world edits an\nXML file with `Notepad.exe` which adds useless UTF-8 BOMs, and breaks their\nfavorite Java XML parser.\n\nMeanwhile, just skip the BOM yourself.\n\n[JDK-4508058]: https://bugs.java.com/bugdatabase/view_bug?bug_id=4508058\n[JDK-6378911]: https://bugs.java.com/bugdatabase/view_bug?bug_id=6378911\n[JDK-6959785]: https://bugs.java.com/bugdatabase/view_bug?bug_id=6959785\n[CC-BY-SA 3.0]: http://creativecommons.org/licenses/by-sa/3.0/legalcode\n\n--------------------------------------------------------------------------------\n\n## Usage\n\nWrap any `InputStream` with `UnicodeBOMInputStream` and use the `getBOM()`\nand/or `skipBOM()` methods. See [`UnicodeBOMInputStreamUsage.java`].\n\n[`UnicodeBOMInputStreamUsage.java`]: example/net/pempek/unicode/UnicodeBOMInputStreamUsage.java\n\n--------------------------------------------------------------------------------\n\nIf you find this library useful and decide to use it in your own projects please\ndrop me a line [@gpakosz].\n\n[@gpakosz]: https://twitter.com/gpakosz\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgpakosz%2Funicodebominputstream","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgpakosz%2Funicodebominputstream","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgpakosz%2Funicodebominputstream/lists"}