{"id":17126097,"url":"https://github.com/nigels-com/tutf8e","last_synced_at":"2025-07-05T10:40:30.615Z","repository":{"id":139545816,"uuid":"214592399","full_name":"nigels-com/tutf8e","owner":"nigels-com","description":"Tiny UTF-8 Encoder for C","archived":false,"fork":false,"pushed_at":"2023-04-24T12:11:02.000Z","size":129,"stargazers_count":12,"open_issues_count":2,"forks_count":8,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-10-15T18:46:44.516Z","etag":null,"topics":["c","cplusplus","iso-8859-1","unicode","utf8","windows-1252"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nigels-com.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-12T06:35:21.000Z","updated_at":"2024-09-18T01:12:03.000Z","dependencies_parsed_at":null,"dependency_job_id":"5e2a40cd-11be-4e80-9427-db678838358a","html_url":"https://github.com/nigels-com/tutf8e","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nigels-com%2Ftutf8e","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nigels-com%2Ftutf8e/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nigels-com%2Ftutf8e/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nigels-com%2Ftutf8e/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nigels-com","download_url":"https://codeload.github.com/nigels-com/tutf8e/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248673213,"owners_count":21143453,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","cplusplus","iso-8859-1","unicode","utf8","windows-1252"],"created_at":"2024-10-14T18:46:46.885Z","updated_at":"2025-04-13T06:27:28.593Z","avatar_url":"https://github.com/nigels-com.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tutf8e\n\n  *Tute Feighty*\n\n  A tiny UTF-8 encoder for C.\n\n## Goals\n\n  * As small and fast as possible\n  * Narrowly scoped to one-step UTF-8 encoding in C\n  * Link only what you need and use\n  * MIT licence\n\n## Supported Encodings\n\n  * [iso-8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) Latin-1 Western European\n  * [iso-8859-2](https://en.wikipedia.org/wiki/ISO/IEC_8859-2) Latin-2 East European\n  * [iso-8859-3](https://en.wikipedia.org/wiki/ISO/IEC_8859-3) Latin-3 South European\n  * [iso-8859-4](https://en.wikipedia.org/wiki/ISO/IEC_8859-4) Latin-4 North European\n  * [iso-8859-5](https://en.wikipedia.org/wiki/ISO/IEC_8859-5) Part 5: Latin/Cyrillic\n  * [iso-8859-6](https://en.wikipedia.org/wiki/ISO/IEC_8859-6) Part 6: Latin/Arabic\n  * [iso-8859-7](https://en.wikipedia.org/wiki/ISO/IEC_8859-7) Part 7: Latin/Greek\n  * [iso-8859-8](https://en.wikipedia.org/wiki/ISO/IEC_8859-8) Part 8: Latin/Hebrew\n  * [iso-8859-9](https://en.wikipedia.org/wiki/ISO/IEC_8859-9) Latin-5 Turkish\n  * [iso-8859-10](https://en.wikipedia.org/wiki/ISO/IEC_8859-10) Latin-6 Nordic\n  * [iso-8859-11](https://en.wikipedia.org/wiki/ISO/IEC_8859-11) Part 11: Latin/Thai\n  * [iso-8859-13](https://en.wikipedia.org/wiki/ISO/IEC_8859-13) Latin-7 Baltic Rim\n  * [iso-8859-14](https://en.wikipedia.org/wiki/ISO/IEC_8859-14) Latin-8 Celtic\n  * [iso-8859-15](https://en.wikipedia.org/wiki/ISO/IEC_8859-15) Latin-9 Western European \n  * [iso-8859-16](https://en.wikipedia.org/wiki/ISO/IEC_8859-16) Latin-10 South-Eastern European\n  * [windows-1250](https://en.wikipedia.org/wiki/Windows-1250) Central European and Eastern European\n  * [windows-1251](https://en.wikipedia.org/wiki/Windows-1251) Cyrillic\n  * [windows-1252](https://en.wikipedia.org/wiki/Windows-1252) English\n  * [windows-1253](https://en.wikipedia.org/wiki/Windows-1253) Greek\n  * [windows-1254](https://en.wikipedia.org/wiki/Windows-1254) Turkish\n  * [windows-1255](https://en.wikipedia.org/wiki/Windows-1255) Hebrew\n  * [windows-1256](https://en.wikipedia.org/wiki/Windows-1256) Arabic\n  * [windows-1257](https://en.wikipedia.org/wiki/Windows-1257) Baltic\n  * [windows-1258](https://en.wikipedia.org/wiki/Windows-1258) Vietnamese\n\n## Test Procedure\n\n```\n$ ./codegen.py\n\n$ gcc src/* test/test.c -Iinclude\n\n$ ./a.out\nA quick brown fox jumps over the lazy dog\nNechť již hříšné saxofony ďáblů rozezvučí síň úděsnými tóny waltzu, tanga a quickstepu.\nPijamalı hasta yağız şoföre çabucak güvendi.\nPõdur Zagrebi tšellomängija-följetonist Ciqo külmetas kehvas garaažis\nВ чащах юга жил бы цитрус? Да, но фальшивый экземпляр!\nδιαφυλάξτε γενικά τη ζωή σας από βαθειά ψυχικά τραύματα\nעטלף אבק נס דרך מזגן שהתפוצץ כי חם\nPijamalı hasta yağız şoföre çabucak güvendi.\nFlygande bäckasiner söka hwila på mjuka tuvor.\nเป็นมนุษย์สุดประเสริฐเลิศคุณค่า กว่าบรรดาฝูงสัตว์เดรัจฉาน จงฝ่าฟันพัฒนาวิชาการ อย่าล้างผลาญฤๅเข่นฆ่าบีฑาใคร ไม่ถือโทษโกรธแช่งซัดฮึดฮัดด่า หัดอภัยเหมือนกีฬาอัชฌาสัย ปฏิบัติประพฤติกฎกำหนดใจ พูดจาให้จ๊ะๆ จ๋าๆ น่าฟังเอยฯ\nJeżu klątw, spłódź Finom część gry hańb!\n11 passed, 0 failed tests\n```\n\n## How small is it?\n\n512 bytes + overhead per encoding.\n\n```\n$ for i in src/*; do gcc -c $i -O1; done\n$ du -bhc *.o | grep total\n32K total\n\n$ for i in src/*; do gcc -c $i -O3; done\n$ du -bhc *.o | grep total\n32K total\n\n$ for i in src/*; do gcc -c $i -Os; done\n$ du -bhc *.o | grep total\n28K total\n```\n\n## Related\n\n  * [iconv](https://www.gnu.org/software/libiconv/)\n  * [icu](http://site.icu-project.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnigels-com%2Ftutf8e","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnigels-com%2Ftutf8e","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnigels-com%2Ftutf8e/lists"}