{"id":15636543,"url":"https://github.com/vanderlee/phpsyllable","last_synced_at":"2025-10-08T00:37:39.633Z","repository":{"id":60774717,"uuid":"1618158","full_name":"vanderlee/phpSyllable","owner":"vanderlee","description":"PHP Syllable splitter/counter and Hyphenator for text and HTML. Multi-language, customisable, cached and fast!","archived":false,"fork":false,"pushed_at":"2025-04-24T18:54:19.000Z","size":2579,"stargazers_count":119,"open_issues_count":8,"forks_count":33,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-10-02T15:23:38.410Z","etag":null,"topics":["hyphen-marker","hyphenate","hyphenation","hyphenation-algorithm","hyphenation-rules","hyphenator","hyphens","language","php","split","syllable","syllable-count","syllable-counts","syllablecounter","syllables","tex"],"latest_commit_sha":null,"homepage":"http://vanderlee.github.io/phpSyllable/","language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vanderlee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2011-04-15T08:53:46.000Z","updated_at":"2025-06-22T22:35:33.000Z","dependencies_parsed_at":"2024-06-18T13:50:56.666Z","dependency_job_id":"5ea6b331-6482-44b8-9967-fbfa2422a7b6","html_url":"https://github.com/vanderlee/phpSyllable","commit_stats":{"total_commits":137,"total_committers":13,"mean_commits":"10.538461538461538","dds":"0.36496350364963503","last_synced_commit":"69c4c1b81f3d9e53b2f565560e236cfcc9373545"},"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/vanderlee/phpSyllable","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vanderlee%2FphpSyllable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vanderlee%2FphpSyllable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vanderlee%2FphpSyllable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vanderlee%2FphpSyllable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vanderlee","download_url":"https://codeload.github.com/vanderlee/phpSyllable/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vanderlee%2FphpSyllable/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278872054,"owners_count":26060525,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hyphen-marker","hyphenate","hyphenation","hyphenation-algorithm","hyphenation-rules","hyphenator","hyphens","language","php","split","syllable","syllable-count","syllable-counts","syllablecounter","syllables","tex"],"created_at":"2024-10-03T11:04:49.584Z","updated_at":"2025-10-08T00:37:39.616Z","avatar_url":"https://github.com/vanderlee.png","language":"TeX","readme":"# Syllable\n\nVersion 1.9\n\n[![Tests](https://github.com/vanderlee/phpSyllable/actions/workflows/tests.yml/badge.svg)](https://github.com/vanderlee/phpSyllable/actions/workflows/tests.yml)\n\nCopyright \u0026copy; 2011-2025 Martijn van der Lee.\nMIT Open Source license applies.\n\n\n## Introduction\n\nPHP Syllable splitting and hyphenation.\nor rather...\nPHP Syl-la-ble split-ting and hy-phen-ation.\n\nBased on the work by Frank M. Liang (http://www.tug.org/docs/liang/)\nand the many volunteers in the TeX community.\n\nMany languages supported. i.e. english (us/uk), spanish, german, french, dutch,\nitalian, romanian, russian, etc. 76 languages in total.\n\nLanguage sources: http://tug.org/tex-hyphen/#languages\n\nSupports PHP 7.1 and up, so you can use it on older servers.\n\n\n## Installation\n\nInstall phpSyllable via Composer\n\n```\ncomposer require vanderlee/syllable\n```\n\nor simply add phpSyllable to your project and set up the project's \nautoloader for phpSyllable's src/ directory.\n\n\n## Usage\n\nInstantiate a Syllable object and start hyphenation.\n\nMinimal example:\n\n```php\n$syllable = new \\Vanderlee\\Syllable\\Syllable('en-us');\necho $syllable-\u003ehyphenateText('Provide a plethora of paragraphs');\n```\n\nExtended example:\n\n```php\nuse Vanderlee\\Syllable\\Syllable;\nuse Vanderlee\\Syllable\\Hyphen;\n\n// Globally set the directory where Syllable can store cache files.\n// By default, this is the cache/ folder in this package, but usually\n// you want to have the folder outside the package. Note that the cache\n// folder must be created beforehand.\nSyllable::setCacheDir(__DIR__ . '/cache');\n\n// Globally set the directory where the .tex files are stored.\n// By default, this is the languages/ folder of this package and\n// usually does not need to be adapted.\nSyllable::setLanguageDir(__DIR__ . '/languages');\n\n// Create a new instance for the language.\n$syllable = new Syllable('en-us');\n\n// Set the style of the hyphen. In this case it is the \"-\" character.\n// By default, it is the soft hyphen \"\u0026shy;\".\n$syllable-\u003esetHyphen(new Hyphen\\Dash());\n\n// Set the minimum word length required for hyphenation.\n// By default, all words are hyphenated.\n$syllable-\u003esetMinWordLength(5);\n\n// Output hyphenated text ..\necho $syllable-\u003ehyphenateText('Provide your own paragraphs...');\n// .. or hyphenated HTML.\necho $syllable-\u003ehyphenateHtmlText('\u003cb\u003e... with highlighted text.\u003c/b\u003e');\n```\n\nSee the [demo.php](demo.php) file for a working example.\n\n\n## `Syllable` API reference\n\nThe following describes the API of the main Syllable class. In most cases, \nyou will not use any other functions. Browse the code under src/ for all \navailable functions.\n\n#### public __construct($language = 'en-us', string|Hyphen $hyphen = null)\n\nCreate a new Syllable class, with defaults.\n\n#### public static setCacheDir(string $dir)\n\nSet the directory where compiled language files may be stored.\nDefault to the `cache` subdirectory of the current directory.\n\n#### public static setEncoding(string|null $encoding = null)\n\nSet the character encoding to use.\nSpecify `null` encoding to not apply any encoding at all.\n\n#### public static setLanguageDir(string $dir)\n\nSet the directory where language source files can be found.\nDefault to the `languages` subdirectory of the current directory.\n\n#### public setLanguage(string $language)\n\nSet the language whose rules will be used for hyphenation.\n\n#### public addHyphenations(array $hyphenations)\n\nAdd any number of custom hyphenation patterns, using '-' to specify where hyphens may occur.\nOmit the '-' from the pattern to add words that will not be hyphenated.\n\n#### public setHyphen(mixed $hyphen)\n\nSet the hyphen text or object to use as a hyphen marker.\n\n#### public getHyphen(): Hyphen\n\nGet the current hyphen object.\n\n#### public setCache(Cache $cache = null)\n\n#### public getCache(): Cache\n\n#### public setSource($source)\n\n#### public getSource(): Source\n\n#### public setMinWordLength(int $length = 0)\n\nWords need to contain at least this many character to be hyphenated.\n\n#### public getMinWordLength(): int\n\n#### public setLibxmlOptions(int $libxmlOptions)\n\nOptions to use for HTML parsing by libxml.\n**See:** https://www.php.net/manual/de/libxml.constants.php.\n\n#### public excludeAll()\n\nExclude all elements.\n\n#### public excludeElement(string|string[] $elements)\n\nAdd one or more elements to exclude from HTML.\n\n#### public excludeAttribute(string|string[] $attributes, $value = null)\n\nAdd one or more elements with attributes to exclude from HTML.\n\n#### public excludeXpath(string|string[] $queries)\n\nAdd one or more xpath queries to exclude from HTML.\n\n#### public includeElement(string|string[] $elements)\n\nAdd one or more elements to include from HTML.\n\n#### public includeAttribute(string|string[] $attributes, $value = null)\n\nAdd one or more elements with attributes to include from HTML.\n\n#### public includeXpath(string|string[] $queries)\n\nAdd one or more xpath queries to include from HTML.\n\n#### public splitWord(string $word): array\n\nSplit a single word on where the hyphenation would go.\nPunctuation is not supported, only simple words. For parsing whole sentences\nplease use Syllable::splitWords() or Syllable::splitText().\n\n#### public splitWords(string $text): array\n\nSplit a text into an array of punctuation marks and words,\nsplitting each word on where the hyphenation would go.\n\n#### public splitText(string $text): array\n\nSplit a text on where the hyphenation would go.\n\n#### public hyphenateWord(string $word): string\n\nHyphenate a single word.\n\n#### public hyphenateText(string $text): string\n\nHyphenate all words in the plain text.\n\n#### public hyphenateHtml(string $html): string\n\nHyphenate all readable text in the HTML, excluding HTML tags and\nattributes.\n**Deprecated:** Use the UTF-8 capable hyphenateHtmlText() instead. This method is kept only for backward compatibility and will be removed in the next major version 2.0.\n\n#### public hyphenateHtmlText(string $html): string\n\nHyphenate all readable text in the HTML, excluding HTML tags and\nattributes.\nThis method is UTF-8 capable and should be preferred over hyphenateHtml().\n\n#### public histogramText(string $text): array\n\nCount the number of syllables in the text and return a map with\nsyllable count as key and number of words for that syllable count as\nthe value.\n\n#### public countWordsText(string $text): int\n\nCount the number of words in the text.\n\n#### public countSyllablesText(string $text): int\n\nCount the number of syllables in the text.\n\n#### public countPolysyllablesText(string $text): int\n\nCount the number of polysyllables in the text.\n\n\n## Development\n\n### Update language files\n\nRun\n```\ncomposer dump-autoload --dev\n./build/update-language-files\n```\nto fetch the latest language files remotely and optionally use environment variables to customize the update process:\n\n#### CONFIGURATION_FILE\nSpecify the absolute path of the configuration file where the language files to be downloaded are defined. The \nconfiguration file has the following format: \n```\n{\n\t\"files\": [\n\t\t{\n\t\t\t\"_comment\": \"\u003ccomment\u003e\",\n\t\t\t\"fromUrl\": \"\u003cabsolute-remote-file-url\u003e\",\n\t\t\t\"toPath\": \"\u003crelative-local-file-path\u003e\",\n\t\t\t\"disabled\": \u003ctrue|false\u003e\n\t\t}\n\t]\n}\n```\nwhere the attributes are self-explanatory and `_comment` and `disabled` are optional. See for example \n[build/update-language-files.json](build/update-language-files.json). \nDefault: The `build/update-language-files.json` file of this package.\n\n#### MAX_REDIRECTS\n\nSpecify the maximum number of URL redirects allowed when retrieving a language file.\nDefault: `1`.\n\n#### WITH_COMMIT\n\nCreate (1) or skip (0) a Git commit from the updated language files.\nDefault: `0`.\n\n#### LOG_LEVEL\n\nSet the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).\nDefault: `6`.\n\nFor example use\n```\ncomposer dump-autoload --dev\nLOG_LEVEL=0 ./build/update-language-files\n```\nto silently run the script without outputting any logging.\n\n### Update API documentation\n\nRun\n```\ncomposer dump-autoload --dev\n./build/generate-docs\n```\nto update the API documentation in this README.md. This should be done when the Syllable class has been modified.\nOptionally, you can use environment variables to modify the documentation update process:\n\n#### WITH_COMMIT\n\nCreate (1) or skip (0) a Git commit from the adapted files.\nDefault: `0`.\n\n#### LOG_LEVEL\n\nSet the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).\nDefault: `6`.\n\n### Create release\n\nRun\n```\ncomposer dump-autoload --dev\n./build/create-release\n```\nto create a local release of the project by adding a changelog to this README.md.\nOptionally, you can use environment variables to modify the release process:\n\n#### RELEASE_TYPE\n\nSet the release type to major (0), minor (1) or patch (2) release.\nDefault: `2`.\n\n#### WITH_COMMIT\n\nCreate (1) or skip (0) a Git commit from the adapted files and apply the release tag.\nDefault: `0`.\n\n#### LOG_LEVEL\n\nSet the verbosity of the script to verbose (6), warnings and errors (4), errors only (3) or silent (0).\nDefault: `6`.\n\n### Tests\n\nRun\n```\ncomposer install\n./vendor/bin/phpunit\n```\nto execute the tests.\n\n\n## Changes\n\n1.7\n-   Use \\hyphenations case-insensitive (like \\patterns)\n-   Correct handling of UTF-8 character sets when hyphenating HTML \n    using the new Syllable::hyphenateHtmlText()\n-   Replace invalid \"en\" with \"en-us\" as default language of Syllable\n-   Update of hyph-de.tex\n\n1.6\n-   Revert renaming of API method names\n-   Use cache version as string instead of number\n-   Cover caching with tests\n-   Reduce the PHP test matrix to the latest versions of PHP 5, 7 and 8\n-   Check via GitHub Action if the API documentation is up-to-date\n-   Update API reference\n-   Fix API documentation of an array as parameter default value\n-   Satisfy StyleCI\n-   Commit changed files of entire working tree in build context\n-   Support for generation of API documentation in README.md\n-   Add words with reduced hyphenation to collection from PR #26\n-   Satisfy StyleCI\n-   Add test for collection of words with reduced hyphenation\n-   Refactor splitWord(), splitWords() and splitText() of Syllable class\n-   Remove @covers annotation in tests\n-   Added splitWords and various code quality improvements\n-   Update the README.md copyright claim on release\n-   Skip GitHub Action scheduler in forks and run tests only in PR context\n-   Allow GitHub Action \"Update languages\" workflow to bypass reviews\n-   Use German orthography from 2006 as standard orthography\n\n1.5.5\n-   Automatic update of 74 languages\n\n1.5.4\n-   Automatically run tests for every push and pull request\n-   Automatic monthly update and release of language files\n-   Fix small typo in README and add 'use' in example.\n-   Use same code format as in src/Source/File.php\n-   Fix opening brace\n-   Remove whitespace\n-   Fix closing brace\n-   Use PHP syntax highlighting\n\n1.5.3\n-   Fixed PHP 7.4 compatibility (#37) by @Dargmuesli.\n\n1.5.2\n-   Fixed bug reverted in refactoring (continue 3) by @Dargmuesli.\n\n1.5.1\n-   Fixed bug reverted in refactoring (continue 2).\n\n1.5\n-   Refactored for modern PHP and support for current PHP version.\n\n1.4.6\n-\tAdded `setMinWordLength($length)` and `getMinWordLength()` to limit\n\thyphenation to words with at least the specified number of characters.\n\n1.4.5\n-\tFixes for composer.\n\n1.4.4\n-\tComposer autoloader added\n\n1.4.3\n-\tImproved documentation\n\n1.4.2\n-\tUpdated spanish language files.\n-\tInitial PHPDoc.\n\n1.4.1\n-\tMore fixes for apostrophes in splitting.\n\n1.4\n-\tFix for French language handling\n-\tRefactor .text loading into source class.\n-\tMassive cache performance increase (excessive writes).\n\n1.3.1\n-\tFix slow initial cache writing; too many writes (only one was needed).\n-\tRemoved min_hyphenation; mb_strlen takes more time than hashmap lookup.\n\n1.3\n-\tAdded `array histogramText($text)`, `integer countWordsText($text)` and\n\t`integer countPolysyllableText($text)` methods.\n-\tRefactored cache interface.\n-\tImproved unittests.\n\n1.2\n-\tDeprecated treshold feature. Was based on misinterpretation of the\n\talgorithm. Methods, constants and constructor signature unchanged, although\n\tyou can now omit the treshold if you want (or leave it in, it's detected as\n\ta \"fake\" treshold).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvanderlee%2Fphpsyllable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvanderlee%2Fphpsyllable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvanderlee%2Fphpsyllable/lists"}