{"id":27146617,"url":"https://github.com/jkseppan/shyster","last_synced_at":"2025-04-08T10:56:44.626Z","repository":{"id":60064920,"uuid":"540847814","full_name":"jkseppan/shyster","owner":"jkseppan","description":"Add soft hyphens to HTML documents","archived":false,"fork":false,"pushed_at":"2022-09-30T17:10:15.000Z","size":370,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-02T00:30:11.309Z","etag":null,"topics":["hyphenation"],"latest_commit_sha":null,"homepage":"https://jkseppan.github.io/shyster/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jkseppan.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-09-24T14:00:39.000Z","updated_at":"2022-09-25T14:57:42.000Z","dependencies_parsed_at":"2022-09-25T19:20:19.578Z","dependency_job_id":null,"html_url":"https://github.com/jkseppan/shyster","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkseppan%2Fshyster","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkseppan%2Fshyster/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkseppan%2Fshyster/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkseppan%2Fshyster/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jkseppan","download_url":"https://codeload.github.com/jkseppan/shyster/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247829472,"owners_count":21002994,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hyphenation"],"created_at":"2025-04-08T10:56:44.036Z","updated_at":"2025-04-08T10:56:44.617Z","avatar_url":"https://github.com/jkseppan.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"shyster\n================\n\n\u003c!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! --\u003e\n\nThe problem this package is trying to solve is that while I can set\n`hyphens: auto;` in CSS, many browsers do a poor job of hyphenating\nFinnish. Even if they have Finnish hyphenation patterns, they often fail\nto recognise compound words, which should be hyphenated at compound\nboundaries (saippua-kauppias, not saip-pua-kaup-pias). One solution is\nto set `hyphens: manual;` and add soft hyphens at acceptable hyphenation\nspots.\n\n## Install\n\n``` sh\npip install shyster\n```\n\n## How to use\n\nOne top-level function does it all:\n\n``` python\nimport shyster\nshyster.hyphenate_html_file('input.html', 'output.html', 'patterns/hyphen.tex')\n```\n\nIf more control is needed:\n\n``` python\nhyph_fi = hyphenator('patterns/hyph-fi.tex', righthyphenmin=2)\n\n[hyph_fi(word) for word in \n 'Jukolan talo, eteläisessä Hämeessä, seisoo erään mäen pohjaisella rinteellä, liki Toukolan kylää'\\\n .replace(',','').split()]\n```\n\n    ['Ju-ko-lan',\n     'ta-lo',\n     'ete-läi-ses-sä',\n     'Hä-mees-sä',\n     'sei-soo',\n     'erään',\n     'mäen',\n     'poh-jai-sel-la',\n     'rin-teel-lä',\n     'li-ki',\n     'Tou-ko-lan',\n     'ky-lää']\n\n``` python\nhtml = \"\"\"\n\u003c!doctype html\u003e\u003ctitle\u003eSeitsemän veljestä\u003c/title\u003e\n\u003cscript\u003evar veljekset = 7;\u003c/script\u003e\n\u003cbody\u003e\n\u003cp style=\"margin-top: 2em\"\u003eJukolan talo, eteläisessä Hämeessä, seisoo erään mäen pohjaisella\nrinteellä, liki Toukolan kylää. Sen läheisin ympäristö on kivinen\ntanner, mutta alempana alkaa pellot, joissa, ennenkuin talo oli häviöön\nmennyt, aaltoili teräinen vilja.\u003c/p\u003e\n\u003c/body\u003e\n\"\"\"\nsoup = BeautifulSoup(html, 'lxml')\nhyphenate_soup(soup, hyph_fi)\nprint(str(soup))\n```\n\n    \u003c!DOCTYPE html\u003e\n    \u003chtml\u003e\u003chead\u003e\u003ctitle\u003eSeit-se-män vel-jes-tä\u003c/title\u003e\n    \u003cscript\u003evar veljekset = 7;\u003c/script\u003e\n    \u003c/head\u003e\u003cbody\u003e\n    \u003cp style=\"margin-top: 2em\"\u003eJu-ko-lan ta-lo, ete-läi-ses-sä Hä-mees-sä, sei-soo erään mäen poh-jai-sel-la\n    rin-teel-lä, li-ki Tou-ko-lan ky-lää. Sen lä-hei-sin ym-pä-ris-tö on ki-vi-nen\n    tan-ner, mut-ta alem-pa-na al-kaa pel-lot, jois-sa, en-nen-kuin ta-lo oli hä-vi-öön\n    men-nyt, aal-toi-li te-räi-nen vil-ja.\u003c/p\u003e\n    \u003c/body\u003e\n    \u003c/html\u003e\n\n``` python\npat, ex = read_patterns(open('patterns/hyphen.tex').readlines())\ntrie = convert_patterns(pat)\nex = convert_exceptions(ex)\ndel ex['present'] # remove an exception\nex['shyster'] = ('shy', 'ster')  # add or alter an exception\nex['lawyer'] = ('l', 'a', 'w', 'y', 'e', 'r')  # exceptions even override {left,right}hyphenmin\n\nhyph_en = hyphenator(None, hyphen='•')\nhyph_en.trie = trie\nhyph_en.exceptions = ex\n\nimport textwrap\ntextwrap.wrap(' '.join(hyph_en(match.group(0)) \n                       for match in re.finditer(r'[\\w]+', '''\nshyster: noun; 1. someone, possibly a lawyer, who behaves in an unscrupulous way;\n2. the present Python library\n''')))\n```\n\n    ['shy•ster noun 1 some•one pos•si•bly a l•a•w•y•e•r who be•haves in an',\n     'un•scrupu•lous way 2 the pre•sent Python li•brary']\n\n## Copying\n\nThis program is free software: you can redistribute it and/or modify it\nunder the terms of the GNU General Public License as published by the\nFree Software Foundation, either version 3 of the License, or (at your\noption) any later version.\n\nThis program is distributed in the hope that it will be useful, but\nWITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General\nPublic License for more details.\n\nYou should have received a copy of the GNU General Public License along\nwith this program. If not, see \u003chttps://www.gnu.org/licenses/\u003e.\n\nThe above does not apply to the files in `patterns/`, which are\ndistributed with this program as example input files. The Finnish\npatterns are covered by the terms “Patterns may be freely distributed”\nand the English ones by “Unlimited copying and redistribution of this\nfile are permitted as long as this file is not modified.”\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkseppan%2Fshyster","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjkseppan%2Fshyster","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkseppan%2Fshyster/lists"}