{"id":25678078,"url":"https://github.com/jongwony/python_hwp","last_synced_at":"2025-08-21T09:04:44.524Z","repository":{"id":124295895,"uuid":"306606332","full_name":"jongwony/python_hwp","owner":"jongwony","description":"python-docx 와 비슷한 방식으로 전처리를 위해 커스터마이징 한 HWP 파일 파서입니다.","archived":false,"fork":false,"pushed_at":"2020-10-24T07:53:51.000Z","size":755,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-24T15:45:30.258Z","etag":null,"topics":["hwp5","parser","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jongwony.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-23T10:43:37.000Z","updated_at":"2024-08-29T01:13:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"aca60450-b2d1-4ffa-b95c-3c0089647985","html_url":"https://github.com/jongwony/python_hwp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jongwony/python_hwp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwony%2Fpython_hwp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwony%2Fpython_hwp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwony%2Fpython_hwp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwony%2Fpython_hwp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jongwony","download_url":"https://codeload.github.com/jongwony/python_hwp/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jongwony%2Fpython_hwp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271454842,"owners_count":24762698,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hwp5","parser","python3"],"created_at":"2025-02-24T15:35:31.664Z","updated_at":"2025-08-21T09:04:44.519Z","avatar_url":"https://github.com/jongwony.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# python-hwp\n\npython-docx 와 비슷한 방식으로 전처리를 위해 커스터마이징 한 HWP 파일 파서입니다.\n현재는 데이터 머신러닝 목적으로 작성하였습니다.\n\n[HWP 파일 포맷](https://www.hancom.com/etc/hwpDownload.do)\n을 참고하였으며 현재는 HWP 5.0 형식만 지원합니다.\n\n- 2020-10-24 Paragraph parsed.\n\n## Example\n\n```python\nfrom main import extract\nextract('example.hwp')\n```\n\n```\ncontrol=2 info=b'dces\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=\ncontrol=2 info=b'dloc\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=\ncontrol=13 info=b'' text=HWP test file\ncontrol=13 info=b'' text=Purpose: Provide example of this file type\ncontrol=13 info=b'' text=Document file type: HWP\ncontrol=13 info=b'' text=Version: 1.0\ncontrol=13 info=b'' text=Remark:\ncontrol=13 info=b'' text=Example content:\ncontrol=11 info=b' osg\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=\ncontrol=13 info=b'' text=The names \"John Doe\" for males, \"Jane Doe\" or \"Jane Roe\" for females, or \"Jonnie Doe\" and \"Janie Doe\" f\nor children, or just \"Doe\" non-gender-specifically are used as placeholder names for a party whose true identity is unknown or m\nust be withheld in a legal action, case, or discussion. The names are also used to refer to acorpse or hospital patient whose id\nentity is unknown. This practice is widely used in the United States and Canada, but is rarely used in other English-speaking co\nuntries including the United Kingdom itself, from where the use of \"John Doe\" in a legal context originates. The names Joe Blogg\ns or John Smith are used in the UK instead, as well as in Australia and New Zealand.\ncontrol=13 info=b'' text=John Doe is sometimes used to refer to a typical male in other contexts as well, in a similar manner to\n John Q. Public, known in Great Britain as Joe Public, John Smith or Joe Bloggs. For example, the first name listed on a form is\n often John Doe, along with a fictional address or other fictional information to provide an example of how to fill in the form.\n The name is also used frequently in popular culture, for example in the Frank Capra film Meet John Doe. John Doe was also the n\name of a 2002 American television series.\ncontrol=13 info=b'' text=Similarly, a child or baby whose identity is unknown may be referred to as Baby Doe. A notorious murder\n case in Kansas City, Missouri, referred to the baby victim as Precious Doe. Other unidentified female murder victims are Cali D\noe and Princess Doe. Additional persons may be called James Doe, Judy Doe, etc. However, to avoid possible confusion, if two ano\nnymous or unknown parties are cited in a specific case or action, the surnames Doe and Roe may be used simultaneously; for examp\nle, \"John Doe v. Jane Roe\". If several anonymous parties are referenced, they may simply be labelled John Doe #1, John Doe #2, e\ntc. (the U.S. Operation Delego cited 21 (numbered) \"John Doe\"s) or labelled with other variants of Doe / Roe / Poe / etc. Other\nearly alternatives such as John Stiles and Richard Miles are now rarely used, and Mary Major has been used in some American fede\nral cases.\ncontrol=11 info=b' osg\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=\ncontrol=13 info=b'' text=\ncontrol=3 info=b'klh%\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=File created by\ncontrol=4 info=b'klh\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=http://www.online-convert.com\ncontrol=13 info=b'' text=\ncontrol=3 info=b'klh%\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=More example files:\ncontrol=4 info=b'klh\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=http://www.online-convert.com/file-type\ncontrol=13 info=b'' text=\ncontrol=3 info=b'klh%\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=Text of “Example content”:\ncontrol=4 info=b'klh\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=Wikipedia\ncontrol=13 info=b'' text=\ncontrol=11 info=b' osg\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=\ncontrol=3 info=b'klh%\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=License:\ncontrol=4 info=b'klh\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' text=Attribution-ShareAlike 3.0 Unported\ncontrol=13 info=b'' text=\ncontrol=13 info=b'' text=Feel free to use and share the file according to the license above.\n```\n\n![control_char](images/control_char.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjongwony%2Fpython_hwp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjongwony%2Fpython_hwp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjongwony%2Fpython_hwp/lists"}