{"id":25203441,"url":"https://github.com/lahter/htr-quality-classifier","last_synced_at":"2025-05-12T20:45:51.258Z","repository":{"id":151507320,"uuid":"617998039","full_name":"LAHTeR/htr-quality-classifier","owner":"LAHTeR","description":"Detect quality of (digitized) text.","archived":false,"fork":false,"pushed_at":"2023-11-16T14:22:26.000Z","size":3403,"stargazers_count":3,"open_issues_count":5,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-14T12:23:03.263Z","etag":null,"topics":["classification","htr","machine-learning","neural-networks","scikit-learn"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LAHTeR.png","metadata":{"files":{"readme":"README.dev.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-03-23T14:44:52.000Z","updated_at":"2024-11-05T09:02:58.000Z","dependencies_parsed_at":"2023-11-10T12:43:19.928Z","dependency_job_id":"d9576087-c6a0-4b09-9935-27de3c5ae19b","html_url":"https://github.com/LAHTeR/htr-quality-classifier","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAHTeR%2Fhtr-quality-classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAHTeR%2Fhtr-quality-classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAHTeR%2Fhtr-quality-classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LAHTeR%2Fhtr-quality-classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LAHTeR","download_url":"https://codeload.github.com/LAHTeR/htr-quality-classifier/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253818714,"owners_count":21969208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","htr","machine-learning","neural-networks","scikit-learn"],"created_at":"2025-02-10T07:17:28.294Z","updated_at":"2025-05-12T20:45:51.210Z","avatar_url":"https://github.com/LAHTeR.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# `text_quality` developer documentation\n\nIf you're looking for user documentation, go [here](README.md).\n\n## Development install\n\n```shell\n# Create a virtual environment, e.g. with\npython3 -m venv env\n\n# activate virtual environment\nsource env/bin/activate\n\n# make sure to have a recent version of pip and setuptools\npython3 -m pip install --upgrade pip setuptools\n\n# (from the project root directory)\n# install text_quality as an editable package\npython3 -m pip install --no-cache-dir --editable .\n# install development dependencies\npython3 -m pip install --no-cache-dir --editable .[dev]\n```\n\nAfterwards check that the install directory is present in the `PATH` environment variable.\n\n## Running the tests\n\nThere are two ways to run tests.\n\nThe first way requires an activated virtual environment with the development tools installed:\n\n```shell\npytest -v\n```\n\nThe second is to use `tox`, which can be installed separately (e.g. with `pip install tox`), i.e. not necessarily inside the virtual environment you use for installing `text_quality`, but then builds the necessary virtual environments itself by simply running:\n\n```shell\ntox\n```\n\nTesting with `tox` allows for keeping the testing environment separate from your development environment.\nThe development environment will typically accumulate (old) packages during development that interfere with testing; this problem is avoided by testing with `tox`.\n\n### Test coverage\n\nIn addition to just running the tests to see if they pass, they can be used for coverage statistics, i.e. to determine how much of the package's code is actually executed during tests.\nIn an activated virtual environment with the development tools installed, inside the package directory, run:\n\n```shell\ncoverage run\n```\n\nThis runs tests and stores the result in a `.coverage` file.\nTo see the results on the command line, run\n\n```shell\ncoverage report\n```\n\n`coverage` can also generate output in HTML and other formats; see `coverage help` for more information.\n\n## Running linters locally\n\nFor linting we will use [prospector](https://pypi.org/project/prospector/) and to sort imports we will use\n[isort](https://pycqa.github.io/isort/). Running the linters requires an activated virtual environment with the\ndevelopment tools installed.\n\n```shell\n# linter\nprospector\n\n# recursively check import style for the text_quality module only\nisort --check-only text_quality\n\n# recursively check import style for the text_quality module only and show\n# any proposed changes as a diff\nisort --check-only --diff text_quality\n\n# recursively fix import style for the text_quality module only\nisort text_quality\n```\n\nTo fix readability of your code style you can use [yapf](https://github.com/google/yapf).\n\nYou can enable automatic linting with `prospector` and `isort` on commit by enabling the git hook from `.githooks/pre-commit`, like so:\n\n```shell\ngit config --local core.hooksPath .githooks\n```\n\n## Generating the Architecture Diagram\n\nThe architecture diagram is stored in the [classes_text_quality.svg](classes_text_quality.svg) file, and displayed in the [README.md](README.md) file.\nTo update it, use [pyreverse](https://pylint.readthedocs.io/en/latest/pyreverse.html) from the [pylint](https://pypi.org/project/pylint/) package:\n\n```shell\npyreverse --output svg --project text_quality text_quality\n```\n\n## Generating the API docs\n\n```shell\ncd docs\nmake html\n```\n\nThe documentation will be in `docs/_build/html`\n\nIf you do not have `make` use\n\n```shell\nsphinx-build -b html docs docs/_build/html\n```\n\nTo find undocumented Python objects run\n\n```shell\ncd docs\nmake coverage\ncat _build/coverage/python.txt\n```\n\nTo [test snippets](https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html) in documentation run\n\n```shell\ncd docs\nmake doctest\n```\n\n## Versioning\n\nBumping the version across all files is done with [bumpversion](https://github.com/c4urself/bump2version), e.g.\n\n```shell\nbumpversion major\nbumpversion minor\nbumpversion patch\n```\n\n## Making a release\n\nThis section describes how to make a release in 3 parts:\n\n1. preparation\n1. making a release on PyPI\n1. making a release on GitHub\n\n### (1/3) Preparation\n\n1. Update the \u003cCHANGELOG.md\u003e (don't forget to update links at bottom of page)\n2. Verify that the information in `CITATION.cff` is correct, and that `.zenodo.json` contains equivalent data\n3. Make sure the [version has been updated](#versioning).\n4. Run the unit tests with `pytest -v`\n\n### SKIP: (2/3) PyPI Release\n\nPublishing an updated package on PyPI manually is not necessary for this project.\nInstead, the [Build and Publish Workflow](.github/workflows/publish-to-test-pypi.yml) is triggered automatically when a new release is created on GitHub in the [next step](#33-github).\n\n### (3/3) GitHub Release\n\nMake a [release on GitHub](https://github.com/laHTeR/htr-quality-classifier/releases/new).\nCreate a new tag in the form `v\u003cX.X.X\u003e`, where `\u003cX.X.X\u003e` is the version number as specified in the [versioning section](#versioning).\n\nThis will also trigger Zenodo into making a snapshot of your repository and sticking a DOI on it (see [Zenodo project page](https://zenodo.org/doi/10.5281/zenodo.8189892)).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flahter%2Fhtr-quality-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flahter%2Fhtr-quality-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flahter%2Fhtr-quality-classifier/lists"}