{"id":13605486,"url":"https://github.com/paucablop/chemotools","last_synced_at":"2026-03-11T00:05:50.020Z","repository":{"id":114318227,"uuid":"593739531","full_name":"paucablop/chemotools","owner":"paucablop","description":"Integrate your chemometric tools with the scikit-learn API 🧪 🤖 ","archived":false,"fork":false,"pushed_at":"2025-02-26T22:08:22.000Z","size":31704,"stargazers_count":51,"open_issues_count":20,"forks_count":6,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-30T06:04:20.106Z","etag":null,"topics":["artificial-intelligence","autoencoders","chemometrics","deep-learning","hacktoberfest","ir-spectroscopy","machine-learning","multivariate-analysis","nir-spectroscopy","python","raman-spectroscopy","scikit-learn","sklearn","spectra","spectroscopy"],"latest_commit_sha":null,"homepage":"https://paucablop.github.io/chemotools/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paucablop.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-26T18:25:00.000Z","updated_at":"2025-03-11T16:00:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"421fef3c-c65e-4b9e-a31b-e9ef57e8e5a4","html_url":"https://github.com/paucablop/chemotools","commit_stats":{"total_commits":437,"total_committers":3,"mean_commits":"145.66666666666666","dds":0.080091533180778,"last_synced_commit":"2fd7a4c0949afe450fb5d8aaf46746972fc836ec"},"previous_names":[],"tags_count":39,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paucablop%2Fchemotools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paucablop%2Fchemotools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paucablop%2Fchemotools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paucablop%2Fchemotools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paucablop","download_url":"https://codeload.github.com/paucablop/chemotools/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247445682,"owners_count":20939961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","autoencoders","chemometrics","deep-learning","hacktoberfest","ir-spectroscopy","machine-learning","multivariate-analysis","nir-spectroscopy","python","raman-spectroscopy","scikit-learn","sklearn","spectra","spectroscopy"],"created_at":"2024-08-01T19:00:59.313Z","updated_at":"2026-03-11T00:05:50.011Z","avatar_url":"https://github.com/paucablop.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"![chemotools](assets/images/banner_dark.png)\n\n# chemotools\n\n\n[![PyPI](https://img.shields.io/pypi/v/chemotools)](https://pypi.org/project/chemotools)\n[![Python Versions](https://img.shields.io/pypi/pyversions/chemotools)](https://pypi.org/project/chemotools)\n[![License](https://img.shields.io/pypi/l/chemotools)](https://github.com/paucablop/chemotools/blob/main/LICENSE)\n[![Coverage](https://codecov.io/github/paucablop/chemotools/branch/main/graph/badge.svg?token=D7JUJM89LN)](https://codecov.io/github/paucablop/chemotools)\n[![Downloads](https://static.pepy.tech/badge/chemotools)](https://pepy.tech/project/chemotools)\n[![DOI](https://joss.theoj.org/papers/10.21105/joss.06802/status.svg)](https://doi.org/10.21105/joss.06802)\n[![CodeFactor](https://www.codefactor.io/repository/github/paucablop/chemotools/badge/main)](https://www.codefactor.io/repository/github/paucablop/chemotools/overview/main)\n\n---\n\n`chemotools` is a Python library that brings **chemometric preprocessing tools** into the [`scikit-learn`](https://scikit-learn.org/) ecosystem.  \n\nIt provides modular transformers for spectral data, designed to plug seamlessly into your ML workflows.\n\n## Features\n\n- Preprocessing for spectral data (baseline correction, smoothing, scaling, derivatization, scatter correction).  \n- Fully compatible with `scikit-learn` pipelines and transformers.  \n- Simple, modular API for flexible workflows.  \n- Open-source, actively maintained, and published on [PyPI](https://pypi.org/project/chemotools/) and [Conda](https://anaconda.org/conda-forge/chemotools).  \n\n## Installation\n\nInstall from PyPI:\n\n```bash\npip install chemotools\n````\n\nInstall from Conda:\n\n```bash\nconda install -c conda-forge chemotools\n```\n\n## Usage\n\nExample: preprocessing pipeline with scikit-learn:\n\n```python\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.pipeline import make_pipeline\n\nfrom chemotools.baseline import AirPls\nfrom chemotools.scatter import MultiplicativeScatterCorrection\n\npreprocessing = make_pipeline(\n    AirPls(),\n    MultiplicativeScatterCorrection(),\n    StandardScaler(with_std=False),\n)\n\nspectra_transformed = preprocessing.fit_transform(spectra)\n```\n\n➡️ See the [documentation](https://paucablop.github.io/chemotools/) for full details.\n\n## Development\n\nThis project uses [uv](https://github.com/astral-sh/uv) for dependency management and [Task](https://taskfile.dev) to simplify common development workflows.\nYou can get started quickly by using the predefined [Taskfile](./Taskfile.yml), which provides handy shortcuts such as:\n\n```bash\ntask install     # install all dependencies\ntask check       # run formatting, linting, typing, and tests\ntask test        # quick test run in the current environment\ntask test:matrix # run the nox compatibility matrix locally\ntask coverage    # run tests with coverage reporting\ntask build       # build the package for distribution\n```\n\nFor compatibility testing across supported Python versions, use [`nox`](https://nox.thea.codes/):\n\n```bash\nuv run nox --list               # show available sessions\nuv run nox -s tests-3.12       # run tests on a specific Python version\nuv run nox -s tests-min-sklearn-3.10\nuv run nox -s tests-min-sklearn-3.12\n```\n\n## Contributing\n\nContributions are welcome!\nCheck out the [contributing guide](CONTRIBUTING.md) and the [project board](https://github.com/users/paucablop/projects/4).\n\n## License\n\nReleased under the [MIT License](LICENSE).\n\n## Compliance and Software Supply Chain Management\n\nThis project embraces software supply chain transparency by generating an SBOM (Software Bill of Materials) for all dependencies. SBOMs help organizations, including those in regulated industries, track open-source components, ensure compliance, and manage security risks. \n\nThe SBOM file is made public as an asset attached to every release. It is generated using [CycloneDX SBOM generator for Python](https://github.com/CycloneDX/cyclonedx-python), and can be vsualized in tools like [CycloneDX Sunshine](https://cyclonedx.github.io/Sunshine/).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaucablop%2Fchemotools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaucablop%2Fchemotools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaucablop%2Fchemotools/lists"}