{"id":24805367,"url":"https://github.com/viadee/discretizer4j","last_synced_at":"2025-10-13T06:32:17.761Z","repository":{"id":57731045,"uuid":"203154312","full_name":"viadee/discretizer4j","owner":"viadee","description":"Discretize all the things!","archived":false,"fork":false,"pushed_at":"2019-11-04T07:37:39.000Z","size":111,"stargazers_count":7,"open_issues_count":2,"forks_count":0,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-07-06T11:17:48.209Z","etag":null,"topics":["discretizer","java","machinelearning","xai"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/viadee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-19T10:52:22.000Z","updated_at":"2023-04-22T04:20:56.000Z","dependencies_parsed_at":"2022-09-26T22:01:42.993Z","dependency_job_id":null,"html_url":"https://github.com/viadee/discretizer4j","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/viadee/discretizer4j","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viadee%2Fdiscretizer4j","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viadee%2Fdiscretizer4j/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viadee%2Fdiscretizer4j/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viadee%2Fdiscretizer4j/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/viadee","download_url":"https://codeload.github.com/viadee/discretizer4j/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/viadee%2Fdiscretizer4j/sbom","scorecard":{"id":919817,"data":{"date":"2025-08-11","repo":{"name":"github.com/viadee/discretizer4j","commit":"2f113807f6c9af46b588d477c38e7dcfc4bd79c8"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/26 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: BSD 3-Clause \"New\" or \"Revised\" License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}}]},"last_synced_at":"2025-08-25T00:44:16.734Z","repository_id":57731045,"created_at":"2025-08-25T00:44:16.748Z","updated_at":"2025-08-25T00:44:16.748Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279013969,"owners_count":26085429,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["discretizer","java","machinelearning","xai"],"created_at":"2025-01-30T07:18:09.134Z","updated_at":"2025-10-13T06:32:17.416Z","avatar_url":"https://github.com/viadee.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# discretizer4j\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![Build Status](https://travis-ci.org/viadee/discretizer4j.svg?branch=master)](https://travis-ci.org/viadee/discretizer4j)\n[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=de.viadee%3Adiscretizer4j\u0026metric=coverage)](https://sonarcloud.io/dashboard?id=de.viadee%3Adiscretizer4j)\n\nThis project provides a Java implementation of several discretization algorithms (aka binning).\n\nThis is often a useful step in order to cope with overfitting in machine learning models or overly specific explanations from XAI algorithms such as [Anchors](https://github.com/viadee/javaAnchorExplainer), when working with numerical data.\n\nWe concentrate on univariate algorithms, both supervised and unsupervised, to keep things simple and away from decision tree algorithms.\nWe chose the Java language to achieve a reasonable performance, to easily integrate with AnchorsJ (and because we did not find any other  suitable open source java package).\n\nCurrent implementations:\n* Unsupervised: \n    * Equal Frequency in ``PercentileMedianDiscretizer``\n    * [Equal Size](http://users.monash.edu/~webb/Files/YangWebb03b.pdf) in ``EqualSizeDiscretizer``\n    * [Proportional k-Interval Discretizer](http://users.monash.edu/~webb/Files/YangWebb03b.pdf) in ``EqualSizeDiscretizer``\n    * Manual Discretization in ``ManualDiscretizer``\n    * Random Discretization in ``RandomDiscretizer``\n* Supervised: \n    * [FUSINTER Discretizer](https://www.researchgate.net/publication/220354451_FUSINTER_A_Method_for_Discretization_of_Continuous_Attributes) in ``FUSINTERDiscretizer``\n    * [Minimum Description Length Principle Discretizer](https://www.ijcai.org/Proceedings/93-2/Papers/022.pdf) in ``MDLPDiscretizer``\n    * [Ameva Discretizer](https://sci2s.ugr.es/keel/pdf/algorithm/articulo/2009-Gonzalez-Abril-ESWA.pdf) in ``AmevaDiscretizer``\n## Getting Started\n\n\n### Prerequisites and Installation\n\nIn order to use the core project, no installation other than Java (version 8+) is are required. The intended way of using the algorithms is to use them as a maven depencency. Our maven coordinates are as follows:\n\n```xml\n  \u003cdependency\u003e\n    \u003cgroupId\u003ede.viadee\u003c/groupId\u003e\n    \u003cartifactId\u003ediscretizer4j\u003c/artifactId\u003e\n    \u003cversion\u003e1.0.0\u003c/version\u003e    \n  \u003c/dependency\u003e\n```\n    \nThere are no transitive dependencies.\n\n### Using the Algorithm\n\nTo discretize a continuous feature, one has to create a Discretizer (extending the ``AbstractDiscretizer``). The Discretizer then has to be fitted.\nThis may be built as follows: \n\n```Java\nDiscretizer discretizer = new Discretizer();\ndiscretizer.fit(values, labels);\n```\nThe fitted discretizer can then be used to get all ``DiscretizerTransitions``, that have been fitted by the algorithm. \nOr values can be applied to the discretizer, the apply function returns the discretized labels.\n\n```Java\ndiscretizer.getTransitions();\n// returns:\n// DiscretizationTransition From ]1, 14.5) to class 0.0\n// DiscretizationTransition From [14.5, 19.5) to class 1.0\n// DiscretizationTransition From [19.5, 22.5) to class 2.0\n// DiscretizationTransition From [22.5, 36.5) to class 3.0\n// DiscretizationTransition From [36.5, 40[ to class 4.0\n\ndiscretizer.apply(new Double[]{1.5, 17.0, 10.0})\n// returns:\n// Double[0.0, 1.0, 0.0]\n```\n\nThe fitting creates ``DiscretizerTransitions``. These consist of a discretizedLabel (Double) and a discretizedOrigin. \nThe Origin is either a unique value, if the ``UniqueValueDiscretizer`` was used, or a combination of a minValue and maxValue, which determine the Interval limits of the Transition. \n\n### Tutorials and Examples\n\nSmall examples for all implemented discretizers can be found in the unit tests. \n\nTo see these discretizers in a more complex project, please refer to the [XAI Examples](https://github.com/viadee/xai_examples). Here discretization was used in the context of explainable artificial intelligence. \n\n# Collaboration\n\nThe project is operated and further developed by the viadee Consulting AG in Münster, Westphalia. Results from theses at the WWU Münster and the FH Münster have been incorporated. Contact person is Dr. Frank Köhne from viadee.\n* Implementation of additional Discretizers ar planned.\n* Community contributions to the project are welcome: Please open Github-Issues with suggestions (or PR), which we can then edit in the team.\n\n## Authors\n* **Marvin Gronhorst** - [Marvin Gronhorst](https://github.com/MarvinGronhorst)\n* **Tobias Goerke** - [Tobias Goerke](https://github.com/TobiasGoerke)\n* **Colin Juers** - [Colin Juers](https://github.com/cjuers)\n* **Dr. Frank Köhne** - [Dr. Frank Köhne](https://github.com/fkoehne)\n \n\n## License\n\nBSD 3-Clause License\n\n## Acknowledgments\n\n[Garcia et al.](https://sci2s.ugr.es/sites/default/files/files/ComplementaryMaterial/discretization/2013-Garcia-IEEETKDE.pdf) for the extensive research of discretization techniques. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviadee%2Fdiscretizer4j","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fviadee%2Fdiscretizer4j","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fviadee%2Fdiscretizer4j/lists"}