{"id":37027295,"url":"https://github.com/rzykov/fastml4j","last_synced_at":"2026-01-14T03:14:16.472Z","repository":{"id":57722562,"uuid":"95898376","full_name":"rzykov/fastml4j","owner":"rzykov","description":"Fast Scala and nd4j based machine learning framework","archived":false,"fork":false,"pushed_at":"2017-11-19T14:02:11.000Z","size":104,"stargazers_count":18,"open_issues_count":8,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-12-23T09:40:46.670Z","etag":null,"topics":["machine-learning","machine-learning-algorithms","nd4j","scala"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rzykov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-06-30T14:55:11.000Z","updated_at":"2022-03-08T11:00:01.000Z","dependencies_parsed_at":"2022-09-14T01:14:11.159Z","dependency_job_id":null,"html_url":"https://github.com/rzykov/fastml4j","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rzykov/fastml4j","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rzykov%2Ffastml4j","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rzykov%2Ffastml4j/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rzykov%2Ffastml4j/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rzykov%2Ffastml4j/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rzykov","download_url":"https://codeload.github.com/rzykov/fastml4j/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rzykov%2Ffastml4j/sbom","scorecard":{"id":793406,"data":{"date":"2025-08-11","repo":{"name":"github.com/rzykov/fastml4j","commit":"f8985eb3b48a3230fe00488d3ffa9231520e5925"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":0,"reason":"Found 0/29 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 2 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-23T08:13:12.078Z","repository_id":57722562,"created_at":"2025-08-23T08:13:12.078Z","updated_at":"2025-08-23T08:13:12.078Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28408818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T01:52:23.358Z","status":"online","status_checked_at":"2026-01-14T02:00:06.678Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","machine-learning-algorithms","nd4j","scala"],"created_at":"2026-01-14T03:14:15.783Z","updated_at":"2026-01-14T03:14:16.466Z","avatar_url":"https://github.com/rzykov.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fastml4j\nFast Scala and nd4j based machine learning experimental framework. It's based on \n[ND4J](https://github.com/deeplearning4j/nd4j), the library for a scientific computing on the JVM. \n\n## Goal\nMy goal is to create simple and fast  machine learning library for Scala and, if possible, \nfor JVM.  \n\nI work heavily with the Spark's MLLib and after that I want to say: Don't use the distributed machine \nlearning algorithms until you really need it. We prefer to prepare datasets \nin the distributed manner but \nprefer to use non-distributed ML libraries due to performance reasons. I saw a case when \nMLLib's  Random forests took 2 hours to train and the same method from the Smile library \ntook only 5 minutes. You'll always pay for the distributed ML. \n\nUnfortunately, there is a lack of good machine learning libraries for Scala. I found a very \ngood library: [Smile](https://github.com/haifengl/smile). It's been written in Java, fast. But it doesn't use a vectorwised \napproach. The vectorwise approach makes code more readable and faster. You can use matrix maths \noperations with it with only a few symbols.\n\n## Features\n* Logistic regression\n* Linear SVM\n* Linear regression (classical OLS)\n* Binary classification metrics\n* Regression metrics\n\n## Roadmap\n* Decision trees with categorical variables and missing data\n* General ensembles\n* Random forests\n* Advanced optimising algorithms\n* Gradient boosted trees. I'm going to use the existed library [Catboost](https://github.com/catboost/catboost) \nvia [JavaCPP](https://github.com/bytedeco/javacpp-presets). It's hard to write own good GBRT.\n\n## Getting started\n* Build:\n```bash\ngit clone https://github.com/rzykov/fastml4j\ncd fastml4j\nsbt package\n```   \n* [Scala Doc for this package](https://rzykov.github.io/fastml4j/api/)\n* Link to Scala notebook\n* How to use it from maven\n```\nApache Maven\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.github.rzykov\u003c/groupId\u003e\n    \u003cartifactId\u003efastml4j_2.11\u003c/artifactId\u003e\n    \u003cversion\u003e0.1\u003c/version\u003e\n\u003c/dependency\u003e\n    \nApache Buildr\n'com.github.rzykov:fastml4j_2.11:jar:0.1'\n   \nApache Ivy\n\u003cdependency org=\"com.github.rzykov\" name=\"fastml4j_2.11\" rev=\"0.1\" /\u003e\n    \nGroovy Grape\n@Grapes( \n@Grab(group='com.github.rzykov', module='fastml4j_2.11', version='0.1') \n)\n    \nGradle/Grails\ncompile 'com.github.rzykov:fastml4j_2.11:0.1'\n    \nScala SBT\nlibraryDependencies += \"com.github.rzykov\" % \"fastml4j_2.11\" % \"0.1\"\n    \nLeiningen\n[com.github.rzykov/fastml4j_2.11 \"0.1\"]\n```\n\n## FAQ\n* __Why did you choose ND4J?__\n  It's supports many platforms and  NVIDIA CUDA GPU out of the box. \n* __Why did you write own implicits instead of using nd4s library?__\n  Nd4s looks too complicated for my purposes. Also found that it doesn't contain some DSL \n  elements. Personally, I don't like Implicits, but in this case they are in the right place.\n  My preferable way of using them is to import them explicitly in any source file. \n  It gives a hint to the reader to show up the fact for using implicits and where to find them.  \n* __Why did you choose Float rather than Double?__\n  ND4J uses Float as a default type. It looks reasonable  because it saves a memory. \n  I also want to conduct some experiments with the Raspberry Pie in future which have 1 GB of RAM only.\n \n## Contributions\nI'm interesting in a code review of this project and getting a feedback from users.\n\n## Why Scala for machine learning?\n\nIt's biased to Spark but also works for the general data science.\n\nIt's very important to choose a right tool for a data analysis. There are lots of questions like \"What Machine Learning tool is better?\" at Kaggle.com's forums. Top ranks are occupied by R and Python. I will try to write about my migration to the Scala/Spark technology stack.\n\nWe do machine learning tasks at a very large scale in Retailrocket.net. Previously we used  IPython + Pyhs2 (hive driver in Python) + Pandas + Sklearn for building prototypes. We decided to move to Apache Spark at the end of summer of 2014 because our experiments with this one showed 3-4x raising of performance on the same cluster.\n\nAt that time, four languages  were used simultaneously: Hive, Pig, Java, Python. Sometimes we'd got very serious problems with this \"Zoo\". With Spark, we could use only one programming language for prototypes and a code in production. This is a great benefit for our small team.\n\nSpark supports Python/Scala/Java via API. Spark is written in Scala hence, it was chosen as the main development language by our team. We can analyse the source code of Spark and make patches for it. Also, Scala is a JVM (Java) based language. It's very important because Hadoop was written in Java.\n\nScala:\n* (+) functional; nice for data scientists\n* (+) native for Spark; important, if you want to learn Spark internals\n* (+) based on JVM; so it's native for Hadoop\n* (+) strong static types; get errors at the compilation stage\n* (-) hard to learn; \n\nPython:\n* (+) popular;\n* (+) simple;\n* (-) dynamic typing;\n* (-) performance is worse than Scala.\n\nJava:\n* (+) popular;\n* (+) native for Hadoop;\n* (-) too many lines of code. I remember the German language with very long words like \"SchadschtoffFilteranlage\" when reading a Java code. :-)\n\nThis choice was hard because no one knows Scala. But after a year and half I can say that Scala is a mix of Java and Python: a conciseness of Python and a power of Java.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frzykov%2Ffastml4j","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frzykov%2Ffastml4j","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frzykov%2Ffastml4j/lists"}