Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/haifengl/smile
Statistical Machine Intelligence & Learning Engine
https://github.com/haifengl/smile
classification clustering computer-algebra-system computer-vision data-science dataframe deep-learning genetic-algorithm interpolation linear-algebra llm machine-learning manifold-learning multidimensional-scaling nearest-neighbor-search nlp regression statistics visualization wavelet
Last synced: 3 days ago
JSON representation
Statistical Machine Intelligence & Learning Engine
- Host: GitHub
- URL: https://github.com/haifengl/smile
- Owner: haifengl
- License: other
- Created: 2014-11-20T16:28:12.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2025-01-02T22:54:26.000Z (about 1 month ago)
- Last Synced: 2025-01-03T19:30:49.980Z (about 1 month ago)
- Topics: classification, clustering, computer-algebra-system, computer-vision, data-science, dataframe, deep-learning, genetic-algorithm, interpolation, linear-algebra, llm, machine-learning, manifold-learning, multidimensional-scaling, nearest-neighbor-search, nlp, regression, statistics, visualization, wavelet
- Language: Java
- Homepage: https://haifengl.github.io
- Size: 244 MB
- Stars: 6,081
- Watchers: 268
- Forks: 1,135
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: COPYING
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- awesome-java - Smile - Statistical Machine Intelligence and Learning Engine provides a set of machine learning algorithms and a visualization library. (Projects / Machine Learning)
- StarryDivineSky - haifengl/smile
- awesome-vega - Smile - Scala (Vega-Lite). (Wrappers / Papers)
- awesome-scala - **smile** - activity/y/haifengl/smile) (Table of Contents / Science and Data Analysis)
- my-awesome - haifengl/smile - algebra-system,computer-vision,data-science,dataframe,deep-learning,genetic-algorithm,interpolation,linear-algebra,llm,machine-learning,manifold-learning,multidimensional-scaling,nearest-neighbor-search,nlp,regression,statistics,visualization,wavelet pushed_at:2025-02 star:6.1k fork:1.1k Statistical Machine Intelligence & Learning Engine (Java)
README
# Smile — Statistical Machine Intelligence and Learning Engine
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/com.github.haifengl/smile-core/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.github.haifengl/smile-core)
## Goal ##
![]()
Smile is a fast and comprehensive machine learning framework in Java.
Smile also provides APIs in Scala, Kotlin, and Clojure with
corresponding language paradigms. With advanced data structures and
algorithms, Smile delivers state-of-art performance.
Smile covers every aspect of machine learning, including deep learning,
large language models, classification, regression, clustering, association
rule mining, feature selection and extraction, manifold learning,
multidimensional scaling, genetic algorithms, missing value imputation,
efficient nearest neighbor search, etc. Furthermore, Smile also provides
advanced algorithms for graph, linear algebra, numerical analysis,
interpolation, computer algebra system for symbolic manipulations,
and data visualization.## Features ##
Smile implements the following major machine learning algorithms:- **GenAI:**
Native Java implementation of Llama 3.1, tiktoken tokenizer, high performance
LLM inference server with OpenAI-compatible APIs and SSE-based chat streaming,
fully functional frontend. [A free service](https://smile-ai.org) is available
for personal or test usage. No registration is required.- **Deep Learning:**
Deep learning with CPU and GPU. EfficientNet model for image classification.- **Classification:**
Support Vector Machines, Decision Trees, AdaBoost, Gradient Boosting,
Random Forest, Logistic Regression, Neural Networks, RBF Networks,
Maximum Entropy Classifier, KNN, Naïve Bayesian,
Fisher/Linear/Quadratic/Regularized Discriminant Analysis.- **Regression:**
Support Vector Regression, Gaussian Process, Regression Trees,
Gradient Boosting, Random Forest, RBF Networks, OLS, LASSO, ElasticNet,
Ridge Regression.- **Feature Selection:**
Genetic Algorithm based Feature Selection, Ensemble Learning based Feature
Selection, TreeSHAP, Signal Noise ratio, Sum Squares ratio.- **Clustering:**
BIRCH, CLARANS, DBSCAN, DENCLUE, Deterministic Annealing, K-Means,
X-Means, G-Means, Neural Gas, Growing Neural Gas, Hierarchical
Clustering, Sequential Information Bottleneck, Self-Organizing Maps,
Spectral Clustering, Minimum Entropy Clustering.- **Association Rule & Frequent Itemset Mining:**
FP-growth mining algorithm.- **Manifold Learning:**
IsoMap, LLE, Laplacian Eigenmap, t-SNE, UMAP, PCA, Kernel PCA,
Probabilistic PCA, GHA, Random Projection, ICA.- **Multi-Dimensional Scaling:**
Classical MDS, Isotonic MDS, Sammon Mapping.- **Nearest Neighbor Search:**
BK-Tree, Cover Tree, KD-Tree, SimHash, LSH.- **Sequence Learning:**
Hidden Markov Model, Conditional Random Field.- **Natural Language Processing:**
Sentence Splitter and Tokenizer, Bigram Statistical Test, Phrase Extractor,
Keyword Extractor, Stemmer, POS Tagging, Relevance Ranking## License ##
SMILE employs a dual license model designed to meet the development
and distribution needs of both commercial distributors (such as OEMs,
ISVs and VARs) and open source projects. For details, please see
[LICENSE](https://github.com/haifengl/smile/blob/master/LICENSE).
To acquire a commercial license, please contact [email protected].## Issues/Discussions ##
* **Discussion/Questions**:
If you wish to ask questions about Smile, we're active on [GitHub Discussions](https://github.com/haifengl/smile/discussions) and [Stack Overflow](http://stackoverflow.com/questions/tagged/smile).* **Docs**:
Smile is well documented and [our docs are available online](https://haifengl.github.io/), where you can find tutorial,
programming guides, and more information. If you'd like to help improve the docs, they're part of this repository
in the `web/src` directory. [Java Docs](https://haifengl.github.io/api/java/index.html),
[Scala Docs](https://haifengl.github.io/api/scala/index.html), [Kotlin Docs](https://haifengl.github.io/api/kotlin/index.html),
and [Clojure Docs](https://haifengl.github.io/api/clojure/index.html) are also available.* **Issues/Feature Requests**:
Finally, any bugs or features, please report to our [issue tracker](https://github.com/haifengl/smile/issues/new).## Installation ##
You can use the libraries through Maven central repository by adding the
following to your project pom.xml file.
```
com.github.haifengl
smile-core
4.2.0
```For deep learning and NLP, use the artifactId smile-deep and smile-nlp, respectively.
For Scala API, please add the below into your sbt script.
```
libraryDependencies += "com.github.haifengl" %% "smile-scala" % "4.2.0"
```For Kotlin API, add the below into the `dependencies` section
of Gradle build script.
```
implementation("com.github.haifengl:smile-kotlin:4.2.0")
```For Clojure API, add the following dependency to your project file:
```
[org.clojars.haifengl/smile "4.2.0"]
```Some algorithms rely on BLAS and LAPACK (e.g. manifold learning,
some clustering algorithms, Gaussian Process regression, MLP, etc.).
To use these algorithms, you should include OpenBLAS for optimized matrix
computation:
```
libraryDependencies ++= Seq(
"org.bytedeco" % "javacpp" % "1.5.11" classifier "macosx-arm64" classifier "macosx-x86_64" classifier "windows-x86_64" classifier "linux-x86_64",
"org.bytedeco" % "openblas" % "0.3.28-1.5.11" classifier "macosx-arm64" classifier "macosx-x86_64" classifier "windows-x86_64" classifier "linux-x86_64",
"org.bytedeco" % "arpack-ng" % "3.9.1-1.5.11" classifier "macosx-x86_64" classifier "windows-x86_64" classifier "linux-x86_64"
)
```
In this example, we include all supported 64-bit platforms and filter out
32-bit platforms. The user should include only the needed platforms to save
spaces.If you prefer other BLAS implementations, you can use any library found on
the "java.library.path" or on the class path, by specifying it with the
"org.bytedeco.openblas.load" system property. For example, to use the BLAS
library from the Accelerate framework on Mac OS X, we can pass options such
as `-Dorg.bytedeco.openblas.load=blas`.If you have a default installation of MKL or simply include the following
modules that include the full version of MKL binaries, Smile will automatically
switch to MKL.
```
libraryDependencies ++= {
val version = "2025.0-1.5.11"
Seq(
"org.bytedeco" % "mkl-platform" % version,
"org.bytedeco" % "mkl-platform-redist" % version
)
}
```## Shell ##
Smile comes with interactive shells for Java, Scala and Kotlin.
Download pre-packaged Smile from the
[releases page](https://github.com/haifengl/smile/releases).
After unziping the package and cd into the home directory of Smile
in a terminal, type
```
./bin/jshell.sh
```
to enter Smile shell in Java, which pre-imports all major Smile packages.
You can run any valid Java expressions in the shell. In the simplest case,
you can use it as a calculator.To enter the shell in Scala, type
```
./bin/smile
```
Similar to the shell in Java, all major Smile packages are pre-imported.
Besides, all high-level Smile operators are predefined in the shell.By default, the shell uses up to 75% memory. If you need more memory
to handle large data, use the option `-J-Xmx` or `-XX:MaxRAMPercentage`.
For example,
```
./bin/smile -J-Xmx30G
```
You can also modify the configuration file `./conf/smile.ini` for the
memory and other JVM settings.To use Smile shell in Kotlin, type
```
./bin/kotlin.sh
```
Unfortunately, Kotlin shell doesn't support pre-import packages.## Model Serialization ##
Most models support the Java `Serializable` interface (all classifiers
do support `Serializable` interface) so that you can serialze a model
and ship it to a production environment for inference. You may also
use serialized models in other systems such as Spark.## Visualization ##
A picture is worth a thousand words. In machine learning, we usually handle
high-dimensional data, which is impossible to draw on display directly.
But a variety of statistical plots are tremendously valuable for us to grasp
the characteristics of many data points. Smile provides data visualization tools
such as plots and maps for researchers to understand information more easily and quickly.
To use smile-plot, add the following to dependencies
```
com.github.haifengl
smile-plot
4.2.0
```On Swing-based systems, the user may leverage `smile.plot.swing` package to
create a variety of plots such as scatter plot, line plot, staircase plot,
bar plot, box plot, histogram, 3D histogram, dendrogram, heatmap, hexmap,
QQ plot, contour plot, surface, and wireframe.This library also support data visualization in declarative approach.
With `smile.plot.vega` package, we can create a specification
that describes visualizations as mappings from data to properties
of graphical marks (e.g., points or bars). The specification is
based on [Vega-Lite](https://vega.github.io/vega-lite/). In a web browser,
the Vega-Lite compiler automatically produces visualization components
including axes, legends, and scales. It then determines properties
of these components based on a set of carefully designed rules.## Contributing ##
Please read the [contributing.md](CONTRIBUTING.md) on how to build and test Smile.## Maintainers ##
- Haifeng Li (@haifengl)
- Karl Li (@kklioss)## Gallery
Scatterplot Matrix
![]()
Scatter Plot
![]()
Line Plot
![]()
Surface Plot
![]()
Bar Plot
![]()
Box Plot
![]()
Histogram Heatmap
![]()
Rolling Average
![]()
Geo Map
![]()
UMAP
![]()
Text Plot
![]()
Heatmap with Contour
![]()
Hexmap
![]()
IsoMap
![]()
LLE
![]()
Kernel PCA
![]()
Neural Network
![]()
SVM
![]()
Hierarchical Clustering
![]()
SOM
![]()
DBSCAN
![]()
Neural Gas
![]()
Wavelet
![]()
Exponential Family Mixture
![]()
Teapot Wireframe
![]()
Grid Interpolation