Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/oneapi-src/oneDAL

oneAPI Data Analytics Library (oneDAL)
https://github.com/oneapi-src/oneDAL

ai-inference ai-machine-learning ai-training analytics big-data cpp data-analysis data-science hacktoberfest machine-learning machine-learning-algorithms oneapi onedal swrepo

Last synced: about 2 months ago
JSON representation

oneAPI Data Analytics Library (oneDAL)

Lists

README

        



# oneAPI Data Analytics Library

[Installation](#installation)   |   [Documentation](#documentation)   |   [Support](#support)   |   [Examples](#examples)   |   [How to Contribute](CONTRIBUTING.md)   

[![Build Status](https://dev.azure.com/daal/DAAL/_apis/build/status/oneapi-src.oneDAL?branchName=main)](https://dev.azure.com/daal/DAAL/_build/latest?definitionId=5&branchName=main) [![License](https://img.shields.io/github/license/oneapi-src/oneDAL.svg)](https://github.com/oneapi-src/oneDAL/blob/main/LICENSE) [![Join the community on GitHub Discussions](https://badgen.net/badge/join%20the%20discussion/on%20github/black?icon=github)](https://github.com/oneapi-src/oneDAL/discussions)

oneAPI Data Analytics Library (oneDAL) is a powerful machine learning library that helps you accelerate big data analysis at all stages: **preprocessing**, **transformation**, **analysis**, **modeling**, **validation**, and **decision making**.

The library implements classical machine learning algorithms. The boost in their performance is achieved by leveraging the capabilities of Intel® hardware.

oneDAL is part of [oneAPI](https://oneapi.io). The current branch implements version 1.1 of oneAPI Specification.

## Usage

There are different ways for you to build high-performance data science applications that use the advantages of oneDAL:
- Use oneDAL C++ interfaces with or without SYCL support ([learn more](https://oneapi-src.github.io/oneDAL/#oneapi-vs-daal-interfaces))
- Use [Intel(R) Extension for Scikit-learn*](https://intel.github.io/scikit-learn-intelex/) to accelerate existing scikit-learn code without changing it
- Use [daal4py](https://github.com/intel/scikit-learn-intelex/tree/main/daal4py), a standalone package with Python API for oneDAL
Deprecation Notice: The Java interfaces are removed from the oneDAL library.

## Installation

Check [System Requirements](https://oneapi-src.github.io/oneDAL/system-requirements.html) before installing oneDAL.

You can [download the specific version of oneDAL](https://github.com/oneapi-src/oneDAL/releases) or [install it from sources](INSTALL.md).

## Examples

C++ Examples:

- [oneAPI interfaces with SYCL support](https://github.com/oneapi-src/oneDAL/tree/main/examples/oneapi/dpc)
- [oneAPI interfaces without SYCL support](https://github.com/oneapi-src/oneDAL/tree/main/examples/oneapi/cpp)
- [DAAL interfaces](https://github.com/oneapi-src/oneDAL/tree/main/examples/daal/cpp)

Python Examples:
- [scikit-learn-intelex](https://github.com/intel/scikit-learn-intelex/tree/main/examples/notebooks)
- [daal4py](https://github.com/intel/scikit-learn-intelex/tree/main/examples/daal4py)

Other Examples

- [MPI](https://github.com/oneapi-src/oneDAL/tree/main/samples/daal/cpp/mpi)
- [MySQL](https://github.com/oneapi-src/oneDAL/tree/main/samples/daal/cpp/mysql)

## Documentation

oneDAL documentation:

- [Release Notes](https://github.com/oneapi-src/oneDAL/releases)
- [Get Started Guide](https://oneapi-src.github.io/oneDAL/quick-start.html)
- [Developer Guide and Reference](http://oneapi-src.github.io/oneDAL/)

Other related documentation:

- [daal4py documentation](https://intelpython.github.io/daal4py/)
- [Intel(R) Extension for Scikit-learn* documentation](https://intel.github.io/scikit-learn-intelex/)
- [oneDAL Specifications](https://spec.oneapi.com/versions/latest/elements/oneDAL/source/index.html)

## Apache Spark MLlib

oneDAL library is used for Spark MLlib acceleration as part of [OAP MLlib](https://github.com/oap-project/oap-mllib) project and allows you to get a **3-18x** increase in performance compared to the default Apache Spark MLlib.

>*Technical details: FPType: double; HW: 7 x m5.2xlarge AWS instances; SW: Intel DAAL 2020 Gold, Apache Spark 2.4.4, emr-5.27.0; Spark config num executors 12, executor cores 8, executor memory 19GB, task cpus 8*

## Scaling

oneDAL supports distributed computation mode that shows excellent results for strong and weak scaling:

oneDAL K-Means fit, strong scaling result | oneDAL K-Means fit, weak scaling results
:-------------------------:|:-------------------------:
![](docs/readme-charts/Intel%20oneDAL%20KMeans%20strong%20scaling.png) | ![](docs/readme-charts/intel%20oneDAL%20KMeans%20weak%20scaling.png)

>*Technical details: FPType: float32; HW: Intel Xeon Processor E5-2698 v3 @2.3GHz, 2 sockets, 16 cores per socket; SW: Intel® DAAL (2019.3), MPI4Py (3.0.0), Intel® Distribution Of Python (IDP) 3.6.8; Details available in the article https://arxiv.org/abs/1909.11822*

## Support

Ask questions and engage in discussions with oneDAL developers, contributers, and other users through the following channels:

- [GitHub Discussions](https://github.com/oneapi-src/oneDAL/discussions)
- [Community Forum](https://community.intel.com/t5/Intel-oneAPI-Data-Analytics/bd-p/oneapi-data-analytics-library)

You may reach out to project maintainers privately at [email protected].

### Security

To report a vulnerability, refer to [Intel vulnerability reporting policy](https://www.intel.com/content/www/us/en/security-center/default.html).

### Contribute

We welcome community contributions. Check our [contributing guidelines](CONTRIBUTING.md) to learn more.

## License

oneDAL is distributed under the Apache License 2.0 license. See [LICENSE](LICENSE) for more information.

[oneMKL FPK microlibs](https://github.com/oneapi-src/oneDAL/releases/tag/Dependencies)
are distributed under Intel Simplified Software License.
Refer to [third-party-programs-mkl.txt](third-party-programs-mkl.txt) for details.