https://github.com/IntelPython/scikit-learn_bench

scikit-learn_bench benchmarks various implementations of machine learning algorithms across data analytics frameworks. It currently support the scikit-learn, DAAL4PY, cuML, and XGBoost frameworks for commonly used machine learning algorithms.
https://github.com/IntelPython/scikit-learn_bench

benchmarks daal4py hacktoberfest machine-learning machine-learning-benchmarks scikit-learn-benchmarks

Last synced: 8 months ago
JSON representation

Host: GitHub
URL: https://github.com/IntelPython/scikit-learn_bench
Owner: IntelPython
License: apache-2.0
Created: 2018-10-02T03:30:53.000Z (almost 7 years ago)
Default Branch: main
Last Pushed: 2024-11-21T16:44:34.000Z (8 months ago)
Last Synced: 2024-11-21T17:39:42.774Z (8 months ago)
Topics: benchmarks, daal4py, hacktoberfest, machine-learning, machine-learning-benchmarks, scikit-learn-benchmarks
Language: Python
Homepage:
Size: 824 KB
Stars: 113
Watchers: 13
Forks: 69
Open Issues: 9
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md

Awesome Lists containing this project

README

        # Machine Learning Benchmarks

[![Build Status](https://dev.azure.com/daal/scikit-learn_bench/_apis/build/status/IntelPython.scikit-learn_bench?branchName=main)](https://dev.azure.com/daal/scikit-learn_bench/_build/latest?definitionId=8&branchName=main)

**Scikit-learn_bench** is a benchmark tool for libraries and frameworks implementing Scikit-learn-like APIs and other workloads.

Benefits:

- Full control of benchmarks suite through CLI

- Flexible and powerful benchmark config structure

- Available with advanced profiling tools, such as Intel(R) VTune* Profiler

- Automated benchmarks report generation

### 📜 Table of Contents

- [Machine Learning Benchmarks](#machine-learning-benchmarks)

  - [🔧 Create a Python Environment](#-create-a-python-environment)

  - [🚀 How To Use Scikit-learn\_bench](#-how-to-use-scikit-learn_bench)

    - [Benchmarks Runner](#benchmarks-runner)

    - [Report Generator](#report-generator)

    - [Scikit-learn\_bench High-Level Workflow](#scikit-learn_bench-high-level-workflow)

  - [📚 Benchmark Types](#-benchmark-types)

  - [📑 Documentation](#-documentation)

## 🔧 Create a Python Environment

How to create a usable Python environment with the following required frameworks:

- **sklearn, sklearnex, and gradient boosting frameworks**:

```bash

# with pip

pip install -r envs/requirements-sklearn.txt

# or with conda

conda env create -n sklearn -f envs/conda-env-sklearn.yml

```

- **RAPIDS**:

```bash

conda env create -n rapids --solver=libmamba -f envs/conda-env-rapids.yml

```

## 🚀 How To Use Scikit-learn_bench

### Benchmarks Runner

How to run benchmarks using the `sklbench` module and a specific configuration:

```bash

python -m sklbench --config configs/sklearn_example.json

```

The default output is a file with JSON-formatted results of benchmarking cases. To generate a better human-readable report, use the following command:

```bash

python -m sklbench --config configs/sklearn_example.json --report

```

By default, output and report file paths are `result.json` and `report.xlsx`. To specify custom file paths, run:

```bash

python -m sklbench --config configs/sklearn_example.json --report --result-file result_example.json --report-file report_example.xlsx

```

For a description of all benchmarks runner arguments, refer to [documentation](sklbench/runner/README.md#arguments).

### Report Generator

To combine raw result files gathered from different environments, call the report generator:

```bash

python -m sklbench.report --result-files result_1.json result_2.json --report-file report_example.xlsx

```

For a description of all report generator arguments, refer to [documentation](sklbench/report/README.md#arguments).

### Scikit-learn_bench High-Level Workflow

```mermaid

flowchart TB

    A[User] -- High-level arguments --> B[Benchmarks runner]

    B -- Generated benchmarking cases --> C["Benchmarks collection"]

    C -- Raw JSON-formatted results --> D[Report generator]

    D -- Human-readable report --> A

    classDef userStyle fill:#44b,color:white,stroke-width:2px,stroke:white;

    class A userStyle

```

## 📚 Benchmark Types

**Scikit-learn_bench** supports the following types of benchmarks:

 - **Scikit-learn estimator** - Measures performance and quality metrics of the [sklearn-like estimator](https://scikit-learn.org/stable/glossary.html#term-estimator).

 - **Function** - Measures performance metrics of specified function.

## 📑 Documentation

[Scikit-learn_bench](README.md):

- [Configs](configs/README.md)

- [Benchmarks Runner](sklbench/runner/README.md)

- [Report Generator](sklbench/report/README.md)

- [Benchmarks](sklbench/benchmarks/README.md)

- [Data Processing and Storage](sklbench/datasets/README.md)

- [Emulators](sklbench/emulators/README.md)

- [Developer Guide](docs/README.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/IntelPython/scikit-learn_bench

Awesome Lists containing this project

README