https://github.com/tomkyle/binning
Determine optimal number of bins š for histogram creation and optimal bin width š using various statistical methods.
https://github.com/tomkyle/binning
binning data-analysis distributions doanes-rule freedman-diaconis histogram histogram-binning math php-math rice-rule scotts-rule square-root statistics sturges-rule terrell-scotts-rule
Last synced: about 1 month ago
JSON representation
Determine optimal number of bins š for histogram creation and optimal bin width š using various statistical methods.
- Host: GitHub
- URL: https://github.com/tomkyle/binning
- Owner: tomkyle
- License: mit
- Created: 2025-06-25T11:52:51.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-06-28T18:51:58.000Z (3 months ago)
- Last Synced: 2025-08-02T22:49:19.774Z (2 months ago)
- Topics: binning, data-analysis, distributions, doanes-rule, freedman-diaconis, histogram, histogram-binning, math, php-math, rice-rule, scotts-rule, square-root, statistics, sturges-rule, terrell-scotts-rule
- Language: PHP
- Homepage:
- Size: 66.4 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# tomkyle/binning
[](https://packagist.org/packages/tomkyle/binning )
[](https://packagist.org/packages/tomkyle/binning )
[](https://github.com/tomkyle/binning/actions/workflows/php.yml)
[](LICENSE.txt)**Determine the optimal š number of bins for histogram creation and optimal bin width š using various statistical methods. Its unified interface includes implementations of well-known binning rules such as:**
- Square Root Rule (1892)
- Sturgesā Rule (1926)
- Doaneās Rule (1976)
- Scottās Rule (1979)
- Freedman-Diaconis Rule (1981)
- Terrell-Scottās Rule (1985)
- Rice University Rule## Requirements
This library requires PHP 8.3 or newer. Support of older versions like [markrogoyski/math-php](https://github.com/markrogoyski/math-php) provides for PHP 7.2+ is not planned.
## Installation
```bash
composer require tomkyle/binning
```## Usage
The **BinSelection** class provides several methods for determining the optimal number of bins for histogram creation and optimal bin width. You can either use specific methods directly or the general `suggestBins()` and `suggestBinWidth()` methods with different strategies.
### Determine Bin Width
Use the **suggestBinWidth** method to get the *optimal bin width* based on the selected method. The method returns the bin width, often referred to as š, as a float value.
```php
ā ļø May overāsmooth heavily skewed or multiāmodal data when IQR is small. |
| **Sturgesā Rule** | Very simple, works well for roughly normal, moderate-sized datasets.
ā ļø Ignores outliers and underestimates bin count for large or skewed samples. |
| **Rice Rule** | Independent of data shape and easy to compute.
ā ļø Prone to overā or underāsmoothing when the distribution is heavyātailed or skewed. |
| **TerrellāScott** | Similar approach as *Rice Rule* but with asymptotically optimal MISE properties; gives more bins than Sturges and adapts better at large š.
ā ļø Still ignores skewness and outliers. |
| **Square Root Rule** | Simply the square root, so it requires no distributional estimates.
ā ļø May produce too few bins for complex distributions ā or too many for very noisy data. |
| **Doaneās Rule** | Extends *Sturgesā Rule* by adding a skewness correction. Improving performance on asymmetric data.
ā ļø Requires estimating the third moment (skewness), which can be unstable for small š. |
| **Scottās Rule** | Uses standard deviation to minimize MISE, providing good balance for unimodal, symmetric data.
ā ļø Sensitive to outliers (inflated $\sigma$) and may underperform on skewed distributions. |## Literature
Rubia, J.M.D.L. (2024):
**Rice University Rule to Determine the Number of Bins.**
Open Journal of Statistics, 14, 119-149.
DOI: [10.4236/ojs.2024.141006](https://doi.org/10.4236/ojs.2024.141006)Wikipedia:
**Histogram / Number of bins and width**
https://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width## Practical Example
```php
BinSelection::STURGES,
'Rice University Rule' => BinSelection::RICE,
'Terrell-Scottās Rule' => BinSelection::TERRELL_SCOTT,
'Square Root Rule' => BinSelection::SQUARE_ROOT,
'Doaneās Rule' => BinSelection::DOANE,
'Scottās Rule' => BinSelection::SCOTT,
'Freedman-Diaconis Rule' => BinSelection::FREEDMAN_DIACONIS,
];foreach ($methods as $name => $method) {
$bins = BinSelection::suggestBins($measurements, $method);
echo sprintf("%-18s: %2d bins\n", $name, $bins);
}
```## Error Handling
All methods will throw `InvalidArgumentException` for invalid inputs:
```php
try {
// This will throw an exception
$bins = BinSelection::sturges([]);
} catch (InvalidArgumentException $e) {
echo "Error: " . $e->getMessage();
// Output: "Dataset cannot be empty to apply the Sturges' Rule."
}try {
// This will throw an exception
$bins = BinSelection::suggestBins($data, 'invalid-method');
} catch (InvalidArgumentException $e) {
echo "Error: " . $e->getMessage();
// Output: "Unknown binning method: invalid-method"
}
```## Development
### Clone repo and install requirements
```bash
$ git clone git@github.com:tomkyle/binning.git
$ composer install
$ pnpm install
```### Watch source and run various tests
This will watch changes inside the **src/** and **tests/** directories and run a series of tests:
1. Find and run the according unit test with *PHPUnit*.
2. Find possible bugs and documentation isses using *phpstan*.
3. Analyse code style and give hints on newer syntax using *Rector*.```bash
$ npm run watch
```**Run PhpUnit**
```bash
$ npm run phpunit
```