{"id":13893929,"url":"https://github.com/RobTillaart/Histogram","last_synced_at":"2025-07-17T08:31:44.262Z","repository":{"id":45346819,"uuid":"271550834","full_name":"RobTillaart/Histogram","owner":"RobTillaart","description":"Arduino library for creating histograms","archived":false,"fork":false,"pushed_at":"2024-05-26T09:04:51.000Z","size":70,"stargazers_count":10,"open_issues_count":0,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-11-19T06:33:47.895Z","etag":null,"topics":["arduino","histogram","math","statistics"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RobTillaart.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"RobTillaart","custom":"https://www.paypal.me/robtillaart"}},"created_at":"2020-06-11T13:17:56.000Z","updated_at":"2024-10-01T17:45:58.000Z","dependencies_parsed_at":"2023-02-12T17:30:28.657Z","dependency_job_id":"1728ea01-0007-4a8e-b530-f3c80b5dab6f","html_url":"https://github.com/RobTillaart/Histogram","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RobTillaart%2FHistogram","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RobTillaart%2FHistogram/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RobTillaart%2FHistogram/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RobTillaart%2FHistogram/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RobTillaart","download_url":"https://codeload.github.com/RobTillaart/Histogram/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226243893,"owners_count":17594452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arduino","histogram","math","statistics"],"created_at":"2024-08-06T18:01:20.099Z","updated_at":"2024-11-24T22:30:58.525Z","avatar_url":"https://github.com/RobTillaart.png","language":"C++","funding_links":["https://github.com/sponsors/RobTillaart","https://www.paypal.me/robtillaart"],"categories":["C++"],"sub_categories":[],"readme":"\n[![Arduino CI](https://github.com/RobTillaart/Histogram/workflows/Arduino%20CI/badge.svg)](https://github.com/marketplace/actions/arduino_ci)\n[![Arduino-lint](https://github.com/RobTillaart/Histogram/actions/workflows/arduino-lint.yml/badge.svg)](https://github.com/RobTillaart/Histogram/actions/workflows/arduino-lint.yml)\n[![JSON check](https://github.com/RobTillaart/Histogram/actions/workflows/jsoncheck.yml/badge.svg)](https://github.com/RobTillaart/Histogram/actions/workflows/jsoncheck.yml)\n[![GitHub issues](https://img.shields.io/github/issues/RobTillaart/Histogram.svg)](https://github.com/RobTillaart/Histogram/issues)\n\n[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/RobTillaart/Histogram/blob/master/LICENSE)\n[![GitHub release](https://img.shields.io/github/release/RobTillaart/Histogram.svg?maxAge=3600)](https://github.com/RobTillaart/Histogram/releases)\n[![PlatformIO Registry](https://badges.registry.platformio.org/packages/robtillaart/library/Histogram.svg)](https://registry.platformio.org/libraries/robtillaart/Histogram)\n\n\n# Histogram\n\nArduino library for creating histograms math.\n\n\n## Description\n\nOne of the main applications for the Arduino board is reading and logging of sensor data.\nWe often want to make a histogram of this data to get insight of the distribution of the\nmeasurements. This is where this Histogram library comes in.\n\nThe Histogram distributes the values added to it into buckets and keeps count per bucket.\n\nIf you need more quantitative analysis, you might need the statistics library, \n- https://github.com/RobTillaart/Statistic\n\n\n#### Related\n\n- https://github.com/RobTillaart/Correlation\n- https://github.com/RobTillaart/GST - Golden standard test metrics\n- https://github.com/RobTillaart/Histogram\n- https://github.com/RobTillaart/Kurtosis\n- https://github.com/RobTillaart/RunningAngle\n- https://github.com/RobTillaart/RunningAverage\n- https://github.com/RobTillaart/RunningMedian\n- https://github.com/RobTillaart/statHelpers - combinations \u0026 permutations\n- https://github.com/RobTillaart/Statistic\n\n\n#### Working\n\nWhen the class is initialized an array of the boundaries to define the borders of the\nbuckets is passed to the constructor. This array should be declared global as the\nHistogram class does not copy the values to keep memory usage low. This allows to change\nthe boundaries runtime, so after a **clear()**, a new Histogram can be created.\n\nThe values in the boundary array do not need to be equidistant (equal in size)\nbut they need to be in ascending order.\n\nInternally the library does not record the individual values, only the count per bucket.\nIf a new value is added - **add(value)** - the class checks in which bucket it \nbelongs and the buckets counter is increased.\n\nThe **sub(value)** function is used to decrease the count of a bucket and it can \ncause the count to become below zero. \nAlthough seldom used but still depending on the application it can be useful. \nE.g. when you want to compare two value generating streams, you let \none stream **add()** and the other **sub()**. If the histogram of both streams is \nsimilar they should cancel each other out (more or less), and the value of all buckets \nshould be around 0. \\[not tried\\].\n\nThe **frequency()** function may be removed to reduce footprint as it can be calculated \nwith the formula **(1.0 \\* bucket(i))/count()**.\n\n\n#### Experimental: Histogram8 Histogram16\n\nHistogram8 and Histogram16 are derived classes with same interface but smaller buckets. \nHistogram can count to ± 2^31 while often ± 2^15 or even ± 2^7 is sufficient. \nSaves substantial memory.\n\n|  class name   |  length  |  count/bucket  |  max memory  |\n|:--------------|---------:|---------------:|-------------:|\n|  Histogram    |   65534  |  ± 2147483647  |      260 KB  |\n|  Histogram8   |   65534  |  ± 127         |       65 KB  |\n|  Histogram16  |   65534  |  ± 32767       |      130 KB  |\n\n\nThe difference is the **\\_data** array, to reduce the memory footprint.\n\nNote: max memory is without the boundary array.\n\nPerformance optimizations are possible too however not essential for \nthe experimental version.\n\n\n## Interface \n\n```cpp\n#include \"histogram.h\"\n```\n\n#### Constructor\n\n- **Histogram(uint16_t length, float \\*bounds)** constructor, get an array of boundary values and array length. \nLength should be less than 65534.\n- **Histogram8(uint16_t length, float \\*bounds)** idem as above.\n- **Histogram16(uint16_t length, float \\*bounds)** idem as above.\n- **~Histogram()** destructor.\n- **~Histogram8()** destructor.\n- **~Histogram16()** destructor.\n\n\n#### MaxBucket\n\nDefault the maxBucket size is defined as 255 (8 bit), 65535 (16 bit) or\n2147483647 (32 bit) depending on class used.\nThe functions below allow to set and get the maxBucket so the **add()** and\n**sub()** function will reach **FULL** faster.\nUseful in some applications e.g. games.\n\n- **void setMaxBucket(uint32_t value)** to have a user defined maxBucket level e.g 25\n- **uint32_t getMaxBucket()** returns the current maxBucket.\n\nPlease note it makes no sense to set maxBucket to a value larger than\nthe histogram type can handle. \nSetting maxBucket to 300 for **Histogram8** will always fail as data can only \nhandle values between 0 .. 255.\n\n\n#### Base\n\n- **uint8_t clear(float value = 0)** reset all bucket counters to value (default 0).\nReturns status, see below.\n- **uint8_t setBucket(const uint16_t index, int32_t value = 0)** store / overwrite a value of bucket.\nReturns status, see below.\n- **uint8_t add(float value)** add a value, increase count of bucket.\nReturns status, see below.\n- **uint8_t sub(float value)** 'add' a value, decrease (subtract) count of bucket.\nThis is less used and has some side effects, see **frequency()**.\nReturns status, see below.\n\n\n|  Status            |  Value  | Description  |\n|:------------------:|:-------:|:------------:|\n|  HISTO_OK          |  0x00   |  all is well\n|  HISTO_FULL        |  0x01   |  add() / sub() caused bucket full ( + or - )\n|  HISTO_ERR_FULL    |  0xFF   |  cannot add() / sub(), overflow / underflow\n|  HISTO_ERR_LENGTH  |  0xFE   |  length = 0 error (constructor)\n\n\n- **uint16_t size()** returns number of buckets.\n- **uint32_t count()** returns total number of values added (or subtracted).\n- **int32_t bucket(uint16_t index)** returns the count of single bucket. \nCan be negative if one uses **sub()**\n- **float frequency(uint16_t index)** returns the relative frequency of a bucket.\nThis is always between -1.0 and 1.0.\n\nSome notes about **frequency()** \n- can return a negative value if an application uses **sub()**\n- sum of all buckets will not add up to 1.0 if one uses **sub()**\n- value (and thus sum) will deviate if **HISTO_ERR_FULL** has occurred.\n\n\n#### Helper functions\n\n- **uint16_t find(float value)** returns the index of the bucket for value.\n- **uint16_t findMin()** returns the (first) index of the bucket with the minimum value.\n- **uint16_t findMax()** returns the (first) index of the bucket with the maximum value.\n- **uint16_t countLevel(int32_t level)** returns the number of buckets with exact that level (count).\n- **uint16_t countAbove(int32_t level)** returns the number of buckets above level.\n- **uint16_t countBelow(int32_t level)** returns the number of buckets below level.\n\n\n#### Probability Distribution Functions\n\nThere are three experimental functions:\n\n- **float PMF(float value)** Probability Mass Function. \nQuite similar to **frequency()**, but uses a value as parameter.\n- **float CDF(float value)** Cumulative Distribution Function. \nReturns the sum of frequencies \u003c= value. Always between 0.0 and 1.0.\n- **float VAL(float probability)** Value Function, is **CDF()** inverted. \nReturns the value of the original array for which the CDF is at least probability.\n- **int32_t sum()** returns the sum of all buckets. (not experimental).\nJust as with **frequency()** it is affected by the use of **sub()**,\nincluding returning a negative value.\n\nAs most Arduino sketches typical uses a small number of buckets these functions \nare quite coarse and/or inaccurate, so indicative at best.\nLinear interpolation within \"last\" bucket needs to be investigated, however it\nintroduces its own uncertainty. Alternative is to add last box for 50%.\n\nNote **PDF()** is a continuous function and therefore not applicable in a discrete histogram.\n\n\n- https://en.wikipedia.org/wiki/Probability_mass_function  PMF()\n- https://en.wikipedia.org/wiki/Cumulative_distribution_function CDF() + VAL()\n- https://en.wikipedia.org/wiki/Probability_density_function  PDF()\n\n\n#### Experimental\n\nAn additional helper function.\n\n- **float saturation()** returns the **count()** / nr of bins.\nIs an indicator of how \"filled\" the histogram is.\n\nMight need to calculate the average level.\n\nNote: **findMax()** gives an indication for the topmost individual bucket.\n\n\n## Future\n\n\n#### Must\n\n- improve documentation\n\n#### Should\n\n- investigate performance - **find()** the right bucket. \n  - Binary search is faster (above 20)\n  - need testing.\n  - mixed search, last part (\u003c 20) linear?\n- improve accuracy - linear interpolation for **PMF()**, **CDF()** and **VAL()**\n- performance - merge loops in **PMF()**\n- performance - reverse loops - compare to zero.\n\n\n#### Could\n\n- **saturation()** indication of the whole histogram\n  - count / nr of bins?\n- percentage readOut == frequency()\n  - **float getBucketPercent(idx)**\n- template class \u003cbucketsizeType\u003e.\n\n\n#### Wont\n\n- merge bins\n- 2D histograms ? e.g. positions on a grid.\n  - see SparseMatrix\n\n\n## Support\n\nIf you appreciate my libraries, you can support the development and maintenance.\nImprove the quality of the libraries by providing issues and Pull Requests, or\ndonate through PayPal or GitHub sponsors.\n\nThank you,\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRobTillaart%2FHistogram","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FRobTillaart%2FHistogram","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRobTillaart%2FHistogram/lists"}