https://github.com/oldcrow/libhmm
Modern C++17 Hidden Markov Model library with smart pointer memory management, comprehensive training algorithms, and critical bug fixes. Features Viterbi training, multiple probability distributions, and memory-safe implementation.
https://github.com/oldcrow/libhmm
baum-welch baum-welch-algorithm cmake cpp cpp17 hidden-markov-models machine-learning probability smart-pointers statistics viterbi-algorithm
Last synced: 6 months ago
JSON representation
Modern C++17 Hidden Markov Model library with smart pointer memory management, comprehensive training algorithms, and critical bug fixes. Features Viterbi training, multiple probability distributions, and memory-safe implementation.
- Host: GitHub
- URL: https://github.com/oldcrow/libhmm
- Owner: OldCrow
- License: mit
- Created: 2025-06-21T22:38:28.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-06-21T23:28:32.000Z (6 months ago)
- Last Synced: 2025-06-22T00:25:30.486Z (6 months ago)
- Topics: baum-welch, baum-welch-algorithm, cmake, cpp, cpp17, hidden-markov-models, machine-learning, probability, smart-pointers, statistics, viterbi-algorithm
- Language: C++
- Homepage:
- Size: 63.5 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# libhmm - Modern C++17 Hidden Markov Model Library
[](https://isocpp.org/std/the-standard)
[](https://cmake.org/)
[](LICENSE)
[](https://github.com/OldCrow/libhmm/releases)
[](tests/)
[](src/performance/)
[](src/performance/thread_pool.cpp)
A modern, high-performance C++17 implementation of Hidden Markov Models with advanced statistical distributions, SIMD optimization, and parallel processing capabilities.
**๐ Latest Release v2.7.1**: Discrete distributions gold standard upgrade and HTK benchmarking enhancement release. Brings all 4 discrete distributions (Discrete, Binomial, Negative Binomial, Poisson) to gold standard with comprehensive exception handling, robust boundary value management, and consistent variable naming. Enhances benchmarking suite with clean discrete/continuous HTK separation and performance scaling analysis. Maintains 100% test pass rate while significantly improving discrete distribution reliability and numerical robustness.
## Major Achievements
โ
**Complete Boost dependency elimination** - Replaced all Boost.uBLAS and Boost.Serialization dependencies with custom C++17 implementations
โ
**Custom Matrix/Vector implementations** - SIMD-friendly contiguous memory layout with full API compatibility
โ
**Comprehensive 5-library benchmarking suite** - 100% numerical agreement across HMMLib, GHMM, StochHMM, HTK, and libhmm
โ
**Zero external dependencies** - C++17 standard library only, simplified build and deployment
โ
**Performance baseline established** - Validated numerical accuracy with comprehensive performance characterization across HMM libraries
## Features
### ๐ฏ **Training Algorithms**
- **Viterbi Training** - Segmented k-means with clustering
- **Segmented K-Means** - Alternative k-means implementation
- **Baum-Welch** - Standard expectation-maximization
- **Scaled Baum-Welch** - Numerically stable implementation
### ๐ **Probability Distributions**
**Discrete Distributions:**
- Discrete Distribution (categorical)
- Binomial Distribution (success/failure trials)
- Negative Binomial Distribution (success probability modeling)
- Poisson Distribution (count data and rare events)
**Continuous Distributions:**
- Gaussian (Normal) Distribution (symmetric, bell-curve)
- Beta Distribution (probabilities and proportions on [0,1])
- Gamma Distribution (positive continuous variables)
- Exponential Distribution (waiting times and reliability)
- Log-Normal Distribution (multiplicative processes)
- Pareto Distribution (power-law phenomena)
- Uniform Distribution (continuous uniform random variables)
- Weibull Distribution (reliability analysis and survival modeling)
- **Student's t-Distribution** (robust modeling with heavy tails)
- **Chi-squared Distribution** (goodness-of-fit and categorical analysis)
**Quality Standards:** All distributions are being progressively updated to meet the [Gold Standard Checklist](docs/GOLD_STANDARD_CHECKLIST.md) for consistency, robustness, and maintainability.
### ๐งฎ **Calculators**
**Forward-Backward Algorithms:**
- **Standard Forward-Backward** - Classic algorithm for probability computation
- **Scaled SIMD Forward-Backward** - Numerically stable with SIMD optimization and automatic CPU fallback
- **Log SIMD Forward-Backward** - Log-space computation with SIMD optimization and automatic CPU fallback
**Viterbi Algorithms:**
- **Standard Viterbi** - Most likely state sequence decoding
- **Scaled SIMD Viterbi** - Numerically stable Viterbi with SIMD optimization and automatic CPU fallback
- **Log SIMD Viterbi** - Log-space Viterbi with SIMD optimization and automatic CPU fallback
**Automatic Calculator Selection:**
- **AutoCalculator** - Intelligent algorithm selection based on problem characteristics
- **Performance Prediction** - CPU feature detection and optimal calculator selection
- **Traits-Based Selection** - Automatic fallback from SIMD to scalar implementations
### โก **Performance Optimizations**
- **SIMD Support**: AVX, SSE2, and ARM NEON vectorization
- **Thread Pool**: Modern C++17 work-stealing thread pool
- **Automatic Optimization**: CPU feature detection and algorithm selection
- **Memory Efficiency**: Aligned allocators and memory pools
- **Cache Optimization**: Blocked algorithms for large matrices
### ๐พ **I/O Support**
- XML file reading/writing
- Extensible file I/O manager
- Model serialization
### ๐งช **Testing Infrastructure**
- **Distribution Tests**: 14 comprehensive distribution tests in `tests/distributions/`
- **Calculator Tests**: 10 SIMD calculator and performance tests in `tests/calculators/`
- **Integration Tests**: Core HMM functionality testing
- **100% Test Coverage**: All distributions with complete functionality testing
- **CMake/CTest Integration**: Automated testing framework
- **Continuous Validation**: Parameter fitting, edge cases, and error handling
- **Code Quality Enforcement**: clang-tidy integration for automated style guide compliance
### ๐ **Benchmarking Suite**
- **Multi-Library Validation**: Integration with HMMLib, GHMM, StochHMM, HTK
- **Numerical Accuracy**: 100% agreement at machine precision across libraries
- **Performance Baseline**: Comprehensive performance characterization
- **Compatibility Documentation**: Complete integration guides and fixes
## Quick Start
### Building with CMake
```bash
# Clone and build
git clone
cd libhmm
mkdir build && cd build
cmake ..
make -j$(nproc)
# Run tests
ctest
# Install
sudo make install
```
**Cross-Platform Support**: For detailed cross-platform building instructions including macOS, Linux, and configuration options, see [Cross-Platform Build Guide](docs/CROSS_PLATFORM.md).
### Building with Make (Legacy)
```bash
make
```
### Basic Usage
```cpp
#include
// Create HMM with 2 states and discrete observations
auto hmm = std::make_unique(2, 6); // 2 states, 6 observation symbols
// Set up training data
ObservationLists trainingData = { /* your observation sequences */ };
// Train using Baum-Welch
BaumWelchTrainer trainer(hmm.get(), trainingData);
trainer.train();
// Decode most likely state sequence
ViterbiCalculator viterbi(hmm.get(), observations);
StateSequence states = viterbi.decode();
```
### Examples
See the [examples/](examples/) directory for comprehensive usage examples:
- **[basic_hmm_example.cpp](examples/basic_hmm_example.cpp)** - Basic HMM usage with modern C++17 features
- **[poisson_hmm_example.cpp](examples/poisson_hmm_example.cpp)** - Website traffic modeling using Poisson distribution
- **[financial_hmm_example.cpp](examples/financial_hmm_example.cpp)** - Market volatility modeling with Beta and Log-Normal distributions
- **[reliability_hmm_example.cpp](examples/reliability_hmm_example.cpp)** - Component lifetime analysis with Weibull and Exponential distributions
- **[quality_control_hmm_example.cpp](examples/quality_control_hmm_example.cpp)** - Manufacturing process monitoring with Binomial and Uniform distributions
- **[economics_hmm_example.cpp](examples/economics_hmm_example.cpp)** - Economic modeling with Negative Binomial and Pareto distributions
- **[queuing_theory_hmm_example.cpp](examples/queuing_theory_hmm_example.cpp)** - Service systems and queuing analysis with Poisson, Exponential, and Gamma distributions
- **[robust_financial_hmm_example.cpp](examples/robust_financial_hmm_example.cpp)** - Heavy-tailed financial modeling with Student's t-distribution
- **[statistical_process_control_hmm_example.cpp](examples/statistical_process_control_hmm_example.cpp)** - Advanced quality control with Chi-squared distribution
- **[swarm_coordination_example.cpp](examples/swarm_coordination_example.cpp)** - Discrete state drone swarm coordination and formation control
## Project Structure
```
libhmm/
โโโ include/libhmm/ # Public headers
โ โโโ calculators/ # Algorithm implementations
โ โโโ distributions/ # Probability distributions
โ โโโ training/ # Training algorithms
โ โโโ io/ # File I/O support
โ โโโ common/ # Utilities and types
โโโ src/ # Implementation files
โโโ tests/ # Test suite
โโโ examples/ # Usage examples
โโโ docs/ # Documentation
โโโ CMakeLists.txt # Modern build system
```
## Requirements
- **C++17** compatible compiler (GCC 7+, Clang 6+, MSVC 2017+)
- **CMake 3.15+** (for CMake builds)
- **Make** (for legacy Makefile builds)
**Zero External Dependencies** - libhmm now requires only the C++17 standard library!
## Documentation
- [API Documentation](docs/api/)
- [Tutorials](docs/tutorials/)
- [Examples](examples/)
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes with tests
4. Follow the [Gold Standard Checklist](docs/GOLD_STANDARD_CHECKLIST.md) for distribution implementations
5. Ensure code complies with the project style guide (enforced via clang-tidy)
6. Submit a pull request
## Acknowledgments
- Original implementation modernized to C++17 standards
- Inspired by JAHMM and other HMM libraries
- Mathematical foundations from Rabiner & Juang tutorials