https://github.com/statsim/confusionmatrix
Generate Confusion Matrix and Evaluation Metrics Online
https://github.com/statsim/confusionmatrix
classification confusion-matrix machine-learning
Last synced: 6 months ago
JSON representation
Generate Confusion Matrix and Evaluation Metrics Online
- Host: GitHub
- URL: https://github.com/statsim/confusionmatrix
- Owner: statsim
- Created: 2024-11-19T22:21:41.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-02T18:06:02.000Z (over 1 year ago)
- Last Synced: 2025-01-22T16:16:01.125Z (over 1 year ago)
- Topics: classification, confusion-matrix, machine-learning
- Language: HTML
- Homepage: https://statsim.com/confusionmatrix/
- Size: 11.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Generate Confusion Matrix and Evaluation Metrics Online
*A confusion matrix is a useful tool for evaluating the performance of classification models. It provides a breakdown of predicted versus actual outcomes, allowing for a deeper understanding of model performance beyond just accuracy.*
---
## **Why Accuracy is Not Enough for Binary Classification?**
Accuracy is the proportion of correctly classified instances among all instances:
```latex
\text{Accuracy} = \frac{\text{TP} + \text{TN}}{\text{Total}}
```
Where $$\text{TP}$$ is the number of true positives, $$\text{TN}$$ is the number of true negatives, and $$\text{Total}$$ is the total number of instances.
### **Limitations of Accuracy as a Metric**
- **Class Imbalance.** In datasets with imbalanced classes (e.g., 95% negatives and 5% positives), a model that always predicts the majority class (negative) can achieve high accuracy but fails to capture the minority class.
- **No Insight into Error Types.** Accuracy does not distinguish between types of errors (e.g., false positives vs. false negatives), which can have vastly different implications in real-world scenarios.
---
## **Confusion Matrix Terminology**
| | **Predicted Positive** | **Predicted Negative** |
|----------------|-------------------------|-------------------------|
| **Actual Positive** | True Positive (TP): Correctly identified positive cases | False Negative (FN): Missed positive cases |
| **Actual Negative** | False Positive (FP): Incorrectly predicted positive cases | True Negative (TN): Correctly identified negative cases |
### **Key Terms**:
- **TP (True Positive)**: Correctly predicted positive instances.
- **TN (True Negative)**: Correctly predicted negative instances.
- **FP (False Positive)**: Negative instances incorrectly predicted as positive.
- **FN (False Negative)**: Positive instances incorrectly predicted as negative.
---
## **Metrics and Their Meaning**
### **1. Sensitivity (True Positive Rate, TPR)**
```latex
\text{TPR} = \frac{\text{TP}}{\text{TP} + \text{FN}}
```
- **Definition**: Proportion of actual positives correctly identified.
- **Importance**: Measures the ability to detect positive cases (useful in medical diagnosis, fraud detection).
---
### **2. Specificity (SPC)**
```latex
\text{SPC} = \frac{\text{TN}}{\text{FP} + \text{TN}}
```
- **Definition**: Proportion of actual negatives correctly identified.
- **Importance**: Measures the ability to avoid false alarms (useful in spam filters, anomaly detection).
---
### **3. Precision (Positive Predictive Value, PPV)**
```latex
\text{PPV} = \frac{\text{TP}}{\text{TP} + \text{FP}}
```
- **Definition**: Proportion of positive predictions that are correct.
- **Importance**: Indicates reliability of positive predictions.
---
### **4. Negative Predictive Value (NPV)**
```latex
\text{NPV} = \frac{\text{TN}}{\text{TN} + \text{FN}}
```
- **Definition**: Proportion of negative predictions that are correct.
- **Importance**: Indicates reliability of negative predictions.
---
### **5. False Positive Rate (FPR)**
```latex
\text{FPR} = \frac{\text{FP}}{\text{FP} + \text{TN}}
```
- **Definition**: Proportion of actual negatives incorrectly predicted as positive.
- **Importance**: Highlights the rate of false alarms.
---
### **6. False Discovery Rate (FDR)**
```latex
\text{FDR} = \frac{\text{FP}}{\text{FP} + \text{TP}}
```
- **Definition**: Proportion of positive predictions that are incorrect.
- **Importance**: Complements precision in evaluating prediction quality.
---
### **7. False Negative Rate (FNR)**
```latex
\text{FNR} = \frac{\text{FN}}{\text{FN} + \text{TP}}
```
- **Definition**: Proportion of actual positives incorrectly predicted as negative.
- **Importance**: Highlights the rate of missed positive cases.
---
### **8. F1 Score**
```latex
\text{F1} = \frac{2 \cdot \text{TP}}{2 \cdot \text{TP} + \text{FP} + \text{FN}}
```
- **Definition**: Harmonic mean of precision and recall.
- **Importance**: Balances precision and recall, especially useful in imbalanced datasets.
---
### **9. Accuracy**
```latex
\text{ACC} = \frac{\text{TP} + \text{TN}}{\text{P} + \text{N}}
```
- **Definition**: Proportion of correct predictions.
- **Importance**: Provides a general sense of model performance but is less reliable in imbalanced datasets.
---
### **10. Matthews Correlation Coefficient (MCC)**
```latex
\text{MCC} = \frac{\text{TP} \cdot \text{TN} - \text{FP} \cdot \text{FN}}{\sqrt{(\text{TP} + \text{FP})(\text{TP} + \text{FN})(\text{TN} + \text{FP})(\text{TN} + \text{FN})}}
```
- **Definition**: A balanced measure that accounts for TP, TN, FP, and FN.
- **Importance**: Considered a robust metric for imbalanced datasets. Ranges from -1 (inverse prediction) to +1 (perfect prediction), with 0 indicating random performance.
---
## **Key Takeaways**
1. **Class Imbalance Requires Caution**: Accuracy alone can be misleading when classes are imbalanced.
2. **Use Multiple Metrics**: Evaluate sensitivity, specificity, precision, and other metrics to understand the trade-offs in your model.
3. **MCC for Imbalanced Datasets**: Use Matthews Correlation Coefficient for a single comprehensive measure.
4. **Domain-Specific Importance**: Choose metrics based on the problem domain (e.g., sensitivity for medical tests, precision for legal applications).
---
By understanding and applying these metrics, you can better assess and improve your model's performance, ensuring it meets the requirements of the task at hand.