https://github.com/rushhaabhhh/ml-learning

Python implementations of core ML algorithms like linear and logistic regression, gradient descent, and model evaluation metrics to deepen understanding of machine learning principles.
https://github.com/rushhaabhhh/ml-learning

case-study evaluation-metrics linear-regression logistic-regression machine-learning

Last synced: about 1 year ago
JSON representation

Python implementations of core ML algorithms like linear and logistic regression, gradient descent, and model evaluation metrics to deepen understanding of machine learning principles.

Host: GitHub
URL: https://github.com/rushhaabhhh/ml-learning
Owner: Rushhaabhhh
Created: 2024-12-14T12:06:31.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-21T16:39:18.000Z (over 1 year ago)
Last Synced: 2025-02-08T03:33:08.238Z (over 1 year ago)
Topics: case-study, evaluation-metrics, linear-regression, logistic-regression, machine-learning
Language: Jupyter Notebook
Homepage:
Size: 3.78 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Machine Learning Concepts

## 📚 Fundamental Concepts

### Supervised Learning
Supervised learning is a type of machine learning where the model is trained on labeled data, meaning that each training example is paired with an output label. The goal is to learn a mapping from inputs to outputs.

#### Types of Supervised Learning:
1. **Classification**: The task of predicting a discrete label (e.g., spam or not spam).
- Example: Predicting whether an email is spam or not.

2. **Regression**: The task of predicting a continuous value (e.g., predicting house prices).
- Example: Predicting the price of a car based on features like make, model, and year.

---

### Model Evaluation Metrics

#### Classification Metrics
When evaluating classification models, several metrics help measure their performance:

1. **Accuracy**:
- **Formula**:
```
Accuracy = (TP + TN) / (TP + TN + FP + FN)
```
- **Definition**: The proportion of correct predictions (both true positives and true negatives) out of all predictions.

2. **Precision**:
- **Formula**:
```
Precision = TP / (TP + FP)
```
- **Definition**: The proportion of positive predictions that are actually correct.

3. **Recall**:
- **Formula**:
```
Recall = TP / (TP + FN)
```
- **Definition**: The proportion of actual positives that are correctly identified by the model.

4. **F1-Score**:
- **Formula**:
```
F1-Score = 2 * (Precision * Recall) / (Precision + Recall)
```
- **Definition**: The harmonic mean of precision and recall, providing a balance between the two.

5. **F-beta Score**:
- **Formula**:
```
F-beta = (1 + β²) * (Precision * Recall) / ((β² * Precision) + Recall)
```
- **Definition**: A generalization of the F1-score that allows you to control the balance between precision and recall using the parameter β.

6. **Confusion Matrix**:
- A table that describes the performance of a classification model by comparing the actual and predicted values:
- **True Positives (TP)**: Correctly predicted positive cases.
- **True Negatives (TN)**: Correctly predicted negative cases.
- **False Positives (FP)**: Incorrectly predicted positive cases.
- **False Negatives (FN)**: Incorrectly predicted negative cases.

**Confusion Matrix Structure**:
```
Predicted \ Actual | Positive | Negative
-------------------|-----------|-----------
Positive | TP | FP
Negative | FN | TN
```

---

#### Regression Metrics

1. **Mean Squared Error (MSE)**:
- **Formula**:
```
MSE = (1/n) * Σ(y_i - ŷ_i)²
```
- **Definition**: The average of the squared differences between the actual and predicted values.

2. **Root Mean Squared Error (RMSE)**:
- **Formula**:
```
RMSE = √(MSE)
```
- **Definition**: The square root of MSE, which gives an error metric in the same units as the target variable.

3. **Mean Absolute Error (MAE)**:
- **Formula**:
```
MAE = (1/n) * Σ|y_i - ŷ_i|
```
- **Definition**: The average of the absolute differences between the actual and predicted values.

4. **R² Score**:
- **Formula**:
```
R² = 1 - (Σ(y_i - ŷ_i)² / Σ(y_i - ȳ)²)
```
- **Definition**: Measures how well the model explains the variation in the target variable. A value closer to 1 indicates a better model fit.

5. **Adjusted R² Score**:
- **Formula**:
```
R²_adj = 1 - ((1 - R²) * (n-1) / (n-p-1))
```
- **Definition**: A modified version of R² that adjusts for the number of predictors in the model, preventing overfitting.

---

### Linear Regression
Linear regression attempts to model the relationship between two variables by fitting a linear equation to the observed data.

#### Formula for Linear Regression:
```
y = mx + b
```
Where:
- `y` is the target variable
- `x` is the feature variable
- `m` is the slope (coefficient)
- `b` is the y-intercept

**Multiple Linear Regression**:
```
y = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ + ε
```
Where:
- `β₀` is the intercept
- `β₁, β₂, ..., βₙ` are the coefficients
- `x₁, x₂, ..., xₙ` are the independent variables
- `ε` is the error term

---

### Logistic Regression
Logistic regression is used for binary classification tasks. The output is between 0 and 1, representing the probability of a class.

#### Formula for Logistic Regression:
1. **Sigmoid Function**:
```
σ(z) = 1 / (1 + e^(-z))
```

2. **Logistic Regression Equation**:
```
P(y=1) = σ(β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ)
```
Where:
- `z = β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ`
- `σ` is the sigmoid function
- `β₀` is the intercept
- `β₁, β₂, ..., βₙ` are the coefficients
- `x₁, x₂, ..., xₙ` are the independent variables

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rushhaabhhh/ml-learning

Awesome Lists containing this project

README