https://github.com/duongnmanh/ibm_ai_issue
My learning of AI
https://github.com/duongnmanh/ibm_ai_issue
artificial-intelligence learning-materials
Last synced: 5 months ago
JSON representation
My learning of AI
- Host: GitHub
- URL: https://github.com/duongnmanh/ibm_ai_issue
- Owner: DuongNManh
- Created: 2024-12-03T16:05:45.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-20T08:38:29.000Z (over 1 year ago)
- Last Synced: 2025-06-05T13:47:39.101Z (about 1 year ago)
- Topics: artificial-intelligence, learning-materials
- Homepage:
- Size: 26.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
Awesome Lists containing this project
README
# 🌟 **IBM AI Issue**
## 🚀 **Python Library for Machine Learning**
The `scikit-learn` package allows you to complete machine learning tasks with just a few lines of code.

---
## 🤖 **Supervised and Unsupervised Algorithms**

---
### 📘 **Supervised Learning**
Supervised learning involves training a model using labeled data. For example, consider a **cancer dataset**:

#### 🧠 **Types of Supervised Learning**
1. **Classification**
Classification predicts discrete categories or classes for a given input.

2. **Regression**
Regression predicts continuous values based on input data.
Base on the **independent variable** to determite continuous value of **Dependent variable**



- Types of regression models:
- Simple Regression: Simple Linear & Non-linear Regression
---
- Multiple Regression: Multiple Linear & Non-linear Regression
---
- Linear Regression: Simple Linear & Multiple Regression
---
- Simple Linear
1.Simple Linear Regression representation

2.Find the best fit Linear

3.Estimating the parameters

4.Predict with linear regression

- Multiple Linear
1.Multiple Linear Regression representation

2.Expose the errors in the model

3.Estimating the parameters


---
**Question:**

- Application:

---
### 📙 **Unsupervised Learning**
Unsupervised learning works with unlabeled data to find hidden patterns or structures in the dataset.

#### 🧩 **Dataset for Unsupervised Learning**
Unsupervised learning uses **unlabeled data**:

---
#### 🔍 **Types of Unsupervised Learning**
1. **Clustering**
Group similar data points into clusters.

2. **Dimensionality Reduction**
Reduce the number of variables while retaining essential information.
---
#### 🧠 **Model evaluation**
**1. Caculate the accurency of the model (how can this model predict an unknown dataset)**
- **Using a portion of the dataset**: train the model by entire dataset (labeled) and check by part of unlabeled data in same dataset


- **Training & out-of-sample Accuracy**
- **Training Accuracy**: % of correct predictions that the model makes when using the test dataset.
- when we train and testing on the same dataset => produces a high training accuracy

- **Out-of-Sample Accuracy**: % of correct predictions that the model makes when using the unknown data.

- **Split train/test evaluation approach**: reduce the overfit and can evaluate the Out-of-Sample Accuracy of the model


- **K-fold cross-validation**: splitting the dataset into K equally sized subsets. The model is trained on K-1 folds and tested on the remaining fold.
Result = avg of all test accuracy
