https://github.com/emna-chebbi/iris-dataset
The iris dataset contains 4 features which i used to classify the flowers into one of the three species based on the measurements.
https://github.com/emna-chebbi/iris-dataset
decision-tree-classifier decision-trees iris-dataset iris-flower-classification machine-learning
Last synced: 2 months ago
JSON representation
The iris dataset contains 4 features which i used to classify the flowers into one of the three species based on the measurements.
- Host: GitHub
- URL: https://github.com/emna-chebbi/iris-dataset
- Owner: Emna-chebbi
- Created: 2024-11-10T13:06:00.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-10T13:20:11.000Z (over 1 year ago)
- Last Synced: 2025-01-11T07:54:10.328Z (over 1 year ago)
- Topics: decision-tree-classifier, decision-trees, iris-dataset, iris-flower-classification, machine-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 208 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Iris-dataset
The Iris dataset consists of 150 samples from three species of iris flowers: Setosa, Versicolor, and Virginica.
For each sample, the dataset includes four features of the flowers:
Sepal length
Sepal width
Petal length
Petal width
Each feature is measured in centimeters.
The tasks I have done :
- Loaded the Iris dataset from scikit-learn using load_iris() and extracted the features (X) and target labels (y).
- Split the dataset into training and testing sets using train_test_split() to evaluate the model's performance.
- Trained a Decision Tree Classifier using DecisionTreeClassifier() from sklearn, fitting it to the training data (X_train, y_train).
- Predicted the class of a new flower by providing input features (e.g., sepal length, sepal width, petal length, petal width) to the trained model using clf.predict().
- Output the predicted iris species by mapping the predicted class index to the species name from iris.target_names.
- Calculated and displayed the model's accuracy using accuracy_score() by comparing the predicted labels on the test set (X_test) to the actual labels (y_test).
- Visualized the decision tree using plot_tree() to understand how the model splits the data.
- Customized the visualization with matplotlib by setting a larger figure size (figsize=(15,12)).
- Enhanced the tree plot by adding feature names ("Sepal Length", "Sepal Width", "Petal Length", "Petal Width") and class names ("Setosa", "Versicolor", "Virginica") for better clarity.