Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model
Python program to Predicting Stars Galaxies Quasars with ML Model
https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model
artificial-intelligence astro astronomy astrophysics classification-algorithm decision-tree decision-trees jypyternotebook knn machine-learning matplotlib python scikit-learn
Last synced: 11 days ago
JSON representation
Python program to Predicting Stars Galaxies Quasars with ML Model
- Host: GitHub
- URL: https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model
- Owner: saro0307
- License: mit
- Created: 2024-05-25T12:29:19.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-30T10:32:36.000Z (8 months ago)
- Last Synced: 2024-11-13T19:14:32.532Z (2 months ago)
- Topics: artificial-intelligence, astro, astronomy, astrophysics, classification-algorithm, decision-tree, decision-trees, jypyternotebook, knn, machine-learning, matplotlib, python, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 408 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Predicting Stars, Galaxies & Quasars with ML Model
This project focuses on building machine learning models to classify celestial objects into stars, galaxies, and quasars using a provided dataset. The following sections detail the workflow and the steps taken in the notebook.
## Workflow of the Notebook
1. **Introducing Dataset**
2. **Importing Necessary Libraries and Modules**
3. **Exploring the Dataset**
4. **Preparing Data for the Model**
5. **Scaling the Data and Checking Distribution Plots**
6. **Building ML Models and Evaluating Results**## Dataset Introduction
The dataset used for this project contains information about celestial objects. The classification task involves predicting whether an object is a star, galaxy, or quasar based on various features provided in the dataset.
## Importing Necessary Libraries and Modules
We utilize several Python libraries for data manipulation, visualization, and building machine learning models. Here is the list of the libraries:
```python
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
import warnings
warnings.filterwarnings("ignore")%matplotlib inline
```## Loading and Exploring the Dataset
### Importing the Dataset
```python
for dirname, _, filenames in os.walk('/kaggle/input'):
for filename in filenames:
print(os.path.join(dirname, filename))
data = pd.read_csv("/content/Skyserver_SQL2_27_2018 6_51_39 PM.csv")
data.head()
```### Dataset Dimensions and Preliminary Analysis
```python
data.shape
data.describe()
```### Dropping Unnecessary Columns
```python
data.drop(['objid','specobjid'], axis=1, inplace=True)
data.head(10)
```### Checking for Null Values
```python
data.info()
```The dataset is complete with no missing values.
### Encoding Target Variable
```python
le = LabelEncoder().fit(data['class'])
data['class'] = le.transform(data['class'])
data.head(10)
```### Final Dataset Information
```python
data.info()
```## Data Preparation
### Splitting Data into Features and Target
```python
X = data.drop('class', axis=1)
y = data['class']
```## Data Scaling
Standardizing the dataset to have a mean of 0 and a standard deviation of 1.
```python
scaler = StandardScaler(copy=True, with_mean=True, with_std=True)
X = scaler.fit_transform(X)
```## Building Machine Learning Models
We employ several machine learning models to classify the data:
- **Decision Tree Classifier**
- **Logistic Regression**
- **Naive Bayes**
- **K-Nearest Neighbors**
- **Support Vector Machine**### Training and Evaluating the Models
```python
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)# Example for Decision Tree Classifier
dt = DecisionTreeClassifier()
dt.fit(X_train, y_train)
y_pred = dt.predict(X_test)
print(f"Decision Tree Accuracy: {accuracy_score(y_test, y_pred)}")# Repeat similar steps for other models
```
## license
This program is under the [MIT License](LICENSE)## Conclusion
This notebook demonstrates the process of loading a dataset, performing exploratory data analysis, preparing the data, scaling it, and finally building and evaluating several machine learning models to classify celestial objects.
## References
- [NumPy Documentation](https://numpy.org/doc/)
- [Pandas Documentation](https://pandas.pydata.org/docs/)
- [Matplotlib Documentation](https://matplotlib.org/stable/contents.html)
- [Seaborn Documentation](https://seaborn.pydata.org/)
- [TensorFlow Documentation](https://www.tensorflow.org/api_docs/python/tf)
- [Scikit-Learn Documentation](https://scikit-learn.org/stable/documentation.html)---
Upload this notebook to a Kaggle session, link the dataset, and run the cells to reproduce the analysis and results.