Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model

Python program to Predicting Stars Galaxies Quasars with ML Model
https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model

artificial-intelligence astro astronomy astrophysics classification-algorithm decision-tree decision-trees jypyternotebook knn machine-learning matplotlib python scikit-learn

Last synced: 11 days ago
JSON representation

Python program to Predicting Stars Galaxies Quasars with ML Model

Host: GitHub
URL: https://github.com/saro0307/predicting-stars-galaxies-quasars-with-ml-model
Owner: saro0307
License: mit
Created: 2024-05-25T12:29:19.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-05-30T10:32:36.000Z (8 months ago)
Last Synced: 2024-11-13T19:14:32.532Z (2 months ago)
Topics: artificial-intelligence, astro, astronomy, astrophysics, classification-algorithm, decision-tree, decision-trees, jypyternotebook, knn, machine-learning, matplotlib, python, scikit-learn
Language: Jupyter Notebook
Homepage:
Size: 408 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Predicting Stars, Galaxies & Quasars with ML Model

This project focuses on building machine learning models to classify celestial objects into stars, galaxies, and quasars using a provided dataset. The following sections detail the workflow and the steps taken in the notebook.

## Workflow of the Notebook

1. **Introducing Dataset**

2. **Importing Necessary Libraries and Modules**

3. **Exploring the Dataset**

4. **Preparing Data for the Model**

5. **Scaling the Data and Checking Distribution Plots**

6. **Building ML Models and Evaluating Results**

## Dataset Introduction

The dataset used for this project contains information about celestial objects. The classification task involves predicting whether an object is a star, galaxy, or quasar based on various features provided in the dataset.

## Importing Necessary Libraries and Modules

We utilize several Python libraries for data manipulation, visualization, and building machine learning models. Here is the list of the libraries:

```python

import os

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import tensorflow as tf

from tensorflow import keras

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder, StandardScaler

from sklearn.metrics import accuracy_score

from sklearn.tree import DecisionTreeClassifier

from sklearn.linear_model import LogisticRegression

from sklearn.naive_bayes import MultinomialNB

from sklearn.neighbors import KNeighborsClassifier

from sklearn.svm import SVC

import warnings

warnings.filterwarnings("ignore")

%matplotlib inline

```

## Loading and Exploring the Dataset

### Importing the Dataset

```python

for dirname, _, filenames in os.walk('/kaggle/input'):

    for filename in filenames:

        print(os.path.join(dirname, filename))

        

data = pd.read_csv("/content/Skyserver_SQL2_27_2018 6_51_39 PM.csv")

data.head()

```

### Dataset Dimensions and Preliminary Analysis

```python

data.shape

data.describe()

```

### Dropping Unnecessary Columns

```python

data.drop(['objid','specobjid'], axis=1, inplace=True)

data.head(10)

```

### Checking for Null Values

```python

data.info()

```

The dataset is complete with no missing values.

### Encoding Target Variable

```python

le = LabelEncoder().fit(data['class'])

data['class'] = le.transform(data['class'])

data.head(10)

```

### Final Dataset Information

```python

data.info()

```

## Data Preparation

### Splitting Data into Features and Target

```python

X = data.drop('class', axis=1)

y = data['class']

```

## Data Scaling

Standardizing the dataset to have a mean of 0 and a standard deviation of 1.

```python

scaler = StandardScaler(copy=True, with_mean=True, with_std=True)

X = scaler.fit_transform(X)

```

## Building Machine Learning Models

We employ several machine learning models to classify the data:

- **Decision Tree Classifier**

- **Logistic Regression**

- **Naive Bayes**

- **K-Nearest Neighbors**

- **Support Vector Machine**

### Training and Evaluating the Models

```python

# Splitting the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Example for Decision Tree Classifier

dt = DecisionTreeClassifier()

dt.fit(X_train, y_train)

y_pred = dt.predict(X_test)

print(f"Decision Tree Accuracy: {accuracy_score(y_test, y_pred)}")

# Repeat similar steps for other models

```

## license

This program is under the [MIT License](LICENSE)

## Conclusion

This notebook demonstrates the process of loading a dataset, performing exploratory data analysis, preparing the data, scaling it, and finally building and evaluating several machine learning models to classify celestial objects.

## References

- [NumPy Documentation](https://numpy.org/doc/)

- [Pandas Documentation](https://pandas.pydata.org/docs/)

- [Matplotlib Documentation](https://matplotlib.org/stable/contents.html)

- [Seaborn Documentation](https://seaborn.pydata.org/)

- [TensorFlow Documentation](https://www.tensorflow.org/api_docs/python/tf)

- [Scikit-Learn Documentation](https://scikit-learn.org/stable/documentation.html)

---

Upload this notebook to a Kaggle session, link the dataset, and run the cells to reproduce the analysis and results.