Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/atharva309/asteroid-analysis
performing Asteroid data analysis using postgresql, liquibase, and R
https://github.com/atharva309/asteroid-analysis
analysis liquibase naive-bayes-classifier postgresql r
Last synced: about 1 month ago
JSON representation
performing Asteroid data analysis using postgresql, liquibase, and R
- Host: GitHub
- URL: https://github.com/atharva309/asteroid-analysis
- Owner: Atharva309
- Created: 2024-11-07T00:30:05.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-27T01:11:06.000Z (about 1 month ago)
- Last Synced: 2024-11-27T02:18:56.641Z (about 1 month ago)
- Topics: analysis, liquibase, naive-bayes-classifier, postgresql, r
- Language: Jupyter Notebook
- Homepage:
- Size: 1010 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: changelogs/changelog-complete.xml
Awesome Lists containing this project
README
# Asteroid Classification Project
This project classifies Near-Earth Objects (NEO) and Potentially Hazardous Asteroids (PHA) using a Naive Bayes classifier on a PostgreSQL database. Data cleaning and feature engineering are done in PostgreSQL, with Liquibase for database versioning, and the classifier is implemented manually in R.
dataset: [kaggle](https://www.kaggle.com/datasets/basu369victor/prediction-of-asteroid-diameter)
## Project Overview
The project uses the Naive Bayes classifier to identify and categorize asteroids based on orbital and physical parameters. Key steps include:
- **Data Cleaning**: Processed in PostgreSQL for consistency and completeness.
- **Feature Engineering**: Binning key features to simplify classification.
- **Database Management**: Using Liquibase to track database changes.
- **Manual Naive Bayes in R**: Implemented in R to classify asteroids as NEO or PHA.## Dataset
Asteroid data features include:
- `Moid` (Minimum Orbit Intersection Distance)
- `a` (Semi-major axis)
- `e` (Eccentricity)
- `i` (Inclination)
- `H` (Absolute magnitude)These features are grouped into bins to create categorical data for classification.
## Project Setup
### Requirements
- **PostgreSQL**: For data storage and cleaning.
- **Liquibase**: For database migrations.
- **R and RStudio**: For implementing the Naive Bayes classifier.### Installation
1. **Clone the repository** and navigate to the project directory.
2. **Database Setup**: Create a PostgreSQL database, then use provided changelogs scripts to set up tables and load data.
3. **Liquibase Migrations**: Run migrations using Liquibase for version control.
4. **Run Classifier in R**: Execute the Naive Bayes classification in R to predict NEO and PHA categories.## Results
Classification results and additional analyses are saved and visualized in R, providing insights into asteroid characteristics.