{"id":25819021,"url":"https://github.com/pngo1997/astrophysical-objects-classification","last_synced_at":"2026-05-10T22:45:16.550Z","repository":{"id":275177797,"uuid":"925320955","full_name":"pngo1997/Astrophysical-Objects-Classification","owner":"pngo1997","description":"Project applies machine learning techniques to classify astrophysical objects using observational data from the Large Synoptic Survey Telescope (LSST).","archived":false,"fork":false,"pushed_at":"2025-01-31T17:38:05.000Z","size":18766,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-28T14:13:30.858Z","etag":null,"topics":["adaptive-boosting-algorithm","classification","down-sampling","gradient-boosting","keras","machine-learning","neural-network","python","random-forest","scikit-learn","supervised-learning","tensorflow","time-series"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pngo1997.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-31T16:50:22.000Z","updated_at":"2025-02-22T00:49:26.000Z","dependencies_parsed_at":"2025-01-31T17:41:10.907Z","dependency_job_id":"a6a0bcf3-93e1-4fa1-9b98-2c24d233719d","html_url":"https://github.com/pngo1997/Astrophysical-Objects-Classification","commit_stats":null,"previous_names":["pngo1997/astrophysical-objects-classification"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pngo1997/Astrophysical-Objects-Classification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pngo1997%2FAstrophysical-Objects-Classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pngo1997%2FAstrophysical-Objects-Classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pngo1997%2FAstrophysical-Objects-Classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pngo1997%2FAstrophysical-Objects-Classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pngo1997","download_url":"https://codeload.github.com/pngo1997/Astrophysical-Objects-Classification/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pngo1997%2FAstrophysical-Objects-Classification/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32874700,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-10T13:40:02.631Z","status":"ssl_error","status_checked_at":"2026-05-10T13:40:02.145Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaptive-boosting-algorithm","classification","down-sampling","gradient-boosting","keras","machine-learning","neural-network","python","random-forest","scikit-learn","supervised-learning","tensorflow","time-series"],"created_at":"2025-02-28T08:14:24.419Z","updated_at":"2026-05-10T22:45:16.533Z","avatar_url":"https://github.com/pngo1997.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🌌 Astrophysical Objects Classification  \n\n## 📜 Overview  \nThis project applies **machine learning (ML) techniques** to classify **astrophysical objects** using observational data from the **Large Synoptic Survey Telescope (LSST)**. By leveraging **time-series and static numerical features**, we aim to develop an accurate model for classifying celestial objects, such as **comets and asteroids**, based on their spatial positions, light curves, and motion.\n\n## 🎯 Problem Explanation  \nThe **LSST dataset** consists of **astronomical time-series data** collected through **optical sky scanning**. The telescope tracks the **brightness and motion** of celestial objects to support cosmic mapping and asteroid threat assessments.\n\n### **Challenges in Classification**  \n1. **Complex Feature Representation**:  \n   - The dataset contains both **static** (e.g., position, redshift) and **time-series** (e.g., brightness over time) attributes.  \n   - Integrating these features for classification is **non-trivial**.  \n\n2. **Imbalanced Classes**:  \n   - Certain astrophysical objects are overrepresented, leading to **biased model predictions**.  \n\n3. **Selection of the Best ML Model**:  \n   - Decision trees, boosting methods, and neural networks have **different strengths** for time-series and tabular data.  \n   - We evaluate **multiple models** to determine the best approach.  \n\n📌 **Dataset**: LSST Photometric Time Series (PLAsTiCC Competition)  \n🔢 **Observations**: 4,925 total (2,459 train | 2,466 test)  \n📊 **Classes**: 16 astrophysical object types  \n\n## 📊 Data Preprocessing \u0026 Feature Engineering  \n\n### **1️⃣ Dataset Overview**  \nThe dataset includes **two types of data**:  \n| **Feature Type** | **Description** |\n|----------------|--------------------------------|\n| **Static Features** | Spatial attributes (RA, DEC, Galactic coordinates) |\n| **Time-Series Features** | Brightness variations over time |\n\n### **2️⃣ Handling Class Imbalance**  \n- **Applied class weighting** to ensure fair representation.  \n- **Downsampling techniques** adjusted sample distribution.  \n\n### **3️⃣ Feature Engineering**  \n- **Rolling time-window transformations** on flux values.  \n- **Feature selection** using correlation analysis.  \n- **Normalization \u0026 scaling** applied for neural network compatibility.  \n\n## 🔍 Exploratory Data Analysis (EDA)  \n\n### **1️⃣ Meta Data (Static Features)**  \n- **Right Ascension (RA) vs. Galactic Longitude (GL)** shows an **inverse parabolic trend**.  \n- **High variance in brightness motion data** across objects.  \n\n📊 **EDA Visualization**:  \n- **Feature correlation heatmaps**  \n- **Class distribution histograms**  \n- **Light curve plots of six sample objects**  \n\n## 🤖 Machine Learning Models  \n\n### **1️⃣ Random Forest (RF) 🌲**  \n✔️ **Strengths**:  \n- Handles **high-dimensional data** well.  \n- Provides **high accuracy \u0026 interpretability**.  \n\n📊 **Results**:  \n- **Train Accuracy**: **97%** | **Log Loss**: **1.05**  \n- **Downsampled Accuracy**: **91%** | **Log Loss**: **3.17**  \n\n### **2️⃣ Gradient Boosting (GB) 🚀**  \n✔️ **Strengths**:  \n- **Reduces bias** via sequential boosting.  \n- More effective in handling **complex relationships**.  \n\n📊 **Results**:  \n- **Train Accuracy**: **68%** | **Log Loss**: **11.64**  \n- **Downsampled Accuracy**: **65%** | **Log Loss**: **12.53**  \n\n### **3️⃣ Adaptive Boosting (AB) 🏋️**  \n✔️ **Strengths**:  \n- Assigns **weights to observations**, improving classification of hard-to-predict samples.  \n\n📊 **Results**:  \n- **Train Accuracy**: **46%** | **Log Loss**: **19.54**  \n- **Downsampled Accuracy**: **23%** | **Log Loss**: **7.78**  \n\n### **4️⃣ Neural Networks (NNs) 🧠**  \n✔️ **Strengths**:  \n- Captures **complex temporal dependencies** in time-series data.  \n- Uses **LSTM layers** for time-dependent pattern recognition.  \n\n📊 **Results**:  \n- **Train Accuracy**: **46%**  \n- **Weighted Accuracy**: **59%**  \n- **Log Loss**: **9.6**  \n\n## 🏆 Best Performing Model  \n📌 **Random Forest (RF) is the most effective model**:  \n- **Highest accuracy (97%)** on full train data.  \n- **Performs well with both static \u0026 time-series features**.  \n- **Faster training time** compared to Boosting \u0026 Neural Networks.  \n\n🚀 **Key Observations**:  \n- **Feature selection improved model performance** by **removing redundant features**.  \n- **Downsampling maintained class distribution** while improving computational efficiency.  \n- **Neural Networks struggled with time complexity and performance tuning**.\n  \n## 📢 Key Findings \u0026 Recommendations  \n\n✅ **Key Insights**:  \n- **Light curve patterns are highly predictive** of object classification.  \n- **Feature selection significantly improves classification accuracy**.  \n- **Class imbalance correction is essential** for fair model evaluation.  \n\n🔧 **Recommended Future Work**:  \n- 📊 **Test alternative deep learning architectures** (CNNs for feature extraction).  \n- ⚡ **Optimize feature engineering** for time-series processing.  \n- 🔬 **Use SMOTE for class balancing** instead of downsampling.  \n\n## 🚀 Technologies Used  \n🛠 **ML Frameworks**:  \n- **Python (Scikit-learn, TensorFlow, Keras)**  \n- **Pandas, NumPy** for preprocessing  \n- **Matplotlib, Seaborn** for visualization  \n\n📡 **Dataset**: [LSST Photometric Time Series Data (PLAsTiCC)](https://timeseriesclassification.com/description.php?Dataset=LSST)  \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpngo1997%2Fastrophysical-objects-classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpngo1997%2Fastrophysical-objects-classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpngo1997%2Fastrophysical-objects-classification/lists"}