{"id":28245927,"url":"https://github.com/burhanahmed1/cryptosynth","last_synced_at":"2025-10-28T20:10:04.379Z","repository":{"id":293782927,"uuid":"948957824","full_name":"burhanahmed1/CryptoSynth","owner":"burhanahmed1","description":"Bitcoin Sentiment Forecast is a Multimodal approach to Bitcoin price forecasting using NLP and Time Series Analysis","archived":false,"fork":false,"pushed_at":"2025-05-17T05:04:41.000Z","size":3710,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-14T19:35:19.087Z","etag":null,"topics":["datafusion","datapreprocessing","eda","explainable-ai","featureengineering","machinelearning","multimodal-deep-learning","pca","predictive-modeling"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/burhanahmed1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-15T10:55:29.000Z","updated_at":"2025-05-17T05:04:44.000Z","dependencies_parsed_at":"2025-05-17T06:29:03.355Z","dependency_job_id":null,"html_url":"https://github.com/burhanahmed1/CryptoSynth","commit_stats":null,"previous_names":["burhanahmed1/cryptosynth"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/burhanahmed1/CryptoSynth","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/burhanahmed1%2FCryptoSynth","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/burhanahmed1%2FCryptoSynth/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/burhanahmed1%2FCryptoSynth/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/burhanahmed1%2FCryptoSynth/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/burhanahmed1","download_url":"https://codeload.github.com/burhanahmed1/CryptoSynth/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/burhanahmed1%2FCryptoSynth/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":281504373,"owners_count":26512876,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-28T02:00:06.022Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["datafusion","datapreprocessing","eda","explainable-ai","featureengineering","machinelearning","multimodal-deep-learning","pca","predictive-modeling"],"created_at":"2025-05-19T09:13:29.666Z","updated_at":"2025-10-28T20:10:04.363Z","avatar_url":"https://github.com/burhanahmed1.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CryptoSynth: A Multimodal Generative Framework for Bitcoin Price Forecasting\n\n![Bitcoin Pipeline](architecture/bird-eye.png)\nA comprehensive pipeline integrating sentiment analysis of social media data with Bitcoin price prediction using BERT, DistilBERT, LSTM, and XGBoost models.\n\n## About\n\nCryptoSynth explores a multimodal approach to forecast Bitcoin prices by leveraging:\n\n- Sentiment analysis of approximately 150,000 Bitcoin-related tweets from the X platform.\n- Transformer-based models (BERT and DistilBERT) for generating embeddings.\n- Integration with historical Bitcoin prices and technical indicators.\n- Advanced machine learning models including LSTM and XGBoost for prediction.\n- Explainable AI (XAI) using SHAP for model interpretability.\n\nThe pipeline spans data collection, preprocessing, exploratory data analysis (EDA), sentiment classification, price prediction, and result visualization.\n\n### Data Flow\n- **Input**: Bitcoin Tweets Dataset and Tabular Data (open, close, etc.).\n- **Data Preprocessing \u0026 Feature Selection**: Cleans and normalizes tweets, labels sentiments, and processes tabular data.\n- **BERT/DistilBERT**: Generates embeddings from preprocessed tweets.\n- **PCA**: Reduces dimensionality of combined embeddings and tabular data.\n- **LSTM \u0026 Hybrid Fusion**: Applies LSTM with Random Forest and XGBoost for prediction.\n- **Output**: Metrics and predictions.\n\n## Key Features\n- End-to-end sentiment analysis and price prediction.\n- Integration of textual and numerical data for enhanced forecasting.\n- Use of PCA for dimensionality reduction.\n- Robust model evaluation with XGBoost as the best performer.\n- XAI integration with SHAP for feature importance analysis.\n- Comprehensive visualizations for insights.\n\n## 🖼️ XGBoost Evaluation Metrics and XAI\n### XGBoost Performance\n![XGBoost Metrics](images/xgb-metrics.png)\n\n### SHAP Value Impact\n![SHAP Value Impact](images/shap-1.png)\n\n### SHAP Model Final Output\n![SHAP Final Output](images/shap-2.png)\n\n## 📊 Performance Metrics\n\n### On BERT Embeddings\n| Model                | MSE      | RMSE    | MAE    |\n|----------------------|----------|---------|--------|\n| Random Forest        | 0.0001837| 0.01355 | -      |\n| Random Forest (Imp)  | 0.0001095| 0.01046 | -      |\n| LSTM                 | 0.007127 | -       | 0.0664 |\n| Tuned LSTM           | 0.00549  | -       | 0.0576 |\n| LSTM (Hybrid Fusion) | 0.0049   | -       | 0.0512 |\n| XGBoost              | 0.004678 | -       | -      |\n\n### On DistilBERT Embeddings\n| Model                | MSE      | RMSE    | MAE    |\n|----------------------|----------|---------|--------|\n| Random Forest        | 0.0002086| 0.01444 | -      |\n| Random Forest (Imp)  | 0.0001095| 0.0146  | -      |\n| LSTM                 | 0.01304  | -       | 0.0937 |\n| Tuned LSTM           | 0.00364  | -       | 0.0443 |\n| LSTM (Hybrid Fusion) | 0.0053   | -       | 0.0531 |\n| XGBoost              | 0.0047   | -       | -      |\n\n## Setup \u0026 Usage\n### Requirements\nInstall necessary dependencies manually:\n- `torch`, `transformers`, `numpy`, `pandas`, `sklearn`\n- `matplotlib`, `tqdm`, `xgboost`, `shap`\n\n### Documentation\nRefer to the following:\n- [CryptoSynth_report.pdf](documentation/CryptoSynth_report.pdf): Project background, methodology, and results.\n\n\n## Contributions\nWe welcome feedback and contributions! Feel free to fork, star 🌟, and share.\n\n## License\nLicensed under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fburhanahmed1%2Fcryptosynth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fburhanahmed1%2Fcryptosynth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fburhanahmed1%2Fcryptosynth/lists"}