{"id":23585098,"url":"https://github.com/ansh-info/stockpulse","last_synced_at":"2026-04-10T13:32:11.154Z","repository":{"id":269832625,"uuid":"905783160","full_name":"ansh-info/StockPulse","owner":"ansh-info","description":"Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.","archived":false,"fork":false,"pushed_at":"2024-12-26T13:39:12.000Z","size":2708,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-26T14:22:19.236Z","etag":null,"topics":["api","big-data","bigquery","cloud","cloud-computing","cloud-native","data-engineering","data-pipeline","docker","docker-compose","gcp","gcp-automation-gitops","gcp-cloud-run","gcp-pubsub","google-cloud-platform","real-time","realtime","stock-market","stocks","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ansh-info.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-19T14:10:29.000Z","updated_at":"2024-12-26T13:23:13.000Z","dependencies_parsed_at":"2024-12-26T14:22:23.124Z","dependency_job_id":"da7081e6-02f6-4d03-8e9c-3f1091f523ad","html_url":"https://github.com/ansh-info/StockPulse","commit_stats":null,"previous_names":["ansh-info/stockpulse"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ansh-info%2FStockPulse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ansh-info%2FStockPulse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ansh-info%2FStockPulse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ansh-info%2FStockPulse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ansh-info","download_url":"https://codeload.github.com/ansh-info/StockPulse/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239412489,"owners_count":19634016,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","big-data","bigquery","cloud","cloud-computing","cloud-native","data-engineering","data-pipeline","docker","docker-compose","gcp","gcp-automation-gitops","gcp-cloud-run","gcp-pubsub","google-cloud-platform","real-time","realtime","stock-market","stocks","streamlit"],"created_at":"2024-12-27T03:13:31.045Z","updated_at":"2025-12-30T21:41:58.001Z","avatar_url":"https://github.com/ansh-info.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# StockPulse\n\n![GCP](https://img.shields.io/badge/GCP-Cloud%20Platform-blue)\n![Python](https://img.shields.io/badge/Python-3.9%2B-brightgreen)\n![License](https://img.shields.io/badge/License-MIT-yellow)\n\nA robust, production-ready stock market data pipeline built on Google Cloud Platform (GCP) that processes and analyzes high-frequency stock data in real-time. This project demonstrates advanced data engineering practices including parallel processing, data validation, and real-time analytics.\n\n![Candle Chart](images/candlechart.png)\n\n## 🎯 Key Features\n\n- **Real-time Processing**: Fetches and processes stock data at 5-minute intervals\n- **Scalable Architecture**: Built on GCP services for high availability and scalability\n- **Intelligent Rate Limiting**: Smart API key rotation system\n- **Robust Error Handling**: Comprehensive retry mechanisms and validation\n- **Advanced Analytics**: Real-time technical indicators and market analysis\n- **Interactive Dashboard**: Rich visualization powered by Streamlit\n- **Data Integrity**: Multi-layer deduplication and validation processes\n- **High Performance**: Processes ~4,000 data points per stock over 30 days\n\n### Data Flow\n\n1. **Data Collection**\n\n   - Alpha Vantage API integration\n   - Rate limit management\n   - Initial data validation\n\n2. **Message Queue**\n\n   - Google Pub/Sub implementation\n   - Asynchronous message processing\n   - Message persistence and retry logic\n\n3. **Data Processing**\n\n   ```\n   Raw Data -\u003e Validation -\u003e Transformation -\u003e Technical Analysis -\u003e Storage\n   ```\n\n   - Data cleaning and normalization\n   - Technical indicator calculation\n   - Real-time analytics processing\n\n4. **Storage Layer**\n   - BigQuery: Structured data storage\n   - Cloud Storage: Raw data archival\n   - Dual-write consistency patterns\n\n![Big Query](images/bigquery.png)\n\n## 📊 Monitored Stocks\n\n| Symbol | Company    | Sector     | Update Frequency |\n| ------ | ---------- | ---------- | ---------------- |\n| AMZN   | Amazon     | Technology | 5 min            |\n| TSLA   | Tesla      | Automotive | 5 min            |\n| AAPL   | Apple      | Technology | 5 min            |\n| GOOGL  | Google     | Technology | 5 min            |\n| MSFT   | Microsoft  | Technology | 5 min            |\n| IBM    | IBM        | Technology | 5 min            |\n| JPM    | JPMorgan   | Finance    | 5 min            |\n| PFE    | Pfizer     | Healthcare | 5 min            |\n| XOM    | ExxonMobil | Energy     | 5 min            |\n| KO     | Coca-Cola  | Consumer   | 5 min            |\n\n![Data Flow](images/dataflow.png)\n\n## 🛠️ Technical Stack\n\n### Core Technologies\n\n- Python 3.9+\n- Google Cloud Platform\n- Docker \u0026 Docker Compose\n- Alpha Vantage API\n\n### GCP Services\n\n- Cloud Pub/Sub\n- BigQuery\n- Cloud Storage\n- Cloud Functions (optional)\n\n## 🚀 Setup and Installation - Docker(Recommended)\n\n### Prerequisites\n\n- Python 3.9+\n- GCP Account with enabled billing #Get Your Service Key from GCP - Place it in the keys/\n- Alpha Vantage API key\n- Docker\n\n### Local Development Setup\n\n1. **Clone \u0026 Configure Environment**\n\n   ```bash\n   # Clone repository\n   git clone https://github.com/ansh-info/StockPulse.git\n   cd StockPulse\n\n   # Create virtual environment\n   python -m venv venv\n   source venv/bin/activate  # Windows: venv\\Scripts\\activate\n\n   # Install dependencies\n   pip install -r requirements.txt\n   ```\n\n2. **GCP Configuration**\n\n   ```bash\n   # GET YOUR KEY - PLACE IT IN THE keys/\n\n   # Set up service account\n   export GOOGLE_APPLICATION_CREDENTIALS=\"path/to/key.json\"\n\n   # Configure gcloud CLI\n   gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS\n   gcloud config set project YOUR_PROJECT_ID\n   ```\n\n3. **Update Configuration**\n   ```python\n   # config.py and .env\n   GCP_CONFIG = {\n       \"GCP_PROJECT_ID\": \"your-project-id\",\n       \"GCP_BUCKET_NAME\": \"your-bucket-name\",\n       \"GCP_TOPIC_NAME\": \"your-topic-name\",\n       \"GCP_DATASET_NAME\": \"your-dataset-name\"\n   }\n   ALPHA_VANTAGE_KEY = {\n     \"ALPHA_VANTAGE_KEY_1\": \"your-api-key-1\"\n   }\n   ```\n\n### Docker Deployment(Recommended)\n\n```bash\n# Build and run with Docker Compose\ndocker-compose up -d\n\n# Check container status\ndocker-compose ps\n\n# View logs\ndocker-compose logs -f\n\n# Interact with gcloudsdk\ndocker exec -it gcloudsdk /bin/bash\n\n# Interact with python container\ndocker exec -it python /bin/bash\n```\n\n![Weekly Distribution](images/weekly_distribution.png)\n\n## 📋 Usage Guide\n\n### Starting the Pipeline\n\n1. **Initialize the Environment**\n\n   ```bash\n   source venv/bin/activate\n   export GOOGLE_APPLICATION_CREDENTIALS=\"path/to/key.json\"\n   ```\n\n2. **Run Core Components**\n\n   ```bash\n   # Start data loader pipeline (wait for the tables to be created)\n   python bigquery_loader.py\n\n   # Start data pipeline (wait for the data to be fetched and published)\n   python stocks_pipeline.py\n\n   # Run deduplication process (start after the bigquery_loader completes)\n   python dedup_pipeline.py\n\n   # Launch dashboard (run it from the app/ - to get white background)\n   streamlit run dashboard.py\n   ```\n\n### Dashboard Features\n\n- Real-time stock price visualization\n- Technical analysis indicators:\n  - Moving Averages (SMA, EMA)\n  - RSI (Relative Strength Index)\n  - MACD (Moving Average Convergence Divergence)\n- Volume analysis with VWAP\n- Customizable timeframes\n- Interactive candlestick charts\n\n![RSI Chart](images/rsi_chart.png)\n\n## 📁 Project Structure\n\n```\nStockPulse/\n│\n├── LICENSE\n├── README.md\n├── app\n│   ├── __init__.py\n│   └── dashboard.py\n├── docker-compose.yml\n├── docs\n│   └── docs.md\n├── keys\n│   ├── key.example.json\n│   └── key.json\n├── requirements.txt\n├── src\n│   ├── __init__.py\n│   ├── __pycache__\n│   │   └── __init__.cpython-39.pyc\n│   ├── config\n│   │   ├── __init__.py\n│   │   ├── __pycache__\n│   │   │   ├── __init__.cpython-39.pyc\n│   │   │   └── config.cpython-39.pyc\n│   │   └── config.py\n│   ├── ingestion\n│   │   ├── __init__.py\n│   │   └── stocks_pipeline.py\n│   ├── loader\n│   │   ├── __init__.py\n│   │   └── bigquery_loader.py\n│   └── preprocessing\n│       ├── __init__.py\n│       ├── data_preprocessor.py\n│       ├── dedup_pipeline.py\n│       └── preprocessing_pipeline.py\n└── tests\n    ├── __init__.py\n    └── check_gcs_buckets.py\n\n```\n\n## 🔄 Error Handling\n\n### Retry Mechanism\n\n```python\n@retry(\n    retry_on_exception=retry_if_exception_type(Exception),\n    wait_exponential_multiplier=1000,\n    wait_exponential_max=10000,\n    stop_max_attempt_number=3\n)\n```\n\n### Validation Rules\n\n- Timestamp format validation\n- Price range checks\n- Volume validation\n- Data completeness verification\n\n## 🔮 Future Roadmap\n\n- [ ] Machine Learning integration for price prediction\n- [ ] Real-time alerting system\n- [ ] Advanced technical indicators\n- [ ] Performance optimization\n- [ ] Enhanced monitoring and logging\n- [ ] API endpoint for data access\n\n## 📖 Documentation\n\nDetailed documentation is available in the `/docs` directory:\n\n- API Documentation\n- Setup Guide\n- Troubleshooting Guide\n- Best Practices\n\n## 📝 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 💡 Citation\n\nIf you use this project in your research, please cite:\n\n```bibtex\n@software{StockPulse_2024,\n  author = {Ansh Kumar and Apoorva Gupta},\n  title = {StockPulse: GCP-powered platform for real-time stock market data processing and visualization},\n  year = {2024},\n  url = {https://github.com/ansh-info/StockPulse.git}\n}\n```\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fansh-info%2Fstockpulse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fansh-info%2Fstockpulse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fansh-info%2Fstockpulse/lists"}