{"id":28373137,"url":"https://github.com/ddihora1604/iitk_task","last_synced_at":"2025-10-29T14:42:31.386Z","repository":{"id":293913001,"uuid":"985484439","full_name":"ddihora1604/IITK_Task","owner":"ddihora1604","description":"A comprehensive financial data analysis system that collects, processes, and analyzes data from approximately 500 tickers in the S\u0026P Global Index. It provides detailed financial information, ESG metrics, and various financial statements for comprehensive market analysis.","archived":false,"fork":false,"pushed_at":"2025-05-17T21:43:49.000Z","size":71152,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-29T18:58:03.862Z","etag":null,"topics":["beautifulsoup4","data-analysis","data-visualization","datamodelling","dataset","esg","machine-learning","python","yahoo-finance"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ddihora1604.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-17T21:29:00.000Z","updated_at":"2025-05-17T21:43:52.000Z","dependencies_parsed_at":"2025-05-17T22:38:36.511Z","dependency_job_id":null,"html_url":"https://github.com/ddihora1604/IITK_Task","commit_stats":null,"previous_names":["ddihora1604/iitk_task"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ddihora1604/IITK_Task","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddihora1604%2FIITK_Task","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddihora1604%2FIITK_Task/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddihora1604%2FIITK_Task/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddihora1604%2FIITK_Task/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ddihora1604","download_url":"https://codeload.github.com/ddihora1604/IITK_Task/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddihora1604%2FIITK_Task/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261874089,"owners_count":23223061,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","data-analysis","data-visualization","datamodelling","dataset","esg","machine-learning","python","yahoo-finance"],"created_at":"2025-05-29T18:39:34.023Z","updated_at":"2025-10-29T14:42:31.379Z","avatar_url":"https://github.com/ddihora1604.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Financial Data Analysis and ESG Metrics Project\n\n## Overview\nThis project is a comprehensive financial data analysis system that collects, processes, and analyzes data from approximately 500 tickers in the S\u0026P Global Index. It provides detailed financial information, ESG metrics, and various financial statements for comprehensive market analysis.\n\n## Key Features\n\n### 1. Historical Data Analysis\n- Historical price data collection over the past 5 years\n- Time-series data processing\n- Market trend analysis capabilities\n\n### 2. ESG (Environmental, Social, Governance) Data\n- Comprehensive ESG metrics collection\n- Environmental impact analysis\n- Social responsibility metrics\n- Corporate governance evaluation\n\n### 3. Company Information\n- Detailed company summaries\n- Key business metrics\n- Company overview and description\n\n### 4. Financial Statements\n- Income Statement analysis\n- Balance Sheet data\n- Cash Flow statement analysis\n- Key financial ratios and metrics\n\n### 5. Statistical Analysis\n- Key statistics and metrics\n- Market performance indicators\n- Financial health indicators\n\n## Technical Architecture\n\n### Core Components\n1. **Data Collection Module**\n   - `historical_data.py`: Historical price data collection\n   - `esg_data.py`: ESG metrics collection\n   - `company_summary.py`: Company information gathering\n   - `statistical_data.py`: Statistical data processing\n\n2. **Financial Analysis Module**\n   - `income_statement.py`: Income statement analysis\n   - `balance_sheet.py`: Balance sheet analysis\n   - `cash_flows.py`: Cash flow analysis\n   - `stocks.py`: Stock-specific data processing\n\n3. **Bot Management**\n   - `bot.py`: Handles web scraping and API interactions\n\n## Data Collection and Processing\n\n### Data Sources\n- Primary Data Source: Yahoo Finance\n- Secondary Data Source: Web scraping for additional metrics\n- S\u0026P Global Index tickers (approximately 500 companies)\n\n### Rate Limiting and Bot Handling\n- Custom user-agent headers implementation\n- Rate limiting management\n- Bot detection avoidance techniques\n- Request throttling and delay implementation\n\n### Data Storage\n- Processed data stored in the `Data/` directory\n- Raw datasets maintained in `Datasets/` directory\n- ESG-specific data in `files4esg/` directory\n\n## Data Pipeline\n1. Data Collection\n   - Fetch data from Yahoo Finance\n   - Web scraping for additional metrics\n   - ESG data collection\n\n2. Data Processing\n   - Clean and validate data\n   - Transform into required formats\n   - Calculate derived metrics\n\n3. Data Storage\n   - Store processed data\n   - Maintain data versioning\n   - Ensure data integrity\n\n## Installation and Setup\n\n### Prerequisites\n- Python 3.8 or higher\n- pip (Python package manager)\n\n### Installation Steps\n1. Clone the repository:\n   ```bash\n   git clone [repository-url]\n   cd [repository-name]\n   ```\n\n2. Install required dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n### Running the Project\n1. Ensure all dependencies are installed\n2. Run the individual data fetching and scraping script:\n   ```bash\n   python historical_data.py\n   python esg_data.py\n   python company_summary.py\n   python statistical_data.py\n   python income_statement.py\n   python balance_sheet.py\n   python cash_flows.py\n   ```\n   \n3. Run the main data collection script:\n   ```bash\n   python stocks.py\n   ```\n\n## Project Structure\n```\n├── Data/                  # Processed data storage\n├── Datasets/             # Raw datasets\n├── files4esg/           # ESG-specific data files\n├── balance_sheet.py     # Balance sheet analysis\n├── cash_flows.py        # Cash flow analysis\n├── company_summary.py   # Company information\n├── esg_data.py         # ESG metrics collection\n├── historical_data.py   # Historical data processing\n├── income_statement.py  # Income statement analysis\n├── statistical_data.py  # Statistical analysis\n├── stocks.py           # Stock data processing\n├── bot.py             # Web scraping and API handling\n└── requirements.txt    # Project dependencies\n```\n\n## Acknowledgments\n- Yahoo Finance for market data\n- S\u0026P Global for index data\n- Contributors and maintainers of the project ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fddihora1604%2Fiitk_task","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fddihora1604%2Fiitk_task","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fddihora1604%2Fiitk_task/lists"}