{"id":20026326,"url":"https://github.com/liviuxyz-ctrl/financial-datawarehouse","last_synced_at":"2026-04-06T21:31:35.169Z","repository":{"id":242377150,"uuid":"809352213","full_name":"liviuxyz-ctrl/Financial-DataWarehouse","owner":"liviuxyz-ctrl","description":"The Financial DataWarehouse project aims to provide an efficient solution for storing and managing financial and commodities data using Cassandra. This project includes a REST API built with FastAPI for easy access and manipulation of data. Using Docker Compose, it deploys a multi-node Cassandra cluster to ensure data redundancy and fault tolerance","archived":false,"fork":false,"pushed_at":"2025-09-11T08:30:52.000Z","size":114,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-01-03T22:34:24.391Z","etag":null,"topics":["cassandra-database","docker","orm","python","rest-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/liviuxyz-ctrl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-06-02T13:00:03.000Z","updated_at":"2025-09-11T08:30:56.000Z","dependencies_parsed_at":"2024-06-09T10:58:04.615Z","dependency_job_id":"5a8a87cf-6c22-4427-a113-dbd2e2e4ad41","html_url":"https://github.com/liviuxyz-ctrl/Financial-DataWarehouse","commit_stats":null,"previous_names":["liviuxyz-ctrl/datawarehouse","liviuxyz-ctrl/financial-datawarehouse"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/liviuxyz-ctrl/Financial-DataWarehouse","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liviuxyz-ctrl%2FFinancial-DataWarehouse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liviuxyz-ctrl%2FFinancial-DataWarehouse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liviuxyz-ctrl%2FFinancial-DataWarehouse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liviuxyz-ctrl%2FFinancial-DataWarehouse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/liviuxyz-ctrl","download_url":"https://codeload.github.com/liviuxyz-ctrl/Financial-DataWarehouse/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/liviuxyz-ctrl%2FFinancial-DataWarehouse/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31491096,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-06T17:22:55.647Z","status":"ssl_error","status_checked_at":"2026-04-06T17:22:54.741Z","response_time":112,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra-database","docker","orm","python","rest-api"],"created_at":"2024-11-13T09:06:19.220Z","updated_at":"2026-04-06T21:31:35.146Z","avatar_url":"https://github.com/liviuxyz-ctrl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Financial DataWarehouse Project\n\n\u003cdetails open\u003e \n  \u003csummary\u003eTable of Contents\u003c/summary\u003e\n  \u003col\u003e\n    \u003cli\u003e\u003ca href=\"#introduction\"\u003eIntroduction\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#system-architecture\"\u003eSystem Architecture\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#getting-started\"\u003eGetting Started\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#project-structure\"\u003eProject Structure\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#docker-architecture\"\u003eDocker Architecture\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#api-documentation\"\u003eAPI Documentation\u003c/a\u003e\u003c/li\u003e\n    \u003cli\u003e\u003ca href=\"#license\"\u003eLicense\u003c/a\u003e\u003c/li\u003e\n  \u003c/ol\u003e\n\u003c/details\u003e\n\n## Introduction\n\n\u003cdetails open\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  This Data Warehouse project is engineered to facilitate extensive data handling capabilities for financial and commodities data. It employs advanced Python data engineering techniques, leveraging ORM for efficient data interactions and providing a RESTful API for data access.\n\u003c/details\u003e\n\n## System Architecture\n\n\u003cdetails open\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  The architecture is built around Python and Cassandra, with Docker ensuring container management. The integration of Python ORM simplifies database interactions, converting complex SQL into manageable Python code, enhancing maintainability and scalability.\n\u003c/details\u003e\n\n## Getting Started\n\n\u003cdetails open\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n\n  ### Prerequisites\n\n  - Python 3.10 or later\n  - Docker and Docker Compose\n  - Cassandra\n  - Virtualenv or any environment management tool\n\n  ### Installation\n\n  1. **Clone the repository:**\n     ```bash\n     git clone https://yourrepository.com/data-warehouse.git\n     cd data-warehouse\n     ```\n\n  2. **Set up the virtual environment:**\n     ```bash\n     python -m venv venv\n     source venv/bin/activate  # On Windows use `venv\\Scripts\\activate`\n     ```\n\n  3. **Install dependencies:**\n     ```bash\n     pip install -r requirements.txt\n     ```\n\n  4. **Launch Docker containers:**\n     ```bash\n     docker-compose up -d\n     ```\n\n  5. **Database Initialization:**\n     Execute scripts to configure the database schema and seed it with initial data.\n\n\u003c/details\u003e\n\n## Project Structure\n\n\u003cdetails\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  \u003cul\u003e\n    \u003cli\u003e\u003ccode\u003esrc/\u003c/code\u003e: Contains all source files.\n      \u003cul\u003e\n        \u003cli\u003e\u003ccode\u003eclients/\u003c/code\u003e: API clients for data sources.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003ecommodities_api_client.py\u003c/code\u003e: Retrieves commodities data.\u003c/li\u003e\n            \u003cli\u003e\u003ccode\u003enasdaq_api_client.py\u003c/code\u003e: Fetches NASDAQ data.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n        \u003cli\u003e\u003ccode\u003econfig/\u003c/code\u003e: Application configurations.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003esettings.py\u003c/code\u003e: Central config file.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n        \u003cli\u003e\u003ccode\u003edata/\u003c/code\u003e: Handles database operations.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003edatabase.py\u003c/code\u003e: Manages database connections.\u003c/li\u003e\n            \u003cli\u003e\u003ccode\u003emodels.py\u003c/code\u003e: Defines ORM models.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n        \u003cli\u003e\u003ccode\u003eingestion/\u003c/code\u003e: Manages data loading and processing.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003eload.py\u003c/code\u003e: Ingests data into the database.\u003c/li\u003e\n            \u003cli\u003e\u003ccode\u003etransform.py\u003c/code\u003e: Transforms data as needed.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n        \u003cli\u003e\u003ccode\u003einit_scripts/\u003c/code\u003e: Database initialization scripts.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003epopulate_commodities_data.py\u003c/code\u003e: Seeds commodities data.\u003c/li\u003e\n            \u003cli\u003e\u003ccode\u003epopulate_sp500_data.py\u003c/code\u003e: Seeds S\u0026P 500 data.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n        \u003cli\u003e\u003ccode\u003eutils/\u003c/code\u003e: Utility scripts.\n          \u003cul\u003e\n            \u003cli\u003e\u003ccode\u003elog_helper.py\u003c/code\u003e: Provides logging functions.\u003c/li\u003e\n          \u003c/ul\u003e\n        \u003c/li\u003e\n      \u003c/ul\u003e\n    \u003c/li\u003e\n  \u003c/ul\u003e\n\u003c/details\u003e\n\n\n## Docker Architecture\n\n\u003cdetails\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  This project uses Docker to containerize and manage the Cassandra database cluster, ensuring consistency and scalability in the development and deployment environments. The Docker setup is defined in the `docker-compose.yml` file, which specifies the configuration for a multi-node Cassandra cluster along with Portainer for container management.\n\n  ### Docker Compose File\n\n  The `docker-compose.yml` file defines the services and their configurations as follows:\n\n  ```yaml\n  version: '3'\n\n  services:\n    # Node 1 Configuration\n    DC1N1:\n      image: cassandra:3.10\n      command: bash -c 'if [ -z \"$$(ls -A /var/lib/cassandra/)\" ] ; then sleep 0; fi \u0026\u0026 /docker-entrypoint.sh cassandra -f'\n      networks:\n        - dc1ring\n      volumes:\n        - ./n1data:/var/lib/cassandra\n      environment:\n        - CASSANDRA_CLUSTER_NAME=dev_cluster\n        - CASSANDRA_SEEDS=DC1N1\n      expose:\n        - 7000  # Cluster communication\n        - 7001  # SSL Cluster communication\n        - 7199  # JMX\n        - 9042  # CQL\n        - 9160  # Thrift service\n      ports:\n        - \"9042:9042\"\n      ulimits:\n        memlock: -1\n        nproc: 32768\n        nofile: 100000\n\n    # Node 2 Configuration\n    DC1N2:\n      image: cassandra:3.10\n      command: bash -c 'if [ -z \"$$(ls -A /var/lib/cassandra/)\" ] ; then sleep 60; fi \u0026\u0026 /docker-entrypoint.sh cassandra -f'\n      networks:\n        - dc1ring\n      volumes:\n        - ./n2data:/var/lib/cassandra\n      environment:\n        - CASSANDRA_CLUSTER_NAME=dev_cluster\n        - CASSANDRA_SEEDS=DC1N1\n      depends_on:\n        - DC1N1\n      expose:\n        - 7000\n        - 7001\n        - 7199\n        - 9042\n        - 9160\n      ports:\n        - \"9043:9042\"\n      ulimits:\n        memlock: -1\n        nproc: 32768\n        nofile: 100000\n\n    # Node 3 Configuration\n    DC1N3:\n      image: cassandra:3.10\n      command: bash -c 'if [ -z \"$$(ls -A /var/lib/cassandra/)\" ] ; then sleep 120; fi \u0026\u0026 /docker-entrypoint.sh cassandra -f'\n      networks:\n        - dc1ring\n      volumes:\n        - ./n3data:/var/lib/cassandra\n      environment:\n        - CASSANDRA_CLUSTER_NAME=dev_cluster\n        - CASSANDRA_SEEDS=DC1N1\n      depends_on:\n        - DC1N1\n      expose:\n        - 7000\n        - 7001\n        - 7199\n        - 9042\n        - 9160\n      ports:\n        - \"9044:9042\"\n      ulimits:\n        memlock: -1\n        nproc: 32768\n        nofile: 100000\n\n    # Portainer Configuration\n    portainer:\n      image: portainer/portainer\n      networks:\n        - dc1ring\n      volumes:\n        - /var/run/docker.sock:/var/run/docker.sock\n        - ./portainer-data:/data\n      ports:\n        - \"9000:9000\"\n\n  networks:\n    dc1ring: { }\n  ```\n\n  ### Explanation\n\n  1. **Cassandra Nodes**:\n     - **DC1N1, DC1N2, DC1N3**:\n       - Each service represents a Cassandra node in the cluster.\n       - The `image` specifies the Docker image used.\n       - The `command` ensures that the node waits if the data directory is empty, then starts Cassandra.\n       - `networks` configures the internal network (`dc1ring`) for the cluster.\n       - `volumes` maps the host directory to the container directory for persistent storage.\n       - `environment` variables set cluster configurations such as `CASSANDRA_CLUSTER_NAME` and `CASSANDRA_SEEDS`.\n       - `ports` exposes necessary ports for communication and management.\n       - `ulimits` sets resource limits for the container.\n\n  2. **Portainer**:\n     - The **Portainer** service provides a web-based interface for managing Docker containers.\n     - It is configured to use the same `dc1ring` network and has access to the Docker socket for control.\n\u003c/details\u003e\n\n## API Documentation\n\n\u003cdetails open\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  The API is structured around resources representing financial data and commodities. It supports operations for retrieving data based on asset identifiers and includes pagination capabilities.\n\n  For easier use, a **Postman collection** is provided. You can download it [here](https://github.com/liviuxyz-ctrl/DataWarehouse/blob/master/Financial%20Data%20API.postman_collection.json).\n\n  ### Endpoint Details\n\n  #### Financial Data Endpoints\n\n  - **GET /api/v1/data/{asset_id}**\n    - Retrieves financial data for a specified asset.\n    - Parameters:\n      - `asset_id`: UUID of the asset.\n      - `limit`: Number of records to return.\n      - `offset`: Pagination offset.\n    - Example: `http://127.0.0.1:8000/api/v1/data/AAPL?limit=20\u0026offset=0`\n\n  #### Commodity Data Endpoints\n\n  - **GET /api/v1/commodities/{commodity_id}**\n    - Fetches commodity data.\n    - Parameters:\n      - `commodity_id`: Identifier for the commodity.\n      - `limit`: Controls the size of the returned data set.\n      - `offset`: Specifies the pagination offset.\n    - Example: `http://127.0.0.1:8000/api/v1/commodities/brent?limit=20\u0026offset=0`\n\n  #### Asset Endpoints\n\n  - **GET /api/v1/assets**\n    - Retrieves a list of asset names.\n    - Parameters:\n      - `offset`: The number of records to skip from the beginning.\n      - `limit`: The number of records to return.\n    - Example: `http://127.0.0.1:8000/api/v1/assets?offset=0\u0026limit=20`\n\n  #### Data Source Endpoints\n\n  - **GET /api/v1/data_sources**\n    - Retrieves a list of all data sources.\n    - Example: `http://127.0.0.1:8000/api/v1/data_sources`\n\n  - **GET /api/v1/data_sources/{source_id}**\n    - Retrieves details of a specific data source.\n    - Parameters:\n      - `source_id`: UUID of the data source.\n    - Example: `http://127.0.0.1:8000/api/v1/data_sources/{source_id}`\n\n  ### Examples\n\n  ```bash\n  # Fetch financial data for a specific asset\n  curl -X GET \"http://localhost:8000/api/v1/data/AAPL?limit=10\u0026offset=0\"\n\n  # Retrieve commodity data\n  curl -X GET \"http://localhost:8000/api/v1/commodities/brent?limit=5\u0026offset=0\"\n\n  # Get a list of assets\n  curl -X GET \"http://localhost:8000/api/v1/assets?offset=0\u0026limit=20\"\n\n  # Get a list of data sources\n  curl -X GET \"http://localhost:8000/api/v1/data_sources\"\n\n  # Get details of a specific data source\n  curl -X GET \"http://localhost:8000/api/v1/data_sources/{source_id}\"\n  ```\n\u003c/details\u003e\n\n## Photos\n\n\u003cdetails open\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  \n  ![image](https://github.com/liviuxyz-ctrl/DataWarehouse/assets/70070368/30ad1780-90cc-49cb-be44-45df3acffbeb)\n  \n  ![image](https://github.com/liviuxyz-ctrl/DataWarehouse/assets/70070368/d81a715e-cfc0-4b3e-b62c-87fb8dbeb35d)\n\n  ![image](https://github.com/liviuxyz-ctrl/Financial-DataWarehouse/assets/70070368/64c7257a-5ed7-4c5f-b835-c27c40fcc05c)\n\n\u003c/details\u003e\n\n## License\n\n\u003cdetails\u003e\n  \u003csummary\u003eDetails\u003c/summary\u003e\n  Licensed under the MIT License. See [LICENSE.md](LICENSE) for more details.\n\u003c/details\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliviuxyz-ctrl%2Ffinancial-datawarehouse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fliviuxyz-ctrl%2Ffinancial-datawarehouse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fliviuxyz-ctrl%2Ffinancial-datawarehouse/lists"}