{"id":20768677,"url":"https://github.com/jmfeck/bigquery-local-framework","last_synced_at":"2026-05-06T19:35:51.584Z","repository":{"id":260792929,"uuid":"882335360","full_name":"jmfeck/bigquery-local-framework","owner":"jmfeck","description":"This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.","archived":false,"fork":false,"pushed_at":"2024-11-08T12:44:35.000Z","size":46,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-18T06:45:35.051Z","etag":null,"topics":["bigquery","data-engineering","ingestion","pandas","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmfeck.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-02T14:54:34.000Z","updated_at":"2024-11-08T12:44:39.000Z","dependencies_parsed_at":"2024-11-02T16:23:19.922Z","dependency_job_id":"e0af862b-cd11-4329-ad3c-e74cf5ef6026","html_url":"https://github.com/jmfeck/bigquery-local-framework","commit_stats":null,"previous_names":["jmfeck/bigquery-local-framework"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmfeck%2Fbigquery-local-framework","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmfeck%2Fbigquery-local-framework/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmfeck%2Fbigquery-local-framework/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmfeck%2Fbigquery-local-framework/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmfeck","download_url":"https://codeload.github.com/jmfeck/bigquery-local-framework/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243098593,"owners_count":20236054,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigquery","data-engineering","ingestion","pandas","python"],"created_at":"2024-11-17T11:40:20.923Z","updated_at":"2026-05-06T19:35:51.579Z","avatar_url":"https://github.com/jmfeck.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# BigQuery Local Framework\n\n## Overview\n\nThe **BigQuery Local Framework** is a Python-based toolkit for managing data workflows in Google BigQuery - locally. It includes scripts to automate data ingestion, query execution, and data extraction, making it easier to handle common tasks. Each script is customizable via YAML configuration files, providing flexibility for various data workflows.\nThis is a great tool for data people that don't have total access to all resources of GCP, or people looking for fast prototyping (which was my case).\n\n## Features\n\n- **Ingest Excel Files**: Load data from Excel files into BigQuery.\n- **Execute SQL Queries**: Run SQL queries stored in files directly within BigQuery.\n- **Extract Table Data**: Export data from a BigQuery table or view to a local file in formats such as CSV, Excel, and Parquet.\n- **Extract Query Data**: Export data from a BigQuery query results to a local file in formats such as CSV, Excel, and Parquet.\n \n## Project Structure\n\n```plaintext\nbigquery-local-framework/\n│\n├── config/\n│   ├── sample_config_query_extractor.yaml      # Configuration for query extraction\n│   ├── sample_config_table_extractor.yaml      # Configuration for table extraction\n│   ├── sample_config_ingest.yaml               # Configuration for data ingestion\n│   └── sample_config_query_trigger.yaml                # Configuration for query execution\n│\n├── input/\n│   └── incoming_excel_filename.xlsx            # Sample input file for ingestion\n│\n├── output/\n│   ├── sample_query_export.csv                 # Sample output file for query extraction\n│   └── sample_table_export.csv                 # Sample output file for table extraction\n│\n├── queries/\n│   └── sample_query.sql                        # Sample SQL query for BigQuery\n│\n├── scripts/\n│   ├── bigquery_ingest_excel.py                # Script to ingest Excel files into BigQuery\n│   ├── bigquery_query_trigger.py               # Script to execute SQL queries in BigQuery\n│   ├── bigquery_query_extractor.py             # Script to extract query data from BigQuery\n│   └── bigquery_table_extractor.py             # Script to extract table data from BigQuery\n│\n├── sample_run_pipeline.bash                # Bash script to run the full pipeline\n├── sample_run_pipeline.bat                 # Batch script to run the full pipeline on Windows\n├── LICENSE\n└── README.md\n```\n\n## Setup\n\n### Prerequisites\n\n- **Python 3.9**\n- **Google Cloud SDK** with BigQuery API enabled\n- Required Python packages (install with `requirements.txt`):\n\n```bash\npip install -r requirements.txt\n```\n\n### Configuration\n\nConfiguration files for each task are located in the `config/` directory. Each YAML file contains settings specific to the task:\n\n- **sample_config_ingest.yaml**: Config for ingesting Excel files.\n- **sample_config_query_trigger.yaml**: Config for executing queries.\n- **sample_config_table_extractor.yaml**: Config for extracting complete data from view/table.\n- **sample_config_query_extractor.yaml**: Config for extracting results from query.\n\nCustomize these files with your project settings.\n\n## Usage\n\n### 1. Ingest Excel Files to BigQuery\n\nTo load data from an Excel file into BigQuery:\n\n```bash\npython scripts/bigquery_ingest_excel.py config/sample_config_ingest.yaml\n```\n\nThis script reads an Excel file from the `input/` directory and loads it into the specified BigQuery table.\n\n### 2. Execute SQL Queries in BigQuery\n\nTo run a SQL query from the `queries/` directory:\n\n```bash\npython scripts/bigquery_query_trigger.py config/sample_config_query_trigger.yaml queries/sample_query.sql\n```\n\nThis will execute the specified SQL query within BigQuery.\n\n### 3. Extract Table Data from BigQuery\n\nThe is two ways to extract data from BigQuery.\n\n#### 3.1 Extract Table/View Data\n\nTo export table/view from BigQuery to a local file:\n\n```bash\npython scripts/bigquery_table_extractor.py config/sample_config_table_extractor.yaml\n```\n\nThe data will be saved in the `output/` directory in the format specified in the configuration file (CSV, Excel, or Parquet).\n\n#### 3.2 Extract Query Results\n\nTo export query results from BigQuery to a local file:\n\n```bash\npython scripts/bigquery_query_extractor.py config/sample_config_query_extractor.yaml queries/sample_query.sql\n```\n\nThe data will be saved in the `output/` directory in the format specified in the configuration file (CSV, Excel, or Parquet).\n\n## Sample Workflow\n\n- **sample_run_pipeline.bash**: Bash script to run the pipeline on Unix-based systems.\n- **sample_run_pipeline.bat**: Batch script to run the pipeline on Windows.\n\nThese scripts show how to automate ingestion, query execution, and extraction in sequence - a pipeline if you will.\n\n## Example Workflow\n\n1. Place input files in the `input/` directory and set up configuration files in `config/`.\n2. Run `bigquery_ingest_excel.py` to load data into BigQuery.\n3. Use `bigquery_query_trigger.py` to run SQL transformations or analysis.\n4. Run `bigquery_table_extractor.py` to save query results to a local file.\n\n## License\n\nThis project is licensed under the MIT License.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmfeck%2Fbigquery-local-framework","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmfeck%2Fbigquery-local-framework","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmfeck%2Fbigquery-local-framework/lists"}