Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fulldecent/google-sheets-etl
Live import all your Google Sheets to your data warehouse
https://github.com/fulldecent/google-sheets-etl
data-vault data-warehouse etl
Last synced: 3 months ago
JSON representation
Live import all your Google Sheets to your data warehouse
- Host: GitHub
- URL: https://github.com/fulldecent/google-sheets-etl
- Owner: fulldecent
- License: mit
- Created: 2019-10-03T16:44:02.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2023-04-19T19:51:57.000Z (over 1 year ago)
- Last Synced: 2024-07-10T07:05:22.156Z (4 months ago)
- Topics: data-vault, data-warehouse, etl
- Language: PHP
- Homepage:
- Size: 2.25 MB
- Stars: 15
- Watchers: 3
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
- awesome-data-engineering - Google Sheets ETL - Live import all your Google Sheets to your data warehouse. (Data Ingestion)
README
[![PHP Composer](https://github.com/fulldecent/google-sheets-etl/actions/workflows/php.yml/badge.svg)](https://github.com/fulldecent/google-sheets-etl/actions/workflows/php.yml)
Google Sheets ETL
=================Import all your Google Sheets to your data warehouse, including periodic delta loads
See `example.php` how to use this library.
## Install
Via Composer
```sh
composer install
```Next, create a Google Service Account. This requires 20 steps so we made a [a step-by-step illustrated guide](GOOGLE-SETUP.md).
## Testing
```sh
composer test
```## Google Sheets limitations
We found several problems with using Google Sheets as a database, even though we continue to use it:
- Cannot restrict editing the first row (headers) to certain people
- If you try protecting the cells it will prevent everyone from using a filter which is unacceptable
- Sometimes the page will load slowly and your collaborators will accidentally overwrite the first row, which is default-selected, and it will cause your ETL to error until fixed
- Cannot restrict that any formatting must apply to the entire column (including new rows)
- Inevitably, any conditional formatting you try to set up will apply to a disjoint set of cells throughout your sheet over time
- Cannot restrict that formulas must apply to the entire column (including new rows)
- Inevitably, over time your calculated "status" column will turn into the text literal "DONE" as people copy-paste-values to new rows
- Cannot limit people from using formatting in cells (which comes by default when they paste into cells)
- Cannot enforce a unique column
- Creating a custom data validation formula is cumbersome and not reliable, plus other collaborators can defeat it
- Cannot create a sheet-level comment to document the purpose of the whole sheet
- Filters cannot be used, because they hide rows for everybody
- If using another mode "filter views", which is harder to find, it will create hundreds of saved "Filter 1", "Filter 2" ... files.## References
* PHP project layout from https://github.com/thephpleague/skeleton
* "You should never catch errors to report them" https://phpdelusions.net/pdo#errors