https://github.com/someengineering/cloud2sql
Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.
https://github.com/someengineering/cloud2sql
Last synced: 11 months ago
JSON representation
Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.
- Host: GitHub
- URL: https://github.com/someengineering/cloud2sql
- Owner: someengineering
- License: agpl-3.0
- Created: 2022-11-14T13:41:41.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-11-09T22:37:28.000Z (over 2 years ago)
- Last Synced: 2025-04-26T06:41:08.504Z (11 months ago)
- Language: Python
- Homepage: https://cloud2sql.com
- Size: 695 KB
- Stars: 33
- Watchers: 5
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Cloud2SQL 🤩
Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.

## Installation
### Install via homebrew
This is the easiest way to install Cloud2SQL.
Please note, that the installation process will take a couple of minutes.
```bash
brew install someengineering/tap/cloud2sql
```
### Install via Python pip
Alternatively you can install Cloud2SQL as Python package, where Python 3.9 or higher is required.
If you only need support for a specific database, instead of `cloud2sql[all]` you can choose between `cloud2sql[snowflake]`, `cloud2sql[parquet]`, `cloud2sql[postgresql]`, `cloud2sql[mysql]`.
```bash
pip3 install --user "cloud2sql[all]"
```
This will install the executable to the user install directory of your platform. Please make sure this installation directory is listed in `PATH`.
## Usage
The sources and destinations for `cloud2sql` are configured via a configuration file. Create your own configuration by adjusting the [config template file](config-template.yaml).
You can safely delete the sections that are not relevant to you (e.g. if you do not use AWS, you can delete the `aws` section).
All sections refer to cloud providers and are enabled if a configuration section is provided.
In the next section you will create a YAML configuration file. Once you have created your configuration file, you can run `cloud2sql` with the following command:
```bash
cloud2sql --config myconfig.yaml
```
## Configuration
Cloud2SQL uses a YAML configuration file to define the `sources` and `destinations`.
### Sources
#### AWS
```yaml
sources:
aws:
# AWS Access Key ID (null to load from env - recommended)
access_key_id: null
# AWS Secret Access Key (null to load from env - recommended)
secret_access_key: null
# IAM role name to assume
role: null
# List of AWS profiles to collect
profiles: null
# List of AWS Regions to collect (null for all)
region: null
# Scrape the entire AWS organization
scrape_org: false
# Assume given role in current account
assume_current: false
# Do not scrape current account
do_not_scrape_current: false
```
#### Google Cloud
```yaml
sources:
gcp:
# GCP service account file(s)
service_account: []
# GCP project(s)
project: []
```
#### Kubernetes
```yaml
sources:
k8s:
# Configure access via kubeconfig files.
# Structure:
# - path: "/path/to/kubeconfig"
# all_contexts: false
# contexts: ["context1", "context2"]
config_files: []
# Alternative: configure access to k8s clusters directly in the config.
# Structure:
# - name: 'k8s-cluster-name'
# certificate_authority_data: 'CERT'
# server: 'https://k8s-cluster-server.example.com'
# token: 'TOKEN'
configs: []
```
#### DigitalOcean
```yaml
sources:
digitalocean:
# DigitalOcean API tokens for the teams to be collected
api_tokens: []
# DigitalOcean Spaces access keys for the teams to be collected, separated by colons
spaces_access_keys: []
```
### Destinations
#### SQLite
```yaml
destinations:
sqlite:
database: /path/to/database.db
```
#### PostgreSQL
```yaml
destinations:
postgresql:
host: 127.0.0.1
port: 5432
user: cloud2sql
password: changeme
database: cloud2sql
args:
key: value
```
#### MySQL
```yaml
destinations:
mysql:
host: 127.0.0.1
port: 3306
user: cloud2sql
password: changeme
database: cloud2sql
args:
key: value
```
#### MariaDB
```yaml
destinations:
mariadb:
host: 127.0.0.1
port: 3306
user: cloud2sql
password: changeme
database: cloud2sql
args:
key: value
```
#### Snowflake
```yaml
destinations:
snowflake:
host: myorg-myaccount
user: cloud2sql
password: changeme
database: cloud2sql/public
args:
warehouse: compute_wh
role: accountadmin
```
#### Apache Parquet
```yaml
destinations:
file:
path: /where/to/write/parquet/files/
format: parquet
batch_size: 100_000
```
#### CSV
```yaml
destinations:
file:
path: /where/to/write/to/csv/files/
format: csv
batch_size: 100_000
```
#### Upload to S3
```yaml
destinations:
s3:
uri: s3://bucket_name/
region: eu-central-1
format: csv
batch_size: 100_000
```
### Upload to Google Cloud Storage
```yaml
destinations:
gcs:
uri: gs://bucket_name/
format: parquet
batch_size: 100_000
```
#### My database is not listed here
Cloud2SQL uses SQLAlchemy to connect to the database. If your database is not listed here, you can check if it is supported in [SQLAlchemy Dialects](https://docs.sqlalchemy.org/en/20/dialects/index.html).
Install the relevant driver and use the connection string from the documentation.
#### Example
We use a minimal configuration [example](config-example.yaml) and export the data to a SQLite database.
The example uses our AWS default credentials and the default kubernetes config.
```bash
cloud2sql --config config-example.yaml
```
For a more in-depth example, check out our [blog post](https://resoto.com/blog/2022/12/21/installing-cloud2sql).
## Local Development
Create a local development environment with the following command:
```bash
make setup
source venv/bin/activate
```