{"id":18752997,"url":"https://github.com/spotify/gcs-tools","last_synced_at":"2025-04-09T08:08:07.287Z","repository":{"id":11178473,"uuid":"68550428","full_name":"spotify/gcs-tools","owner":"spotify","description":"GCS support for avro-tools, parquet-tools and protobuf","archived":false,"fork":false,"pushed_at":"2025-01-30T15:39:59.000Z","size":197,"stargazers_count":74,"open_issues_count":13,"forks_count":15,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-04-05T17:13:40.575Z","etag":null,"topics":["avro","gcp","gcs","gcs-connector","google-storage","parquet","protobuf"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spotify.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-09-18T22:21:46.000Z","updated_at":"2025-01-30T15:06:35.000Z","dependencies_parsed_at":"2023-02-16T21:31:24.155Z","dependency_job_id":"57693fe3-7988-47f6-a9d0-b33ed9fb6209","html_url":"https://github.com/spotify/gcs-tools","commit_stats":null,"previous_names":[],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fgcs-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fgcs-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fgcs-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fgcs-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spotify","download_url":"https://codeload.github.com/spotify/gcs-tools/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247999859,"owners_count":21031046,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["avro","gcp","gcs","gcs-connector","google-storage","parquet","protobuf"],"created_at":"2024-11-07T17:23:40.131Z","updated_at":"2025-04-09T08:08:07.269Z","avatar_url":"https://github.com/spotify.png","language":"Scala","readme":"GCS Tools\n=========\n\n[![Build Status](https://github.com/spotify/gcs-tools/actions/workflows/ci.yml/badge.svg)](https://github.com/spotify/gcs-tools/actions/workflows/ci.yml)\n[![GitHub license](https://img.shields.io/github/license/spotify/gcs-tools.svg)](./LICENSE)\n\n## Raison d'être:\n\nLight weight wrapper that adds [Google Cloud Storage](https://cloud.google.com/storage/) (GCS) support to common Hadoop tools, including [avro-tools](https://mvnrepository.com/artifact/org.apache.avro/avro-tools), [parquet-cli](https://mvnrepository.com/artifact/org.apache.parquet/parquet-cli), proto-tools for [Scio](https://github.com/spotify/scio)'s Protobuf in Avro file, and magnolify-tools for [Magnolify](https://github.com/spotify/magnolify) code generation, so that they can be used from regular workstations or laptops, outside of a [Google Compute Engine](https://cloud.google.com/compute/) (GCE) instance.\n\nIt uses your existing OAuth2 credentials and allows authentication via a browser.\n\n## Usage:\n\nYou can install the tools via our [Homebrew tap](https://github.com/spotify/homebrew-public) on Mac.\n\n```\nbrew tap spotify/public\nbrew install gcs-avro-tools gcs-parquet-cli gcs-proto-tools gcs-magnolify-tools\navro-tools tojson \u003cGCS_PATH\u003e\nparquet-cli cat \u003cGCS_PATH\u003e\nproto-tools tojson \u003cGCS_PATH\u003e\nmagnolify-tools \u003cavro|parquet\u003e \u003cGCS_PATH\u003e\n```\n\nOr build them yourself.\n\n```\nsbt assembly\njava -jar avro-tools/target/scala-2.13/avro-tools-*.jar tojson \u003cGCS_PATH\u003e\njava -jar parquet-cli/target/scala-2.13/parquet-cli-*.jar cat \u003cGCS_PATH\u003e\njava -jar proto-tools/target/scala-2.13/proto-tools-*.jar cat \u003cGCS_PATH\u003e\njava -jar magnolify-tools/target/scala-2.13/magnolify-tools-*.jar \u003cavro|parquet\u003e \u003cGCS_PATH\u003e\n```\n\n## How it works:\n\nTo make avro-tools and parquet-cli work with GCS we need:\n- [GCS connector](https://github.com/GoogleCloudPlatform/bigdata-interop) and its dependencies\n- [GCS connector configuration](//github.com/spotify/gcs-tools/blob/master/shared/src/main/resources/core-site.xml)\n\nGCS connector won't pick up your local gcloud configuration, and instead expects settings\nin [core-site.xml](https://github.com/spotify/gcs-tools/blob/master/shared/src/main/resources/core-site.xml).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fgcs-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspotify%2Fgcs-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fgcs-tools/lists"}