https://github.com/googleclouddataproc/hadoop-connectors
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
https://github.com/googleclouddataproc/hadoop-connectors
bigquery google-cloud-dataproc hadoop hadoop-filesystem hadoop-hcfs
Last synced: 28 days ago
JSON representation
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
- Host: GitHub
- URL: https://github.com/googleclouddataproc/hadoop-connectors
- Owner: GoogleCloudDataproc
- License: apache-2.0
- Created: 2014-05-12T03:11:55.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2025-05-07T02:36:05.000Z (about 1 month ago)
- Last Synced: 2025-05-07T03:33:58.560Z (about 1 month ago)
- Topics: bigquery, google-cloud-dataproc, hadoop, hadoop-filesystem, hadoop-hcfs
- Language: Java
- Size: 11.1 MB
- Stars: 284
- Watchers: 96
- Forks: 250
- Open Issues: 92
-
Metadata Files:
- Readme: README-template.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Apache Hadoop Connectors
[](https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/latest)
[](https://github.com/GoogleCloudDataproc/hadoop-connectors/releases/latest)
[](https://codecov.io/gh/GoogleCloudDataproc/hadoop-connectors)Libraries and tools for interoperability between Apache Hadoop related
open-source software and Google Cloud Platform.## Google Cloud Storage connector for Apache Hadoop (HCFS)
[](https://search.maven.org/search?q=g:com.google.cloud.bigdataoss%20AND%20a:gcs-connector%20AND%20v:hadoop1-*)
[](https://search.maven.org/search?q=g:com.google.cloud.bigdataoss%20AND%20a:gcs-connector%20AND%20v:hadoop2-*)
[](https://search.maven.org/search?q=g:com.google.cloud.bigdataoss%20AND%20a:gcs-connector%20AND%20v:hadoop3-*)The Google Cloud Storage connector for Hadoop enables running MapReduce jobs
directly on data in Google Cloud Storage by implementing the Hadoop FileSystem
interface. For details, see [the README](gcs/README.md).## Unreleased Changes
This Readme may include documentation for changes that haven't been released yet. The latest release's documentation and source code are found here.
https://github.com/GoogleCloudDataproc/hadoop-connectors/tree/mastr
## Building the Cloud Storage connector
> Note that build requires Java 11+ and fails with older Java versions.
To build the connector for specific Hadoop version, run the following commands
from the main directory:```bash
./mvnw clean package
```In order to verify test coverage for specific Hadoop version, run the following
commands from the main directory:```bash
./mvnw -P coverage clean verify
```The Cloud Storage connector JAR can be found in `gcs/target/` directory.
## Adding the Cloud Storage connector to your build
Maven group ID is `com.google.cloud.bigdataoss` and artifact ID for Cloud
Storage connector is `gcs-connector`.To add a dependency on Cloud Storage connector using Maven, use the following:
```xml
com.google.cloud.bigdataoss
gcs-connector
${next-gcs-connector-release-tag}```
## Resources
On **Stack Overflow**, use the tag
[`google-cloud-dataproc`](https://stackoverflow.com/tags/google-cloud-dataproc)
for questions about the connectors in this repository. This tag receives
responses from the Stack Overflow community and Google engineers, who monitor
the tag and offer unofficial support.