https://github.com/trk54ylmz/spark-bigquery
Google BigQuery support for Spark SQL
https://github.com/trk54ylmz/spark-bigquery
bigquery spark
Last synced: about 1 month ago
JSON representation
Google BigQuery support for Spark SQL
- Host: GitHub
- URL: https://github.com/trk54ylmz/spark-bigquery
- Owner: trK54Ylmz
- Created: 2018-03-28T16:34:56.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-04-02T23:08:13.000Z (about 7 years ago)
- Last Synced: 2025-04-01T09:11:09.271Z (3 months ago)
- Topics: bigquery, spark
- Language: Scala
- Homepage:
- Size: 5.86 KB
- Stars: 5
- Watchers: 1
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Spark JDBC Big Query Connector
Google BigQuery support for Spark SQL
## Version
| spark-bigquery | Spark | Scala |
| :-----: | ----- | ----- |
| 0.1 | 2.1 | 2.11 |## Usage
### Scala 2.11
```scala
import com.metglobal.oss.spark.jdbc._// Register BigQuery dialect
JdbcDialects.registerDialect(BigQueryDialect)var projectId = "[PROJECT ID]"
var oAuthType = "[OAUTH TYPE, DEFAULT = 0]"
var serviceAccount = "[SERVICE ACCOUNT EMAIL FOR BIGQUERY]"
var localOAuth = "[LOCAL OAUTH FILE *.P12]"val url = s"jdbc:bigquery://https://www.googleapis.com/$projectId:443;ProjectId=$projectId;OAuthType=$oAuthType;OAuthServiceAcctEmail=$serviceAccount;OAuthPvtKeyPath=$localOAuth"
val df = spark.read
.format("jdbc")
.option("driver", "com.simba.googlebigquery.jdbc42.Driver") \
.option("url", url) \
.option("dbtable", "(SELECT a, SUM(b) AS c, CAST(d AS STRING) FROM test.records GROUP BY a) AS table") \
.load()// Unregister dialect
JdbcDialects.unregisterDialect(BigQueryDialect)
```### Python
```python
sc = spark.sparkContextsc._jvm.com.metglobal.oss.spark.jdbc.BigQueryRegister.register()
df = spark.read \
.format("jdbc") \
.option("driver", "com.simba.googlebigquery.jdbc42.Driver") \
.option("url", "jdbc:bigquery://https://www.googleapis.com/...") \
.option("dbtable", "(SELECT a, SUM(b) AS c, CAST(d AS STRING) FROM test.records GROUP BY a) AS table") \
.load()sc._jvm.com.metglobal.oss.spark.jdbc.BigQueryRegister.unregister()
```