https://github.com/mc2-project/opaque-sql
An encrypted data analytics platform
https://github.com/mc2-project/opaque-sql
analytics enclave machine-learning privacy security spark spark-sql
Last synced: 19 days ago
JSON representation
An encrypted data analytics platform
- Host: GitHub
- URL: https://github.com/mc2-project/opaque-sql
- Owner: mc2-project
- License: apache-2.0
- Created: 2016-10-31T19:09:35.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2023-03-29T16:04:05.000Z (about 2 years ago)
- Last Synced: 2024-08-01T18:40:10.240Z (9 months ago)
- Topics: analytics, enclave, machine-learning, privacy, security, spark, spark-sql
- Language: Scala
- Homepage:
- Size: 18 MB
- Stars: 178
- Watchers: 16
- Forks: 73
- Open Issues: 22
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- Awesome-SGX-Open-Source - https://github.com/mc2-project/opaque-sql
README
## Secure Apache Spark SQL

[](https://opensource.org/licenses/Apache-2.0)
[](https://join.slack.com/t/mc2-project/shared_invite/zt-rt3kxyy8-GS4KA0A351Ysv~GKwy8NEQ)
[](CODE_OF_CONDUCT.md)Welcome to the landing page of Opaque SQL! Opaque SQL is a package for Apache Spark SQL that enables processing over encrypted DataFrames using the OpenEnclave framework.
### Quick start
Note that Opaque SQL requires the [MC2 Client](https://github.com/mc2-project/mc2) in order to securely run an encrypted query. For a quickstart on that end-to-end workflow, please see [the README](https://github.com/mc2-project/mc2#quickstart) in the MC2 Client repo.### Usage
Similar to Apache Spark SQL, Opaque SQL offer an *encrypted DataFrame abstraction*. Users familiar with the Spark API can easily run queries on encrypted DataFrames using the same API. The main difference is that we support saving and loading of DataFrames, but not actions like `.collect` or `.show`. An example script is the following:```scala
// Import hooks to Opaque SQL
import edu.berkeley.cs.rise.opaque.implicits._
import org.apache.spark.sql.types._// Load an encrypted DataFrame (saved using the MC2 client)
val df_enc = spark.read.format("edu.berkeley.cs.rise.opaque.EncryptedSource").load("/tmp/opaquesql.csv.enc")
// Run a filter query on the encrypted DataFrame
val result = df_enc.filter($"Age" < lit(30))
// This will save the encrypted result to the result directory on the cloud
result.write.format("edu.berkeley.cs.rise.opaque.EncryptedSource").save("/tmp/opaque_sql_result")
```For more details on how to use Opaque SQL, please refer to [this section](https://mc2-project.github.io/opaque-sql-docs/src/usage/usage.html).
### Documentation
For more details on building, using, and contributing, please see our [documentation](https://mc2-project.github.io/opaque-sql-docs/src/index.html).### Paper
The open source is based on our NSDI 2017 [paper](https://www.usenix.org/system/files/conference/nsdi17/nsdi17-zheng.pdf).### Contact
Join the discussion on [Slack](https://join.slack.com/t/mc2-project/shared_invite/zt-rt3kxyy8-GS4KA0A351Ysv~GKwy8NEQ) or email us at [email protected].