Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/izhangzhihao/spark-security
https://github.com/izhangzhihao/spark-security
ranger ranger-plugin security spark spark-sql sql
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/izhangzhihao/spark-security
- Owner: izhangzhihao
- License: apache-2.0
- Created: 2021-12-20T11:57:57.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2021-12-23T02:29:08.000Z (about 3 years ago)
- Last Synced: 2024-10-29T04:41:02.473Z (3 months ago)
- Topics: ranger, ranger-plugin, security, spark, spark-sql, sql
- Language: Scala
- Homepage:
- Size: 143 KB
- Stars: 4
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Submarine Spark Security Plugin is built using [Apache Maven](http://maven.apache.org). To build it, `cd` to the root direct of submarine project and run:
```bash
mvn clean package -Dmaven.javadoc.skip=true -DskipTests -pl :submarine-spark-security
```By default, Submarine Spark Security Plugin is built against Apache Spark `2.3.x` and Apache Ranger `1.1.0`, which may be incompatible with other Apache Spark or Apache Ranger releases.
Currently, available profiles are:
**Spark**: `-Pspark-2.3`, `-Pspark-2.4`, `-Pspark-3.0`
**Ranger**: `-Pranger-1.2`, `-Pranger-2.0`, `-Pranger-2.1`
ACL Management for Apache Spark SQL with Apache Ranger, enabling:
- Table/Column level authorization
- Row level filtering
- Data maskingSecurity is one of fundamental features for enterprise adoption. [Apache Ranger™](https://ranger.apache.org) offers many security plugins for many Hadoop ecosystem components,
such as HDFS, Hive, HBase, Solr and Sqoop2. However, [Apache Spark™](http://spark.apache.org) is not counted in yet.
When a secured HDFS cluster is used as a data warehouse accessed by various users and groups via different applications wrote by Spark and Hive,
it is very difficult to guarantee data management in a consistent way. Apache Spark users visit data warehouse only
with Storage based access controls offered by HDFS. This library enables Spark with SQL Standard Based Authorization.## Build
Please refer to the online documentation - [Building submarine spark security plguin](build-submarine-spark-security-plugin.md)
## Quick Start
Three steps to integrate Apache Spark and Apache Ranger.
### Installation
Place the submarine-spark-security-<version>.jar into `$SPARK_HOME/jars`.
### Configurations
#### Settings for Apache Ranger
Create `ranger-spark-security.xml` in `$SPARK_HOME/conf` and add the following configurations
for pointing to the right Apache Ranger admin server.```xml
ranger.plugin.spark.policy.rest.url
https://10.102.13.36:6182
ranger.plugin.spark.policy.rest.ssl.config.file
/etc/spark/conf/ranger-spark-policymgr-ssl.xml
ranger.plugin.spark.service.name
cm_hive
ranger.plugin.spark.policy.cache.dir
/tmp
ranger.plugin.spark.policy.pollIntervalMs
5000
ranger.plugin.spark.policy.source.impl
org.apache.ranger.admin.client.RangerAdminRESTClient
```
Create `ranger-spark-audit.xml` in `$SPARK_HOME/conf` and add the following configurations
to enable/disable auditing.```xml
xasecure.audit.is.enabled
true
ranger-spark-policymgr-ssl.xml
xasecure.policymgr.clientssl.truststore
/home/bigdatauser/cm-auto-global_truststore.jks
xasecure.policymgr.clientssl.truststore.credential.file
jceks://file/home/bigdatauser/ranger-truststore.jceks
xasecure.policymgr.clientssl.keystore
/home/bigdatauser/cm-auto-host_keystore.jks
xasecure.policymgr.clientssl.keystore.credential.file
jceks://file/home/bigdatauser/ranger-keystore.jceks
xasecure.policymgr.clientssl.keystore.type
jks
xasecure.policymgr.clientssl.truststore.type
jks
```
#### Settings for Apache Spark-
You can configure `spark.sql.extensions` with the `*Extension` we provided.
For example, `spark.sql.extensions=org.apache.submarine.spark.security.api.RangerSparkAuthzExtension`Currently, you can set the following options to `spark.sql.extensions` to choose authorization w/ or w/o
extra functions.| option | authorization | row filtering | data masking |
|---|---|---|---|
|org.apache.submarine.spark.security.api.RangerSparkAuthzExtension| √ | × | × |
|org.apache.submarine.spark.security.api.RangerSparkSQLExtension| √ | √ | √ |## Apache Submarine Community
Read the [Apache Submarine Community Guide](https://submarine.apache.org/docs/community/README)
How to contribute [Contributing Guide](https://submarine.apache.org/docs/community/contributing)
Issue Tracking: https://issues.apache.org/jira/projects/SUBMARINE
## User Document
See [User Guide Home Page](https://submarine.apache.org/docs/)
## Developer Document
See [Developer Guide Home Page](https://submarine.apache.org/docs/devDocs/Development/)
## Roadmap
What to know more about what's coming for Submarine? Please check the roadmap out: https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap
## License
The Apache Submarine project is licensed under the Apache 2.0 License. See the [LICENSE](./LICENSE) file for details.