Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/izhangzhihao/spark-security


https://github.com/izhangzhihao/spark-security

ranger ranger-plugin security spark spark-sql sql

Last synced: about 1 month ago
JSON representation

Awesome Lists containing this project

README

        

Submarine Spark Security Plugin is built using [Apache Maven](http://maven.apache.org). To build it, `cd` to the root direct of submarine project and run:

```bash
mvn clean package -Dmaven.javadoc.skip=true -DskipTests -pl :submarine-spark-security
```

By default, Submarine Spark Security Plugin is built against Apache Spark `2.3.x` and Apache Ranger `1.1.0`, which may be incompatible with other Apache Spark or Apache Ranger releases.

Currently, available profiles are:

**Spark**: `-Pspark-2.3`, `-Pspark-2.4`, `-Pspark-3.0`

**Ranger**: `-Pranger-1.2`, `-Pranger-2.0`, `-Pranger-2.1`

ACL Management for Apache Spark SQL with Apache Ranger, enabling:

- Table/Column level authorization
- Row level filtering
- Data masking

Security is one of fundamental features for enterprise adoption. [Apache Ranger™](https://ranger.apache.org) offers many security plugins for many Hadoop ecosystem components,
such as HDFS, Hive, HBase, Solr and Sqoop2. However, [Apache Spark™](http://spark.apache.org) is not counted in yet.
When a secured HDFS cluster is used as a data warehouse accessed by various users and groups via different applications wrote by Spark and Hive,
it is very difficult to guarantee data management in a consistent way. Apache Spark users visit data warehouse only
with Storage based access controls offered by HDFS. This library enables Spark with SQL Standard Based Authorization.

## Build

Please refer to the online documentation - [Building submarine spark security plguin](build-submarine-spark-security-plugin.md)

## Quick Start

Three steps to integrate Apache Spark and Apache Ranger.

### Installation

Place the submarine-spark-security-<version>.jar into `$SPARK_HOME/jars`.

### Configurations

#### Settings for Apache Ranger

Create `ranger-spark-security.xml` in `$SPARK_HOME/conf` and add the following configurations
for pointing to the right Apache Ranger admin server.

```xml


ranger.plugin.spark.policy.rest.url
https://10.102.13.36:6182


ranger.plugin.spark.policy.rest.ssl.config.file
/etc/spark/conf/ranger-spark-policymgr-ssl.xml


ranger.plugin.spark.service.name
cm_hive


ranger.plugin.spark.policy.cache.dir
/tmp


ranger.plugin.spark.policy.pollIntervalMs
5000


ranger.plugin.spark.policy.source.impl
org.apache.ranger.admin.client.RangerAdminRESTClient

```

Create `ranger-spark-audit.xml` in `$SPARK_HOME/conf` and add the following configurations
to enable/disable auditing.

```xml


xasecure.audit.is.enabled
true

ranger-spark-policymgr-ssl.xml


xasecure.policymgr.clientssl.truststore
/home/bigdatauser/cm-auto-global_truststore.jks


xasecure.policymgr.clientssl.truststore.credential.file
jceks://file/home/bigdatauser/ranger-truststore.jceks


xasecure.policymgr.clientssl.keystore
/home/bigdatauser/cm-auto-host_keystore.jks


xasecure.policymgr.clientssl.keystore.credential.file
jceks://file/home/bigdatauser/ranger-keystore.jceks


xasecure.policymgr.clientssl.keystore.type
jks


xasecure.policymgr.clientssl.truststore.type
jks

```

#### Settings for Apache Spark-

You can configure `spark.sql.extensions` with the `*Extension` we provided.
For example, `spark.sql.extensions=org.apache.submarine.spark.security.api.RangerSparkAuthzExtension`

Currently, you can set the following options to `spark.sql.extensions` to choose authorization w/ or w/o
extra functions.

| option | authorization | row filtering | data masking |
|---|---|---|---|
|org.apache.submarine.spark.security.api.RangerSparkAuthzExtension| √ | × | × |
|org.apache.submarine.spark.security.api.RangerSparkSQLExtension| √ | √ | √ |

## Apache Submarine Community

Read the [Apache Submarine Community Guide](https://submarine.apache.org/docs/community/README)

How to contribute [Contributing Guide](https://submarine.apache.org/docs/community/contributing)

Issue Tracking: https://issues.apache.org/jira/projects/SUBMARINE

## User Document

See [User Guide Home Page](https://submarine.apache.org/docs/)

## Developer Document

See [Developer Guide Home Page](https://submarine.apache.org/docs/devDocs/Development/)

## Roadmap

What to know more about what's coming for Submarine? Please check the roadmap out: https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap

## License

The Apache Submarine project is licensed under the Apache 2.0 License. See the [LICENSE](./LICENSE) file for details.