https://github.com/data-platform-hq/terraform-databricks-databricks-runtime-premium
Terraform module for managing Databricks Premium Workspace
https://github.com/data-platform-hq/terraform-databricks-databricks-runtime-premium
azure databricks databricks-workspace terraform-modules
Last synced: about 1 year ago
JSON representation
Terraform module for managing Databricks Premium Workspace
- Host: GitHub
- URL: https://github.com/data-platform-hq/terraform-databricks-databricks-runtime-premium
- Owner: data-platform-hq
- License: other
- Created: 2022-10-21T11:12:12.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-09-25T12:31:04.000Z (over 1 year ago)
- Last Synced: 2025-03-30T07:41:23.601Z (about 1 year ago)
- Topics: azure, databricks, databricks-workspace, terraform-modules
- Language: HCL
- Homepage: https://registry.terraform.io/modules/data-platform-hq/databricks-runtime-premium/databricks/latest
- Size: 182 KB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# TODO - UPDATE DOCS
# Databricks Premium Workspace Terraform module
Terraform module used for management of Databricks Premium Resources
## Usage
### **Requires Workspace with "Premium" SKU**
The main idea behind this module is to deploy resources for Databricks Workspace with Premium SKU only.
Here we provide some examples of how to provision it with a different options.
### In example below, these features of given module would be covered:
1. Workspace admins assignment, custom Workspace group creation, group assignments, group entitlements
2. Clusters (i.e., for Unity Catalog and Shared Autoscaling)
3. Workspace IP Access list creation
4. ADLS Gen2 Mount
5. Create Secret Scope and assign permissions to custom groups
6. SQL Endpoint creation and configuration
7. Create Cluster policy
8. Create an Azure Key Vault-backed secret scope
9. Connect to already existing Unity Catalog Metastore
```hcl
# Prerequisite resources
# Databricks Workspace with Premium SKU
data "azurerm_databricks_workspace" "example" {
name = "example-workspace"
resource_group_name = "example-rg"
}
# Databricks Provider configuration
provider "databricks" {
alias = "main"
host = data.azurerm_databricks_workspace.example.workspace_url
azure_workspace_resource_id = data.azurerm_databricks_workspace.example.id
}
# Key Vault where Service Principal's secrets are stored. Used for mounting Storage Container
data "azurerm_key_vault" "example" {
name = "example-key-vault"
resource_group_name = "example-rg"
}
# Example usage of module for Runtime Premium resources.
module "databricks_runtime_premium" {
source = "data-platform-hq/databricks-runtime-premium/databricks"
project = "datahq"
env = "example"
location = "eastus"
# Parameters of Service principal used for ADLS mount
# Imports App ID and Secret of Service Principal from target Key Vault
key_vault_id = data.azurerm_key_vault.example.id
sp_client_id_secret_name = "sp-client-id" # secret's name that stores Service Principal App ID
sp_key_secret_name = "sp-key" # secret's name that stores Service Principal Secret Key
tenant_id_secret_name = "infra-arm-tenant-id" # secret's name that stores tenant id value
# 1.1 Workspace admins
workspace_admins = {
user = ["user1@example.com"]
service_principal = ["example-app-id"]
}
# 1.2 Custom Workspace group with assignments.
# In addition, provides an ability to create group and entitlements.
iam = [{
group_name = "DEVELOPERS"
permissions = ["ADMIN"]
entitlements = [
"allow_instance_pool_create",
"allow_cluster_create",
"databricks_sql_access"
]
}]
# 2. Databricks clusters configuration, and assign permission to a custom group on clusters.
databricks_cluster_configs = [ {
cluster_name = "Unity Catalog"
data_security_mode = "USER_ISOLATION"
availability = "ON_DEMAND_AZURE"
spot_bid_max_price = 1
permissions = [{ group_name = "DEVELOPERS", permission_level = "CAN_RESTART" }]
},
{
cluster_name = "shared autoscaling"
data_security_mode = "NONE"
availability = "SPOT_AZURE"
spot_bid_max_price = -1
permissions = [{group_name = "DEVELOPERS", permission_level = "CAN_MANAGE"}]
}]
# 3. Workspace could be accessed only from these IP Addresses:
ip_rules = {
"ip_range_1" = "10.128.0.0/16",
"ip_range_2" = "10.33.0.0/16",
}
# 4. ADLS Gen2 Mount
mountpoints = {
storage_account_name = data.azurerm_storage_account.example.name
container_name = "example_container"
}
# 5. Create Secret Scope and assign permissions to custom groups
secret_scope = [{
scope_name = "extra-scope"
acl = [{ principal = "DEVELOPERS", permission = "READ" }] # Only custom workspace group names are allowed. If left empty then only Workspace admins could access these keys
secrets = [{ key = "secret-name", string_value = "secret-value"}]
}]
# 6. SQL Warehouse Endpoint
databricks_sql_endpoint = [{
name = "default"
enable_serverless_compute = true
permissions = [{ group_name = "DEVELOPERS", permission_level = "CAN_USE" },]
}]
# 7. Databricks cluster policies
custom_cluster_policies = [{
name = "custom_policy_1",
can_use = "DEVELOPERS", # custom workspace group name, that is allowed to use this policy
definition = {
"autoscale.max_workers": {
"type": "range",
"maxValue": 3,
"defaultValue": 2
},
}
}]
# 8. Azure Key Vault-backed secret scope
key_vault_secret_scope = [{
name = "external"
key_vault_id = data.azurerm_key_vault.example.id
dns_name = data.azurerm_key_vault.example.vault_uri
}]
providers = {
databricks = databricks.main
}
}
# 9 Assignment already existing Unity Catalog Metastore
module "metastore_assignment" {
source = "data-platform-hq/metastore-assignment/databricks"
version = "1.0.0"
workspace_id = data.azurerm_databricks_workspace.example.workspace_id
metastore_id = ""
providers = {
databricks = databricks.workspace
}
}
```
## Requirements
| Name | Version |
|------|---------|
| [terraform](#requirement\_terraform) | >=1.0.0 |
| [azurerm](#requirement\_azurerm) | >= 4.0.1 |
| [databricks](#requirement\_databricks) | >=1.30.0 |
## Providers
| Name | Version |
|------|---------|
| [azurerm](#provider\_azurerm) | >= 4.0.1 |
| [databricks](#provider\_databricks) | >=1.30.0 |
## Modules
No modules.
## Resources
| Name | Type |
|------|------|
| [azurerm_key_vault_access_policy.databricks](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/key_vault_access_policy) | resource |
| [databricks_cluster.cluster](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster) | resource |
| [databricks_cluster_policy.overrides](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster_policy) | resource |
| [databricks_cluster_policy.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster_policy) | resource |
| [databricks_entitlements.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/entitlements) | resource |
| [databricks_group.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/group) | resource |
| [databricks_group_member.admin](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/group_member) | resource |
| [databricks_group_member.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/group_member) | resource |
| [databricks_ip_access_list.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/ip_access_list) | resource |
| [databricks_mount.adls](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mount) | resource |
| [databricks_permissions.clusters](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/permissions) | resource |
| [databricks_permissions.sql_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/permissions) | resource |
| [databricks_permissions.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/permissions) | resource |
| [databricks_secret.main](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret) | resource |
| [databricks_secret.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret) | resource |
| [databricks_secret_acl.external](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_acl) | resource |
| [databricks_secret_acl.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_acl) | resource |
| [databricks_secret_scope.external](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_scope) | resource |
| [databricks_secret_scope.main](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_scope) | resource |
| [databricks_secret_scope.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_scope) | resource |
| [databricks_service_principal.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/service_principal) | resource |
| [databricks_sql_endpoint.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/sql_endpoint) | resource |
| [databricks_system_schema.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/system_schema) | resource |
| [databricks_token.pat](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/token) | resource |
| [databricks_user.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/user) | resource |
| [databricks_workspace_conf.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/workspace_conf) | resource |
| [databricks_group.account_groups](https://registry.terraform.io/providers/databricks/databricks/latest/docs/data-sources/group) | data source |
| [databricks_group.admin](https://registry.terraform.io/providers/databricks/databricks/latest/docs/data-sources/group) | data source |
## Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| [clusters](#input\_clusters) | Set of objects with parameters to configure Databricks clusters and assign permissions to it for certain custom groups |
set(object({
cluster_name = string
spark_version = optional(string, "13.3.x-scala2.12")
spark_conf = optional(map(any), {})
cluster_conf_passthrought = optional(bool, false)
spark_env_vars = optional(map(any), {})
data_security_mode = optional(string, "USER_ISOLATION")
node_type_id = optional(string, "Standard_D3_v2")
autotermination_minutes = optional(number, 30)
min_workers = optional(number, 1)
max_workers = optional(number, 2)
availability = optional(string, "ON_DEMAND_AZURE")
first_on_demand = optional(number, 0)
spot_bid_max_price = optional(number, 1)
cluster_log_conf_destination = optional(string, null)
init_scripts_workspace = optional(set(string), [])
init_scripts_volumes = optional(set(string), [])
init_scripts_dbfs = optional(set(string), [])
init_scripts_abfss = optional(set(string), [])
single_user_name = optional(string, null)
single_node_enable = optional(bool, false)
custom_tags = optional(map(string), {})
permissions = optional(set(object({
group_name = string
permission_level = string
})), [])
pypi_library_repository = optional(set(string), [])
maven_library_repository = optional(set(object({
coordinates = string
exclusions = set(string)
})), [])
})) | `[]` | no |
| [create\_databricks\_access\_policy\_to\_key\_vault](#input\_create\_databricks\_access\_policy\_to\_key\_vault) | Boolean flag to enable creation of Key Vault Access Policy for Databricks Global Service Principal. | `bool` | `true` | no |
| [custom\_cluster\_policies](#input\_custom\_cluster\_policies) | Provides an ability to create custom cluster policy, assign it to cluster and grant CAN\_USE permissions on it to certain custom groups
name - name of custom cluster policy to create
can\_use - list of string, where values are custom group names, there groups have to be created with Terraform;
definition - JSON document expressed in Databricks Policy Definition Language. No need to call 'jsonencode()' function on it when providing a value; | list(object({
name = string
can_use = list(string)
definition = any
})) | [
{
"can_use": null,
"definition": null,
"name": null
}
]
| no |
| [default\_cluster\_policies\_override](#input\_default\_cluster\_policies\_override) | Provides an ability to override default cluster policy
name - name of cluster policy to override
family\_id - family id of corresponding policy
definition - JSON document expressed in Databricks Policy Definition Language. No need to call 'jsonencode()' function on it when providing a value; | list(object({
name = string
family_id = string
definition = any
})) | [
{
"definition": null,
"family_id": null,
"name": null
}
]
| no |
| [global\_databricks\_sp\_object\_id](#input\_global\_databricks\_sp\_object\_id) | Global 'AzureDatabricks' SP object id. Used to create Key Vault Access Policy for Secret Scope | `string` | `"9b38785a-6e08-4087-a0c4-20634343f21f"` | no |
| [iam\_account\_groups](#input\_iam\_account\_groups) | List of objects with group name and entitlements for this group | list(object({
group_name = optional(string)
entitlements = optional(list(string))
})) | `[]` | no |
| [iam\_workspace\_groups](#input\_iam\_workspace\_groups) | Used to create workspace group. Map of group name and its parameters, such as users and service principals added to the group. Also possible to configure group entitlements. | map(object({
user = optional(list(string))
service_principal = optional(list(string))
entitlements = optional(list(string))
})) | `{}` | no |
| [ip\_rules](#input\_ip\_rules) | Map of IP addresses permitted for access to DB | `map(string)` | `{}` | no |
| [key\_vault\_secret\_scope](#input\_key\_vault\_secret\_scope) | Object with Azure Key Vault parameters required for creation of Azure-backed Databricks Secret scope | list(object({
name = string
key_vault_id = string
dns_name = string
tenant_id = string
})) | `[]` | no |
| [mount\_adls\_passthrough](#input\_mount\_adls\_passthrough) | Boolean flag to use mount options for credentials passthrough. Should be used with mount\_cluster\_name, specified cluster should have option cluster\_conf\_passthrought == true | `bool` | `false` | no |
| [mount\_cluster\_name](#input\_mount\_cluster\_name) | Name of the cluster that will be used during storage mounting. If mount\_adls\_passthrough == true, cluster should also have option cluster\_conf\_passthrought == true | `string` | `null` | no |
| [mount\_enabled](#input\_mount\_enabled) | Boolean flag that determines whether mount point for storage account filesystem is created | `bool` | `false` | no |
| [mount\_service\_principal\_client\_id](#input\_mount\_service\_principal\_client\_id) | Application(client) Id of Service Principal used to perform storage account mounting | `string` | `null` | no |
| [mount\_service\_principal\_secret](#input\_mount\_service\_principal\_secret) | Service Principal Secret used to perform storage account mounting | `string` | `null` | no |
| [mount\_service\_principal\_tenant\_id](#input\_mount\_service\_principal\_tenant\_id) | Service Principal tenant id used to perform storage account mounting | `string` | `null` | no |
| [mountpoints](#input\_mountpoints) | Mountpoints for databricks | map(object({
storage_account_name = string
container_name = string
})) | `{}` | no |
| [pat\_token\_lifetime\_seconds](#input\_pat\_token\_lifetime\_seconds) | The lifetime of the token, in seconds. If no lifetime is specified, the token remains valid indefinitely | `number` | `315569520` | no |
| [secret\_scope](#input\_secret\_scope) | Provides an ability to create custom Secret Scope, store secrets in it and assigning ACL for access management
scope\_name - name of Secret Scope to create;
acl - list of objects, where 'principal' custom group name, this group is created in 'Premium' module; 'permission' is one of "READ", "WRITE", "MANAGE";
secrets - list of objects, where object's 'key' param is created key name and 'string\_value' is a value for it; | list(object({
scope_name = string
acl = optional(list(object({
principal = string
permission = string
})))
secrets = optional(list(object({
key = string
string_value = string
})))
})) | [
{
"acl": null,
"scope_name": null,
"secrets": null
}
]
| no |
| [sql\_endpoint](#input\_sql\_endpoint) | Set of objects with parameters to configure SQL Endpoint and assign permissions to it for certain custom groups | set(object({
name = string
cluster_size = optional(string, "2X-Small")
min_num_clusters = optional(number, 0)
max_num_clusters = optional(number, 1)
auto_stop_mins = optional(string, "30")
enable_photon = optional(bool, false)
enable_serverless_compute = optional(bool, false)
spot_instance_policy = optional(string, "COST_OPTIMIZED")
warehouse_type = optional(string, "PRO")
permissions = optional(set(object({
group_name = string
permission_level = string
})), [])
})) | `[]` | no |
| [suffix](#input\_suffix) | Optional suffix that would be added to the end of resources names. | `string` | `""` | no |
| [system\_schemas](#input\_system\_schemas) | Set of strings with all possible System Schema names | `set(string)` | [
"access",
"billing",
"compute",
"marketplace",
"storage"
]
| no |
| [system\_schemas\_enabled](#input\_system\_schemas\_enabled) | System Schemas only works with assigned Unity Catalog Metastore. Boolean flag to enabled this feature | `bool` | `false` | no |
| [user\_object\_ids](#input\_user\_object\_ids) | Map of AD usernames and corresponding object IDs | `map(string)` | `{}` | no |
| [workspace\_admins](#input\_workspace\_admins) | Provide users or service principals to grant them Admin permissions in Workspace. | object({
user = list(string)
service_principal = list(string)
}) | {
"service_principal": null,
"user": null
} | no |
## Outputs
| Name | Description |
|------|-------------|
| [clusters](#output\_clusters) | Provides name and unique identifier for the clusters |
| [sql\_endpoint\_data\_source\_id](#output\_sql\_endpoint\_data\_source\_id) | ID of the data source for this endpoint |
| [sql\_endpoint\_jdbc\_url](#output\_sql\_endpoint\_jdbc\_url) | JDBC connection string of SQL Endpoint |
| [token](#output\_token) | Databricks Personal Authorization Token |
## License
Apache 2 Licensed. For more information please see [LICENSE](https://github.com/data-platform-hq/terraform-databricks-databricks-runtime-premium/blob/main/LICENSE)