https://github.com/databricks/terraform-databricks-mlops-aws-project
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Databricks AWS staging and prod workspaces.
https://github.com/databricks/terraform-databricks-mlops-aws-project
Last synced: 12 months ago
JSON representation
This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Databricks AWS staging and prod workspaces.
- Host: GitHub
- URL: https://github.com/databricks/terraform-databricks-mlops-aws-project
- Owner: databricks
- License: apache-2.0
- Created: 2022-06-23T20:54:31.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2023-02-15T22:23:25.000Z (about 3 years ago)
- Last Synced: 2025-04-11T21:49:42.584Z (12 months ago)
- Language: HCL
- Homepage:
- Size: 16.6 KB
- Stars: 5
- Watchers: 7
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# MLOps AWS Project Module
In both of the specified staging and prod workspaces, this module:
* Creates and configures a service principal with appropriate permissions and entitlements to run CI/CD for a project.
* Creates a workspace directory as a container for project-specific resources
The service principals are granted `CAN_MANAGE` permissions on the created workspace directories.
**_NOTE:_**
1. This module is in preview so it is still experimental and subject to change. Feedback is welcome!
2. The [Databricks providers](https://registry.terraform.io/providers/databricks/databricks/latest/docs) that are passed into the module must be configured with workspace admin permissions.
3. The module assumes that the [MLOps AWS Infrastructure Module](https://registry.terraform.io/modules/databricks/mlops-aws-infrastructure/databricks/latest) has already been applied, namely that service principal groups with token usage permissions have been created with the default name `"mlops-service-principals"` or by specifying the `service_principal_group_name` field.
4. The service principal tokens are created with a default expiration of 100 days (8640000 seconds), and the module will need to be re-applied after this time to refresh the tokens.
## Usage
```hcl
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
module "mlops_aws_project" {
source = "databricks/mlops-aws-project/databricks"
providers = {
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
service_principal_name = "example-name"
project_directory_path = "/dir-name"
}
```
### Usage example with [MLOps AWS Infrastructure Module](https://registry.terraform.io/modules/databricks/mlops-aws-infrastructure/databricks/latest)
```hcl
provider "databricks" {
alias = "dev" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "staging" # Authenticate using preferred method as described in Databricks provider
}
provider "databricks" {
alias = "prod" # Authenticate using preferred method as described in Databricks provider
}
module "mlops_aws_infrastructure" {
source = "databricks/mlops-aws-infrastructure/databricks"
providers = {
databricks.dev = databricks.dev
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
staging_workspace_id = "123456789"
prod_workspace_id = "987654321"
additional_token_usage_groups = ["users"] # This field is optional.
}
module "mlops_aws_project" {
source = "databricks/mlops-aws-project/databricks"
providers = {
databricks.staging = databricks.staging
databricks.prod = databricks.prod
}
service_principal_name = "example-name"
project_directory_path = "/dir-name"
service_principal_group_name = module.mlops_aws_infrastructure.service_principal_group_name
# The above field is optional, especially since in this case service_principal_group_name will be mlops-service-principals either way,
# but this also serves to create an implicit dependency. Can also be replaced with the following line to create an explicit dependency:
# depends_on = [module.mlops_aws_infrastructure]
}
```
### Usage example with Git credentials for service principal
This can be helpful for common use cases such as Git authorization for [Remote Git Jobs](https://docs.databricks.com/repos/jobs-remote-notebook.html).
```hcl
data "databricks_current_user" "staging_user" {
provider = databricks.staging
}
data "databricks_current_user" "prod_user" {
provider = databricks.prod
}
provider "databricks" {
alias = "staging_sp"
host = data.databricks_current_user.staging_user.workspace_url
token = module.mlops_aws_project.staging_service_principal_token
}
provider "databricks" {
alias = "prod_sp"
host = data.databricks_current_user.prod_user.workspace_url
token = module.mlops_aws_project.prod_service_principal_token
}
resource "databricks_git_credential" "staging_git" {
provider = databricks.staging_sp
git_username = var.git_username
git_provider = var.git_provider
personal_access_token = var.git_token # This should be configured with `repo` scope for Databricks Repos.
}
resource "databricks_git_credential" "prod_git" {
provider = databricks.prod_sp
git_username = var.git_username
git_provider = var.git_provider
personal_access_token = var.git_token # This should be configured with `repo` scope for Databricks Repos.
}
```
## Requirements
| Name | Version |
|------|---------|
|[terraform](https://registry.terraform.io/)|\>=1.1.6|
|[databricks](https://registry.terraform.io/providers/databricks/databricks/0.5.8)|\>=0.5.8|
## Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
|service_principal_name|The display name for the service principals.|string|N/A|yes|
|project_directory_path|Path/Name of Databricks workspace directory to be created for the project. NOTE: The parent directories in the path must already be created.|string|N/A|yes|
|service_principal_group_name|The name of the service principal group in the staging and prod workspace. The created service principals will be added to this group.|string|`"mlops-service-principals"`|no|
## Outputs
| Name | Description | Type | Sensitive |
|------|-------------|------|---------|
|project_directory_path|Path/Name of Databricks workspace directory created for the project.|string|no|
|staging_service_principal_application_id|Application ID of the created Databricks service principal in the staging workspace.|string|no|
|staging_service_principal_token|Sensitive personal access token (PAT) value of the created Databricks service principal in the staging workspace.|string|yes|
|prod_service_principal_application_id|Application ID of the created Databricks service principal in the prod workspace.|string|no|
|prod_service_principal_token|Sensitive personal access token (PAT) value of the created Databricks service principal in the prod workspace.|string|yes|
## Providers
| Name | Authentication | Use |
|------|-------------|----|
|databricks.staging|Provided by the user.|Create group, directory, and service principal module in the staging workspace.|
|databricks.prod|Provided by the user.|Create group, directory, and service principal module in the prod workspace.|
## Resources
| Name | Type |
|------|------|
|databricks_group.staging_sp_group|data source|
|databricks_group.prod_sp_group|data source|
|databricks_directory.staging_directory|resource|
|databricks_permissions.staging_directory_usage|resource|
|databricks_directory.prod_directory|resource|
|databricks_permissions.prod_directory_usage|resource|
|aws-service-principal.staging_sp|module|
|aws-service-principal.prod_sp|module|