https://github.com/wandb/terraform-azurerm-wandb
https://github.com/wandb/terraform-azurerm-wandb
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/wandb/terraform-azurerm-wandb
- Owner: wandb
- License: apache-2.0
- Created: 2022-06-10T23:59:41.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-22T22:48:22.000Z (11 months ago)
- Last Synced: 2024-05-22T23:23:47.538Z (11 months ago)
- Language: HCL
- Size: 323 KB
- Stars: 5
- Watchers: 8
- Forks: 2
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
- jimsghstars - wandb/terraform-azurerm-wandb - (HCL)
README
# Weights & Biases Azure Module
This is a Terraform module for provisioning a Weights & Biases Cluster on Azure.
Weights & Biases Server is our self-hosted distribution of wandb.ai. It offers
enterprises a private instance of the Weights & Biases application, with no
resource limits and with additional enterprise-grade architectural features like
audit logging and single sign-on.## About This Module
## Pre-requisites
This module is intended to run in an Azure account with minimal
preparation, however it does have the following pre-requisites:### Terraform version >= 1
### Credentials / Permissions
## How to Use This Module
## Cluster Sizing
By default, the type of kubernetes instances, number of instances, redis cluster size, and database instance sizes are
standardized via configurations in [./deployment-size.tf](deployment-size.tf), and is configured via the `size` input
variable.Available sizes are, `small`, `medium`, `large`, `xlarge`, and `xxlarge`. Default is `small`.
All the values set via `deployment-size.tf` can be overridden by setting the appropriate input variables.
- `kubernetes_instance_type` - The instance type for the EKS nodes
- `kubernetes_min_node_per_az` - The minimum number of nodes in the EKS cluster
- `kubernetes_max_node_per_az` - The maximum number of nodes in the EKS cluster
- `redis_capacity` - The instance type for the redis cluster
- `database_sku_name` - The instance type for the database## Examples
We have included documentation and reference examples for additional common
installation scenarios for Weights & Biases, as well as examples for supporting
resources that lack official modules.- Route
## Requirements
| Name | Version |
|------|---------|
| [terraform](#requirement\_terraform) | ~> 1.0 |
| [azapi](#requirement\_azapi) | ~> 1.0 |
| [azurerm](#requirement\_azurerm) | ~> 3.17 |
| [helm](#requirement\_helm) | ~> 2.6 |
| [kubernetes](#requirement\_kubernetes) | ~> 2.23 |## Providers
| Name | Version |
|------|---------|
| [azapi](#provider\_azapi) | ~> 1.0 |
| [azurerm](#provider\_azurerm) | ~> 3.17 |## Modules
| Name | Source | Version |
|------|--------|---------|
| [app\_aks](#module\_app\_aks) | ./modules/app_aks | n/a |
| [app\_lb](#module\_app\_lb) | ./modules/app_lb | n/a |
| [cert\_manager](#module\_cert\_manager) | ./modules/cert_manager | n/a |
| [clickhouse](#module\_clickhouse) | ./modules/clickhouse | n/a |
| [cron\_job](#module\_cron\_job) | ./modules/cron_job | n/a |
| [database](#module\_database) | ./modules/database | n/a |
| [identity](#module\_identity) | ./modules/identity | n/a |
| [networking](#module\_networking) | ./modules/networking | n/a |
| [pod\_identity](#module\_pod\_identity) | ./modules/identity | n/a |
| [redis](#module\_redis) | ./modules/redis | n/a |
| [storage](#module\_storage) | ./modules/storage | n/a |
| [vault](#module\_vault) | ./modules/vault | n/a |
| [wandb](#module\_wandb) | wandb/wandb/helm | 1.2.0 |## Resources
| Name | Type |
|------|------|
| [azapi_resource_list.az_zones](https://registry.terraform.io/providers/azure/azapi/latest/docs/data-sources/resource_list) | data source |
| [azurerm_subscription.current](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/subscription) | data source |## Inputs
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| [allowed\_ip\_ranges](#input\_allowed\_ip\_ranges) | Allowed public IP addresses or CIDR ranges. | `list(string)` | `[]` | no |
| [allowed\_subscriptions](#input\_allowed\_subscriptions) | List of allowed customer subscriptions coma seperated values | `string` | `""` | no |
| [app\_wandb\_env](#input\_app\_wandb\_env) | Extra environment variables for W&B | `map(string)` | `{}` | no |
| [azuremonitor](#input\_azuremonitor) | # To support otel azure monitor sql and redis metrics need operator-wandb chart minimum version 0.14.0 | `bool` | `false` | no |
| [blob\_container](#input\_blob\_container) | Use an existing bucket. | `string` | `""` | no |
| [bucket\_path](#input\_bucket\_path) | path of where to store data for the instance-level bucket | `string` | `""` | no |
| [clickhouse\_private\_endpoint\_service\_name](#input\_clickhouse\_private\_endpoint\_service\_name) | ClickHouse private endpoint 'Service name' (ends in .azure.privatelinkservice). | `string` | `""` | no |
| [clickhouse\_region](#input\_clickhouse\_region) | ClickHouse region (eastus2, westus3, etc). | `string` | `""` | no |
| [cluster\_sku\_tier](#input\_cluster\_sku\_tier) | The Azure AKS SKU Tier to use for this cluster (https://learn.microsoft.com/en-us/azure/aks/free-standard-pricing-tiers) | `string` | `"Free"` | no |
| [controller\_image\_tag](#input\_controller\_image\_tag) | Tag of the controller image to deploy | `string` | `"1.14.0"` | no |
| [create\_private\_link](#input\_create\_private\_link) | Use for the azure private link. | `bool` | `false` | no |
| [create\_redis](#input\_create\_redis) | Boolean indicating whether to provision an redis instance (true) or not (false). | `bool` | `false` | no |
| [database\_availability\_mode](#input\_database\_availability\_mode) | n/a | `string` | `"SameZone"` | no |
| [database\_sku\_name](#input\_database\_sku\_name) | Specifies the SKU Name for this MySQL Server. Defaults to null and value from deployment-size.tf is used | `string` | `null` | no |
| [database\_version](#input\_database\_version) | Version for MySQL | `string` | `"5.7"` | no |
| [deletion\_protection](#input\_deletion\_protection) | If the instance should have deletion protection enabled. The database / Bucket can't be deleted when this value is set to `true`. | `bool` | `true` | no |
| [disable\_storage\_vault\_key\_id](#input\_disable\_storage\_vault\_key\_id) | Flag to disable the `customer_managed_key` block, the properties 'encryption.identity, encryption.keyvaultproperties' cannot be updated in a single operation. | `bool` | `false` | no |
| [domain\_name](#input\_domain\_name) | Domain for accessing the Weights & Biases UI. | `string` | `null` | no |
| [enable\_database\_vault\_key](#input\_enable\_database\_vault\_key) | Flag to enable managed key encryption for the database. Once enabled, cannot be disabled. | `bool` | `false` | no |
| [enable\_storage\_vault\_key](#input\_enable\_storage\_vault\_key) | Flag to enable managed key encryption for the storage account. | `bool` | `false` | no |
| [external\_bucket](#input\_external\_bucket) | config an external bucket | `any` | `null` | no |
| [kubernetes\_instance\_type](#input\_kubernetes\_instance\_type) | Instance type for primary node group. Defaults to null and value from deployment-size.tf is used | `string` | `null` | no |
| [kubernetes\_max\_node\_per\_az](#input\_kubernetes\_max\_node\_per\_az) | Maximum number of nodes for the AKS cluster. Defaults to null and value from deployment-size.tf is used | `number` | `null` | no |
| [kubernetes\_min\_node\_per\_az](#input\_kubernetes\_min\_node\_per\_az) | Minimum number of nodes for the AKS cluster. Defaults to null and value from deployment-size.tf is used | `number` | `null` | no |
| [license](#input\_license) | Your wandb/local license | `string` | n/a | yes |
| [location](#input\_location) | n/a | `string` | n/a | yes |
| [namespace](#input\_namespace) | String used for prefix resources. | `string` | n/a | yes |
| [node\_max\_pods](#input\_node\_max\_pods) | Maximum number of pods per node | `number` | `30` | no |
| [node\_pool\_num\_zones](#input\_node\_pool\_num\_zones) | Number of availability zones to use for the node pool when node\_pool\_zones is not set. If neither are set, 3 zones will be used | `number` | `2` | no |
| [node\_pool\_zones](#input\_node\_pool\_zones) | Availability zones for the node pool | `list(string)` | `null` | no |
| [oidc\_auth\_method](#input\_oidc\_auth\_method) | OIDC auth method | `string` | `"implicit"` | no |
| [oidc\_client\_id](#input\_oidc\_client\_id) | The Client ID of application in your identity provider | `string` | `""` | no |
| [oidc\_issuer](#input\_oidc\_issuer) | A url to your Open ID Connect identity provider, i.e. https://cognito-idp.us-east-1.amazonaws.com/us-east-1_uiIFNdacd | `string` | `""` | no |
| [oidc\_secret](#input\_oidc\_secret) | The Client secret of application in your identity provider | `string` | `""` | no |
| [operator\_chart\_version](#input\_operator\_chart\_version) | Version of the operator chart to deploy | `string` | `"1.3.4"` | no |
| [other\_wandb\_env](#input\_other\_wandb\_env) | Extra environment variables for W&B | `map(any)` | `{}` | no |
| [parquet\_wandb\_env](#input\_parquet\_wandb\_env) | Extra environment variables for W&B | `map(string)` | `{}` | no |
| [redis\_capacity](#input\_redis\_capacity) | Number indicating size of an redis instance. Defaults to null and value from deployment-size.tf is used | `number` | `null` | no |
| [size](#input\_size) | Deployment size | `string` | `"small"` | no |
| [ssl](#input\_ssl) | Enable SSL certificate | `bool` | `true` | no |
| [storage\_account](#input\_storage\_account) | Azure storage account name | `string` | `""` | no |
| [storage\_key](#input\_storage\_key) | Azure primary storage access key | `string` | `""` | no |
| [subdomain](#input\_subdomain) | Subdomain for accessing the Weights & Biases UI. Default creates record at Route53 Route. | `string` | `null` | no |
| [tags](#input\_tags) | Map of tags for resource | `map(string)` | `{}` | no |
| [use\_internal\_queue](#input\_use\_internal\_queue) | Uses an internal redis queue instead of using azure queue. | `bool` | `false` | no |
| [wandb\_image](#input\_wandb\_image) | Docker repository of to pull the wandb image from. | `string` | `"wandb/local"` | no |
| [wandb\_version](#input\_wandb\_version) | The version of Weights & Biases local to deploy. | `string` | `"latest"` | no |
| [weave\_wandb\_env](#input\_weave\_wandb\_env) | Extra environment variables for W&B | `map(string)` | `{}` | no |## Outputs
| Name | Description |
|------|-------------|
| [address](#output\_address) | n/a |
| [aks\_max\_node\_count](#output\_aks\_max\_node\_count) | n/a |
| [aks\_min\_node\_count](#output\_aks\_min\_node\_count) | n/a |
| [aks\_node\_instance\_type](#output\_aks\_node\_instance\_type) | n/a |
| [client\_id](#output\_client\_id) | n/a |
| [cluster\_ca\_certificate](#output\_cluster\_ca\_certificate) | n/a |
| [cluster\_client\_certificate](#output\_cluster\_client\_certificate) | n/a |
| [cluster\_client\_key](#output\_cluster\_client\_key) | n/a |
| [cluster\_host](#output\_cluster\_host) | n/a |
| [database\_instance\_type](#output\_database\_instance\_type) | n/a |
| [fqdn](#output\_fqdn) | The FQDN to the W&B application |
| [oidc\_issuer\_url](#output\_oidc\_issuer\_url) | n/a |
| [private\_link\_resource\_id](#output\_private\_link\_resource\_id) | n/a |
| [private\_link\_sub\_resource\_name](#output\_private\_link\_sub\_resource\_name) | n/a |
| [standardized\_size](#output\_standardized\_size) | n/a |
| [tenant\_id](#output\_tenant\_id) | n/a |
| [url](#output\_url) | The URL to the W&B application |## Upgrading from 3.x to 4.x
3.0.0 introduced autoscaling to the AKS cluster and made the `size` variable the preferred way to set the cluster size.
Previously, unless the `size` variable was set explicitly, there were default values for the following variables:
- `kubernetes_instance_type`
- `kubernetes_node_count`
- `redis_capacity`
- `database_sku_name`The `size` variable is now defaulted to `small`, and the following values to can be used to partially override the values
set by the `size` variable:
- `kubernetes_instance_type`
- `kubernetes_min_node_per_az`
- `kubernetes_max_node_per_az`
- `redis_capacity`
- `database_sku_name`For more information on the available sizes, see the [Cluster Sizing](#cluster-sizing) section.
If having the cluster scale nodes in and out is not desired, the `kubernetes_min_node_per_az` and
`kubernetes_max_node_per_az` can be set to the same value to prevent the cluster from scaling.### Upgrading from 2.x to 3.x
When upgrading from 2.x to 3.x, the following changes are required:
1. Add the `azapi` provider to the `required_providers` block:
```hcl
terraform {
required_providers {
azapi = {
source = "azure/azapi"
version = "~> 1.0"
}
}
}
```2. Add the `azapi` provider to the `provider` block:
```hcl
provider "azapi" {
# azapi provider configuration should be the same as azurerm provider configuration
}
```