Databricks Terraform Provider

Home Page: https://registry.terraform.io/providers/databricks/databricks/latest

License: Other

Makefile 0.11% Go 96.14% Dockerfile 0.06% Shell 0.53% HCL 3.11% Python 0.05%

databricks terraform-provider aws azure terraform gcp databricks-automation

terraform-provider-databricks's Issues

[TEST] Acceptance tests for Jobs

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for jobs.

[TEST] Acceptance tests for Instance Profiles

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for Instance Profiles.

[DOC] Should not use resource output in provider definition

Affected Resource(s)

https://github.com/databrickslabs/databricks-terraform/blob/master/website/content/Provider/_index.md

Expected Details

The doc suggest using the output of a resource in the provider definition:

provider "databricks" {
  azure_auth = {
    managed_resource_group  = azurerm_databricks_workspace.demo_test_workspace.managed_resource_group_name
    azure_region            = azurerm_databricks_workspace.demo_test_workspace.location
    workspace_name          = azurerm_databricks_workspace.demo_test_workspace.name
    resource_group          = azurerm_databricks_workspace.demo_test_workspace.resource_group_name
    client_id               = var.client_id
    client_secret           = var.client_secret
    tenant_id               = var.tenant_id
    subscription_id         = var.subscription_id
  }
}

The Terraform Kubernetes provider documentation warns against this. Presumably this would affect the Databricks provider too, although I have not encountered this issue.

List of things to potentially add/remove:

This is a list of things to manipulate in the docs:

Recommend executing terraform apply twice and using terraform outputs

Important Factoids

The Terraform documentation for the Kubernetes provider states that should not be done:

IMPORTANT WARNING When using interpolation to pass credentials to the Kubernetes provider from other resources, these resources SHOULD NOT be created in the same apply operation where Kubernetes provider resources are also used. This will lead to intermittent and unpredictable errors which are hard to debug and diagnose. The root issue lies with the order in which Terraform itself evaluates the provider blocks vs. actual resources.

References

[FEATURE] Add cluster.single_user_name

Is your feature request related to a problem? Please describe.
Currently unable to create interactive single-user clusters as the cluster resource doesn't allow seeting the single_user_name property

Describe the solution you'd like
Adding the single_user_name property to the cluster resource would solve this

Describe alternatives you've considered
None (other than falling back to CLI etc)

Additional context
This would support #63

[FEATURE] Support Git Integration

Is your feature request related to a problem? Please describe.
It is not supported to configure the git integration of notebooks in terraform (at least, on Azure Databricks)

Describe the solution you'd like
Ability to define git integration for a notebook in the databricks_notebook resource.

Describe alternatives you've considered
In Azure Databricks (not sure about other flavors), need to manually configure git repo in the notebook UI.

[ISSUE] Inconsistent user groups deployment/state

Terraform Version

0.12.26

Affected Resource(s)

databricks_scim_group (possibly)
databricks_secret_scope (possibly)
databricks_secret_acl (possibly)

Terraform Configuration Files

provider "databricks" {
  host  = var.databricks_host
  token = var.databricks_api_token
}

resource "databricks_scim_group" "privileged-user-group" {
  display_name = "Privileged user group"
}

resource "databricks_secret_scope" "privileged-scope" {
  name = "privileged-secret-scope"
}

resource "databricks_secret_acl" "privileged-acl" {
   principal = "Privileged user group"
   permission = "READ"
   scope = databricks_secret_scope.privileged-scope.name
}

resource "databricks_cluster" "standard_cluster" {
  cluster_name  = "standard-cluster"
  spark_version = "6.4.x-scala2.11"
  node_type_id = "Standard_DS13_v2"
  autoscale {
    min_workers = 1
    max_workers = 3
  }
  library_whl {
    path = "dbfs:/custom-whls/my_custom_whl.whl"
  }

}

# Create high concurrency cluster with AAD credential passthrough enabled
resource "databricks_cluster" "high_concurrency_cluster" {
  cluster_name  = "high-concurrency-cluster"
  spark_version = "6.4.x-scala2.11"
  node_type_id = "Standard_DS13_v2"
  autoscale {
    min_workers = 1
    max_workers = 3
  }
  spark_conf = {
    "spark.databricks.cluster.profile": "serverless"
    "spark.databricks.repl.allowedLanguages": "python, sql"
    "spark.databricks.passthrough.enabled": true
    "spark.databricks.pyspark.enableProcessIsolation": true
  }
}

 resource "databricks_notebook" "notebook" {
   content = base64encode("# Welcome to your Jupyter notebook")
   path = "/mynotebook"
   overwrite = false
   mkdirs = true
   language = "PYTHON"
   format = "SOURCE"
}

Debug Output

https://gist.github.com/masoncusack/3806347b0ef5ed873ac77689c63a4ab6

Expected Behavior

It should be recognised that a user group has been destroyed.

Actual Behavior

TF seems to hold the user group as part of the present state even though it's been deleted, causing future plan/applies to fail with error "Error: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"User or Group Privileged user group does not exist."}"

If we look in tfstate, the associated acl resource seems to still exist. Perhaps this wasn't successfully deleted by tf destroy?

  "resources": [
    {
      "mode": "managed",
      "type": "databricks_secret_acl",
      "name": "privileged-acl",
      "provider": "provider.databricks",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "id": "privileged-secret-scope|||Privileged user group",
            "permission": "READ",
            "principal": "Privileged user group",
            "scope": "privileged-secret-scope"
          }
        }
      ]
    },

Steps to Reproduce

terraform apply (with associated databricks_scim_group, secret_scope, and secret_acl resources in main.tf)
terraform destroy
terraform apply
NOTE: this may be a periodic issue. I believe the "destroy" operation and subsequent applies have succeeded before with the same config.

Provider basic auth block default func doesn't work.

There is a default function setup for username and password, but as the attributes are required, you need to specify a valid string invalidating the environment variable.

			"basic_auth": &schema.Schema{
				Type:     schema.TypeList,
				Optional: true,
				MaxItems: 1,
				Elem: &schema.Resource{
					Schema: map[string]*schema.Schema{
						"username": &schema.Schema{
							Type:        schema.TypeString,
							Required:    true,
							DefaultFunc: schema.EnvDefaultFunc("DATABRICKS_USERNAME", nil),
						},
						"password": &schema.Schema{
							Type:        schema.TypeString,
							Sensitive:   true,
							Required:    true,
							DefaultFunc: schema.EnvDefaultFunc("DATABRICKS_PASSWORD", nil),
						},
					},
				},
				ConflictsWith: []string{"token"},
			},

Reference: https://github.com/databrickslabs/databricks-terraform/blob/34e68d5de27b22e01d4b78ad570ebd8b1bf418c2/databricks/provider.go#L74

[TEST] Acceptance tests for Notebooks

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for Azure Mounts (Blob (both SAS key and Access Key), ADLS gen 1, ADLS gen 2).

[ISSUE] Manually deleted mount throws error on next tf plan

Hi folks,

If you create an azure blob mount with tf, delete it manually (via databricks notebook), then re-run terraform plan, a file not found error is thrown.

Terraform Version

terraform -v == 0.12.19

azurerm == 2.11.0
random == 2.2.1
databricks == 0.1.0

Affected Resource(s)

Please list the resources as a list, for example:

databricks_azure_blob_mount

Terraform Configuration Files

variable group_name {}

variable "client_id" {
  type = string
}
variable "client_secret" {
  type = string
}
variable "tenant_id" {
  type = string
}
variable "subscription_id" {
  type = string
}
variable "dbws_name" {
  type = string
}

provider "azurerm" {
  version = "~> 2.3"
  features {}
  
  subscription_id = var.subscription_id
  client_id = var.client_id
  client_secret = var.client_secret
  tenant_id = var.tenant_id
}

provider "random" {
  version = "~> 2.2"
}

resource "random_string" "name_prefix" {
  special = false
  upper   = false
  length  = 6
}

resource "azurerm_resource_group" "example" {
  name     = var.group_name
  location = "eastus" # note must be lower without spaces not verbose style
}

resource "azurerm_databricks_workspace" "example" {
  name                = var.dbws_name
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  sku                 = "standard"
}

resource "azurerm_storage_account" "account" {
  name                     = "${random_string.name_prefix.result}blob"
  resource_group_name      = azurerm_resource_group.example.name
  location                 = azurerm_resource_group.example.location
  account_tier             = "Standard"
  account_replication_type = "LRS"
  account_kind             = "StorageV2"
}

resource "azurerm_storage_container" "example" {
  name                  = "dev"
  storage_account_name  = azurerm_storage_account.account.name
  container_access_type = "private"
}

resource "databricks_secret_scope" "terraform" {
  name                     = "terraform"
  initial_manage_principal = "users"
}

resource "databricks_secret" "blob_account_key" {
  key          = "blob_account_key"
  string_value = azurerm_storage_account.account.primary_access_key
  scope        = databricks_secret_scope.terraform.name
}

provider "databricks" {
  azure_auth = {
    managed_resource_group = azurerm_databricks_workspace.example.managed_resource_group_name
    azure_region           = azurerm_databricks_workspace.example.location
    workspace_name         = azurerm_databricks_workspace.example.name
    resource_group         = azurerm_databricks_workspace.example.resource_group_name

    client_id       = var.client_id
    client_secret   = var.client_secret
    tenant_id       = var.tenant_id
    subscription_id = var.subscription_id
  }
}

resource "databricks_cluster" "cluster" {
    cluster_name = "cluster1"
    num_workers = 1
    spark_version = "6.4.x-scala2.11"
    node_type_id = "Standard_D3_v2"
}

resource "databricks_azure_blob_mount" "mount" {
  cluster_id           = databricks_cluster.cluster.id
  container_name       = "dev" 
  storage_account_name = azurerm_storage_account.account.name
  mount_name           = "dev"
  auth_type            = "ACCESS_KEY"
  token_secret_scope   = databricks_secret_scope.terraform.name
  token_secret_key     = databricks_secret.blob_account_key.key
}

Expected Behavior

Terraform plan should list the deleted mount as a resource to add.

Actual Behavior

Terraform plan throws a file not found error and terminates.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

build infrastructure (with terraform) to include azure storage account (blob), databricks, mount the blob.
In databricks notebook, dbutils.fs.unmount( the mount just created )
re-run terraform plan

[ISSUE] Notebooks content always seen as different even when no file has changed

Terraform Version

Terraform v0.12.24
+ provider.azurerm v1.44.0
+ provider.databricks v0.1.0

Affected Resource(s)

Please list the resources as a list, for example:

databricks_notebook

Terraform Configuration Files

resource "databricks_notebook" "notebook" {
  content = filebase64("${path.module}/nb/notebook1.scala")
  path = "/Shared/Notebooks/notebook1.scala"
  overwrite = false
  mkdirs = true
  format = "SOURCE"
  language = "SCALA"
}

Debug Output

On first run, the resource is seen as an add, correctly, and deploys.
On subsequent tf apply it sees the content has having changed:

  # databricks_notebook.notebook must be replaced
-/+ resource "databricks_notebook" "notebook" {
      ~ content     = "3750311991" -> "2327128740" # forces replacement
        format      = "SOURCE"
      ~ id          = "/Shared/Notebooks/notebook1.scala" -> (known after apply)
        language    = "SCALA"
        mkdirs      = true
      ~ object_id   = 4081355166030977 -> (known after apply)
      ~ object_type = "NOTEBOOK" -> (known after apply)
        overwrite   = false
        path        = "/Shared/Notebooks/notebook1.scala"
    }

This also exacerbates the issue #41 when deploying multiple files as it's trying to re-create all of them every time.

Expected Behavior

The content should be seen as the same and no-op

Actual Behavior

The content is seen as different and a delete/create is needed

Steps to Reproduce

Use the above hcl to deploy a notebook, then run tf apply again.

library_cran.Messages should be type list string like other libraries

https://github.com/databrickslabs/databricks-terraform/blob/20e0acb2eacfe1c93f436002e8bc94a5698ab5d2/databricks/resource_databricks_cluster.go#L457

This can break the provider during a refresh operation if the library has messages and the provider is expecting a single string.

[ISSUE] databricks_azure_adls_gen2_mount does not validate directory

Terraform Version

Terraform v0.12.26

Affected Resource(s)

Please list the resources as a list, for example:

databricks_azure_adls_gen2_mount

Terraform Configuration Files

resource "databricks_azure_adls_gen2_mount" "mount_wibble" {
  cluster_id             = databricks_cluster.cluster.id
  container_name         = "wibble"
  storage_account_name   = "storeageaccount"
  directory              = "wibble"
  mount_name             = "wibbledir"
  tenant_id              = "<tentant_id>"
  client_secret_scope    = databricks_secret_scope.terraform.name
  client_secret_key      = databricks_secret.client_secret.key
  initialize_file_system = true
}

Debug Output

Error: Response from server (403) <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 403 Invalid access token.</title>
</head>
<body><h2>HTTP ERROR 403</h2>
<p>Problem accessing /api/1.2/commands/status. Reason:
<pre>    Invalid access token.</pre></p>
</body>
</html>

Panic Output

N/A

Expected Behavior

The tf plan should throw an error saying that the directory does not match the validation logic of starting with a /

Actual Behavior

No validation error, instead get an error about invalid access token, looks like the db.fs.mount being called behind scenes does not have a timeout and because it is looking up an invalid uri everything just times out.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform plan
terraform apply

[ISSUE] Cannot create a job while reusing an existing cluster

Terraform Version

➜ terraform -v
Terraform v0.12.24

Affected Resource(s)

databricks_job

Terraform Configuration Files

resource "databricks_job" "transform" {
  existing_cluster_id = databricks_cluster.cluster.id
  notebook_path = databricks_notebook.transform.path
  name = "transform"

  schedule {
      quartz_cron_expression = "0 2 * * *"
      timezone_id = "UTC"
  }
}

Debug Output

2020-04-30T18:30:30.347+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/04/30 18:30:30 {"Method":"POST","URI":"https://eastus2.azuredatabricks.net/api/2.0/jobs/create","Payload":{"existing_cluster_id":"0430-125417-toed833","new_cluster":{},"notebook_task":{"notebook_path":"/workspace/sample/transform.scala"},"name":"transform","schedule":{"quartz_cron_expression":"0 2 * * *","timezone_id":"UTC"}}}
2020/04/30 18:30:30 [DEBUG] databricks_job.transform: apply errored, but we're indicating that via the Error pointer rather than returning it: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"Missing required field: settings.cluster_spec.new_cluster.size"}

Expected Behavior

Job should be created without errors.

Actual Behavior

An error message indicating that a parameter is missing.
I think you need to leave out the new_cluster parameter from the HTTP request when existing_cluster_id is not null to avoid server-side validation of that block.

Steps to Reproduce

terraform apply

Reuse an existing cluster when creating a job.

[ISSUE] Terraform crash log during plan

Terraform Version

➜ tf -v
Terraform v0.12.24
+ provider.azuread v0.8.0
+ provider.azurerm v2.6.0
+ provider.databricks (unversioned)
+ provider.http v1.2.0
+ provider.local v1.4.0
+ provider.null v2.1.2
+ provider.random v2.2.1

Provider is a custom build from commit 6bce373
I later updated my version to da0a178 but it kept crashing.

➜ go version
go version go1.14.2 darwin/amd64

Affected Resource(s)

databricks_notebook

Terraform Configuration Files, debug output and panic output

See https://gist.github.com/sdebruyn/f97beefc8670f643a2ec3e8894ebe81f

Expected Behavior

Terraform creates all listed resources

Actual Behavior

Terraform crashed

Steps to Reproduce

terraform apply -auto-approve
or
terraform plan

It happens every time with my current state. I did git clean -dfx and ran terraform init again, but terraform kept crashing.

Terraform stops crashing when I remove the last resource in my config (the notebook).
I tried with other notebooks and they worked fine, except for one. That one also uses base64encode(template file(....... for the content.

Problem initializing provider for MWS

During configuration, there is validation to check if the host or token is empty and if one of them is, the provider will try to read the databricks config file.

		if config.Host == "" || config.Token == "" {
			if err := tryDatabricksCliConfigFile(d, &config); err != nil {
				return nil, fmt.Errorf("failed to get credentials from config file; error msg: %w", err)
			}
		}

The problem here is that for MWS you need to setup host and basic_auth, but if you don't provide token it will try to read your config file leading to two possible scenarios:

It reads the file successfully and overwrites the host, to one that isn't the correct one set on the provider configuration
It fails to read the file and doesn't continue the process to use basic_auth, quitting the provider's initialization.

You also can't use a placeholder token to avoid the usage of the config file, because token conflicts with basich_auth. Meaning that you need to create a config file or add a new profile to it, just to put the correct host that you specified in the .tf file initially.

Reference: https://github.com/databrickslabs/databricks-terraform/blob/b8b4d864175db4084c25e5a0ad072805caf6922d/databricks/provider.go#L298

Workaround:

provider "databricks" {
  host    = "https://accounts.cloud.databricks.com"
  profile = "ACCOUNT"
  basic_auth {
    username = "username"
    password = "password"
  }
}

[DEFAULT]
...

[ACCOUNT]
host = https://accounts.cloud.databricks.com
token = placeholder

Manually deleted workspace with secret scope results in error on plan

Creating a workspace with a secret scope, cluster or possibly other references and then manually deleting the workspace after creating, results in an error on terraform plan/apply.

Terraform Version

0.12.24

Affected Resource(s)

Please list the resources as a list, for example:

databricks_workspace
databricks_job

Terraform Configuration Files

# Copy-paste your Terraform configurations here - for large Terraform configs,
# please use a service like Dropbox and share a link to the ZIP file. For
# security, you can also encrypt the files using our GPG public key.

Panic Output

Error: parse :///api/2.0/secrets/scopes/list?: missing protocol scheme
Error: parse :///api/2.0/clusters/get?cluster_id=0610-100720-loss540: missing protocol scheme

Expected Behavior

Deleting an existing workspace, previously created by terrafrom, waits for a new workspace be created before querying for secret scopes, clusters etc.

Actual Behavior

Deleting an existing workspace, previously created by terrafrom, results in an error on terraform plan/apply.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

Create a workspace with secret scope, cluster, etc in terraform
Manually delete the workspace (on Azure in this case)
terraform plan

Important Factoids

Ran on Azure
SP authentication

Other panic output

Error output of a workspace with cluster, secrets and ADAL Gen2 mount, that was manually deleted:

Configuration

variable "user" {
  type        = string
}

variable "password" {
  type        = string
}

variable "client_id" {
  type        = string
}

variable "client_secret" {
  type        = string
}

variable "tenant_id" {
  type        = string
}

variable "subscription_id" {
  type        = string
}

provider "azurerm" {
    version = "~> 2.10"
    client_id         = var.client_id
    client_secret     = var.client_secret
    tenant_id         = var.tenant_id
    subscription_id   = var.subscription_id
    features {}
    skip_provider_registration = true

}

resource "azurerm_resource_group" "db" {
  name     = "db-labs-resources"
  location = "West Europe"
}

resource "azurerm_databricks_workspace" "module" {
  name                        = "db-labs-worspace"
  resource_group_name         = azurerm_resource_group.db.name
  location                    = azurerm_resource_group.db.location
  sku                         = "premium"
}

data "azurerm_client_config" "current" {}

provider "databricks" {
  version = "~> 0.1"

  azure_auth = {
    managed_resource_group = azurerm_databricks_workspace.module.managed_resource_group_name
    azure_region           = azurerm_databricks_workspace.module.location
    workspace_name         = azurerm_databricks_workspace.module.name
    resource_group         = azurerm_databricks_workspace.module.resource_group_name
    client_id               = var.client_id
    client_secret           = var.client_secret
    tenant_id               = var.tenant_id
    subscription_id         = var.subscription_id
}

resource "databricks_secret_scope" "sandbox_storage" {
  name                     = "sandbox-storage"
  initial_manage_principal = "users"
}

resource "databricks_secret" "secret" {
  key          = "secret"
  string_value = "I am a secret"
  scope        = databricks_secret_scope.sandbox_storage.name
}

[FEATURE] Cluster config validation before sending request

Is your feature request related to a problem? Please describe.

Databricks UI provides some validation, so provider has to do it as well. E.g.

Error: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"At least one EBS volume must be attached for clusters created with node type m4.xlarge."}

Describe the solution you'd like
Cluster config validation before sending request

[TEST] Acceptance tests for Dbfs Files

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for Dbfs Files.

[TEST] Acceptance tests for DBFS File Sync

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for DBFS File Sync

[FEATURE] Cluster Policies Resource

Is your feature request related to a problem? Please describe.
I would like to be able to create cluster policies and cluster policy permissions. Please read more about it here in terms of the features they enable: https://docs.databricks.com/dev-tools/api/latest/policies.html#cluster-policy-permissions-api, https://docs.databricks.com/administration-guide/clusters/policies.html

Describe the solution you'd like
This requires:

Client implementation to communicate with cluster policies api & cluster policy permissions api
Resource object to create or destroy Cluster policy object where cluster policy permissions will be a block type in the cluster policy object.
Documentation update describing the attributes of the cluster policy and permissions object & usage.

Describe alternatives you've considered
Other alternatives could be that the cluster policy and the cluster policy permissions can be other objects to make it easier to manage but, the permissions objects themselves are not really reusable or create able. So from a crud stand point it does not make much sense.

Additional context
Please read these docs for more information: https://docs.databricks.com/dev-tools/api/latest/policies.html#cluster-policy-permissions-api, https://docs.databricks.com/administration-guide/clusters/policies.html

[ISSUE] Secret belonging to manually deleted secret-scope throws error on next tf plan

Terraform Version

$ terraform -v                                                                                                    
Terraform v0.12.6

Affected Resource(s)

databricks_secret_scope
databricks_secret

The issue is not present with databricks_secret_scope alone, but is required for databricks_secret

Terraform Configuration Files

provider "databricks" {
  host = "https://[redacted].azuredatabricks.net"
  token = "[redacted]"
}

resource "databricks_secret_scope" "my-scope" {
  name = "terraform-demo-scope"
  initial_manage_principal = "users"
}

resource "databricks_secret" "my_secret" {
  key = "test-secret-1"
  string_value = "hello world 123"
  scope = "${databricks_secret_scope.my-scope.name}"
}

Debug Output

debug output

Expected Behavior

Terraform should be able to plan when the secret has been deleted manually. The plan should notice the deletion and re-create the secret in the secret-scope

Actual Behavior

Error message: Error: status 404: err Response from server {"error_code":"RESOURCE_DOES_NOT_EXIST","message":"Scope terraform-demo-scope does not exist!"}

Steps to Reproduce

terraform apply
databricks secrets list --scope my-scope - use cli and see the secret exists
databricks secrets delete-scope --scope terraform-demo-scope - delete the secret-scope
terraform plan - error shown

[ISSUE] UnknownWorkerEnvironmentException when creating a cluster after creating a workspace

Terraform Version

➜ terraform -v
Terraform v0.12.24
+ provider.azuread v0.8.0
+ provider.azurerm v2.6.0
+ provider.databricks v0.1.0
+ provider.http v1.2.0
+ provider.local v1.4.0
+ provider.null v2.1.2
+ provider.random v2.2.1

Affected Resource(s)

databricks_cluster

Terraform Configuration Files

Same one as in #21

Debug Output

2020-05-04T10:45:06.969+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/05/04 10:45:06 {"Method":"GET","URI":"https://eastus2.azuredatabricks.net/api/2.0/clusters/list-node-types?"}
2020/05/04 10:45:07 [ERROR] eval: *terraform.EvalConfigProvider, err: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"Delegate unexpected exception during listing node types: com.databricks.backend.manager.util.UnknownWorkerEnvironmentException: Unknown worker environment WorkerEnvId(workerenv-3375316063940170)"}

Error: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"Delegate unexpected exception during listing node types: com.databricks.backend.manager.util.UnknownWorkerEnvironmentException: Unknown worker environment WorkerEnvId(workerenv-1918878560143470)"}

  on databricks.tf line 10, in provider "databricks":
  10: provider "databricks" {

Expected Behavior

After creating the workspace, we should be able to create the cluster during the same apply run.

Actual Behavior

When you create a workspace and terraform goes on to immediately create a cluster, you get the mentioned exception. It works when you apply a second time after a few seconds.

Steps to Reproduce

terraform apply

References

This third party databricks provider has the same issue

https://github.com/innovationnorway/terraform-provider-databricks/issues/49

[FEATURE] Support for Azure AD credentials passthrough

In the databricks_cluster resource, it'd be nice to be able to enable Azure AD credentials passthrough.

[ISSUE] Azure ADAL Mount always detects diff due to extra slash

Hi,

A bug we've hit that I'd like to pickup and PR a fix, looks like a super easy fix. We'd use integration tests to be added in #37 to validate it behaves correctly.

When re-running the adls_gen2_mount resource it will always detect a change due to an additional slash being detected.

Think this is likely a 1 or 2 char change to fix then adding the tests to validate.

Terraform Version

# tf -v
Terraform v0.12.16
+ provider.azuread v0.8.0
+ provider.azurerm v2.8.0
+ provider.databricks v0.1.0
+ provider.random v2.2.1

Affected Resource(s)

databricks_azure_adls_gen2_mount

Terraform Configuration Files

resource "databricks_azure_adls_gen2_mount" "mount" {
  cluster_id           = databricks_cluster.cluster.id
  container_name       = "dev" #todo: replace with env...
  storage_account_name = azurerm_storage_account.account.name
  directory            = "/dir"
  mount_name           = "localdir"
  tenant_id            = data.azuread_client_config.current.tenant_id
  client_id            = azuread_application.datalake.application_id
  client_secret_scope  = databricks_secret_scope.terraform.name
  client_secret_key    = databricks_secret.client_secret.key
}

Debug Output

> TF_LOG=debug tf plan -var-file vars.tfvars 2>&1 >/dev/null | grep "plugin.terraform-provider-databricks_v0.1.0"

2020-05-05T15:10:42.867Z [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/05/05 15:10:42 {"Method":"POST","URI":"https://eastus.azuredatabricks.net/api/1.2/commands/execute","Payload":{"language":"python","clusterId":"0504-155102-pram660","contextId":"2178625329361652192","command":"\ntry:\n  configs = {\"fs.azure.account.auth.type\": \"OAuth\",\n           \"fs.azure.account.oauth.provider.type\": \"org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider\",\n           \"fs.azure.account.oauth2.client.id\": \"REDACTED",\n           \"fs.azure.account.oauth2.client.secret\": dbutils.secrets.get(scope = \"terraform\", key = \"datalake_sp_secret\"),\n           \"fs.azure.account.oauth2.client.endpoint\": \"https://login.microsoftonline.com/REDACTED/oauth2/token\"}\n  dbutils.fs.mount(\n   source = \"abfss://[email protected]//dir\",\n   mount_point = \"/mnt/localdir\",\n   extra_configs = configs)\nexcept Exception as e:\n  dbutils.fs.unmount(\"/mnt/localdir\")\n  raise e\ndbutils.notebook.

Expected Behavior

No diff should be found and plan should be empty.

Actual Behavior

After first apply all plan/apply operations detect a diff and recreate the mount.

This is due to the added / character

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform apply
Do no changes to the mount in TF or in the cluster
terraform plan
Observe diff in plan due to additional slash on directory field

[FEATURE] Add Visual Studio Code Devcontainer support

Is your feature request related to a problem? Please describe.
I work across a bunch of repos and each has their own requirements for tooling (and tooling versions). As part of working on the issues that @lawrencegripper recently raised, we will create a VS Code Devcontainer definition. This allows us to capture and share the requirements in a container definition and use that for any work on the project.

More information here: https://code.visualstudio.com/docs/remote/containers

Describe the solution you'd like
Contribute our .devcontainer folder with the definition of the container to use when working with VS Code Devcontainers for this repo so that others can use it if they choose that workflow.

[ISSUE] Databricks provider uses invalid access token

Terraform Version

➜ terraform -v
Terraform v0.12.25
+ provider.azuread v0.9.0
+ provider.azurerm v2.11.0
+ provider.databricks (unversioned)
+ provider.http v1.2.0
+ provider.null v2.1.2
+ provider.random v2.2.1
+ provider.time v0.5.0

Current master branch

Affected Resource(s)

all resources

Terraform Configuration Files

https://github.com/datarootsio/terraform-module-azure-datalake/blob/a3c400b5bf40c2d64159bd703428c062f0174a23/databricks.tf

Debug Output

https://github.com/datarootsio/terraform-module-azure-datalake/runs/709963745?check_suite_focus=true

2020-05-26T16:40:31.5532525Z     command.go:172: Error: Response from server (403) <html>
2020-05-26T16:40:31.5532687Z     command.go:172: <head>
2020-05-26T16:40:31.5533058Z     command.go:172: <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
2020-05-26T16:40:31.5533244Z     command.go:172: <title>Error 403 Invalid access token.</title>
2020-05-26T16:40:31.5533404Z     command.go:172: </head>
2020-05-26T16:40:31.5533565Z     command.go:172: <body><h2>HTTP ERROR 403</h2>
2020-05-26T16:40:31.5533748Z     command.go:172: <p>Problem accessing /api/2.0/secrets/put. Reason:
2020-05-26T16:40:31.5533922Z     command.go:172: <pre>    Invalid access token.</pre></p>
2020-05-26T16:40:31.5534079Z     command.go:172: </body>
2020-05-26T16:40:31.5534228Z     command.go:172: </html>
2020-05-26T16:40:31.5534573Z     command.go:172: : invalid character '<' looking for beginning of value
2020-05-26T16:40:31.5534739Z     command.go:172: 
2020-05-26T16:40:31.5534912Z     command.go:172:   on databricks.tf line 73, in resource "databricks_secret" "cmdb_master":
2020-05-26T16:40:31.5535168Z     command.go:172:   73: resource "databricks_secret" "cmdb_master" {

Expected Behavior

Create a databricks_secret

Actual Behavior

See error output above

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

Use the module https://github.com/datarootsio/terraform-module-azure-datalake as described in the readme

Comments

At first sight I thought it was an issue with databricks_token but it does not seem to be directly related to that resource. It seems to be an issue with the token that this provider is using underneath to create the resources as this seems to happen with a databricks_secret that isn't using any access tokens explictly.

[ISSUE] Terraform plan command actually creates the Databricks workspace when an error is encountered

Hi there,

Terraform Version

terraform v0.12.24
provider.azurerm v2.9.0
provider.databricks v0.1.0

Affected Resource(s)

databricks_workspace

Terraform Configuration Files

variable "client_id" {}
variable "client_secret" {}
variable "tenant_id" {}
variable "subscription_id" {}
variable "resource_group" {}
variable "managed_resource_group_name" {}

provider "azurerm" {
  version         = ">= 2.3.0"
  client_id       = var.client_id
  client_secret   = var.client_secret
  tenant_id       = var.tenant_id
  subscription_id = var.subscription_id
  features {}
}

resource "azurerm_databricks_workspace" "demo" {
  location                    = "westeurope"
  name                        = "databricks-demo-workspace"
  resource_group_name         = var.resource_group
  managed_resource_group_name = var.managed_resource_group_name
  sku                         = "standard"
}

resource "databricks_cluster" "demo" {
  autoscale {
    min_workers = 2
    max_workers = 8
  }
  cluster_name            = "databricks-demo-cluster"
  spark_version           = "6.4.x-scala2.11"
  node_type_id            = "Standard_DS3_v2"
  autotermination_minutes = 30
}

provider "databricks" {
  version = ">= 0.1"
  azure_auth = {
    managed_resource_group = azurerm_databricks_workspace.demo.managed_resource_group_name
    azure_region           = azurerm_databricks_workspace.demo.location
    workspace_name         = azurerm_databricks_workspace.demo.name
    resource_group         = azurerm_databricks_workspace.demo.resource_group_name
    client_id              = var.client_id
    client_secret          = var.client_secret
    tenant_id              = var.tenant_id
    subscription_id        = var.subscription_id
  }
}

Debug Output

https://gist.github.com/christophecremon/f839ac7b0f342d277f0ceaba2fbab165

Expected Behavior

Terraform generates an execution plan.
Terraform will perform the following actions, upon apply command has been executed:

create a Databricks workspace
create a Databricks cluster

Actual Behavior

An error is generated:
Error 404 The workspace with resource ID /subscriptions/<REDACTED_FOR_GITHUB>/resourceGroups/databricks-rg/providers/Microsoft.Databricks/workspaces/databricks-demo-workspace could not be found.

Terraform actually creates the Databricks workspace, with a Premium Pricing Tier, even if terraform apply command has not been executed.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform plan

[DOC] Explaining spark_version string in cluster resource usage example

I had some trouble finding a valid spark_version string for a cluster resource.

It'd be nice if the generic format of the string, or a selection of valid options to use from which this could be inferred, were provided under the usage example in the cluster resource documentation.

I guess the choice of documenting a generic format or specific working examples should depend on whether all spark versions that a user can select in the Databricks UI will always be supported by the TF resource. If this is the case, users can just translate the details there into a valid string.

[ISSUE] Warnings or "Error: Response from server (429)" when applying notebooks

Hi there,

Terraform Version

$ terraform version
Terraform v0.12.24
+ provider.azuread v0.8.0
+ provider.azurerm v2.5.0
+ provider.databricks v0.1.0

Affected Resource(s)

Please list the resources as a list, for example:

databricks_notebook

Terraform Configuration Files

resource "databricks_notebook" "tamers-databricks-nb-handson-unsupervised-02" {
  content   = filebase64("notebooks/handson-unsupervised-02-end-to-end-machine-learning-project.py")
  language  = "PYTHON"
  path      = "/Shared/xxxx/handson-unsupervised-02-end-to-end-machine-learning-project.py"
  overwrite = false
  mkdirs    = true
  format    = "SOURCE"
}

Debug Output

https://gist.github.com/mikemowgli/42c32bd11e21d926cadc7b05788d49d7

Expected Behavior

No error nor warning when applying

Actual Behavior

warning or errors

Steps to Reproduce

TF_LOG=DEBUG terraform apply -target databricks_notebook.tamers-databricks-nb-handson-unsupervised-02

Important Factoids

In an Azure Databricks workspace, a terraform apply on a specific notebook (using -target option) yields the log in the gist link: only a warning, and the content of the notebook not updated but left as-is.
However, when applying all my terraform plan, I get the same debug log, with only this difference at the very end:

databricks_notebook.tamers-databricks-nb-handson-unsupervised-02: Creation complete after 3s [id=/Shared/xxxx/handson-unsupervised-02-end-to-end-machine-learning-project.py]

Error: Response from server (429)


2020-05-13T12:58:48.240+0200 [DEBUG] plugin: plugin process exited: path=/home/mvdborne/.terraform.d/plugins/linux_amd64/terraform-provider-databricks_v0.1.0 pid=8267
2020-05-13T12:58:48.240+0200 [DEBUG] plugin: plugin exited
$ echo $?
1

References

I'm impacted by the notebook content issue, so this one might be a consequence of issue 42.

[FEATURE] Support Azure Key Vault secret scope

Is your feature request related to a problem? Please describe.
It is not supported creating a secret scope backed by Azure Key Vault at the moment

Describe the solution you'd like
An extra setting in a databricks_secret_scope resource to link with an Azure Key Vault

Describe alternatives you've considered
Only alternative so far is using a databricks backed secret scope.

[TEST] Acceptance tests for Instance Pool

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for instance pools.

[ISSUE] Azure token auth fails with service principal secret with special characters

Terraform Version

➜ terraform -v
Terraform v0.12.24

Affected Resource(s)

all resources

Terraform Configuration Files

terraform {
  required_version = "~> 0.12"
}

provider "azurerm" {
  version = "~> 2.6.0"
  features {}
}

provider "azuread" {
  version = "~> 0.8.0"
}

data "azurerm_client_config" "current" {
}

resource "azuread_application" "aadapp" {
  name = "app"
  required_resource_access {
    resource_app_id = "e406a681-f3d4-42a8-90b6-c2b029497af1"
    resource_access {
      id   = "03e0da56-190b-40ad-a80c-ea378c433f7f"
      type = "Scope"
    }
  }
  required_resource_access {
    resource_app_id = "00000003-0000-0000-c000-000000000000"
    resource_access {
      id   = "e1fe6dd8-ba31-4d61-89e7-88639da4683d"
      type = "Scope"
    }
  }
}

resource "random_password" "aadapp_secret" {
  length = 32
  # special = false - this fixes the issue...
}

resource "azuread_service_principal" "sp" {
  application_id = azuread_application.aadapp.application_id
}

resource "azuread_service_principal_password" "sppw" {
  service_principal_id = azuread_service_principal.sp.id
  value                = random_password.aadapp_secret.result
  end_date             = "2030-01-01T00:00:00Z"
}

resource "azurerm_resource_group" "rg" {
  name     = "rg"
  location = var.region
}

resource "azurerm_role_assignment" "sprg" {
  scope                = azurerm_resource_group.rg.id
  role_definition_name = "Owner"
  principal_id         = azuread_service_principal.sp.object_id
}

resource "azurerm_databricks_workspace" "dbks" {
  name                        = "dbks"
  resource_group_name         = azurerm_resource_group.rg.name
  managed_resource_group_name = "rgdbks"
  location                    = var.region
  sku                         = "standard"
}

provider "databricks" {
  azure_auth = {
    managed_resource_group = azurerm_databricks_workspace.dbks.managed_resource_group_name
    azure_region           = azurerm_databricks_workspace.dbks.location
    workspace_name         = azurerm_databricks_workspace.dbks.name
    resource_group         = azurerm_databricks_workspace.dbks.resource_group_name
    client_id              = azuread_application.aadapp.application_id
    client_secret          = random_password.aadapp_secret.result
    tenant_id              = data.azurerm_client_config.current.tenant_id
    subscription_id        = data.azurerm_client_config.current.subscription_id
  }
}

resource "databricks_cluster" "cluster" {
  spark_version           = var.databricks_cluster_version
  cluster_name            = "cluster"
  node_type_id            = var.databricks_cluster_node_type
  autotermination_minutes = 30
  autoscale {
    min_workers = 2
    max_workers = 4
  }
}

Debug Output

2020-04-30T13:44:32.563+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/04/30 13:44:32 Creating db client via azure auth!
2020-04-30T13:44:32.563+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/04/30 13:44:32 Running Azure Auth
2020-04-30T13:44:32.563+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/04/30 13:44:32 [DEBUG] Creating Azure Databricks management OAuth token.
2020-04-30T13:44:32.563+0200 [DEBUG] plugin.terraform-provider-databricks_v0.1.0: 2020/04/30 13:44:32 {"Method":"POST","URI":"https://login.microsoftonline.com/TENANTID/oauth2/token","Payload":"grant_type=client_credentials\u0026client_id=0123456-1234-1234-1234-52ef7bbab4af\u0026client_secret=NcntUf_9vBruvi5v8l}$GWolISz+kyXy\u0026resource=https://management.core.windows.net/"}

2020/04/30 13:44:33 [ERROR] <root>: eval: *terraform.EvalConfigProvider, err: status 401: err Response from server {"error":"invalid_client","error_description":"AADSTS7000215: Invalid client secret is provided.\r\nTrace ID: 3386fd50-68af-4678-80e5-596f419e0d00\r\nCorrelation ID: 3ef4860f-7c98-41fb-8f63-ae37e9091033\r\nTimestamp: 2020-04-30 11:44:33Z","error_codes":[7000215],"timestamp":"2020-04-30 11:44:33Z","trace_id":"3386fd50-68af-4678-80e5-596f419e0d00","correlation_id":"3ef4860f-7c98-41fb-8f63-ae37e9091033","error_uri":"https://login.microsoftonline.com/error?code=7000215"}

Expected Behavior

The request to create an access token should work without issues.

Actual Behavior

The request fails because the client secret with the special characters isn't submitted correctly.

Steps to Reproduce

terraform apply
set the vars to the region and SKUs that you would like to test with, they don't matter as long as they're valid

[ISSUE] ADLS Mount with cluster that has been deleted causes error in tf plan

Terraform Version

0.12.26

Affected Resource(s)

Please list the resources as a list, for example:

databricks_cluster
databricks_azure_adls_gen2_mount

Terraform Configuration Files

        resource "databricks_cluster" "cluster" {
		num_workers = 1
		spark_version = "6.4.x-scala2.11"
		node_type_id = "Standard_D3_v2"
		autotermination_minutes = 15
	} 

	resource "databricks_secret_scope" "terraform" {
	  name                     = "terraform${databricks_cluster.cluster.cluster_id}"
	  initial_manage_principal = "users"
	}
	
	resource "databricks_secret" "client_secret" {
	  key          = "datalake_sp_secret"
	  string_value = "%[2]s"
	  scope        = databricks_secret_scope.terraform.name
	}
	
	resource "databricks_azure_adls_gen2_mount" "mount" {
	  cluster_id             = databricks_cluster.cluster.id
	  container_name         = "dev" # Created by prereqs.tf
	  storage_account_name   = "%[9]s"
	  directory              = ""
	  mount_name             = "localdir${databricks_cluster.cluster.cluster_id}"
	  tenant_id              = "%[3]s"
	  client_id              = "%[1]s"
	  client_secret_scope    = databricks_secret_scope.terraform.name
	  client_secret_key      = databricks_secret.client_secret.key
	  initialize_file_system = true
	}

Panic Output

N/A

Expected Behavior

When the cluster that originally created the mount has been deleted inside databricks, the tf plan should identify this. It should then identify that the mount most likely needs to be re-created as there is a high likelihood that the cluster is also in the same terraform configuration.

Actual Behavior

The provider throws an error during tf plan and renders the state file unusable unless you manually remove the mount from state.

Error: status 400: err Response from server {"error_code":"INVALID_PARAMETER_VALUE","message":"Cluster <some cluster id> does not exist"}

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform apply
Log into databricks or use databricks REST api to delete the cluster associated in state with mount.
terraform plan >>> Error occurs here during refresh of state

[FEATURE] Support Multiple Workspaces API for AWS

Is your feature request related to a problem? Please describe.
Support multiple workspaces api to be able to provision Databricks workspaces via terraform.

Describe the solution you'd like
Creation of new resources:

Credentials
Storage Config
Network (BYOVPC)
Workspace creation

Additional context
This is a brand new public preview api for the AWS cloud service provider

[FEATURE] Migrate to new unique databricks hostnames

Is your feature request related to a problem? Please describe.
Currently not a problem, but it's advised to start using the new unique URLs for each Databricks workspace as documented in https://docs.microsoft.com/en-us/azure/databricks/release-notes/product/2020/april#unique-urls-for-each-azure-databricks-workspace

Describe the solution you'd like
Replace the current code that uses https://.azuredatabricks.net/

Additional context
The current hostnames have not been deprecated (yet) so we have still time.

Nil Pointer reference on calculateLibraryChanges

The method tries to check if the libraries have a non-empty string for its name but for the Pypi, Maven and Cran libraries are pointer to structs, so when it tries to check the len like the following, it can lead to an exception.

if len(library.Pypi.Package) > 0

[ISSUE] new_cluster field is empty when creating databricks_job

Hi there,

When we are trying to create a Databricks job with the new_cluster field fulfilled the payload sent to the API is empty.

v0.12.26

Affected Resource(s)

databricks_job

Terraform Configuration Files

provider "databricks" {
  host = "https://xxxxxx.cloud.databricks.com/"
  token = "xxxxxx"
}

resource "databricks_job" "my_job3" {
  new_cluster  {
    autoscale  {
      min_workers = 2
      max_workers = 3
    }
    spark_version = "6.4.x-scala2.11"
    aws_attributes  {
      availability = "SPOT"
      zone_id = "us-east-1a"
      spot_bid_price_percent = "100"
    }
    node_type_id = "r3.xlarge"
  }
  notebook_path = "/Users/[email protected]/my-demo-notebook"
  name = "my-demo-notebook"
  timeout_seconds = 3600
  max_retries = 1
  max_concurrent_runs = 1
}

Debug Output

https://gist.github.com/Gnarik/21805a0ceb7e8fd26b67318d83ff80f6

Expected Behavior

We would expect to have the field new_cluster fulfilled with the values specified in the terraform script then a job that spin up a new cluster when it starts is created on the Databricks environment.

Actual Behavior

The new_cluster field is empty in the API HTTP payload and no job is created

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

terraform init
terraform apply

Important Factoids

None

References

None

[TEST] Acceptance tests for Azure Mounts

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for Azure Mounts (Blob (both SAS key and Access Key), ADLS gen 1, ADLS gen 2).

[FEATURE] Make databricks mounts use Sensitive attribute for authentication secrets in resource schemas

Is your feature request related to a problem? Please describe.

Currently resource_databricks_azure_* mounts don't have "Sensitive" method on their secrets in schema making it possible to print out secrets to standard output.

Describe the solution you'd like

Addition of "Sensitive bool" method, as per the official documentation.
https://www.terraform.io/docs/extend/schemas/schema-methods.html

Describe alternatives you've considered
N\A

Additional context
Issue found in:

databricks/resource_databricks_auzre_adls_gen1_mount.go
databricks/resource_databricks_auzre_adls_gen2_mount.go
databricks/resource_databricks_auzre_blob_mount.go

The solution should look similar to the below code (added method in bold):

"token_secret_key": {
Type: Schema.TypeString,
Required: True,
ForceNew: True,
Sensitive: True
}

[DOC] Document Data sources

The following data sources are missing documentation.

Affected Resource(s)

Expected Details

We expect all the attributes and types to be clearly documented. These data sources will be used in conjunction with other resources so it is important that they are documented.

[FEATURE] Support AAD Passthrough for ADLS mounts

Is your feature request related to a problem? Please describe.
Currently ADLS mounts allow mounts to be created using service princpal details, but for some scenarios we want to be able to provision mounts using AAD Passthrough: https://docs.microsoft.com/en-us/azure/databricks/security/credential-passthrough/adls-passthrough#--mount-azure-data-lake-storage-to-dbfs-using-credential-passthrough

Current ADLS Gen 2 mount resource:

resource "databricks_azure_adls_gen2_mount" "mount" {
	  cluster_id             = ""
	  container_name         = "" 
	  storage_account_name   = ""
	  directory              = ""
	  mount_name             = ""
	  tenant_id              = ""
	  client_id              = ""
	  client_secret_scope    = ""
	  client_secret_key      = ""
	  initialize_file_system = true
}

Describe the solution you'd like
Would like to be able to specify to use AAD Passthrough rather than passing client_id etc

The proposed change to the resource is shown below

Service principal:

resource "databricks_azure_adls_gen2_mount" "mount" {
	  cluster_id             = ""
	  container_name         = "" 
	  storage_account_name   = ""
	  directory              = ""
	  mount_name             = ""
	  initialize_file_system = true
	  mount_type             =  "ServicePrincipal"
	  service_principal {
	  	  tenant_id              = ""
	  	  client_id              = ""
	  	  client_secret_scope    = ""
	  	  client_secret_key      = ""
	  }
}

AAD Passthrough:

resource "databricks_azure_adls_gen2_mount" "mount" {
	  cluster_id             = ""
	  container_name         = "" 
	  storage_account_name   = ""
	  directory              = ""
	  mount_name             = ""
	  initialize_file_system = true
	  mount_type             =  "AADPassthrough"
}

[TEST] Acceptance tests for AWS Mounts

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for AWS Mounts (both IAM User & IAM Role mounts).

[FEATURE] Enable alternative authentication options for Azure Databricks

Is your feature request related to a problem? Please describe.
We are using the databricks-terraform provider in conjunction with the azurerm provider to deploy an Azure Databricks Workspace and set up Databricks using tasks in Azure DevOps Pipelines.

When using the Terraform task in Azure DevOps Pipelines to target Azure it sets up the ARM_* env vars that the azurerm provider expects. Since these are not used by the databricks-terraform provider we cannot use the Terraform task. As an alternative we are creating a separate script task that sets the additional env vars for databricks-terraform and then kicking of the terraform apply

Describe the solution you'd like
The azurerm provider allows ARM_CLIENT_ID and ARM_CLIENT_SECRET env vars to be set. If we could have a way to opt in to configuring the databricks-terraform provider to use these values to get an authorization token for talking to Azure Databricks then it would simplify the deployment pipeline.

Describe alternatives you've considered
Current alternative is wrapping the terraform execution inside a separate task in Azure DevOps Pipelines

[FEATURE] SCIM service principal resource

Is your feature request related to a problem? Please describe.
I would like the scim service principal resource to be implemented, with acceptance tests and documented in the website docs. https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/scim/scim-sp

Describe the solution you'd like
This requires:

Client implementation to communicate with SCIM api
Resource object to create or destroy SCIM object
Documentation update describing the attributes of the service principal object & usage.

Describe alternatives you've considered
Design is straight forward follows the pattern of scim user.

Additional context
For more information read here: https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/scim/scim-sp. It enables you to use Terraform to add SCIM service principals to the workspace via SCIM.

[FEATURE] AAD Token based Auth for Azure Databricks

Is your feature request related to a problem? Please describe.
Currently this project only supports Auth Via SP and can create a token but this should not be the case, it should just use the AAD token with the right headers.

Describe the solution you'd like
Refactor the azure_ws_init file into an azure auth component for the dbclient.

Additional context
Azure customers would much rather use AAD tokens, rather than PAT tokens when interacting with the APIs.

Current Blockers
AAD token based auth is blocked due to api calls to secrets api via AAD token is blocked.

[TEST] Acceptance tests for Clusters

Our goal is to make sure that there is acceptance tests for all resources.
There should be an acceptance test for Clusters.

[ISSUE] ebs_volume_count disappear in API HTTP call

Hi there,

When creating a new job with the new_cluster field fulfill, the parameter ebs_volume_count disappears which leads to a response error from Databricks API.

Terraform v0.12.26

Databricks provider version is master branch patched with #79

Affected Resource(s)

databricks_job

Terraform Configuration Files

provider "databricks" {
  host = "https://xxxxxx.cloud.databricks.com/"
  token = "xxxxxx"
}

resource "databricks_job" "my_job3" {
  new_cluster  {
    autoscale  {
      min_workers = 2
      max_workers = 3
    }
    spark_version = "6.4.x-scala2.11"
    aws_attributes  {
      availability = "SPOT"
      zone_id = "us-east-1a"
      spot_bid_price_percent = "100"
      first_on_demand = 1
      ebs_volume_type = "GENERAL_PURPOSE_SSD"
      ebs_volume_count = 1
      ebs_volume_size = 32
    }
    node_type_id = "r4.2xlarge"
  }
  notebook_path = "/Users/[email protected]/my-demo-notebook"
  name = "my-demo-notebook"
  timeout_seconds = 3600
  max_retries = 1
  max_concurrent_runs = 1

Debug Output

https://gist.github.com/Gnarik/dc16b034a1809011c7092897bc6326b9

Expected Behavior

Job is created with new_cluster behavior

Actual Behavior

No job is created and an error message is returned instead

Steps to Reproduce

terraform init
terraform apply

[ISSUE] cluster_id is not set after cluster creation

Terraform Version

➜ terraform -v
Terraform v0.12.24

Affected Resource(s)

databricks_cluster

Terraform Configuration Files

terraform {
  required_version = "~> 0.12"
}

provider "azurerm" {
  version = "~> 2.6.0"
  features {}
}

provider "azuread" {
  version = "~> 0.8.0"
}

data "azurerm_client_config" "current" {
}

resource "azuread_application" "aadapp" {
  name = "app"
  required_resource_access {
    resource_app_id = "e406a681-f3d4-42a8-90b6-c2b029497af1"
    resource_access {
      id   = "03e0da56-190b-40ad-a80c-ea378c433f7f"
      type = "Scope"
    }
  }
  required_resource_access {
    resource_app_id = "00000003-0000-0000-c000-000000000000"
    resource_access {
      id   = "e1fe6dd8-ba31-4d61-89e7-88639da4683d"
      type = "Scope"
    }
  }
}

resource "random_password" "aadapp_secret" {
  length = 32
  special = false
}

resource "azuread_service_principal" "sp" {
  application_id = azuread_application.aadapp.application_id
}

resource "azuread_service_principal_password" "sppw" {
  service_principal_id = azuread_service_principal.sp.id
  value                = random_password.aadapp_secret.result
  end_date             = "2030-01-01T00:00:00Z"
}

resource "azurerm_resource_group" "rg" {
  name     = "rg"
  location = var.region
}

resource "azurerm_role_assignment" "sprg" {
  scope                = azurerm_resource_group.rg.id
  role_definition_name = "Owner"
  principal_id         = azuread_service_principal.sp.object_id
}

resource "azurerm_databricks_workspace" "dbks" {
  name                        = "dbks"
  resource_group_name         = azurerm_resource_group.rg.name
  managed_resource_group_name = "rgdbks"
  location                    = var.region
  sku                         = "standard"
}

provider "databricks" {
  azure_auth = {
    managed_resource_group = azurerm_databricks_workspace.dbks.managed_resource_group_name
    azure_region           = azurerm_databricks_workspace.dbks.location
    workspace_name         = azurerm_databricks_workspace.dbks.name
    resource_group         = azurerm_databricks_workspace.dbks.resource_group_name
    client_id              = azuread_application.aadapp.application_id
    client_secret          = random_password.aadapp_secret.result
    tenant_id              = data.azurerm_client_config.current.tenant_id
    subscription_id        = data.azurerm_client_config.current.subscription_id
  }
}

resource "databricks_cluster" "cluster" {
  spark_version           = var.databricks_cluster_version
  cluster_name            = "cluster"
  node_type_id            = var.databricks_cluster_node_type
  autotermination_minutes = 30
  autoscale {
    min_workers = 2
    max_workers = 4
  }
}

resource "databricks_azure_adls_gen2_mount" "mnt" {
  cluster_id           = databricks_cluster.cluster.cluster_id
  container_name       = "anything"
  storage_account_name = "anything"
  mount_name           = "anything"
  tenant_id            = data.azurerm_client_config.current.tenant_id
  client_id            = azuread_application.aadapp.application_id
  client_secret_scope  = "anything"
  client_secret_key    = "anything"
}

Debug Output

Error: "cluster_id": required field is not set

You can see it in the state as well (I already deployed the cluster):

➜ terraform show terraform.tfstate | grep -A35 '# databricks_cluster.cluster'
# databricks_cluster.cluster:
resource "databricks_cluster" "cluster" {
    autoscale               = [
        {
            max_workers = 4
            min_workers = 2
        },
    ]
    autotermination_minutes = 30
    cluster_name            = "cluster"
    default_tags            = {
        "ClusterId"   = "0430-125417-toed833"
        "ClusterName" = "cluster"
        "Creator"     = "123456789-1234-1234-1234-52ef7bbab4af"
        "Vendor"      = "Databricks"
    }
    driver_node_type_id     = "Standard_DS3_v2"
    enable_elastic_disk     = true
    id                      = "0430-125417-toed833"
    library_cran            = []
    library_egg             = []
    library_jar             = []
    library_maven           = []
    library_pypi            = []
    library_whl             = []
    node_type_id            = "Standard_DS3_v2"
    spark_version           = "6.5.x-scala2.11"
    state                   = "RUNNING"
}

Expected Behavior

According to the docs, cluster_id should be available. You can use both id and cluster_id.

Actual Behavior

While id works as expected, cluster_id doesn't. I suggest to either remove the attribute or fill it as expected.

Steps to Reproduce

terraform apply
Use valid region and SKUs for the variables

[ISSUE] Azure Mount ADAL throws wrong error hiding error details

Hi,

Great work on the provider. We've found a small bug we'd like to fix up and PR into the provider to make understanding a failure case easier.

@stuartleeks is looking at fixing this up by changing the behavior in the try-except block and adding an integration test to validate. Probably tackles #20 too as an added bonus 🎉

Terraform Version

# tf -v
Terraform v0.12.16
+ provider.azuread v0.8.0
+ provider.azurerm v2.8.0
+ provider.databricks v0.1.0
+ provider.random v2.2.1

Affected Resource(s)

databricks_azure_adls_gen2_mount

Terraform Configuration Files

resource "databricks_azure_adls_gen2_mount" "mount" {
  cluster_id           = databricks_cluster.cluster.id
  container_name       = "dev" #todo: replace with env...
  storage_account_name = azurerm_storage_account.account.name
  directory            = "/dir"
  mount_name           = "localdir"
  tenant_id            = data.azuread_client_config.current.tenant_id
  client_id            = azuread_application.datalake.application_id
  client_secret_scope  = databricks_secret_scope.terraform.name
  client_secret_key    = databricks_secret.client_secret.key
}

Debug Output

Expected Behavior

The errror from the dbutils.fs.mount should be returned by the provider. For example if a secret is misconfigured or clientID wrong the following should be returned:

shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator$HttpException: AADToken: HTTP connection failed for getting token from AzureAD. Http response: 401 Unauthorized

Actual Behavior

If the mount operation fails the exception details are swallowed by the provider and another exception is instead returned as a result of dbutils.unmount failing.

For example if you misconfigure the Service Principal details inputted into the resource the following output is received.

The actual error occurring during the mount is:

shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.oauth2.AzureADAuthenticator$HttpException: AADToken: HTTP connection failed for getting token from AzureAD. Http response: 401 Unauthorized

This detail is lost as the except block triggers and attempts to call dbutils.fs.unmount. As the mount operation failed this call throws an exception. This is not caught and the throw e line is not reached to throw the original error.

https://github.com/databrickslabs/databricks-terraform/blob/0a30fe8b33b9b56162c4c323cd7363f667966b84/databricks/mounts.go#L310-L323

See example repro'ing this manually in a notebook:

Steps to Reproduce

Create an invalid SP credential and pass into the resource
terraform apply
Observe an exception of Directory not mounted instead of Not authorized error

[ISSUE] 429 Conflicts when deploying multiple notebooks

Terraform Version

Terraform v0.12.24
+ provider.azurerm v1.44.0
+ provider.databricks v0.1.0

Affected Resource(s)

Please list the resources as a list, for example:

databricks_notebook

Terraform Configuration Files

resource "databricks_notebook" "notebook" {
  for_each = fileset("${path.module}/notebooks", "*")

  content = filebase64("${path.module}/notebooks/${each.value}")
  path = "/Shared/Notebooks/${each.value}"
  overwrite = false
  mkdirs = true
  format = "SOURCE"
  language = SCALA
}

The above loops a number of files in a local dir and deploys them to databricks using the databricks_notebook provider. When more than about 3/4 files are present i'm seeing pretty regular 429 errors returned when running tf apply. Not sure if there's a race condition whereby the notebook is still being deleted when tf is trying to re-create it.

Debug Output

On first create, all is successful. On further runs of tf apply it sees each notebook as needing to be recreated (will log separate issue for this), and when trying to recreate we usually hit a 429 error.

databricks_notebook.notebook["notebook1.scala"]: Destroying... [id=/Shared/Notebooks/notebook1.scala]
databricks_notebook.notebook["notebook3.scala"]: Destroying... [id=/Shared/Notebooks/notebook3.scala]
databricks_notebook.notebook["notebook2.scala"]: Destroying... [id=/Shared/Notebooks/notebook2.scala]
databricks_notebook.notebook["notebook3.scala"]: Destruction complete after 0s
databricks_notebook.notebook["notebook3.scala"]: Creating...
databricks_notebook.notebook["notebook2.scala"]: Destruction complete after 0s
databricks_notebook.notebook["notebook2.scala"]: Creating...
databricks_notebook.notebook["notebook3.scala"]: Creation complete after 1s [id=/Shared/Notebooks/notebook3.scala]
databricks_notebook.notebook["notebook2.scala"]: Creation complete after 1s [id=/Shared/Notebooks/notebook2.scala]

Error: Response from server (429)

Expected Behavior

That the notebooks get re-created.

Actual Behavior

429 errors

Steps to Reproduce

Create a folder next to your tf script called "notebooks", and add a number of dummy scala files.
Use the above HCL then run tf apply to loop the directory and create all the notebooks
Run tf apply again.

databricks / terraform-provider-databricks Goto Github PK

terraform-provider-databricks's Issues

Affected Resource(s)

Expected Details

List of things to potentially add/remove:

Important Factoids

References

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files, debug output and panic output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

Other panic output

Configuration

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

References

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce