Secrets and Key Management
/tldr: Using `dbutils.secrets` and Secret Scopes for secure credential management.
1. Secret Scopes: The Core Mechanism
A **Secret Scope** is the primary organizational unit for secrets in Databricks. It defines a namespace where secrets are stored. You access a secret using a combination of the scope name and the secret key.
Types of Secret Scopes
Databricks-Backed Scope
Secrets are encrypted and stored within the Databricks control plane. This is typically easier for quick setup or non-critical environments.
- **Creation:** Use the Databricks Secrets CLI or API.
- **Security:** Encryption keys are managed by Databricks.
Azure Key Vault / AWS KMS-Backed Scope
Secrets are stored in an external cloud vault, and Databricks simply provides a reference (a pointer) to them. This is the **standard for production security and compliance.**
- **Creation:** Requires specifying the Cloud DNS/Resource ID of the external vault.
- **Security:** Encryption keys and access policies are fully controlled by the cloud provider/user.
Access Control (ACLs)
Access to a Secret Scope is managed via ACLs (Access Control Lists), granting users or groups READ, WRITE, or MANAGE permissions.
READ: Required to retrieve secrets viadbutils.secrets.get().WRITE: Required to add or modify secrets in a Databricks-backed scope.MANAGE: Required to change ACLs on the scope.
2. `dbutils.secrets`: Accessing Credentials
The only way to access secrets from a Databricks notebook is through the built-in dbutils.secrets utility.
Getting a Secret Value
You need the scope name and the secret key to retrieve the credential.
# Python Example
db_password = dbutils.secrets.get(scope="analytics_scope", key="postgres_password")
# Scala Example
val dbPassword = dbutils.secrets.get("analytics_scope", "postgres_password")
Listing Available Secrets/Scopes
Use these methods for discovery and debugging.
# List all scopes you have access to
dbutils.secrets.listScopes()
# List all secrets within a specific scope
dbutils.secrets.list("analytics_scope")
**Security Feature:** When you display a secret (e.g., printing db_password), Databricks automatically redacts the output in the notebook, replacing the value with [REDACTED].
3. Cloud-Backed Key Vaults
Integrating with cloud vaults is the best practice. Databricks assumes the role of an Identity (e.g., a Service Principal or IAM Role) to securely fetch the secret from the vault on behalf of the user.
Azure Key Vault (AKV) Integration
- **Prerequisite:** A Service Principal (SPN) with **Get** and **List** permissions on the AKV Secrets.
- **Mapping:** When creating the scope, you provide the AKV's DNS Name and Resource ID.
- **Access:** A secret named `storage-key` in AKV is accessed in Databricks as
dbutils.secrets.get(scope="akv_scope", key="storage-key").
AWS Key Management Service (KMS) Integration
- **Prerequisite:** An IAM Role that Databricks assumes, with permissions to the Secrets Manager secrets.
- **Mapping:** You link the scope to an AWS Secrets Manager ARN.
- **Access:** Similar to Azure, the secret name in Secrets Manager is the key name in the Databricks scope.
Never hardcode credentials. Always use cloud-backed Secret Scopes for production workloads.