Photo by Kristina Flour on Unsplash
Use secret scopes in Databricks to protect your sensitive credentials
Table of contents
Disclaimer: the following steps apply primarily to Windows users. You can follow the instructions as a guideline for other OS like Linux, Mac etc.
Secrets
In Databricks, secrets are used to guard sensitive credentials against unauthorized access. This can be database credentials, SSH keys, API keys, passwords, and private certificates, among others. All these can be stored in the form of Databricks secrets.
Secret scope
Think of a secret scope as a safety box for storing secrets. It's used for organising your secrets and managing access to them based on the configured permissions by the scope admin.
In other words, a secret scope is a centralized storage unit for storing access tokens, API keys and other security-related information in Databricks.
Steps
1. Check if Databricks CLI is installed in your terminal
Open the command prompt by typing the
Windows key
+R
on your keyboardType
cmd
in the Run dialogue box then click EnterType the following to check if you have the Databricks CLI installed.
databricks --version
If you get an error message along the lines of “Databricks is not recognized as an internal or external command”, this may indicate you don't have it on your machine. Use the following command to install it:
pip install databricks-cli
2. Configure Databricks CLI
To configure the Databricks CLI, you need the following:
Databricks workspace URL
Personal access token
Get your Databricks workspace URL
Navigate to the homepage of your Databricks workspace and copy the URL. This is your Databricks workspace URL which is in the format of "xxxxxx.azuredatabricks.net/?o=xxxxxxxxxxxxx..”)
Get your personal access token
Enter your Databricks workspace
Click on your email on the top right, then enter User Settings from the drop-down menu
Under the Access Tokens tab click on Generate new token, and enter a brief reason for creating the token. This will generate the token.
Keep this window open until the next steps are completed
Assuming you're still in the terminal and the Databricks CLI is installed:
- Type the following command into the terminal:
databricks configure --token
Enter the Databricks Host (this is the Databricks workspace URL in the format of "xxxxxx.azuredatabricks.net/?o=xxxxxxxxxxxxx..”)
Paste the personal access token generated from the previous step
Workaround for entering personal access token
Sometimes the terminal may glitch by not allowing you to directly paste the token when you're prompted to do so.
The alternative route is to manually enter the token into the Databricks config file. Here's how:
Go to your File Explorer and enter the
C:\\Users\\[your-username]
folder then search for.databrickscfg
fileIf you can't find the file, create a new one with the same name
Open the
.databrickscfg
file in an IDE or notepad toolCopy and paste the personal access token generated from the previous steps
Your .databrickscfg
file structure should look like this:
3. Create secret scope
Assuming you're still in the terminal and the Databricks CLI is installed:
- Create a secret scope with this command:
databricks secrets create-scope --scope [scope-name]
Replace the [secret-scope]
with a name of your choice.
You can check if the scope was created successfully using:
databricks secrets list-scopes
The output should look similar to this:
4. Create secret
Assuming you've just created the secret scope from the previous step:
- Use this command to create a new secret:
databricks secrets put --scope [scope-name] --key [key-name]
Replace the [scope-name]
with the name of the scope you've just created in the previous step and replace the [key-name]
with the name you want to give your secret.
- A text file should pop up. Enter the value of your secret above the line when prompted, then save and exit.
You can verify if the secret was created successfully by using the following command:
databricks secrets list --scope [scope-name]
The output should look similar to this:
Delete secret scope (optional)
You can also delete secret scopes by using the following command:
databricks secrets delete-scope --scope [scope-name]
Delete secrets (optional)
To delete secrets, here’s the terminal command:
databricks secrets delete --scope [scope-name] --key [key-name]
Conclusion
Avoid the temptation of hard-coding your passwords and access tokens into your notebooks or in the Advance Settings of your Databricks clusters, and use secrets and secret scopes instead.
This method reduces the risk of exposing security credentials to unauthorized users and also gives you a scalable way of managing sensitive information across your enterprise.
Feel free to reach out via my handles: LinkedIn| Email | Twitter