ABFS
Azure BlobFS Data Connector Documentation
The Azure BlobFS (ABFS) Data Connector enables federated SQL queries on files stored in Azure Blob-compatible endpoints. This includes Azure BlobFS (abfss://
) and Azure Data Lake (adl://
) endpoints.
When a folder path is provided, all the contained files will be loaded.
File formats are specified using the file_format
parameter, as described in Object Store File Formats.
Configuration
from
from
Defines the ABFS-compatible URI to a folder or object:
from: abfs://<container>/<path>
with the account name configured usingabfs_account
parameter, orfrom: abfs://<container>@<account_name>.dfs.core.windows.net/<path>
name
name
Defines the dataset name, which is used as the table name within Spice.
Example:
params
params
Basic parameters
file_format
abfs_account
Azure storage account name
abfs_sas_string
SAS (Shared Access Signature) Token to use for authorization
abfs_endpoint
Storage endpoint, default: https://{account}.blob.core.windows.net
abfs_use_emulator
Use true
or false
to connect to a local emulator
abfs_authority_host
Alternative authority host, default: https://login.microsoftonline.com
abfs_proxy_url
Proxy URL
abfs_proxy_ca_certificate
CA certificate for the proxy
abfs_proxy_exludes
A list of hosts to exclude from proxy connections
abfs_disable_tagging
Disable tagging objects. Use this if your backing store doesn't support tags
allow_http
Allow insecure HTTP connections
hive_partitioning_enabled
Enable partitioning using hive-style partitioning from the folder structure. Defaults to false
Authentication parameters
The following parameters are used when authenticating with Azure. Only one of these parameters can be used at a time:
abfs_access_key
abfs_bearer_token
abfs_client_secret
abfs_skip_signature
If none of these are set the connector will default to using a managed identity
abfs_access_key
Secret access key
abfs_bearer_token
abfs_client_id
Client ID for client authentication flow
abfs_client_secret
Client Secret to use for client authentication flow
abfs_tenant_id
Tenant ID to use for client authentication flow
abfs_skip_signature
Skip credentials and request signing for public containers
abfs_msi_endpoint
Endpoint for managed identity tokens
abfs_federated_token_file
File path for federated identity token in Kubernetes
abfs_use_cli
Set to true
to use the Azure CLI to acquire access tokens
Retry parameters
abfs_max_retries
Maximum retries
abfs_retry_timeout
Total timeout for retries (e.g., 5s
, 1m
)
abfs_backoff_initial_duration
Initial retry delay (e.g., 5s
)
abfs_backoff_max_duration
Maximum retry delay (e.g., 1m
)
abfs_backoff_base
Exponential backoff base (e.g., 0.1
)
Authentication
ABFS connector supports three types of authentication, as detailed in the authentication parameters
Service principal authentication
Configure service principal authentication by setting the abfs_client_secret
parameter.
Create a new Azure AD application in the Azure portal and generate a
client secret
underCertificates & secrets
.Grant the Azure AD application read access to the storage account under
Access Control (IAM)
, this can typically be done using theStorage Blob Data Reader
built-in role.
Access key authentication
Configure service principal authentication by setting the abfs_access_key
parameter to Azure Storage Account Access Key
Supported file formats
Specify the file format using file_format
parameter. More details in Object Store File Formats.
Examples
Reading a CSV file with an Access Key
Using Public Containers
Connecting to the Storage Emulator
Using secrets for Account name
Authenticating using Client Authentication
Last updated
Was this helpful?