ABFS
Azure BlobFS Data Connector Documentation
Last updated
Was this helpful?
Azure BlobFS Data Connector Documentation
Last updated
Was this helpful?
The Azure BlobFS (ABFS) Data Connector enables federated SQL queries on files stored in Azure Blob-compatible endpoints. This includes Azure BlobFS (abfss://
) and Azure Data Lake (adl://
) endpoints.
When a folder path is provided, all the contained files will be loaded.
File formats are specified using the file_format
parameter, as described in .
from
Defines the ABFS-compatible URI to a folder or object:
from: abfs://<container>/<path>
with the account name configured using abfs_account
parameter, or
from: abfs://<container>@<account_name>.dfs.core.windows.net/<path>
name
Defines the dataset name, which is used as the table name within Spice.
Example:
params
file_format
abfs_account
Azure storage account name
abfs_sas_string
SAS (Shared Access Signature) Token to use for authorization
abfs_endpoint
Storage endpoint, default: https://{account}.blob.core.windows.net
abfs_use_emulator
Use true
or false
to connect to a local emulator
abfs_authority_host
Alternative authority host, default: https://login.microsoftonline.com
abfs_proxy_url
Proxy URL
abfs_proxy_ca_certificate
CA certificate for the proxy
abfs_proxy_exludes
A list of hosts to exclude from proxy connections
abfs_disable_tagging
Disable tagging objects. Use this if your backing store doesn't support tags
allow_http
Allow insecure HTTP connections
hive_partitioning_enabled
Enable partitioning using hive-style partitioning from the folder structure. Defaults to false
The following parameters are used when authenticating with Azure. Only one of these parameters can be used at a time:
abfs_access_key
abfs_bearer_token
abfs_client_secret
abfs_skip_signature
abfs_access_key
Secret access key
abfs_bearer_token
abfs_client_id
Client ID for client authentication flow
abfs_client_secret
Client Secret to use for client authentication flow
abfs_tenant_id
Tenant ID to use for client authentication flow
abfs_skip_signature
Skip credentials and request signing for public containers
abfs_msi_endpoint
Endpoint for managed identity tokens
abfs_federated_token_file
File path for federated identity token in Kubernetes
abfs_use_cli
Set to true
to use the Azure CLI to acquire access tokens
abfs_max_retries
Maximum retries
abfs_retry_timeout
Total timeout for retries (e.g., 5s
, 1m
)
abfs_backoff_initial_duration
Initial retry delay (e.g., 5s
)
abfs_backoff_max_duration
Maximum retry delay (e.g., 1m
)
abfs_backoff_base
Exponential backoff base (e.g., 0.1
)
Configure service principal authentication by setting the abfs_client_secret
parameter.
Grant the Azure AD application read access to the storage account under Access Control (IAM)
, this can typically be done using the Storage Blob Data Reader
built-in role.
Specifies the data format. Required if not inferrable from from
. Options: parquet
, csv
. Refer to for details.
If none of these are set the connector will default to using a
BEARER
access token for user authentication. The token can be obtained from the OAuth2 flow (see ).
ABFS connector supports three types of authentication, as detailed in the
Create a new Azure AD application in the and generate a client secret
under Certificates & secrets
.
Configure service principal authentication by setting the abfs_access_key
parameter to
Specify the file format using file_format
parameter. More details in .