Dremio

Dremio Data Connector Documentation

Dremio is a data lake engine that enables high-performance SQL queries directly on data lake storage. It provides a unified interface for querying and analyzing data from various sources without the need for complex data movement or transformation.

This connector enables using Dremio as a data source for federated SQL queries.

- from: dremio:datasets.dremio_dataset
  name: dremio_dataset
  params:
    dremio_endpoint: grpc://127.0.0.1:32010
    dremio_username: demo
    dremio_password: ${secrets:my_dremio_pass}

Configuration

from

The from field takes the form dremio:dataset where dataset is the fully qualified name of the dataset to read from.

name

The dataset name. This will be used as the table name within Spice.

Example:

datasets:
  - from: dremio:datasets.dremio_dataset
    name: cool_dataset
    params: ...
SELECT COUNT(*) FROM cool_dataset;
+----------+
| count(*) |
+----------+
| 6001215  |
+----------+

params

Parameter Name
Description

dremio_endpoint

The endpoint used to connect to the Dremio server.

dremio_username

The username used to connect to the Dremio endpoint.

dremio_password

The password used to connect to the Dremio endpoint. Use the secret replacement syntax to load the password from a secret store, e.g. ${secrets:my_dremio_pass}.

Examples

Connecting to a GRPC endpoint

- from: dremio:datasets.dremio_dataset
  name: dremio_dataset
  params:
    dremio_endpoint: grpc://127.0.0.1:32010
    dremio_username: demo
    dremio_password: ${secrets:my_dremio_pass}

Types

The table below shows the Dremio data types supported, along with the type mapping to Apache Arrow types in Spice.

Dremio Type
Arrow Type

INT

Int32

BIGINT

Int64

FLOAT

Float32

DOUBLE

Float64

DECIMAL

Decimal128

VARCHAR

Utf8

VARBINARY

Binary

BOOL

Boolean

DATE

Date64

TIME

Time32

TIMESTAMP

Timestamp(Millisecond, None)

INTERVAL

Interval

LIST

List

STRUCT

Struct

MAP

Map

Limitations

````

Last updated

Was this helpful?