LogoLogo
BlogTwitterDiscordTelegramSignup/Login
  • Getting Started
    • Welcome to Spice.ai Cloud
    • Getting Started
      • Sign in with GitHub
      • Create a Spice app
      • Add a Dataset and query data
      • Add AI Model and chat with your data
      • Next Steps
    • FAQ
  • Features
    • Federated SQL Query
    • Data Acceleration
      • In-Memory Arrow Data Accelerator
      • DuckDB Data Accelerator
      • PostgreSQL Data Accelerator
      • SQLite Data Accelerator
    • Search & Retrieval
    • AI Gateway
    • Semantic Models
    • ML Models
    • Observability
      • Task History
      • Zipkin
  • Building Blocks
    • Data Connectors
      • ABFS
      • ClickHouse
      • Databricks
      • Delta Lake
      • Dremio
      • DuckDB
      • DynamoDB
      • FlightSQL
      • FTP
      • GitHub
      • GraphQL
      • HTTPS
      • LocalPod
      • Memory
      • MSSQL
      • MySQL
      • ODBC
      • Postgres
      • S3
      • SharePoint
      • Snowflake
      • Spark
      • SpiceAI
    • Model Providers
      • Anthropic
      • Azure
      • Hugging Face
      • OpenAI
      • Perplexity
      • SpiceAI
      • XAI
  • API
    • SQL Query API
      • HTTP API
      • Apache Arrow Flight API
    • OpenAI API
    • Health API
  • Portal
    • Playground
      • SQL Query
      • AI Chat
    • Organizations
    • Apps
      • API keys
      • Secrets
      • Connect GitHub
      • Transfer
    • Public Apps
    • App Spicepod
      • Spicepod Configuration
      • Deployments
      • Spice Runtime Versions
    • Monitoring
    • Profile
      • Personal Access Tokens
  • Use-Cases
    • Agentic AI Apps
    • Database CDN
    • Data Lakehouse
    • Enterprise Search
    • Enterprise RAG
  • SDKs
    • Python SDK
      • Streaming
    • Node.js SDK
      • Streaming
      • API Reference
    • Go SDK
    • Rust SDK
    • Dotnet SDK
    • Java SDK
  • Integrations
    • GitHub Copilot
    • Grafana
  • REFERENCE
    • Core Concepts
      • Duration Literals
    • SQL Reference
      • Data Types
      • SQL Functions
        • Aggregate
          • APPROX_COUNT_DISTINCT
          • AVG
          • BIT_AND
          • BIT_OR
          • CORR
          • COUNT
          • COVAR_POP
          • COVAR_SAMP
          • HLL
          • LISTAGG
          • MAX
          • MIN
          • NDV
          • STDDEV
          • STDDEV_POP
          • STDDEV_SAMP
          • SUM
          • VAR_POP
          • VAR_SAMP
        • Binary
          • BASE64
          • BIT_LENGTH
          • FROM_HEX
          • HEX
          • TO_HEX
          • UNBASE64
          • UNHEX
        • Bitwise
          • BIT_AND
          • BIT_OR
          • LSHIFT
          • RSHIFT
          • XOR
        • Boolean
          • IS [NOT] DISTINCT FROM
          • ISFALSE
          • IS [NOT] NULL
          • ISNUMERIC
          • ISTRUE
          • IS_MEMBER
        • Conditional
          • BOOL_AND
          • BOOL_OR
          • CASE
          • COALESCE
          • GREATEST
          • LEAST
          • NULLIF
        • Conversion
          • BINARY_STRING
          • CAST
          • CONVERT_FROM
          • CONVERT_REPLACEUTF8
          • CONVERT_TIMEZONE
          • CONVERT_TO
          • FLATTEN
          • FROM_HEX
          • HASH
          • HEX
          • TOASCII
          • TO_CHAR
          • TO_DATE
          • TO_HEX
          • TO_NUMBER
          • TO_TIME
          • TO_TIMESTAMP
          • UNHEX
        • Cryptography
          • AES_DECRYPT
          • AES_ENCRYPT
          • MD5
          • SHA
          • SHA1
          • SHA256
          • SHA512
        • Data Generation
          • RANDOM
        • Datatype
          • IS_BIGINT
          • IS_DATE
          • IS_INT
          • IS_VARCHAR
          • SIZE
          • TYPEOF
        • Date/Time
          • CONVERT_TIMEZONE
          • CURRENT_DATE
          • CURRENT_DATE_UTC
          • CURRENT_TIME
          • CURRENT_TIMESTAMP
          • DATEDIFF
          • DATE_ADD
          • DATE_DIFF
          • DATE_PART
          • DATE_SUB
          • DATE_TRUNC
          • DAY
          • DAYOFMONTH
          • DAYOFWEEK
          • DAYOFYEAR
          • EXTRACT
          • HOUR
          • LAST_DAY
          • MINUTE
          • MONTH
          • MONTHS_BETWEEN
          • NEXT_DAY
          • QUARTER
          • SECOND
          • TIMESTAMPADD
          • TIMESTAMPDIFF
          • TO_DATE
          • TO_TIME
          • TO_TIMESTAMP
          • UNIX_TIMESTAMP
          • WEEK
          • WEEKOFYEAR
          • YEAR
        • Math
          • ABS
          • ACOS
          • ASIN
          • ATAN
          • CBRT
          • CEILING
          • COS
          • COSH
          • COT
          • DEGREES
          • E
          • EXP
          • FLOOR
          • LOG
          • LOG10
          • MOD
          • PI
          • POWER
          • RADIANS
          • ROUND
          • SIGN
          • SIN
          • SINH
          • SQRT
          • STDDEV
          • STDDEV_POP
          • STDDEV_SAMP
          • TAN
          • TANH
          • TRUNCATE
        • Percentile
          • MEDIAN
          • PERCENTILE_CONT
          • PERCENTILE_DISC
        • Regular Expressions
          • REGEXP_EXTRACT
          • REGEXP_LIKE
          • REGEXP_MATCHES
          • REGEXP_REPLACE
          • REGEXP_SPLIT
        • Semistructured Data
          • ARRAY_CONTAINS
          • MAP_KEYS
          • MAP_VALUES
        • String
          • ASCII
          • BASE64
          • BTRIM
          • CHARACTER_LENGTH
          • CHAR_LENGTH
          • CHR
          • COL_LIKE
          • CONCAT
          • CONCAT_WS
          • ENDS_WITH
          • FROM_HEX
          • HEX
          • ILIKE
          • INITCAP
          • INSTR
          • IS_UTF8
          • LCASE
          • LEFT
          • LENGTH
          • LEVENSHTEIN
          • LIKE
          • LOCATE
          • LOWER
          • LPAD
          • LTRIM
          • MASK
          • MASK_FIRST_N
          • MASK_HASH
          • MASK_LAST_N
          • MASK_SHOW_FIRST_N
          • MASK_SHOW_LAST_N
          • OCTET_LENGTH
          • POSITION
          • QUOTE
          • REGEXP_EXTRACT
          • REGEXP_LIKE
          • REGEXP_MATCHES
          • REGEXP_REPLACE
          • REGEXP_SPLIT
          • REPEAT
          • REPEATSTR
          • REPLACE
          • REVERSE
          • RIGHT
          • RPAD
          • RTRIM
          • SIMILAR_TO
          • SOUNDEX
          • SPLIT_PART
          • STARTS_WITH
          • STRPOS
          • SUBSTRING
          • SUBSTRING_INDEX
          • TOASCII
          • TO_HEX
          • TRANSLATE
          • TRIM
          • UCASE
          • UNBASE64
          • UNHEX
          • UPPER
        • Window
          • COUNT
          • COVAR_POP
          • COVAR_SAMP
          • CUME_DIST
          • DENSE_RANK
          • FIRST_VALUE
          • HLL
          • LAG
          • LEAD
          • MAX
          • MIN
          • NDV
          • NTILE
          • PERCENT_RANK
          • RANK
          • ROW_NUMBER
          • SUM
          • VAR_POP
          • VAR_SAMP
      • SQL Commands
        • SELECT
        • USE
        • SHOW
        • DESCRIBE
        • WITH
    • Release Notes
  • Pricing
    • Paid Plans
    • Community Plan
  • Support
    • Support
  • Security
    • Security at Spice AI
    • Report a vulnerability
  • Legal
    • Privacy Policy
    • Website Terms of Use
    • Terms of Service
    • End User License Agreement
Powered by GitBook
On this page
  • Configuration
  • from
  • name
  • params
  • Examples
  • Pagination
  • Example
  • Working with JSON Data
  • Accessing objects fields
  • Piping array into rows
  • Unnesting object properties

Was this helpful?

Edit on GitHub
Export as PDF
  1. Building Blocks
  2. Data Connectors

GraphQL

GraphQL Data Connector Documentation

Last updated 3 months ago

Was this helpful?

The Data Connector enables federated SQL queries on any GraphQL endpoint by specifying graphql as the selector in the from value for the dataset.

datasets:
  - from: graphql:your-graphql-endpoint
    name: my_dataset
    params:
      json_pointer: /data/some/nodes
      graphql_query: |
        {
          some {
            nodes {
              field1
              field2
            }
          }
        }

Limitations

  • The GraphQL data connector does not support variables in the query.

  • Filter pushdown, with the exclusion of LIMIT, is not currently supported. Using a LIMIT will reduce the amount of data requested from the GraphQL server.

Configuration

from

The from field takes the form of graphql:your-graphql-endpoint.

name

The dataset name. This will be used as the table name within Spice.

params

Parameter Name
Description

unnest_depth

Depth level to automatically unnest objects to. By default, disabled if unspecified or 0.

graphql_auth_token

The authentication token to use to connect to the GraphQL server. Uses bearer authentication.

graphql_auth_user

The username to use for basic auth. E.g. graphql_auth_user: my_user

graphql_auth_pass

The password to use for basic auth. E.g. graphql_auth_pass: ${secrets:my_graphql_auth_pass}

graphql_query

json_pointer

GraphQL Query Example

query: |
  {
    some {
      nodes {
        field1
        field2
      }
    }
  }

Examples

Example using the GitHub GraphQL API and Bearer Auth. The following will use json_pointer to retrieve all of the nodes in starredRepositories:

from: graphql:https://api.github.com/graphql
name: stars
params:
  graphql_auth_token: ${env:GITHUB_TOKEN}
  graphql_auth_user: ${env:GRAPHQL_USER}                                                                                            ...
  graphql_auth_pass: ${env:GRAPHQL_PASS}
  json_pointer: /data/viewer/starredRepositories/nodes
  graphql_query: |
    {
      viewer {
        starredRepositories {
          nodes {
            name
            stargazerCount
            languages (first: 10) {
              nodes {
                name
              }
            }
          }
        }
      }
    }

Pagination

The query must have the correct pagination arguments in the associated paginated field.

Example

Forward Pagination:

{
  something_paginated(first: 100) {
    nodes {
      foo
      bar
    }
    pageInfo {
      endCursor
      hasNextPage
    }
  }
}

Backward Pagination:

{
  something_paginated(last: 100) {
    nodes {
      foo
      bar
    }
    pageInfo {
      startCursor
      hasPreviousPage
    }
  }
}

Working with JSON Data

Accessing objects fields

You can access the fields of the object using the square bracket notation. Arrays are indexed from 1.

sql> select node['login'] as login, node['name'] as name from stargazers limit 5;
+--------------+----------------------+
| login        | name                 |
+--------------+----------------------+
| simsieg      | Simon Siegert        |
| davidmathers | David Mathers        |
| ahmedtadde   | Ahmed Tadde          |
| lordhamlet   | Shih-Fen Cheng       |
| thinmy       | Thinmy Patrick Alves |
+--------------+----------------------+

Piping array into rows

from: graphql:https://countries.trevorblades.com
name: countries
params:
  json_pointer: /data/continents
  graphql_query: |
    {
      continents {
        name
        countries {
          name
          capital
        }
      }
    }

description: countries
acceleration:
  enabled: true
  refresh_mode: full
  refresh_check_interval: 30m

Example query:

sql> select continent, country['name'] as country, country['capital'] as capital
from (select name as continent, unnest(countries) as country from countries)
where continent = 'North America' limit 5;
+---------------+---------------------+--------------+
| continent     | country             | capital      |
+---------------+---------------------+--------------+
| North America | Antigua and Barbuda | Saint John's |
| North America | Anguilla            | The Valley   |
| North America | Aruba               | Oranjestad   |
| North America | Barbados            | Bridgetown   |
| North America | Saint Barthélemy    | Gustavia     |
+---------------+---------------------+--------------+

Unnesting object properties

You can also use the unnest_depth parameter to control automatic unnesting of objects from GraphQL responses.

This examples uses the GitHub stargazers endpoint:

from: graphql:https://api.github.com/graphql
name: stargazers
params:
  graphql_auth_token: ${env:GITHUB_TOKEN}
  unnest_depth: 2
  json_pointer: /data/repository/stargazers/edges
  graphql_query: |
    {
      repository(name: "spiceai", owner: "spiceai") {
        id
        name
        stargazers(first: 100) {
          edges {
            node {
              id
              name
              login
            }
          }
          pageInfo {
            hasNextPage
            endCursor
          }
        }

      }
    }

If unnest_depth is set to 0, or unspecified, object unnesting is disabled. When enabled, unnesting automatically moves nested fields to the parent level.

Without unnesting, stargazers data looks like this in a query:

sql> select node from stargazers limit 1;
+------------------------------------------------------------+
| node                                                       |
+------------------------------------------------------------+
| {id: MDQ6VXNlcjcwNzIw, login: ashtom, name: Thomas Dohmke} |
+------------------------------------------------------------+

With unnesting, these properties are automatically placed into their own columns:

sql> select node from stargazers limit 1;
+------------------+--------+---------------+
| id               | login  | name          |
+------------------+--------+---------------+
| MDQ6VXNlcjcwNzIw | ashtom | Thomas Dohmke |
+------------------+--------+---------------+

Unnesting Duplicate Columns

By default, the Spice Runtime will error when a duplicate column is detected during unnesting.

For example, this example spicepod.yml query would fail due to name fields:

from: graphql:https://my-graphql-api.com
name: stargazers
params:
  unnest_depth: 2
  json_pointer: /data/users
  graphql_query: |
    query {
      users {
        name
        emergency_contact {
          name
        }
      }
    }

This example would fail with a runtime error:

WARN runtime: GraphQL Data Connector Error: Invalid object access. Column 'name' already exists in the object.

The example below uses a GraphQL alias to rename emergency_contact.name as emergencyContactName.

from: graphql:https://my-graphql-api.com
name: stargazers
params:
  unnest_depth: 2
  json_pointer: /data/people
  graphql_query: |
    query {
      users {
        name
        emergency_contact {
          emergencyContactName: name
        }
      }
    }

The GraphQL data connector can be configured by providing the following params. Use the to load the password from a secret store, e.g. ${secrets:my_graphql_auth_token}.

The username to use for basic auth. See for a sample GraphQL query

The into the response body. When graphql_query is , the json_pointer can be inferred.

The GraphQL Data Connector supports automatic pagination of the response for queries using .

The graphql_query must include the pageInfo field as per . The connector will parse the graphql_query, and when pageInfo is present, will retrieve data until pagination completes.

Tips for working with JSON data. For more information see .

Example for the stargazers query from :

You can use Datafusion unnest function to pipe values from array into rows. We'll be using as an example.

Avoid this error by where possible. In the example above, a duplicate error was introduced from emergency_contact { name }.

GraphQL
secret replacement syntax
cursor pagination
spec
Datafusion Docs
countries GraphQL api
using aliases in the query
pagination section
examples
JSON pointer
paginated