GraphQL
GraphQL Data Connector Documentation
The GraphQL Data Connector enables federated SQL queries on any GraphQL endpoint by specifying graphql
as the selector in the from
value for the dataset.
datasets:
- from: graphql:your-graphql-endpoint
name: my_dataset
params:
json_pointer: /data/some/nodes
graphql_query: |
{
some {
nodes {
field1
field2
}
}
}
Limitations
The GraphQL data connector does not support variables in the query.
Filter pushdown, with the exclusion of
LIMIT
, is not currently supported. Using aLIMIT
will reduce the amount of data requested from the GraphQL server.
Configuration
from
from
The from
field takes the form of graphql:your-graphql-endpoint
.
name
name
The dataset name. This will be used as the table name within Spice.
params
params
The GraphQL data connector can be configured by providing the following params
. Use the secret replacement syntax to load the password from a secret store, e.g. ${secrets:my_graphql_auth_token}
.
unnest_depth
Depth level to automatically unnest objects to. By default, disabled if unspecified or 0
.
graphql_auth_token
The authentication token to use to connect to the GraphQL server. Uses bearer authentication.
graphql_auth_user
The username to use for basic auth. E.g. graphql_auth_user: my_user
graphql_auth_pass
The password to use for basic auth. E.g. graphql_auth_pass: ${secrets:my_graphql_auth_pass}
graphql_query
The username to use for basic auth. See examples for a sample GraphQL query
json_pointer
The JSON pointer into the response body. When graphql_query
is paginated, the json_pointer
can be inferred.
GraphQL Query Example
query: |
{
some {
nodes {
field1
field2
}
}
}
Examples
Example using the GitHub GraphQL API and Bearer Auth. The following will use json_pointer
to retrieve all of the nodes in starredRepositories:
from: graphql:https://api.github.com/graphql
name: stars
params:
graphql_auth_token: ${env:GITHUB_TOKEN}
graphql_auth_user: ${env:GRAPHQL_USER} ...
graphql_auth_pass: ${env:GRAPHQL_PASS}
json_pointer: /data/viewer/starredRepositories/nodes
graphql_query: |
{
viewer {
starredRepositories {
nodes {
name
stargazerCount
languages (first: 10) {
nodes {
name
}
}
}
}
}
}
Pagination
The GraphQL Data Connector supports automatic pagination of the response for queries using cursor pagination.
The graphql_query
must include the pageInfo
field as per spec. The connector will parse the graphql_query
, and when pageInfo
is present, will retrieve data until pagination completes.
The query must have the correct pagination arguments in the associated paginated field.
Example
Forward Pagination:
{
something_paginated(first: 100) {
nodes {
foo
bar
}
pageInfo {
endCursor
hasNextPage
}
}
}
Backward Pagination:
{
something_paginated(last: 100) {
nodes {
foo
bar
}
pageInfo {
startCursor
hasPreviousPage
}
}
}
Working with JSON Data
Tips for working with JSON data. For more information see Datafusion Docs.
Accessing objects fields
You can access the fields of the object using the square bracket notation. Arrays are indexed from 1.
Example for the stargazers query from pagination section:
sql> select node['login'] as login, node['name'] as name from stargazers limit 5;
+--------------+----------------------+
| login | name |
+--------------+----------------------+
| simsieg | Simon Siegert |
| davidmathers | David Mathers |
| ahmedtadde | Ahmed Tadde |
| lordhamlet | Shih-Fen Cheng |
| thinmy | Thinmy Patrick Alves |
+--------------+----------------------+
Piping array into rows
You can use Datafusion unnest
function to pipe values from array into rows. We'll be using countries GraphQL api as an example.
from: graphql:https://countries.trevorblades.com
name: countries
params:
json_pointer: /data/continents
graphql_query: |
{
continents {
name
countries {
name
capital
}
}
}
description: countries
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 30m
Example query:
sql> select continent, country['name'] as country, country['capital'] as capital
from (select name as continent, unnest(countries) as country from countries)
where continent = 'North America' limit 5;
+---------------+---------------------+--------------+
| continent | country | capital |
+---------------+---------------------+--------------+
| North America | Antigua and Barbuda | Saint John's |
| North America | Anguilla | The Valley |
| North America | Aruba | Oranjestad |
| North America | Barbados | Bridgetown |
| North America | Saint Barthélemy | Gustavia |
+---------------+---------------------+--------------+
Unnesting object properties
You can also use the unnest_depth
parameter to control automatic unnesting of objects from GraphQL responses.
This examples uses the GitHub stargazers endpoint:
from: graphql:https://api.github.com/graphql
name: stargazers
params:
graphql_auth_token: ${env:GITHUB_TOKEN}
unnest_depth: 2
json_pointer: /data/repository/stargazers/edges
graphql_query: |
{
repository(name: "spiceai", owner: "spiceai") {
id
name
stargazers(first: 100) {
edges {
node {
id
name
login
}
}
pageInfo {
hasNextPage
endCursor
}
}
}
}
If unnest_depth
is set to 0, or unspecified, object unnesting is disabled. When enabled, unnesting automatically moves nested fields to the parent level.
Without unnesting, stargazers data looks like this in a query:
sql> select node from stargazers limit 1;
+------------------------------------------------------------+
| node |
+------------------------------------------------------------+
| {id: MDQ6VXNlcjcwNzIw, login: ashtom, name: Thomas Dohmke} |
+------------------------------------------------------------+
With unnesting, these properties are automatically placed into their own columns:
sql> select node from stargazers limit 1;
+------------------+--------+---------------+
| id | login | name |
+------------------+--------+---------------+
| MDQ6VXNlcjcwNzIw | ashtom | Thomas Dohmke |
+------------------+--------+---------------+
Unnesting Duplicate Columns
By default, the Spice Runtime will error when a duplicate column is detected during unnesting.
For example, this example spicepod.yml
query would fail due to name
fields:
from: graphql:https://my-graphql-api.com
name: stargazers
params:
unnest_depth: 2
json_pointer: /data/users
graphql_query: |
query {
users {
name
emergency_contact {
name
}
}
}
This example would fail with a runtime error:
WARN runtime: GraphQL Data Connector Error: Invalid object access. Column 'name' already exists in the object.
Avoid this error by using aliases in the query where possible. In the example above, a duplicate error was introduced from emergency_contact { name }
.
The example below uses a GraphQL alias to rename emergency_contact.name
as emergencyContactName
.
from: graphql:https://my-graphql-api.com
name: stargazers
params:
unnest_depth: 2
json_pointer: /data/people
graphql_query: |
query {
users {
name
emergency_contact {
emergencyContactName: name
}
}
}
Last updated
Was this helpful?