GraphQL
GraphQL Data Connector Documentation
The GraphQL Data Connector enables federated SQL queries on any GraphQL endpoint by specifying graphql as the selector in the from value for the dataset.
datasets:
- from: graphql:your-graphql-endpoint
name: my_dataset
params:
json_pointer: /data/some/nodes
graphql_query: |
{
some {
nodes {
field1
field2
}
}
}Limitations
The GraphQL data connector does not support variables in the query.
Filter pushdown, with the exclusion of
LIMIT, is not currently supported. Using aLIMITwill reduce the amount of data requested from the GraphQL server.
Configuration
from
fromThe from field takes the form of graphql:your-graphql-endpoint.
name
nameThe dataset name. This will be used as the table name within Spice.
params
paramsThe GraphQL data connector can be configured by providing the following params. Use the secret replacement syntax to load the password from a secret store, e.g. ${secrets:my_graphql_auth_token}.
unnest_depth
Depth level to automatically unnest objects to. By default, disabled if unspecified or 0.
graphql_auth_token
The authentication token to use to connect to the GraphQL server. Uses bearer authentication.
graphql_auth_user
The username to use for basic auth. E.g. graphql_auth_user: my_user
graphql_auth_pass
The password to use for basic auth. E.g. graphql_auth_pass: ${secrets:my_graphql_auth_pass}
graphql_query
The username to use for basic auth. See examples for a sample GraphQL query
json_pointer
The JSON pointer into the response body. When graphql_query is paginated, the json_pointer can be inferred.
GraphQL Query Example
query: |
{
some {
nodes {
field1
field2
}
}
}Examples
Example using the GitHub GraphQL API and Bearer Auth. The following will use json_pointer to retrieve all of the nodes in starredRepositories:
from: graphql:https://api.github.com/graphql
name: stars
params:
graphql_auth_token: ${env:GITHUB_TOKEN}
graphql_auth_user: ${env:GRAPHQL_USER} ...
graphql_auth_pass: ${env:GRAPHQL_PASS}
json_pointer: /data/viewer/starredRepositories/nodes
graphql_query: |
{
viewer {
starredRepositories {
nodes {
name
stargazerCount
languages (first: 10) {
nodes {
name
}
}
}
}
}
}Pagination
The GraphQL Data Connector supports automatic pagination of the response for queries using cursor pagination.
The graphql_query must include the pageInfo field as per spec. The connector will parse the graphql_query, and when pageInfo is present, will retrieve data until pagination completes.
The query must have the correct pagination arguments in the associated paginated field.
Example
Forward Pagination:
{
something_paginated(first: 100) {
nodes {
foo
bar
}
pageInfo {
endCursor
hasNextPage
}
}
}Backward Pagination:
{
something_paginated(last: 100) {
nodes {
foo
bar
}
pageInfo {
startCursor
hasPreviousPage
}
}
}Working with JSON Data
Tips for working with JSON data. For more information see Datafusion Docs.
Accessing objects fields
You can access the fields of the object using the square bracket notation. Arrays are indexed from 1.
Example for the stargazers query from pagination section:
sql> select node['login'] as login, node['name'] as name from stargazers limit 5;
+--------------+----------------------+
| login | name |
+--------------+----------------------+
| simsieg | Simon Siegert |
| davidmathers | David Mathers |
| ahmedtadde | Ahmed Tadde |
| lordhamlet | Shih-Fen Cheng |
| thinmy | Thinmy Patrick Alves |
+--------------+----------------------+Piping array into rows
You can use Datafusion unnest function to pipe values from array into rows. We'll be using countries GraphQL api as an example.
from: graphql:https://countries.trevorblades.com
name: countries
params:
json_pointer: /data/continents
graphql_query: |
{
continents {
name
countries {
name
capital
}
}
}
description: countries
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 30mExample query:
sql> select continent, country['name'] as country, country['capital'] as capital
from (select name as continent, unnest(countries) as country from countries)
where continent = 'North America' limit 5;
+---------------+---------------------+--------------+
| continent | country | capital |
+---------------+---------------------+--------------+
| North America | Antigua and Barbuda | Saint John's |
| North America | Anguilla | The Valley |
| North America | Aruba | Oranjestad |
| North America | Barbados | Bridgetown |
| North America | Saint Barthélemy | Gustavia |
+---------------+---------------------+--------------+Unnesting object properties
You can also use the unnest_depth parameter to control automatic unnesting of objects from GraphQL responses.
This examples uses the GitHub stargazers endpoint:
from: graphql:https://api.github.com/graphql
name: stargazers
params:
graphql_auth_token: ${env:GITHUB_TOKEN}
unnest_depth: 2
json_pointer: /data/repository/stargazers/edges
graphql_query: |
{
repository(name: "spiceai", owner: "spiceai") {
id
name
stargazers(first: 100) {
edges {
node {
id
name
login
}
}
pageInfo {
hasNextPage
endCursor
}
}
}
}If unnest_depth is set to 0, or unspecified, object unnesting is disabled. When enabled, unnesting automatically moves nested fields to the parent level.
Without unnesting, stargazers data looks like this in a query:
sql> select node from stargazers limit 1;
+------------------------------------------------------------+
| node |
+------------------------------------------------------------+
| {id: MDQ6VXNlcjcwNzIw, login: ashtom, name: Thomas Dohmke} |
+------------------------------------------------------------+With unnesting, these properties are automatically placed into their own columns:
sql> select node from stargazers limit 1;
+------------------+--------+---------------+
| id | login | name |
+------------------+--------+---------------+
| MDQ6VXNlcjcwNzIw | ashtom | Thomas Dohmke |
+------------------+--------+---------------+Unnesting Duplicate Columns
By default, the Spice Runtime will error when a duplicate column is detected during unnesting.
For example, this example spicepod.yml query would fail due to name fields:
from: graphql:https://my-graphql-api.com
name: stargazers
params:
unnest_depth: 2
json_pointer: /data/users
graphql_query: |
query {
users {
name
emergency_contact {
name
}
}
}This example would fail with a runtime error:
WARN runtime: GraphQL Data Connector Error: Invalid object access. Column 'name' already exists in the object.Avoid this error by using aliases in the query where possible. In the example above, a duplicate error was introduced from emergency_contact { name }.
The example below uses a GraphQL alias to rename emergency_contact.name as emergencyContactName.
from: graphql:https://my-graphql-api.com
name: stargazers
params:
unnest_depth: 2
json_pointer: /data/people
graphql_query: |
query {
users {
name
emergency_contact {
emergencyContactName: name
}
}
}Last updated
Was this helpful?