githubEdit

robotAI & Models

Use the AI Gateway for LLM inference, embeddings, and search across multiple model providers.

Overview

Spice.ai provides an OpenAI-compatible AI Gateway that lets you access multiple model providers through a unified API. This enables LLM inference, embeddings, vector search, and RAG workflows.

See AI Gatewayarrow-up-right for full feature details.

Supported Model Providers

See Model Providersarrow-up-right for the complete list.

Setting Up AI in Your App

1. Add a model provider secret

Store your model provider API key as a secretarrow-up-right in your app (e.g., OPENAI_API_KEY).

2. Configure a model in your Spicepod

3. Deploy

Deploy your app to make the model available.

4. Use the API

Send requests to the LLM APIarrow-up-right:

Key Features

OpenAI-compatible API

The AI Gateway exposes an OpenAI-compatible interface at https://data.spiceai.io/v1/chat/completions. You can use any OpenAI-compatible client library — just point it at the Spice.ai endpoint and use your app's API key.

Custom tools & system prompts

Configure custom tools and system prompts in your model configuration to tailor AI behavior. See AI Gatewayarrow-up-right for configuration options.

Vector search & RAG

Spice supports vector and hybrid searcharrow-up-right for retrieval-augmented generation (RAG) workflows:

Observability

All AI requests include full OpenTelemetry observabilityarrow-up-right for tracing request flows, latency, and errors.

Common Issues

AI chat returns errors

  1. Model not configured — Ensure a model is defined in your Spicepod and the app is deployed.

  2. Missing secret — Verify the model provider API key is stored as a secret and referenced correctly with ${secrets:SECRET_NAME}.

  3. Secret changes require redeployment — After adding or updating secrets, redeploy the app.

Model not found

  • Check that the model name in your API request matches the name field in your Spicepod model configuration.

  • Verify the from field uses a valid provider and model identifier.

Rate limits or quota errors

These typically come from the upstream model provider (e.g., OpenAI). Check your provider's usage dashboard and rate limits.

Further Reading

Last updated

Was this helpful?