Model Providers
Overview of supported model providers for ML and LLMs in Spice.
Last updated
Was this helpful?
Overview of supported model providers for ML and LLMs in Spice.
Last updated
Was this helpful?
Spice supports various model providers for traditional machine learning (ML) models and large language models (LLMs).
OpenAI (or compatible) LLM endpoint
-
OpenAI-compatible HTTP endpoint
Models hosted on HuggingFace
ONNX
GGUF, GGML, SafeTensor
Models hosted on the Spice.ai Cloud Platform
ONNX
OpenAI-compatible HTTP endpoint
Azure OpenAI
-
OpenAI-compatible HTTP endpoint
Models hosted on Anthropic
-
OpenAI-compatible HTTP endpoint
Models hosted on xAI
-
OpenAI-compatible HTTP endpoint
LLM Format(s) may require additional files (e.g. tokenizer_config.json
).
The model type is inferred based on the model source and files. For more detail, refer to the model
.
Spice supports a variety of features for large language models (LLMs):
Custom Tools: Provide models with tools to interact with the Spice runtime. See .
System Prompts: Customize system prompts and override defaults for . See .
Memory: Provide LLMs with memory persistence tools to store and retrieve information across conversations. See .
Vector Search: Perform advanced vector-based searches using embeddings. See .
Evals: Evaluate, track, compare, and improve language model performance for specific tasks. See .
Local Models: Load and serve models locally from various sources, including local filesystems and Hugging Face. See .
The following examples demonstrate how to configure and use various models or model features with Spice. Each example provides a specific use case to help you understand the configuration options available.
Example spicepod.yml
:
This example demonstrates how to pull GitHub issue data from the last 14 days, accelerate the data, create a chat model with memory and tools to access the accelerated data, and use Spice to ask the chat model about the general themes of new issues.
First, configure a dataset to pull GitHub issue data from the last 14 days.
Next, create a chat model that includes memory and tools to access the accelerated GitHub issue data.
At this step, the spicepod.yaml
should look like:
Finally, use Spice to ask the chat model about the general themes of new issues in the last 14 days. The following curl
command demonstrates how to make this request using the OpenAI-compatible API.
For more details, refer to the .
To use a language model hosted on OpenAI (or compatible), specify the openai
path and model ID in from
. For more details, see .
To specify tools for an OpenAI model, include them in the params.tools
field. For more details, see the .
To enable memory tools for a model, define a store
memory dataset and specify memory
in the model's tools
parameter. For more details, see the .
To set default overrides for parameters, use the openai_
prefix followed by the parameter name. For more details, see the .
To configure an additional system prompt, use the system_prompt
parameter. For more details, see the .
To serve a model from the local filesystem, specify the from
path as file
and provide the local path. For more details, see .
Refer to the for more details on making chat completion requests.