Last updated
Was this helpful?
Last updated
Was this helpful?
To use a model hosted on HuggingFace, specify the huggingface.co
path in the from
field and, when needed, the files to include.
from
The from
key takes the form of huggingface:model_path
. Below shows 2 common example of from
key configuration.
huggingface:username/modelname
: Implies the latest version of modelname
hosted by username
.
huggingface:huggingface.co/username/modelname:revision
: Specifies a particular revision
of modelname
by username
, including the optional domain.
The from
key follows the following regex format.
The from
key consists of five components:
Prefix: The value must start with huggingface:
.
Domain (Optional): Optionally includes huggingface.co/
immediately after the prefix. Currently no other Huggingface compatible services are supported.
Organization/User: The HuggingFace organization (org
).
Model Name: After a /
, the model name (model
).
Revision (Optional): A colon (:
) followed by the git-like revision identifier (revision
).
name
The model name. This will be used as the model ID within Spice and Spice's endpoints (i.e. https://data.spiceai.io/v1/models
). This can be set to the same value as the model ID in the from
field.
params
files
The specific file path for Huggingface model. For example, GGUF model formats require a specific file path, other varieties (e.g. .safetensors
) are inferred.
Access tokens can be provided for Huggingface models in two ways:
In the Huggingface token cache (i.e. ~/.cache/huggingface/token
). Default.
Limitations
ML models currently only support ONNX file format.
Via .
For more details on authentication, see .
The throughput, concurrency & latency of a locally hosted model will vary based on the underlying hardware and model size. Spice supports and for accelerated inference.
Instructions for using machine learning models hosted on HuggingFace with Spice.
hf_token
The Huggingface access token.
-
model_type
The architecture to load the model as. Supported values: mistral
, gemma
, mixtral
, llama
, phi2
, phi3
, qwen2
, gemma2
, starcoder2
, phi3.5moe
, deepseekv2
, deepseekv3
-
tools
Which [tools] should be made available to the model. Set to auto
to use all available tools.
-
system_prompt
An additional system prompt used for all chat completions to this model.
-