# Hugging Face

To use a model hosted on HuggingFace, specify the `huggingface.co` path in the `from` field and, when needed, the files to include.

## Configuration

### `from`

The `from` key takes the form of `huggingface:model_path`. Below shows 2 common example of `from` key configuration.

* `huggingface:username/modelname`: Implies the latest version of `modelname` hosted by `username`.
* `huggingface:huggingface.co/username/modelname:revision`: Specifies a particular `revision` of `modelname` by `username`, including the optional domain.

The `from` key follows the following regex format.

```regex
\A(huggingface:)(huggingface\.co\/)?(?<org>[\w\-]+)\/(?<model>[\w\-]+)(:(?<revision>[\w\d\-\.]+))?\z
```

The `from` key consists of five components:

1. **Prefix:** The value must start with `huggingface:`.
2. **Domain (Optional):** Optionally includes `huggingface.co/` immediately after the prefix. Currently no other Huggingface compatible services are supported.
3. **Organization/User:** The HuggingFace organization (`org`).
4. **Model Name:** After a `/`, the model name (`model`).
5. **Revision (Optional):** A colon (`:`) followed by the git-like revision identifier (`revision`).

### `name`

The model name. This will be used as the model ID within Spice and Spice's endpoints (i.e. `https://data.spiceai.io/v1/models`). This can be set to the same value as the model ID in the `from` field.

### `params`

| Param           | Description                                                                                                                                                                               | Default |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `hf_token`      | The Huggingface access token.                                                                                                                                                             | -       |
| `model_type`    | The architecture to load the model as. Supported values: `mistral`, `gemma`, `mixtral`, `llama`, `phi2`, `phi3`, `qwen2`, `gemma2`, `starcoder2`, `phi3.5moe`, `deepseekv2`, `deepseekv3` | -       |
| `tools`         | Which \[tools] should be made available to the model. Set to `auto` to use all available tools.                                                                                           | -       |
| `system_prompt` | An additional system prompt used for all chat completions to this model.                                                                                                                  | -       |

### `files`

The specific file path for Huggingface model. For example, GGUF model formats require a specific file path, other varieties (e.g. `.safetensors`) are inferred.

#### Example

```yaml
models:
  - from: huggingface:huggingface.co/lmstudio-community/Qwen2.5-Coder-3B-Instruct-GGUF
    name: sloth-gguf
    files:
      - path: Qwen2.5-Coder-3B-Instruct-Q3_K_L.gguf
```

## Access Tokens

Access tokens can be provided for Huggingface models in two ways:

1. In the Huggingface token cache (i.e. `~/.cache/huggingface/token`). Default.
2. Via [model params](#params).

```yaml
models:
  - name: llama_3.2_1B
    from: huggingface:huggingface.co/meta-llama/Llama-3.2-1B
    params:
      hf_token: ${ secrets:HF_TOKEN }
```

## Examples

### Load a ML model to predict taxi trips outcomes

```yaml
models:
  - from: huggingface:huggingface.co/spiceai/darts:latest
    name: hf_model
    files:
      - path: model.onnx
    datasets:
      - taxi_trips
```

### Load a LLM model to generate text

```yaml
models:
  - from: huggingface:huggingface.co/microsoft/Phi-3.5-mini-instruct
    name: phi
```

### Load a private model

```yaml
models:
  - name: llama_3.2_1B
    from: huggingface:huggingface.co/meta-llama/Llama-3.2-1B
    params:
      hf_token: ${ secrets:HF_TOKEN }
```

For more details on authentication, see [access tokens](#access-tokens).

{% hint style="warning" %}
**Limitations**

* The throughput, concurrency & latency of a locally hosted model will vary based on the underlying hardware and model size. Spice supports [Apple metal](https://github.com/spicehq/docs/blob/trunk/installation.md#metal-support) and [CUDA](https://github.com/spicehq/docs/blob/trunk/installation.md#cuda-support) for accelerated inference.
* ML models currently only support ONNX file format.
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.spice.ai/building-blocks/model-providers/huggingface.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
