> For the complete documentation index, see [llms.txt](https://docs.spice.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.spice.ai/features/semantic-models.md).

# Semantic Models

A semantic model is a structured representation of data that captures the meaning and relationships between elements in a dataset.

In Spice, semantic models transform raw data into meaningful business concepts by defining metadata, descriptions, and relationships at both the dataset and column level. This makes the data more interpretable for both AI language models and human analysis.

### Use-Cases

#### Large Language Models (LLMs)

The semantic model is automatically used by [Spice Models](/features/spice-models.md) as context to produce more accurate and context-aware AI responses.

### Defining a Semantic Model

Semantic data models are defined within the `spicepod.yaml` file, specifically under the `datasets` section. Each dataset supports `description`, `metadata`, and a `columns` field where individual columns are described with metadata and features for utility and clarity.

#### Example Configuration

Example `spicepod.yaml`:

```yaml
datasets:
  - name: taxi_trips
    description: NYC taxi trip rides
    metadata:
      instructions: Always provide citations with reference URLs.
      reference_url_template: https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_<YYYY-MM>.parquet
    columns:
      - name: tpep_pickup_time
        description: 'The time the passenger was picked up by the taxi'
      - name: notes
        description: 'Optional notes about the trip'
        embeddings:
          - from: hf_minilm # A defined Spice Model
            chunking:
              enabled: true
              target_chunk_size: 512
              overlap_size: 128
              trim_whitespace: true
```

### Dataset Metadata

Datasets can be defined with the following metadata:

* `instructions`: Optional. Instructions to provide to a language model when using this dataset.
* `reference_url_template`: Optional. A URL template for citation links.

For detailed `metadata` configuration, see the Spice OSS [Dataset Reference](https://docs.spiceai.org/reference/spicepod/datasets#metadata)

### Column Definitions

Each column in the dataset can be defined with the following attributes:

* `description`: Optional. A description of the column's contents and purpose.
* `embeddings`: Optional. Vector embeddings configuration for this column.

For detailed `columns` configuration, see the Spice OSS [Dataset Reference](https://docs.spiceai.org/reference/spicepod/datasets#columns)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.spice.ai/features/semantic-models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
