githubEdit

cameraAcceleration Snapshots

Object-store-backed snapshot, bootstrap, and recovery for accelerated datasets in Spice.ai Enterprise.

Acceleration snapshots persist accelerated dataset state to an object store so that a Spice runtime can bootstrap a new replica or executor without re-fetching from the federated source. Snapshots dramatically reduce cold-start time, support fast scale-out, and provide a simple disaster-recovery path for accelerated state.

circle-info

Acceleration snapshots are a Spice.ai Enterprise feature. They are not available in Spice.ai OSS.

How it works

  1. The runtime writes the accelerated dataset to an object store (S3-compatible, Azure Blob, GCS, or any object store with conditional writesarrow-up-right).

  2. On startup, an accelerated dataset that opts into snapshots reads the most recent snapshot from the configured location and hydrates its local accelerator before serving queries.

  3. While running, snapshots are produced according to the dataset's configured trigger and creation policy. Older snapshots can be compacted to bound storage cost.

  4. In a SpicepodCluster, each executor reads the snapshot for the partitions it owns and writes new snapshots after refreshes complete on the partitions it is responsible for.

Engine
Snapshot support
Notes

DuckDB

✓ Bootstrap + create

File-mode; snapshot is the DuckDB file uploaded to object storage.

SQLite

✓ Bootstrap + create

File-mode; snapshot is the SQLite file uploaded to object storage.

Cayenne

✓ Bootstrap + create (recommended for distributed)

Vortex storage with SQLite metadata; integrates natively with SpicepodCluster partitioned executors.

Arrow (in-memory)

Re-fetched from source on restart.

Postgres (external)

External database is itself the durable store.

Configuration

Snapshots are configured in two places:

  1. Top-level snapshots — declares the snapshot store location, credentials, and global behavior.

  2. Per-dataset acceleration.snapshots* — opts the dataset into snapshots and chooses behavior, trigger, and compaction.

Top-level snapshots

Field
Type
Default
Description

enabled

bool

true

Global on/off switch. When false, no dataset can use snapshots regardless of per-dataset config.

location

string

Object-store URL of the snapshot folder (e.g. s3://bucket/spice/snapshots/, abfs://…, gs://…).

bootstrap_on_failure_behavior

warn | retry | fallback

warn

What to do if snapshot load fails. warn continues with empty acceleration; retry retries indefinitely; fallback tries older snapshots.

params

object

Object-store auth/configuration. For S3, defaults to s3_auth: iam_role; explicit keys may be sourced from secrets.

Per-dataset acceleration snapshot fields

Field
Type
Default
Description

snapshots

disabled | enabled | bootstrap_only | create_only

disabled

Per-dataset opt-in. enabled does both bootstrap and create. bootstrap_only only loads. create_only only writes.

snapshots_trigger

refresh_complete | time_interval | stream_batches

refresh_complete

When to attempt to write a snapshot.

snapshots_trigger_threshold

duration / count string

For time_interval (e.g. "10m") and stream_batches (e.g. "1000") triggers.

snapshots_compaction

disabled | enabled

disabled

When enabled, older snapshots are compacted/garbage-collected.

snapshots_creation_policy

on_change | always

on_change

on_change skips creating a new snapshot when the dataset hasn't changed since the last snapshot.

snapshots_reset_expiry_on_load

disabled | enabled

disabled

When enabled, loading a snapshot resets the dataset's expiry/retention clock — useful for cold standby replicas.

Behavior modes

Mode
Bootstrap from snapshot?
Write new snapshots?
Typical use

disabled

Snapshots off for this dataset (default).

enabled

Standard production setting: bootstrap fast, keep the snapshot up to date.

bootstrap_only

Read replicas / scaled-out executors that should hydrate from a shared snapshot but never overwrite it.

create_only

A "primary" replica that produces snapshots for other replicas to consume.

A common topology in SpicepodCluster is one create_only (or enabled) executor per partition writing snapshots, with newly added executors using bootstrap_only to come online quickly.

Triggers

Trigger
Threshold required
Description

refresh_complete

Default. A snapshot is attempted after each successful dataset refresh.

time_interval

A snapshot is attempted on a timer (snapshots_trigger_threshold: "10m").

stream_batches

For streaming sources, a snapshot is attempted every N consumed batches (snapshots_trigger_threshold: "1000").

snapshots_creation_policy: on_change (default) suppresses snapshots when there is no change to commit; pair it with time_interval or stream_batches to bound snapshot frequency.

Failure handling on bootstrap

bootstrap_on_failure_behavior controls what happens if the runtime cannot read the most recent snapshot:

  • warn — Log a warning and continue with an empty acceleration. The next refresh will repopulate it from the federated source. Suitable when the source is fast and snapshots are an optimization.

  • retry — Retry loading the newest snapshot indefinitely. Suitable for replicas that should not serve stale-empty data.

  • fallback — Try progressively older snapshots until one loads successfully. Suitable for resilience against a corrupt or partially written latest snapshot.

Distributed clusters

In a SpicepodCluster:

  • The snapshot location should point to the same object store used for cluster shared state (or a dedicated bucket) so all schedulers and executors share a single view.

  • Each executor writes snapshots only for the partitions it owns; reads target the same partition layout (Hive-style key1=v1/key2=v2/...).

  • New executors joining the cluster receive their partition assignments via AllocateInitialPartitions (see Distributed Query) and then bootstrap each owned partition from the snapshot store, avoiding a federated re-scan.

  • Setting snapshots: bootstrap_only on executors prevents accidental dual-writers when more than one executor temporarily believes it owns a partition during a transition; the scheduler-confirmed owner uses enabled or create_only.

Observability

Acceleration snapshots emit Prometheus / OTLP metrics on every node:

Metric
Type
Description

dataset_acceleration_snapshot_bootstrap_bytes

gauge

Bytes downloaded when bootstrapping the acceleration from a snapshot.

dataset_acceleration_snapshot_bootstrap_checksum

gauge

Checksum of the snapshot downloaded during bootstrap (emitted with checksum attribute).

dataset_acceleration_snapshot_bootstrap_duration_ms

counter

Time in ms taken to download the snapshot used to bootstrap acceleration.

dataset_acceleration_snapshot_failure_count

counter

Number of failures encountered while writing snapshots.

dataset_acceleration_snapshot_write_bytes

gauge

Bytes written for the most recent snapshot.

dataset_acceleration_snapshot_write_checksum

gauge

Checksum of the most recent snapshot write (emitted with checksum attribute).

dataset_acceleration_snapshot_write_duration_ms

histogram

Time in ms taken to write the latest snapshot to object storage.

dataset_acceleration_snapshot_write_timestamp

gauge

Unix timestamp (seconds) when the most recent snapshot write completed.

Production checklist

See also

Last updated

Was this helpful?