Spice.ai OSS Cookbook

Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for building and deploying data & AI applications using Spice.ai. Each recipe is a self-contained example that demonstrates a specific use case, integration, or feature of Spice.ai, helping you accelerate your data and AI projects.

Recipes

Guides

Real-time Data Access Pattern Analysis - Use AI to analyze query patterns and detect potential security risks.

Core scenarios

Federated SQL Query - Query data from S3, PostgreSQL, and Dremio in a single query.

Sample Applications

Command Query Responsibility Segregation (CQRS) - Sample application implementing the CQRS pattern with Spice.

Models & AI - Connect data to hosted or local AI models

AI SQL Function - Use the ai() SQL function to invoke LLMs directly in SQL queries for text generation, sentiment analysis, and data enrichment.
Azure OpenAI Models
Running Llama3 Locally - Use the Llama family of models locally from HuggingFace using Spice.
OpenAI Models - Use OpenAI LLM and embedding models.
OpenAI SDK - Use the OpenAI SDK to connect to models hosted on Spice.
LLM Memory - Persistent memory for language models.
Text to SQL (Tools)
Nvidia NIM on Kubernetes - Deploy Nvidia NIM infrastructure, on Kubernetes, with GPUs connected to Spice.
Nvidia NIM on AWS EC2 - Deploy Nvidia NIM on AWS GPU-optimized EC2 instances connected to Spice.
Searching GitHub Files - Search GitHub files with embeddings and vector similarity search.
xAI Models - Use xAI models such as Grok.
DeepSeek Model - Use DeepSeek model through Spice.
Filesystem Hosted Model - Use models hosted directly on filesystems.
Web Search Tools using Perplexity - Provide LLMs with web search access for more informed answers.
Language Model Evaluations - Use Spice to evaluate language models.
LLM as a Judge - Define LLM judge models to evaluate the performance of other language models.
OpenAI Responses API - Use OpenAI's Responses API with Spice

Data Acceleration - Materializing & accelerating data locally with Data Accelerators

Consuming and visualizing data with clients

Sales BI (Apache Superset) - Visualize data in Spice with Apache Superset.
Grafana Datasource - Add Spice as a Grafana datasource.
Python ADBC Client - Query Spice using ADBC and Parameterized Queries with Python.
Java JDBC Client - Query Spice using JDBC and Parameterized Queries with Java.
Scala JDBC Client - Query Spice using JDBC and Parameterized Queries with Scala.

Connecting to Data Sources with Data Connectors

Postgres Data Connector
- AWS RDS PostgreSQL
- Supabase
MySQL Data Connector
- AWS RDS Aurora (MySQL Compatible)
- PlanetScale
Clickhouse Data Connector
Databricks Connector - Delta Lake and Spark Connect.
Delta Lake Connector - Query data from Delta Lake tables.
Debezium Change Data Capture (CDC) Data Connector from Postgres - Stream changes from a Postgres database to Spice.
- Debezium CDC SASL/SCRAM Authentication from MySQL - Stream changes from a MySQL database to Spice using SASL/SCRAM authentication.
Dremio Data Connector
DuckDB Data Connector - Use a DuckDB database with sample TPCH data.
File Data Connector - Query data from local files.
FTP Data Connector - Query data from an FTP server.
Glue Data Connector
GitHub Data Connector
GraphQL Data Connector
MSSQL (Microsoft SQL Server) Data Connector
ODBC Data Connector
Amazon Redshift - Read and write TPC-H data with Amazon Redshift.
Oracle Data Connector
S3 Data Connector
SharePoint/OneDrive for Business Data Connector
Snowflake Data Connector
Spice.ai Cloud Platform Data Connector
Apache Spark Data Connector
Apache Kafka Data Connector
IMAP Data Connector
- Connecting to an Outlook mailbox

Connecting to Data Sources with Catalog Connectors

Using Vector Engines

Amazon S3 Vectors - Use Amazon S3 as a vector engine for embeddings and similarity search.

Search

Hybrid-Search - Combine keyword and vector search for improved retrieval.

Deployment and Installation

Performance

Acceleration Data Configuration

Client SDKs - Recipes for querying data from Spice with language-specific SDKs

Rust SDK
Python SDK
Go SDK
JavaScript SDK (Node.js) - Query NYC taxi trips data using the @spiceai/spice npm package.
Java SDK

Security

Advanced Topics

Local dataset replication - Link datasets in a parent/child relationship within the current Spicepod

Name		Name	Last commit message	Last commit date
Latest commit History 293 Commits
.github/workflows		.github/workflows
acceleration		acceleration
ai		ai
api_key		api_key
architectures		architectures
arrow		arrow
azure_openai		azure_openai
caching		caching
catalogs		catalogs
cdc-debezium		cdc-debezium
clickhouse		clickhouse
client-sdk		client-sdk
clients		clients
cqrs		cqrs
data		data
databricks		databricks
deepseek		deepseek
delta-lake		delta-lake
docker		docker
dremio		dremio
duckdb		duckdb
evals		evals
federation		federation
file		file
ftp		ftp
github		github
glue		glue
grafana-datasource		grafana-datasource
graphql		graphql
guides/security-analyzer		guides/security-analyzer
hashed_partitioning		hashed_partitioning
imap		imap
json_strings		json_strings
kafka		kafka
kubernetes		kubernetes
llama		llama
llm-judge		llm-judge
llm-memory		llm-memory
localpod		localpod
mcp		mcp
models		models
mongodb/connector		mongodb/connector
mssql		mssql
mysql		mysql
nvidia-nim		nvidia-nim
odbc		odbc
openai-responses-api		openai-responses-api
openai_sdk		openai_sdk
oracle		oracle
postgres		postgres
redshift		redshift
refresh-data-window		refresh-data-window
retention		retention
s3		s3
sales-bi		sales-bi
search		search
search_github_files		search_github_files
sharepoint		sharepoint
snowflake		snowflake
spark		spark
spiceai		spiceai
sqlite/accelerator		sqlite/accelerator
text-to-sql		text-to-sql
tls		tls
tpc-h		tpc-h
vectors/s3		vectors/s3
views		views
websearch		websearch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spice.ai OSS Cookbook

Recipes

Guides

Core scenarios

Sample Applications

Models & AI - Connect data to hosted or local AI models

Data Acceleration - Materializing & accelerating data locally with Data Accelerators

Consuming and visualizing data with clients

Connecting to Data Sources with Data Connectors

Connecting to Data Sources with Catalog Connectors

Using Vector Engines

Search

Deployment and Installation

Performance

Acceleration Data Configuration

Client SDKs - Recipes for querying data from Spice with language-specific SDKs

Security

Advanced Topics

About

Uh oh!

Uh oh!

Contributors 16

Uh oh!

Languages

License

spiceai/cookbook

Folders and files

Latest commit

History

Repository files navigation

Spice.ai OSS Cookbook

Recipes

Guides

Core scenarios

Sample Applications

Models & AI - Connect data to hosted or local AI models

Data Acceleration - Materializing & accelerating data locally with Data Accelerators

Consuming and visualizing data with clients

Connecting to Data Sources with Data Connectors

Connecting to Data Sources with Catalog Connectors

Using Vector Engines

Search

Deployment and Installation

Performance

Acceleration Data Configuration

Client SDKs - Recipes for querying data from Spice with language-specific SDKs

Security

Advanced Topics

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 16

Uh oh!

Languages