Welcome to the Spice.ai OSS Cookbook—a comprehensive collection of recipes for building and deploying data & AI applications using Spice.ai. Each recipe is a self-contained example that demonstrates a specific use case, integration, or feature of Spice.ai, helping you accelerate your data and AI projects.
- Real-time Data Access Pattern Analysis - Use AI to analyze query patterns and detect potential security risks.
- Federated SQL Query - Query data from S3, PostgreSQL, and Dremio in a single query.
- Command Query Responsibility Segregation (CQRS) - Sample application implementing the CQRS pattern with Spice.
- AI SQL Function - Use the
ai()
SQL function to invoke LLMs directly in SQL queries for text generation, sentiment analysis, and data enrichment. - Azure OpenAI Models
- Running Llama3 Locally - Use the Llama family of models locally from HuggingFace using Spice.
- OpenAI Models - Use OpenAI LLM and embedding models.
- OpenAI SDK - Use the OpenAI SDK to connect to models hosted on Spice.
- LLM Memory - Persistent memory for language models.
- Text to SQL (Tools)
- Nvidia NIM on Kubernetes - Deploy Nvidia NIM infrastructure, on Kubernetes, with GPUs connected to Spice.
- Nvidia NIM on AWS EC2 - Deploy Nvidia NIM on AWS GPU-optimized EC2 instances connected to Spice.
- Searching GitHub Files - Search GitHub files with embeddings and vector similarity search.
- xAI Models - Use xAI models such as Grok.
- DeepSeek Model - Use DeepSeek model through Spice.
- Filesystem Hosted Model - Use models hosted directly on filesystems.
- Web Search Tools using Perplexity - Provide LLMs with web search access for more informed answers.
- Language Model Evaluations - Use Spice to evaluate language models.
- LLM as a Judge - Define LLM judge models to evaluate the performance of other language models.
- OpenAI Responses API - Use OpenAI's Responses API with Spice
- DuckDB Data Accelerator
- Hashed Partitioning with DuckDB
- PostgreSQL Data Accelerator
- SQLite Data Accelerator
- Apache Arrow Data Accelerator
- Accelerated Views
- Sales BI (Apache Superset) - Visualize data in Spice with Apache Superset.
- Grafana Datasource - Add Spice as a Grafana datasource.
- Python ADBC Client - Query Spice using ADBC and Parameterized Queries with Python.
- Java JDBC Client - Query Spice using JDBC and Parameterized Queries with Java.
- Scala JDBC Client - Query Spice using JDBC and Parameterized Queries with Scala.
- Postgres Data Connector
- MySQL Data Connector
- Clickhouse Data Connector
- Databricks Connector - Delta Lake and Spark Connect.
- Delta Lake Connector - Query data from Delta Lake tables.
- Debezium Change Data Capture (CDC) Data Connector from Postgres - Stream changes from a Postgres database to Spice.
- Debezium CDC SASL/SCRAM Authentication from MySQL - Stream changes from a MySQL database to Spice using SASL/SCRAM authentication.
- Dremio Data Connector
- DuckDB Data Connector - Use a DuckDB database with sample TPCH data.
- File Data Connector - Query data from local files.
- FTP Data Connector - Query data from an FTP server.
- Glue Data Connector
- GitHub Data Connector
- GraphQL Data Connector
- MSSQL (Microsoft SQL Server) Data Connector
- ODBC Data Connector
- Amazon Redshift - Read and write TPC-H data with Amazon Redshift.
- Oracle Data Connector
- S3 Data Connector
- SharePoint/OneDrive for Business Data Connector
- Snowflake Data Connector
- Spice.ai Cloud Platform Data Connector
- Apache Spark Data Connector
- Apache Kafka Data Connector
- IMAP Data Connector
- Spice.ai Cloud Platform Catalog Connector
- Databricks Unity Catalog Connector
- Unity Catalog Connector
- Iceberg Catalog Connector
- Glue Catalog Connector
- Amazon S3 Vectors - Use Amazon S3 as a vector engine for embeddings and similarity search.
- Hybrid-Search - Combine keyword and vector search for improved retrieval.
- Deploying to Kubernetes
- Running in Docker
- Sidecar Deployment Architecture
- Microservice Deployment Architecture
- Rust SDK
- Python SDK
- Go SDK
- JavaScript SDK (Node.js) - Query NYC taxi trips data using the
@spiceai/spice
npm package. - Java SDK
- Local dataset replication - Link datasets in a parent/child relationship within the current Spicepod