TinyEmbed

TinyEmbed lets you create text embeddings, perform semantic searches, and analyze text similarity without any backend server. All processing happens locally, right within your browser.

Features

Single Text Embedding: Generate embeddings for individual texts
Batch Embedding: Process multiple texts at once
Semantic Search: Create a corpus of texts and search by meaning
Similarity Analysis: Compare two texts and analyze their semantic similarity
Local Processing: All computation runs in your browser
No API Keys: No external API services or keys required

Technical Stack

Next.js with App Router and TypeScript
TinyLM for embedding generation
shadcn/ui for components
Tailwind CSS for styling
WebGPU (optional) for hardware acceleration

Browser Requirements

Modern browser with WebGPU support for full performance (Chrome 113+, Edge 113+, or other Chromium browsers)
Falls back to WebGL when WebGPU is unavailable

Getting Started

Installation

# Clone the repository
git clone https://github.com/wizenheimer/tinyembed.git
cd tinyembed

# Install dependencies
npm install

Development

# Start the development server
npm run dev

Production Build

# Build for production
npm run build

# Start the production server
npm start

Usage Guide

1. Load the Embedding Model

Select the embedding model from the dropdown (currently only Nomic Embed is supported)
Choose the encoding format (Float for raw vectors or Base64 for compact storage)
Click "Load Model" and wait for the loading process to complete
The status indicators will show you the current state of the model and system

2. Generate Embeddings

Single Embedding

Enter text in the input field
Click "Generate Embedding"
View the resulting embedding vector and metadata

Batch Embedding

Enter multiple texts, one per line
Click "Generate Batch Embeddings"
View the results for the entire batch

3. Semantic Search

Add documents one by one to create a searchable corpus
Enter a search query
Click "Search" to find semantically similar documents
Results will be ranked by similarity

4. Similarity Analysis

Enter two texts to compare
Click "Compare Texts"
View the similarity score and interpretation

Performance Notes

WebGPU acceleration provides the best performance, especially for larger embedding models
Loading models can take some time depending on your internet connection (models are downloaded the first time)
Processing times improve significantly after the initial runs
For using vector search over HNSW index, try out tinkerbird library.

Attribution

This project uses:

TinyLM for browser-based ML inference
Nomic Embed text embedding model
shadcn/ui for UI components

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
app		app
assets		assets
components		components
lib		lib
public		public
types		types
.gitignore		.gitignore
README.md		README.md
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TinyEmbed

Features

Technical Stack

Browser Requirements

Getting Started

Installation

Development

Production Build

Usage Guide

1. Load the Embedding Model

2. Generate Embeddings

Single Embedding

Batch Embedding

3. Semantic Search

4. Similarity Analysis

Performance Notes

Attribution

About

Uh oh!

Releases

Packages

Languages

wizenheimer/tinyembed

Folders and files

Latest commit

History

Repository files navigation

TinyEmbed

Features

Technical Stack

Browser Requirements

Getting Started

Installation

Development

Production Build

Usage Guide

1. Load the Embedding Model

2. Generate Embeddings

Single Embedding

Batch Embedding

3. Semantic Search

4. Similarity Analysis

Performance Notes

Attribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages