Vector Embeddings for E-commerce: A Practical Guide
A technical guide to vector embeddings for e-commerce—how they power semantic search, product recommendations, and AI customer support at scale.
The Problem with Traditional E-commerce Search
A customer searches for "lightweight summer footwear" on your store. Traditional keyword search returns nothing—because your product descriptions say "mesh sandals" and "cork-sole flats," not "lightweight summer footwear."
Same customer asks support: "Do you have shoes for hot weather?" Your chatbot doesn't understand. It's looking for keywords: "shoes," "hot," "weather." Those exact words aren't in your documentation.
This is the fundamental limitation of keyword-based systems. They match words, not meaning.
Vector embeddings solve this problem. They represent meaning mathematically, enabling systems to understand that "lightweight summer footwear" and "mesh sandals" are semantically related—even when they share no common words.
This isn't theoretical. Modern e-commerce platforms are replacing keyword search with vector-based semantic search, achieving measurably better results: higher conversion rates, fewer "no results" pages, better product discovery.
What Are Vector Embeddings?
A vector embedding is a numerical representation of text, images, or other data in a high-dimensional space. Similar concepts cluster together; dissimilar concepts stay apart.
Think of it as mapping language to coordinates. Instead of storing "running shoes" as text, you store it as something like:
``` [0.23, -0.45, 0.67, -0.12, 0.89, ...] // 768 or 1536 dimensions ```
The crucial property: semantically similar items have similar vectors. "Running shoes," "jogging sneakers," and "athletic footwear" all map to nearby points in this space—despite different words.
The Mathematics (Simplified)
Similarity is measured using cosine similarity—the angle between two vectors. The formula:
``` similarity(A, B) = (A · B) / (||A|| × ||B||) ```
This produces a score from -1 (opposite) to +1 (identical). In practice, most semantic similarities range from 0.3 to 0.9.
Example:
- "summer sandals" vs "beach footwear": 0.82 (very similar)
- "summer sandals" vs "winter boots": 0.34 (related but different)
- "summer sandals" vs "laptop charger": 0.05 (unrelated)
The system doesn't know these are semantically related because you told it. It learned the relationships from training on massive text corpora.
E-commerce Use Cases
Vector embeddings enable three critical e-commerce capabilities:
1. Semantic Product Search
Traditional approach: Match keywords in search query against product titles/descriptions.
Problem: "affordable wireless earbuds" doesn't match products titled "Budget Bluetooth Headphones"—even though they're exactly what the customer wants.
Vector approach: Convert query and all products to embeddings. Find products with highest similarity scores.
Result: "affordable wireless earbuds" retrieves "Budget Bluetooth Headphones" (similarity: 0.87) because the embedding model understands semantic equivalence.
According to [Zilliz's research on e-commerce embeddings](https://zilliz.com/ai-faq/how-do-i-choose-embedding-models-for-ecommerce-product-search), semantic search using embeddings can significantly improve product discovery compared to keyword-only approaches.
2. Product Recommendations
Traditional approach: Collaborative filtering ("customers who bought X also bought Y") or rule-based ("same category").
Vector approach: Represent products as embeddings based on descriptions, attributes, and customer behavior. Find similar products in vector space.
Advantage: Discovers non-obvious relationships. A customer viewing hiking boots might get recommended water bottles, trail mix, and moisture-wicking socks—items that cluster near "hiking" in embedding space, even if they're different product categories.
3. AI Customer Support
Traditional approach: Rule-based chatbots or keyword matching in FAQs.
Vector approach: Convert customer questions and all documentation to embeddings. Retrieve most relevant content, generate contextual answers.
This is Retrieval-Augmented Generation (RAG)—covered in depth in our [RAG explainer](/blog/rag-explained-ai-chatbots). The embedding layer ensures the chatbot finds relevant information even when customers phrase questions differently than your documentation.
Real-world impact: Customer service teams using RAG systems report [95% accuracy rates in AI responses](https://mehmetozkaya.medium.com/designing-e-shop-customer-support-using-rag-2f2ba8a760d6), compared to just 60% with standard keyword-based chatbots.
Choosing an Embedding Model
The embedding model determines how well your system understands semantic similarity. Wrong choice means poor search results and bad recommendations.
Text Embedding Models: Key Players
OpenAI text-embedding-3-large
- Dimensions: 1536 (configurable via Matryoshka truncation)
- Strengths: Strong general-purpose accuracy, handles typos well
- Cost: $0.13 per 1M tokens
- Best for: General e-commerce search, broad product catalogs
- [Performance details from AIMMultiple research](https://research.aimultiple.com/embedding-models/)
OpenAI text-embedding-3-small
- Dimensions: 512
- Strengths: Fast inference, low cost
- Cost: $0.02 per 1M tokens
- Best for: Real-time applications, budget-conscious deployments
- Trade-off: Slightly lower accuracy than -large variant
Cohere embed-english-v3.0
- Dimensions: 1024 or 384 (configurable)
- Strengths: Excellent fine-grained differentiation, works with rerankers
- Cost: $0.50 per 1M tokens (1024-dim)
- Best for: Precise product matching, when combined with reranking
- Limitation: Lower typo tolerance than OpenAI
- [Comparison from MyScale](https://www.myscale.com/blog/best-embedding-models-semantic-search-comparison/)
Mistral embed
- Dimensions: Variable
- Strengths: Highest accuracy in some benchmarks (77.8%)
- Best for: Maximum retrieval precision
- Note: Emerging model; less widely deployed than OpenAI/Cohere
Multimodal Models: Text + Images
E-commerce often requires understanding both product descriptions AND images. Multimodal models handle both.
Amazon Titan Multimodal Embeddings G1
- Strengths: Processes text and images into unified embedding space
- Use case: Search query "red heels" matches both product descriptions containing "red heels" AND actual images of red heels
- Advantage: Single query retrieves text-based and visual matches
According to [AWS Titan implementation guides](https://norahsakal.com/blog/vectorizing-ecommerce-product-data-with-aws-titan-a-practical-guide/), multimodal embeddings enable customers to search by uploading images or combining visual and textual criteria.
Model Selection Criteria
Accuracy vs Speed
- 768+ dimensions: Better accuracy, slower search
- 384 dimensions: Faster search, slight accuracy loss
- Solution: Use approximate nearest neighbor libraries (FAISS, Annoy) to optimize high-dimensional search
Domain Specificity
- Generic models (trained on general text): Work for most e-commerce
- Domain-specific models (trained on e-commerce data): Better for specialized catalogs
- Practical: Fine-tuning on your product catalog improves results but requires ML expertise
Cost at Scale Processing 1 million product descriptions (average 100 tokens each):
- OpenAI text-embedding-3-small: $2
- OpenAI text-embedding-3-large: $13
- Cohere embed-english-v3.0: $50
For catalogs with millions of SKUs, these costs add up. Budget accordingly.
Vector Database Options
Once you have embeddings, you need to store and search them efficiently. This is where vector databases come in.
The Core Challenge
Traditional databases use exact matching: `WHERE product_name = 'running shoes'`. Vector databases use similarity search: "find the 10 products with embeddings most similar to this query embedding."
This requires specialized data structures (HNSW, IVF) that can search billions of high-dimensional vectors in milliseconds.
pgvector: PostgreSQL Extension
What it is: Adds vector search to standard PostgreSQL.
Strengths:
- Use existing Postgres infrastructure—no new systems
- Familiar SQL syntax: `ORDER BY embedding <=> query_embedding`
- Hybrid queries combining vector search with traditional filters
Limitations:
- Realistically handles ~1-10 million vectors before performance degrades
- Sub-10ms latency difficult at scale
- Not optimized for billions of vectors
Best for: Small to medium catalogs (<5M products), teams already using Postgres, hybrid queries with complex filters
Cost: Free (open-source extension), runs on your Postgres instance
[PostgreSQL vector search guide from Northflank](https://northflank.com/blog/postgresql-vector-search-guide-with-pgvector) provides implementation details.
Pinecone: Managed Vector Database
What it is: Cloud-native, purpose-built vector database.
Strengths:
- Handles billions of vectors
- Consistent low latency (sub-50ms at scale)
- Zero infrastructure management
- Built-in filtering and metadata support
Limitations:
- Vendor lock-in (proprietary)
- Higher cost than self-hosted alternatives
- Less flexibility than open-source options
Best for: Fast time-to-market, large-scale catalogs (10M+ products), teams prioritizing reliability over cost
Cost: Starts ~$70/month, scales with usage
Weaviate: Open-Source Vector Database
What it is: Schema-based vector database with hybrid search capabilities.
Strengths:
- Combines vector search with keyword search (hybrid retrieval)
- GraphQL interface (developer-friendly)
- Self-hostable (avoid vendor lock-in)
- Built-in classification and object relationships
Use case: E-commerce marketplace with 15M SKUs reported Weaviate delivered 22% lower monthly cost than Pinecone at steady traffic, with hybrid filtering and OSS fallback advantages.
Best for: Hybrid search requirements, teams wanting open-source flexibility, complex product relationships
Cost: Free (self-hosted) or managed cloud ($25+/month)
[Vector database comparison from wearemicro.co](https://wearemicro.co/vector-database-comparison/) benchmarks Pinecone vs Weaviate vs others.
Decision Matrix
| Database | Best For | Scale Limit | |----------|----------|-------------| | pgvector | SQL familiarity, hybrid queries | ~10M vectors | | Pinecone | Managed reliability, fast deployment | Billions | | Weaviate | OSS flexibility, hybrid search | Billions |
Common strategy: Start with pgvector. Migrate to Weaviate or Pinecone when scale or latency demands it.
Implementation: Semantic Search Example
Here's how to build semantic product search with embeddings. This example uses OpenAI embeddings and pgvector, but the pattern applies to other stacks.
Step 1: Install Dependencies
```bash npm install openai pg ```
Enable pgvector extension in your PostgreSQL database:
```sql CREATE EXTENSION vector; ```
Step 2: Create Products Table with Vector Column
```sql CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT NOT NULL, description TEXT NOT NULL, price DECIMAL(10,2), embedding vector(1536) -- OpenAI text-embedding-3-large dimensions );
-- Index for fast similarity search CREATE INDEX ON products USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); ```
Step 3: Generate Embeddings for Products
```typescript import OpenAI from 'openai'; import { Pool } from 'pg';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); const db = new Pool({ connectionString: process.env.DATABASE_URL });
async function embedProducts() { const { rows: products } = await db.query( 'SELECT id, name, description FROM products WHERE embedding IS NULL' );
for (const product of products) { // Combine name and description for richer embedding const text = `${product.name}. ${product.description}`;
const response = await openai.embeddings.create({ model: 'text-embedding-3-large', input: text, });
const embedding = response.data[0].embedding;
await db.query( 'UPDATE products SET embedding = $1 WHERE id = $2', [JSON.stringify(embedding), product.id] ); } } ```
Step 4: Semantic Search Function
```typescript async function searchProducts(query: string, limit: number = 10) { // Generate embedding for search query const response = await openai.embeddings.create({ model: 'text-embedding-3-large', input: query, });
const queryEmbedding = response.data[0].embedding;
// Find products with most similar embeddings const { rows } = await db.query( ` SELECT id, name, description, price, 1 - (embedding <=> $1) AS similarity FROM products WHERE 1 - (embedding <=> $1) > 0.7 -- Minimum similarity threshold ORDER BY embedding <=> $1 LIMIT $2 `, [JSON.stringify(queryEmbedding), limit] );
return rows; } ```
Step 5: Use in Your Application
```typescript const results = await searchProducts('lightweight summer footwear');
// Results might include: // - Cork Sandals (similarity: 0.84) // - Mesh Beach Flats (similarity: 0.81) // - Espadrilles (similarity: 0.78) // Even though none contain exact query words ```
Key Implementation Details
Similarity threshold (0.7): Filters out low-quality matches. Tune based on your data:
- Too low (0.5): Retrieves marginally relevant products
- Too high (0.9): Misses good matches
- Sweet spot typically: 0.65-0.75
Operator (<=>): pgvector's cosine distance operator. Lower distance = higher similarity. That's why `1 - (embedding <=> $1)` converts distance to similarity score.
Batch processing: The example processes products one at a time. In production, batch embed requests (up to 2,048 texts per OpenAI API call) to reduce latency and cost.
Hybrid Search: Best of Both Worlds
Pure semantic search has a weakness: it can miss exact matches. Search for "SKU-12345" and vector search might return similar products instead of that exact SKU.
Solution: combine vector search with keyword search.
Hybrid Search Implementation
```sql SELECT id, name, description, price, -- Combine semantic similarity and keyword matching (1 - (embedding <=> $1)) * 0.7 + -- 70% weight to semantic ts_rank(to_tsvector(name || ' ' || description), plainto_tsquery($2)) * 0.3 -- 30% weight to keywords AS combined_score FROM products WHERE -- Semantic threshold 1 - (embedding <=> $1) > 0.6 OR -- Keyword match to_tsvector(name || ' ' || description) @@ plainto_tsquery($2) ORDER BY combined_score DESC LIMIT 10; ```
This approach ensures:
- Exact SKU/model numbers get retrieved (keyword component)
- Semantically similar items rank highly (embedding component)
- Best results when query contains both specific terms and natural language
According to [Weaviate's approach](https://wearemicro.co/vector-database-comparison/), "Weaviate's hybrid search saved us from building a separate Elasticsearch cluster. That alone justified the switch for our e-commerce search."
Fine-Tuning vs Top-Tuning
Generic embedding models work well, but fine-tuning on your specific product catalog can improve accuracy.
Fine-Tuning: Maximum Accuracy, Maximum Cost
What it is: Retrain the entire embedding model on your e-commerce dataset.
Process: 1. Collect training data (product pairs, search queries → clicked products) 2. Update all model parameters 3. Deploy custom model
Advantages:
- Learns domain-specific terminology ("pump" in hardware vs cosmetics)
- Captures your specific product relationships
- Best possible accuracy for your catalog
Disadvantages:
- Requires ML expertise
- Computationally expensive (GPU clusters, days of training)
- Ongoing maintenance (retrain as catalog changes)
When worth it: Large catalogs (100k+ SKUs) with specialized terminology or unique product relationships.
Top-Tuning: Middle Ground
What it is: Freeze the base embedding model, train only a lightweight classifier on top.
Advantages:
- Much faster than full fine-tuning
- Requires fewer resources (can run on CPUs)
- Easier to deploy
Disadvantages:
- Less improvement than full fine-tuning
- Still requires labeled training data
When worth it: Medium catalogs (10k-100k SKUs) where generic embeddings underperform but full fine-tuning is overkill.
Practical Recommendation
Start with generic models. OpenAI text-embedding-3-large or Cohere embed-english-v3.0 work well for most e-commerce use cases.
Fine-tune only if:
- You have >50k products with rich metadata
- Generic models demonstrably fail on your domain
- You have ML engineering resources
- You can maintain ongoing retraining
According to [Zilliz's e-commerce personalization guide](https://zilliz.com/learn/leveraging-vector-databases-for-next-level-ecommerce-personalization), most e-commerce platforms see sufficient improvements from generic models with proper chunking and retrieval strategies.
Performance Optimization
Vector search at scale requires optimization. Here's what matters:
Approximate Nearest Neighbor (ANN) Algorithms
Exact nearest neighbor search is O(n)—compare query to every vector. For 10 million products, that's too slow.
ANN algorithms trade slight accuracy for massive speed gains:
HNSW (Hierarchical Navigable Small World)
- Builds a multi-layer graph structure
- Search complexity: O(log n)
- Typically 95%+ recall with 10-100x speedup
- Used by: Weaviate, Pinecone, pgvector (via extension)
IVF (Inverted File Index)
- Clusters vectors, searches only relevant clusters
- Faster than HNSW for very large datasets (100M+ vectors)
- Used by: FAISS, Milvus
Dimensionality Reduction
Smaller embeddings mean faster search. Options:
Matryoshka Representation Learning (OpenAI text-embedding-3):
- Embeddings front-load important information
- Truncate 1536-dim to 512-dim with minimal accuracy loss
- 3x faster search, 3x less storage
PCA (Principal Component Analysis):
- Reduce dimensions mathematically
- Requires recomputing for new embeddings
- Works with any embedding model
Trade-off: 768-dim to 384-dim typically costs 2-5% recall. Measure on your data.
Caching
Embed query once, reuse for similar queries:
- "running shoes" and "running shoe" produce nearly identical embeddings
- Cache query embeddings for ~5 minutes
- Reduces API costs and latency
Pre-compute product embeddings:
- Generate embeddings when products are added/updated
- Never embed at search time
- Store in vector database for instant retrieval
Batch Processing
OpenAI allows up to 2,048 texts per embedding API call:
```typescript // Inefficient: 1000 API calls for (const product of products) { await embed(product.description); }
// Efficient: 1 API call (if products.length <= 2048) const embeddings = await openai.embeddings.create({ model: 'text-embedding-3-large', input: products.map(p => p.description), }); ```
This is 100-1000x faster and cheaper.
Real-World Considerations
Cost at Scale
Embedding 1 million products (avg 200 tokens each):
- Initial embedding: $26 (text-embedding-3-large)
- Re-embedding on updates: Incremental based on change rate
If 10% of products update weekly:
- Weekly cost: $2.60
- Annual: ~$135
Search queries (assuming 100k queries/day, avg 10 tokens):
- Daily: $0.13
- Annual: ~$47
Total annual embedding cost for this example: ~$180. Usually negligible compared to infrastructure costs.
Latency Budget
Typical semantic search latency breakdown: 1. Embed query: 50-150ms (OpenAI API) 2. Vector search: 10-50ms (pgvector/Pinecone) 3. Total: 60-200ms
For user-facing search, aim for <200ms total. This usually means:
- Fast embedding model (text-embedding-3-small)
- Optimized vector database (indexed, ANN)
- Geographic proximity (API and DB in same region)
Handling Updates
Products change constantly. Embedding strategy:
Full re-embed: When description/title changes significantly Skip re-embedding: Minor changes (price, stock status) Batch updates: Re-embed changed products nightly
Avoid re-embedding entire catalog unnecessarily—expensive and unnecessary.
Multi-Language Support
E-commerce often spans languages. Options:
Multilingual embedding models:
- Cohere embed-multilingual-v3.0 (100+ languages)
- OpenAI text-embedding-3 (trained on multilingual data)
Advantage: Single embedding space. "chaussures de course" (French) retrieves "running shoes" (English) products.
Limitation: Slightly lower accuracy than language-specific models.
For multi-region stores, multilingual embeddings eliminate maintaining separate search indices per language.
Common Pitfalls
Pitfall 1: Embedding the Wrong Text
Wrong: ```typescript const embedding = await embed(product.id); // "SKU-12345" ```
IDs have no semantic meaning. The embedding is useless.
Right: ```typescript const text = `${product.name}. ${product.description}. ${product.category}`; const embedding = await embed(text); ```
Combine semantically rich fields.
Pitfall 2: Not Normalizing Embeddings
Some models return normalized embeddings (length=1), others don't. If using dot product similarity, normalization matters.
Check: ```typescript const length = Math.sqrt(embedding.reduce((sum, val) => sum + val * val, 0)); console.log(length); // Should be ~1.0 for normalized ```
If not normalized, normalize before storing: ```typescript const normalized = embedding.map(val => val / length); ```
Note: Cosine similarity doesn't require normalization (it's built into the formula). But dot product does.
Pitfall 3: Ignoring Retrieval Quality
High similarity scores don't guarantee good results. Monitor:
- Precision: Are retrieved products actually relevant?
- Recall: Do relevant products get retrieved?
Use clickthrough data to evaluate:
- Customers search "summer dress"
- Do they click on retrieved products?
- If not, your embeddings or retrieval strategy need tuning
Pitfall 4: Over-Relying on Embeddings
Vector search excels at semantic similarity but fails at:
- Exact matches (SKU numbers, model codes)
- Filtering (price ranges, categories, availability)
- Sorting (price, popularity, recency)
Use hybrid approaches: embeddings for relevance, traditional queries for filters and exact matches.
Measuring Success
Track these metrics to evaluate embedding-based search:
Search Metrics:
- Zero-results rate: % of queries returning no results (should decrease)
- Click-through rate: % of searches followed by clicks (should increase)
- Conversion rate: % of searches leading to purchases (ultimate goal)
Technical Metrics:
- Latency: p50, p95, p99 search response times
- Retrieval accuracy: Are top-10 results actually relevant?
- Embedding coverage: % of products with embeddings (should be 100%)
Cost Metrics:
- API costs (embedding requests)
- Infrastructure costs (vector database)
- Total cost per 1000 searches
A/B test semantic search against existing keyword search. Measure impact on conversion and revenue, not just technical metrics.
The Bottom Line
Vector embeddings enable e-commerce systems to understand meaning, not just keywords. This translates directly to better customer experiences:
Better search: "lightweight summer footwear" finds mesh sandals, even with zero keyword overlap.
Better recommendations: Hiking boots suggest water bottles and trail mix—semantically related, not just same category.
Better support: Customer asks "Do you ship overseas?" Chatbot finds "international shipping" documentation—same meaning, different words.
The technology is proven. Costs are manageable. Implementation is increasingly straightforward.
Start with generic embedding models (OpenAI or Cohere), standard vector database (pgvector or Weaviate), and basic semantic search. Measure results. Optimize from there.
Most e-commerce platforms see measurable improvements in search conversion rates within weeks of implementing vector-based semantic search. The question isn't whether embeddings work—it's when you'll deploy them.
Sources
- [Zilliz: How to Choose Embedding Models for E-commerce Product Search](https://zilliz.com/ai-faq/how-do-i-choose-embedding-models-for-ecommerce-product-search)
- [Meilisearch: What Are Vector Embeddings? A Complete Guide](https://www.meilisearch.com/blog/what-are-vector-embeddings)
- [Datos: Product Discovery in E-Commerce Powered by Vector Embeddings](https://datos.live/blog/product-discovery-in-e-commerce-powered-by-vector-embeddings/)
- [AWS Blog: Vectorizing Multimodal E-commerce Product Data](https://norahsakal.com/blog/vectorizing-ecommerce-product-data-with-aws-titan-a-practical-guide/)
- [AIMMultiple: Embedding Models Comparison - OpenAI vs Gemini vs Cohere](https://research.aimultiple.com/embedding-models/)
- [MyScale: Best Embedding Models for Semantic Search Comparison](https://www.myscale.com/blog/best-embedding-models-semantic-search-comparison/)
- [Medium: Comparing Cohere, Amazon Titan, and OpenAI Embedding Models](https://medium.com/@aniketpatil8451/comparing-cohere-amazon-titan-and-openai-embedding-models-a-deep-dive-b7a5c116b6e3)
- [wearemicro.co: Vector Database Comparison - Pinecone vs Weaviate vs Qdrant vs pgvector](https://wearemicro.co/vector-database-comparison/)
- [Northflank: PostgreSQL Vector Search Guide with pgvector](https://northflank.com/blog/postgresql-vector-search-guide-with-pgvector)
- [DataCamp: The 7 Best Vector Databases in 2025](https://www.datacamp.com/blog/the-top-5-vector-databases)
- [Pinecone: Vector Similarity Explained](https://www.pinecone.io/learn/vector-similarity/)
- [GeeksforGeeks: Cosine Similarity](https://www.geeksforgeeks.org/dbms/cosine-similarity/)
- [Voiceflow: Why Semantic Search Matters For Enterprises](https://www.voiceflow.com/blog/semantic-search)
- [SayOne: How Semantic Search is Revolutionizing eCommerce in 2025](https://www.sayonetech.com/blog/how-semantic-search-revolutionizing-ecommerce-2025/)
- [Medium: Designing E-Shop Customer Support Using RAG](https://mehmetozkaya.medium.com/designing-e-shop-customer-support-using-rag-2f2ba8a760d6)
- [Zilliz: Leveraging Vector Databases for E-commerce Personalization](https://zilliz.com/learn/leveraging-vector-databases-for-next-level-ecommerce-personalization)
Ready to stop answering the same questions?
14-day free trial. No credit card required. Set up in under 5 minutes.
Start free trial