VOOZH about

URL: https://crazyrouter.com/en/blog/text-embedding-3-small-api-guide

⇱ Text-Embedding-3-Small API Tutorial - OpenAI Embedding Model Guide - Crazyrouter


Back to Blog

Building a semantic search engine or RAG (Retrieval-Augmented Generation) system? Text-embedding-3-small is OpenAI's latest embedding model that converts text into numerical vectors, enabling powerful similarity search and content retrieval.

In this guide, you'll learn:

  • What are text embeddings and why they matter
  • How to use text-embedding-3-small API
  • Complete code examples in Python and Node.js
  • Custom dimensions for optimized storage
  • Pricing comparison and cost optimization

What is Text-Embedding-3-Small?#

Text-embedding-3-small is OpenAI's compact embedding model released in January 2024. It converts text into 1536-dimensional vectors that capture semantic meaning, enabling:

  • Semantic Search: Find relevant documents based on meaning, not just keywords
  • RAG Systems: Retrieve context for LLM responses
  • Similarity Matching: Compare text similarity for recommendations
  • Clustering: Group similar documents together
  • Classification: Categorize text based on content

Model Specifications#

SpecificationValue
Model Nametext-embedding-3-small
Default Dimensions1536
Custom Dimensions256, 512, 1024, 1536
Max Input Tokens8,191
OutputNormalized vector

Quick Start#

Prerequisites#

  1. Sign up at Crazyrouter
  2. Get your API key from the dashboard
  3. Python 3.8+ or Node.js 16+

Python Example#

python
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-api-key",
 base_url="https://crazyrouter.com/v1"
)

# Generate embedding for a single text
response = client.embeddings.create(
 model="text-embedding-3-small",
 input="Machine learning is transforming industries worldwide."
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}") # Output: 1536
print(f"First 5 values: {embedding[:5]}")

Node.js Example#

javascript
import OpenAI from 'openai';

const client = new OpenAI({
 apiKey: 'your-crazyrouter-api-key',
 baseURL: 'https://crazyrouter.com/v1'
});

async function getEmbedding(text) {
 const response = await client.embeddings.create({
 model: 'text-embedding-3-small',
 input: text
 });

 return response.data[0].embedding;
}

// Usage
const embedding = await getEmbedding('Machine learning is amazing');
console.log(`Dimensions: ${embedding.length}`); // Output: 1536

cURL Example#

bash
curl -X POST https://crazyrouter.com/v1/embeddings \
 -H "Authorization: Bearer your-api-key" \
 -H "Content-Type: application/json" \
 -d '{
 "model": "text-embedding-3-small",
 "input": "Hello world"
 }'

Response:

json
{
 "object": "list",
 "model": "text-embedding-3-small",
 "usage": {
 "prompt_tokens": 2,
 "total_tokens": 2
 },
 "data": [
 {
 "object": "embedding",
 "index": 0,
 "embedding": [-0.0020785425, -0.049085874, 0.02094679, ...]
 }
 ]
}

Batch Embedding#

Process multiple texts in a single API call for better efficiency:

python
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-api-key",
 base_url="https://crazyrouter.com/v1"
)

# Batch embedding - multiple texts at once
texts = [
 "Python is a programming language",
 "JavaScript runs in browsers",
 "Machine learning uses neural networks"
]

response = client.embeddings.create(
 model="text-embedding-3-small",
 input=texts
)

# Access each embedding
for i, data in enumerate(response.data):
 print(f"Text {i}: {len(data.embedding)} dimensions")

# Output:
# Text 0: 1536 dimensions
# Text 1: 1536 dimensions
# Text 2: 1536 dimensions

Custom Dimensions#

Reduce storage costs by using smaller dimensions. The model supports dimension reduction while maintaining quality:

python
# Use 512 dimensions instead of 1536
response = client.embeddings.create(
 model="text-embedding-3-small",
 input="Your text here",
 dimensions=512 # Options: 256, 512, 1024, 1536
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}") # Output: 512

Dimension Comparison#

DimensionsStorage (per vector)Use Case
2561 KBMobile apps, limited storage
5122 KBBalanced performance
10244 KBHigh accuracy needs
15366 KBMaximum accuracy

Building a Semantic Search System#

Here's a complete example of building a semantic search system:

python
import numpy as np
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-api-key",
 base_url="https://crazyrouter.com/v1"
)

def get_embedding(text):
 """Get embedding for a single text"""
 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=text
 )
 return response.data[0].embedding

def cosine_similarity(a, b):
 """Calculate cosine similarity between two vectors"""
 return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Document database
documents = [
 "Python is great for data science and machine learning",
 "JavaScript is essential for web development",
 "Docker containers simplify deployment",
 "Kubernetes orchestrates container workloads",
 "PostgreSQL is a powerful relational database"
]

# Pre-compute embeddings for all documents
doc_embeddings = [get_embedding(doc) for doc in documents]

# Search function
def search(query, top_k=3):
 query_embedding = get_embedding(query)

 # Calculate similarities
 similarities = [
 cosine_similarity(query_embedding, doc_emb)
 for doc_emb in doc_embeddings
 ]

 # Get top results
 results = sorted(
 zip(documents, similarities),
 key=lambda x: x[1],
 reverse=True
 )[:top_k]

 return results

# Example search
results = search("How to deploy applications?")
for doc, score in results:
 print(f"Score: {score:.4f} - {doc}")

# Output:
# Score: 0.8234 - Docker containers simplify deployment
# Score: 0.7891 - Kubernetes orchestrates container workloads
# Score: 0.6543 - PostgreSQL is a powerful relational database

Integration with Vector Databases#

Pinecone Integration#

python
import pinecone
from openai import OpenAI

# Initialize clients
client = OpenAI(
 api_key="your-crazyrouter-api-key",
 base_url="https://crazyrouter.com/v1"
)

pinecone.init(api_key="your-pinecone-key")
index = pinecone.Index("your-index")

def embed_and_upsert(texts, ids):
 """Embed texts and store in Pinecone"""
 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=texts
 )

 vectors = [
 (id, data.embedding)
 for id, data in zip(ids, response.data)
 ]

 index.upsert(vectors=vectors)

def search_pinecone(query, top_k=5):
 """Search Pinecone with query embedding"""
 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=query
 )

 results = index.query(
 vector=response.data[0].embedding,
 top_k=top_k
 )

 return results

ChromaDB Integration#

python
import chromadb
from openai import OpenAI

client = OpenAI(
 api_key="your-crazyrouter-api-key",
 base_url="https://crazyrouter.com/v1"
)

# Initialize ChromaDB
chroma_client = chromadb.Client()
collection = chroma_client.create_collection("documents")

def get_embeddings(texts):
 """Get embeddings for multiple texts"""
 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=texts
 )
 return [data.embedding for data in response.data]

# Add documents
documents = ["doc1 content", "doc2 content", "doc3 content"]
embeddings = get_embeddings(documents)

collection.add(
 embeddings=embeddings,
 documents=documents,
 ids=["doc1", "doc2", "doc3"]
)

# Query
query_embedding = get_embeddings(["search query"])[0]
results = collection.query(
 query_embeddings=[query_embedding],
 n_results=3
)

Available Embedding Models#

Crazyrouter provides access to multiple OpenAI embedding models:

ModelDimensionsPrice RatioBest For
text-embedding-3-small15360.01General use, best value
text-embedding-3-large30720.065High precision needs
text-embedding-ada-00215360.05Legacy compatibility

Pricing Comparison#

ProviderModelPrice per 1M tokens
OpenAI Officialtext-embedding-3-small$0.020
Crazyroutertext-embedding-3-small$0.002
OpenAI Officialtext-embedding-3-large$0.130
Crazyroutertext-embedding-3-large$0.013

Pricing Disclaimer: Prices shown are for demonstration and may change. Actual billing is based on real-time prices at request time.

Cost Savings Example:

For a RAG system processing 10M tokens/month:

  • OpenAI Official: $200/month
  • Crazyrouter: $20/month
  • Savings: 90%

Best Practices#

1. Batch Your Requests#

python
# Good - single API call for multiple texts
response = client.embeddings.create(
 model="text-embedding-3-small",
 input=["text1", "text2", "text3"] # Up to 2048 texts
)

# Bad - multiple API calls
for text in texts:
 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=text
 )

2. Cache Embeddings#

python
import hashlib
import json

embedding_cache = {}

def get_embedding_cached(text):
 # Create cache key
 cache_key = hashlib.md5(text.encode()).hexdigest()

 if cache_key in embedding_cache:
 return embedding_cache[cache_key]

 response = client.embeddings.create(
 model="text-embedding-3-small",
 input=text
 )

 embedding = response.data[0].embedding
 embedding_cache[cache_key] = embedding

 return embedding

3. Use Appropriate Dimensions#

  • 256 dimensions: Mobile apps, IoT devices
  • 512 dimensions: Web applications with storage constraints
  • 1024 dimensions: Standard applications
  • 1536 dimensions: Maximum accuracy requirements

Frequently Asked Questions#

What's the difference between text-embedding-3-small and text-embedding-3-large?#

Text-embedding-3-small produces 1536-dimensional vectors and is optimized for cost-efficiency. Text-embedding-3-large produces 3072-dimensional vectors with higher accuracy but at 6.5x the cost. For most applications, text-embedding-3-small provides excellent results.

Can I reduce dimensions after generating embeddings?#

Yes, you can use the dimensions parameter to generate smaller vectors directly. This is more efficient than generating full vectors and truncating them.

How many texts can I embed in one request?#

You can embed up to 2048 texts in a single API request. For large datasets, batch your requests in groups of 2048.

Are the embeddings normalized?#

Yes, text-embedding-3-small returns normalized vectors (unit length), so you can use dot product instead of cosine similarity for faster computation.

Getting Started#

  1. Sign up at Crazyrouter
  2. Get your API key from the dashboard
  3. Install the SDK: pip install openai or npm install openai
  4. Start embedding with the code examples above

Related Articles:

For questions, contact support@crazyrouter.com

Implementation Guides

Topics

Related Posts

GPT Image Generation API Guide: Create AI Images with gpt-image-1 in 2026

"Complete guide to OpenAI's GPT Image Generation API (gpt-image-1). Learn how to generate, edit, and vary images with code examples in Python, Node.js, and cURL."

Mar 2

Midjourney API Without Discord: How to Generate AI Images Programmatically

"Learn how to use Midjourney's image generation through an API without Discord. Complete guide with Python code examples, pricing, and alternatives."

Feb 21

Text-Embedding-3-Small: Complete Guide to OpenAI's Most Popular Embedding Model (2026)

"Everything you need to know about text-embedding-3-small: pricing, token limits, dimensions, API usage, dimension reduction, benchmarks, and how it compares to text-embedding-3-large. Includes Python and cURL code examples."

May 3

Gemini 2.5 Flash Image Generation Guide: Create AI Images with Google's Model

Learn how to generate images with Gemini 2.5 Flash, Google's multimodal AI model. Includes API tutorial, code examples, and comparison with DALL-E and Midjourney.

Feb 22

Can Claude Code Build a World Cup 2026 Match Predictor? A Real Crazyrouter API Test

We built a reproducible World Cup 2026 match predictor demo with Claude Code-style workflow, Elo/Poisson probabilities, charts, and real Crazyrouter API calls through https://cn.crazyrouter.com/v1.

Jun 12

AI Palm Reading with GPT-image-2 — Generate Professional Palmistry Analysis from a Single Photo

Use GPT-image-2 via Crazyrouter API to generate stunning palm reading infographics. Complete code in Python, curl, and Node.js.

May 1