ai.matey.backend
Backend adapters connect to AI provider APIs. Switch providers without changing your application code - just swap the backend adapter.
Installation
Section titled “Installation”npm install ai.matey.backendOverview
Section titled “Overview”Backend adapters translate ai.matey’s Intermediate Representation (IR) into provider-specific API calls. This allows you to switch AI providers without changing your application code.
Supported Providers (24+):
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3.5 Sonnet, Opus, Haiku)
- Google (Gemini 1.5 Pro, Flash)
- Groq (Llama 3, Mixtral)
- DeepSeek (V3, Chat)
- Ollama (Local models)
- Cohere, Mistral, Perplexity, Together AI, and more!
Quick Start
Section titled “Quick Start”import { Bridge } from 'ai.matey.core';import { OpenAIFrontendAdapter } from 'ai.matey.frontend/openai';import { AnthropicBackendAdapter } from 'ai.matey.backend/anthropic';
const bridge = new Bridge( new OpenAIFrontendAdapter(), new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY }));
// Write in OpenAI format, execute with Claudeconst response = await bridge.chat({ model: 'gpt-4', messages: [{ role: 'user', content: 'Hello!' }]});OpenAI Backend
Section titled “OpenAI Backend”Use OpenAI’s GPT models.
Installation
Section titled “Installation”import { OpenAIBackendAdapter } from 'ai.matey.backend/openai';Configuration
Section titled “Configuration”const backend = new OpenAIBackendAdapter({ apiKey: process.env.OPENAI_API_KEY, // Required baseURL: 'https://api.openai.com/v1', // Optional organization: 'org-xxx', // Optional timeout: 60000, // Optional (default: 60s) maxRetries: 3 // Optional (default: 2)});Available Models
Section titled “Available Models”- GPT-4 Turbo:
gpt-4-turbo,gpt-4-turbo-preview - GPT-4:
gpt-4,gpt-4-0613 - GPT-3.5 Turbo:
gpt-3.5-turbo,gpt-3.5-turbo-16k - GPT-4 Vision:
gpt-4-vision-preview
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ Function calling
- ✅ Vision (GPT-4 Vision)
- ✅ JSON mode
- ✅ Seed for reproducibility
Pricing (per 1M tokens)
Section titled “Pricing (per 1M tokens)”| Model | Input | Output |
|---|---|---|
| GPT-4 Turbo | $10 | $30 |
| GPT-4 | $30 | $60 |
| GPT-3.5 Turbo | $0.50 | $1.50 |
Anthropic Backend
Section titled “Anthropic Backend”Use Anthropic’s Claude models.
Installation
Section titled “Installation”import { AnthropicBackendAdapter } from 'ai.matey.backend/anthropic';Configuration
Section titled “Configuration”const backend = new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY, // Required baseURL: 'https://api.anthropic.com', // Optional timeout: 60000 // Optional});Available Models
Section titled “Available Models”- Claude 3.5 Sonnet:
claude-3-5-sonnet-20241022(Latest, most capable) - Claude 3 Opus:
claude-3-opus-20240229(Highest intelligence) - Claude 3 Sonnet:
claude-3-sonnet-20240229(Balanced) - Claude 3 Haiku:
claude-3-haiku-20240307(Fastest, cheapest)
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ Tool use
- ✅ Vision
- ✅ 200K context window
- ✅ System prompts
Pricing (per 1M tokens)
Section titled “Pricing (per 1M tokens)”| Model | Input | Output |
|---|---|---|
| Claude 3.5 Sonnet | $3 | $15 |
| Claude 3 Opus | $15 | $75 |
| Claude 3 Sonnet | $3 | $15 |
| Claude 3 Haiku | $0.25 | $1.25 |
Google Gemini Backend
Section titled “Google Gemini Backend”Use Google’s Gemini models.
Installation
Section titled “Installation”import { GeminiBackendAdapter } from 'ai.matey.backend/gemini';Configuration
Section titled “Configuration”const backend = new GeminiBackendAdapter({ apiKey: process.env.GEMINI_API_KEY, // Required baseURL: 'https://generativelanguage.googleapis.com', // Optional});Available Models
Section titled “Available Models”- Gemini 1.5 Pro:
gemini-1.5-pro-latest(Most capable) - Gemini 1.5 Flash:
gemini-1.5-flash-latest(Fast, efficient) - Gemini 1.0 Pro:
gemini-1.0-pro(Previous generation)
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ Function calling
- ✅ Native multi-modal (vision, audio)
- ✅ 2M token context (Pro)
- ✅ Grounding with Google Search
Pricing (per 1M tokens)
Section titled “Pricing (per 1M tokens)”| Model | Input | Output |
|---|---|---|
| Gemini 1.5 Pro | $3.50 | $10.50 |
| Gemini 1.5 Flash | $0.35 | $1.05 |
Groq Backend
Section titled “Groq Backend”Use Groq’s ultra-fast inference.
Installation
Section titled “Installation”import { GroqBackendAdapter } from 'ai.matey.backend/groq';Configuration
Section titled “Configuration”const backend = new GroqBackendAdapter({ apiKey: process.env.GROQ_API_KEY // Required});Available Models
Section titled “Available Models”- Llama 3:
llama3-70b-8192,llama3-8b-8192 - Mixtral:
mixtral-8x7b-32768 - Gemma:
gemma-7b-it
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ Ultra-fast (500+ tokens/sec)
- ⚠️ Limited tool support
Pricing
Section titled “Pricing”Free tier available! Very cost-effective for high-throughput use cases.
DeepSeek Backend
Section titled “DeepSeek Backend”Use DeepSeek’s cost-effective models.
Installation
Section titled “Installation”import { DeepSeekBackendAdapter } from 'ai.matey.backend/deepseek';Configuration
Section titled “Configuration”const backend = new DeepSeekBackendAdapter({ apiKey: process.env.DEEPSEEK_API_KEY // Required});Available Models
Section titled “Available Models”- DeepSeek V3:
deepseek-chat(Latest) - DeepSeek Coder:
deepseek-coder(Code specialist)
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ Competitive quality
- ✅ Very low cost
Pricing (per 1M tokens)
Section titled “Pricing (per 1M tokens)”| Model | Input | Output |
|---|---|---|
| DeepSeek Chat | $0.14 | $0.28 |
| DeepSeek Coder | $0.14 | $0.28 |
Up to 95% cheaper than GPT-4!
Ollama Backend
Section titled “Ollama Backend”Use local open-source models.
Installation
Section titled “Installation”import { OllamaBackendAdapter } from 'ai.matey.backend/ollama';Configuration
Section titled “Configuration”const backend = new OllamaBackendAdapter({ baseURL: 'http://localhost:11434', // Optional (default) timeout: 120000 // Optional (2 min for local inference)});Available Models
Section titled “Available Models”Any model supported by Ollama:
- Llama 3.2:
llama3.2,llama3.2:70b - Mistral:
mistral,mistral-nemo - Qwen:
qwen2.5:72b - Gemma:
gemma2 - Phi:
phi3
Features
Section titled “Features”- ✅ Chat completions
- ✅ Streaming
- ✅ 100% local (no API costs)
- ✅ Privacy (data never leaves your machine)
- ⚠️ Slower than cloud providers
- ⚠️ Limited tool support (model-dependent)
- Install Ollama: https://ollama.ai
- Pull a model:
ollama pull llama3.2 - Use with ai.matey:
const bridge = new Bridge( new OpenAIFrontendAdapter(), new OllamaBackendAdapter());
const response = await bridge.chat({ model: 'llama3.2', messages: [{ role: 'user', content: 'Hello!' }]});More Providers
Section titled “More Providers”Cohere
Section titled “Cohere”import { CohereBackendAdapter } from 'ai.matey.backend/cohere';
const backend = new CohereBackendAdapter({ apiKey: process.env.COHERE_API_KEY});Models: command-r-plus, command-r, command
Mistral
Section titled “Mistral”import { MistralBackendAdapter } from 'ai.matey.backend/mistral';
const backend = new MistralBackendAdapter({ apiKey: process.env.MISTRAL_API_KEY});Models: mistral-large-latest, mistral-medium-latest, mistral-small-latest
Perplexity
Section titled “Perplexity”import { PerplexityBackendAdapter } from 'ai.matey.backend/perplexity';
const backend = new PerplexityBackendAdapter({ apiKey: process.env.PERPLEXITY_API_KEY});Models: llama-3.1-sonar-large, llama-3.1-sonar-small
Together AI
Section titled “Together AI”import { TogetherBackendAdapter } from 'ai.matey.backend/together';
const backend = new TogetherBackendAdapter({ apiKey: process.env.TOGETHER_API_KEY});Models: Wide selection of open-source models
Provider Comparison
Section titled “Provider Comparison”By Use Case
Section titled “By Use Case”Best for Production
Section titled “Best for Production”- OpenAI - Most reliable, widely tested
- Anthropic - Excellent quality, large context
- Google Gemini - Strong multi-modal capabilities
Best for Cost Optimization
Section titled “Best for Cost Optimization”- DeepSeek - Cheapest cloud option
- Ollama - Free (local)
- Groq - Generous free tier
Best for Speed
Section titled “Best for Speed”- Groq - 500+ tokens/sec
- Gemini Flash - Very fast
- Claude Haiku - Fast cloud model
Best for Privacy
Section titled “Best for Privacy”- Ollama - 100% local
- LM Studio - Local with GUI
- Self-hosted options
Feature Matrix
Section titled “Feature Matrix”| Provider | Streaming | Tools | Vision | Context | Speed |
|---|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | 128K | Fast |
| Anthropic | ✅ | ✅ | ✅ | 200K | Fast |
| Gemini | ✅ | ✅ | ✅ | 2M | Fast |
| Groq | ✅ | ⚠️ | ❌ | 32K | Very Fast |
| DeepSeek | ✅ | ✅ | ❌ | 64K | Medium |
| Ollama | ✅ | ⚠️ | ⚠️ | Varies | Slow |
Switching Providers
Section titled “Switching Providers”Simple Switch
Section titled “Simple Switch”Change providers by only changing the backend:
// Before: OpenAIconst bridge = new Bridge( new OpenAIFrontendAdapter(), new OpenAIBackendAdapter({ apiKey: openaiKey }));
// After: Anthropic (only change backend!)const bridge = new Bridge( new OpenAIFrontendAdapter(), // Same frontend new AnthropicBackendAdapter({ apiKey: anthropicKey }));Environment-Based
Section titled “Environment-Based”Use different providers for dev/prod:
const backend = process.env.NODE_ENV === 'production' ? new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY }) : new OllamaBackendAdapter({ baseURL: 'http://localhost:11434' });
const bridge = new Bridge(new OpenAIFrontendAdapter(), backend);Multi-Provider Fallback
Section titled “Multi-Provider Fallback”Use Router for automatic failover:
import { Router } from 'ai.matey.core';
const router = new Router(new OpenAIFrontendAdapter(), { backends: [ new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY }), new OpenAIBackendAdapter({ apiKey: process.env.OPENAI_API_KEY }), new GroqBackendAdapter({ apiKey: process.env.GROQ_API_KEY }) ], strategy: 'priority', fallbackOnError: true});
// Automatically tries Anthropic, then OpenAI, then Groqconst response = await router.chat({ model: 'gpt-4', messages: [{ role: 'user', content: 'Hello!' }]});Cost Optimization
Section titled “Cost Optimization”Route by Complexity
Section titled “Route by Complexity”const router = new Router(new OpenAIFrontendAdapter(), { backends: [ new DeepSeekBackendAdapter({ apiKey: process.env.DEEPSEEK_API_KEY }), // Cheap new GroqBackendAdapter({ apiKey: process.env.GROQ_API_KEY }), // Fast new OpenAIBackendAdapter({ apiKey: process.env.OPENAI_API_KEY }), // Powerful new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY }) // Most capable ], strategy: 'custom', customStrategy: (request) => { const messageLength = JSON.stringify(request.messages).length;
if (messageLength < 100) return 0; // DeepSeek: simple queries if (messageLength < 500) return 1; // Groq: moderate queries if (messageLength < 2000) return 2; // OpenAI: complex queries return 3; // Anthropic: very complex queries }});Potential savings: Up to 90% compared to always using GPT-4.
Provider-Specific Features
Section titled “Provider-Specific Features”OpenAI: JSON Mode
Section titled “OpenAI: JSON Mode”const bridge = new Bridge( new OpenAIFrontendAdapter(), new OpenAIBackendAdapter({ apiKey }));
const response = await bridge.chat({ model: 'gpt-4', messages: [{ role: 'user', content: 'Return a user object' }], response_format: { type: 'json_object' }});Anthropic: Extended Context
Section titled “Anthropic: Extended Context”const bridge = new Bridge( new OpenAIFrontendAdapter(), new AnthropicBackendAdapter({ apiKey }));
// Claude supports up to 200K tokens!const longDocument = fs.readFileSync('long-doc.txt', 'utf-8');
const response = await bridge.chat({ model: 'claude-3-5-sonnet-20241022', messages: [ { role: 'user', content: `Summarize this:\n\n${longDocument}` } ]});Gemini: Grounding
Section titled “Gemini: Grounding”const bridge = new Bridge( new GeminiFrontendAdapter(), new GeminiBackendAdapter({ apiKey }));
const response = await bridge.chat({ model: 'gemini-1.5-pro', contents: [{ role: 'user', parts: [{ text: 'Latest AI news?' }] }], tools: [{ google_search_retrieval: {} }] // Enable grounding});Best Practices
Section titled “Best Practices”1. Use Environment Variables
Section titled “1. Use Environment Variables”const backend = new AnthropicBackendAdapter({ apiKey: process.env.ANTHROPIC_API_KEY // Don't hardcode!});2. Set Timeouts
Section titled “2. Set Timeouts”const backend = new OpenAIBackendAdapter({ apiKey: process.env.OPENAI_API_KEY, timeout: 30000 // 30 seconds});3. Handle Errors
Section titled “3. Handle Errors”try { const response = await bridge.chat(request);} catch (error) { if (error.code === 'RATE_LIMIT_ERROR') { console.log('Rate limited, waiting...'); await sleep(1000); } else if (error.code === 'AUTH_ERROR') { console.error('Invalid API key'); } else { console.error('Error:', error.message); }}4. Monitor Costs
Section titled “4. Monitor Costs”import { createCostTrackingMiddleware } from 'ai.matey.middleware';
bridge.use(createCostTrackingMiddleware({ budgetLimit: 100, onBudgetExceeded: () => { console.error('Daily budget exceeded!'); }}));See Also
Section titled “See Also”- Frontend Adapters - Available input formats
- Core Package - Bridge and Router
- Middleware - Add logging, caching, etc.
- Integration Patterns - Production patterns
- Examples on GitHub - Provider examples