Skip to content

NodeLlamaCppConfig

Defined in: native-node-llamacpp/src/index.ts:55

readonly optional apiKey: string

Defined in: ai.matey.types/dist/types/adapters.d.ts:103

API key for authentication. Should be injected from environment or secure config.

Partial.apiKey


readonly optional baseURL: string

Defined in: ai.matey.types/dist/types/adapters.d.ts:108

Base URL for API endpoint. Useful for proxies or alternative endpoints.

Partial.baseURL


optional batchSize: number

Defined in: native-node-llamacpp/src/index.ts:69

Batch size for prompt processing. Default: 512


readonly optional browserMode: boolean

Defined in: ai.matey.types/dist/types/adapters.d.ts:155

Enable browser-compatible mode.

⚠️ SECURITY WARNING: Enabling browser mode may expose API keys in client-side code. This option should ONLY be used for development and testing. Production applications should always use proxy servers to protect API keys.

Each provider implements browser compatibility differently:

  • Anthropic: Adds anthropic-dangerous-direct-browser-access: true header
  • Gemini: Already browser-compatible (API key in URL), this flag has no effect
  • OpenAI: Already browser-compatible, this flag has no effect
  • Other providers: May have provider-specific implementations
false
// Development only - DO NOT use in production!
const backend = new AnthropicBackendAdapter({
apiKey: process.env.ANTHROPIC_API_KEY,
browserMode: true // ⚠️ Exposes API key in browser
});

Partial.browserMode


readonly optional cacheModels: boolean

Defined in: ai.matey.types/dist/types/adapters.d.ts:181

Enable model list caching.

true

Partial.cacheModels


optional contextSize: number

Defined in: native-node-llamacpp/src/index.ts:59

Context window size. Default: 2048


readonly optional custom: Record<string, unknown>

Defined in: ai.matey.types/dist/types/adapters.d.ts:131

Provider-specific configuration options.

Partial.custom


readonly optional debug: boolean

Defined in: ai.matey.types/dist/types/adapters.d.ts:123

Enable debug logging.

false

Partial.debug


readonly optional defaultModel: string

Defined in: ai.matey.types/dist/types/adapters.d.ts:162

Default model to use when no model is specified in the request. This provides a fallback model for requests that don’t specify one.

'gpt-4o' for OpenAI, 'claude-3-5-sonnet-20241022' for Anthropic

Partial.defaultModel


optional gpuLayers: number

Defined in: native-node-llamacpp/src/index.ts:61

Number of layers to offload to GPU. 0 = CPU only. Default: 0


readonly optional headers: Record<string, string>

Defined in: ai.matey.types/dist/types/adapters.d.ts:127

Custom HTTP headers to include in requests.

Partial.headers


readonly optional maxRetries: number

Defined in: ai.matey.types/dist/types/adapters.d.ts:118

Maximum number of retries for transient failures.

0

Partial.maxRetries


modelPath: string

Defined in: native-node-llamacpp/src/index.ts:57

Path to the GGUF model file. Can be relative (resolved from cwd) or absolute.


readonly optional models: readonly (string | AIModel)[]

Defined in: ai.matey.types/dist/types/adapters.d.ts:171

Static model list (used when provider doesn’t have listing endpoint or to override remote list).

Can be either:

  • Array of model IDs (strings) - will be normalized to AIModel objects
  • Array of full AIModel objects with capabilities

Partial.models


readonly optional modelsCacheScope: "global" | "instance"

Defined in: ai.matey.types/dist/types/adapters.d.ts:193

Cache scope strategy.

  • ‘global’: Share cache across all adapter instances (default)
  • ‘instance’: Each adapter instance has its own cache
'global'

Partial.modelsCacheScope


readonly optional modelsCacheTTL: number

Defined in: ai.matey.types/dist/types/adapters.d.ts:186

Cache TTL in milliseconds.

3600000 (1 hour)

Partial.modelsCacheTTL


readonly optional modelsEndpoint: string

Defined in: ai.matey.types/dist/types/adapters.d.ts:176

URL endpoint for fetching models (overrides default). Used for custom model endpoints or proxies.

Partial.modelsEndpoint


readonly optional streaming: StreamingConfig

Defined in: ai.matey.types/dist/types/adapters.d.ts:204

Streaming configuration for this backend.

Controls how streaming responses are delivered:

  • mode: ‘delta’ (incremental only) or ‘accumulated’ (full text each chunk)
  • includeBoth: Whether to provide both delta and accumulated in chunks
  • bufferStrategy: How to buffer for accumulated mode
{ mode: 'delta', includeBoth: false, bufferStrategy: 'memory' }

Partial.streaming


optional temperature: number

Defined in: native-node-llamacpp/src/index.ts:63

Sampling temperature. Default: 0.7


optional threads: number

Defined in: native-node-llamacpp/src/index.ts:71

Number of CPU threads to use. Defaults to optimal value.


readonly optional timeout: number

Defined in: ai.matey.types/dist/types/adapters.d.ts:113

Request timeout in milliseconds.

30000

Partial.timeout


optional topK: number

Defined in: native-node-llamacpp/src/index.ts:67

Top-k sampling. Default: 40


optional topP: number

Defined in: native-node-llamacpp/src/index.ts:65

Top-p sampling. Default: 0.9