RuntimeConfig

Runtime configuration for model execution.

Properties

optional batchSize: number

Batch size for prompt processing.

optional contextSize: number

Context size (max tokens in context window).

optional gpuLayers: number

Number of GPU layers to offload. -1 = all layers, 0 = CPU only

optional keepAlive: boolean

Keep model loaded in memory.

true

optional mlock: boolean

Lock model in memory (prevent swapping).

false

optional mmap: boolean

Memory map the model file.

true

optional threads: number

Number of threads to use.

(CPU cores)