Skip to main content

Model

Overview

Llumen allows you to configure individual models with specific capabilities and parameters. This enables fine-tuned control over how each AI model behaves.

Configuration Format

Model configurations are stored in TOML format. Each model has:

  • Display name - Human-readable name shown in UI
  • Model ID - Identifier used by the API provider
  • Capabilities - What the model supports(optional)
  • Parameters - Inference settings(optional)

Basic Configuration

Minimal Example

display_name = "GPT-OSS 20B"
# openrouter suffix are supported
model_id = "openai/gpt-oss-20b:nitro"

Complete Example

display_name = "Claude 4.5 Sonnet"
model_id = "anthropic/claude-4.5-sonnet"

[capability]
image = true # upload image
audio = false # upload audio
ocr = "Mistral"
tool = true
json = true
reasoning = true

[parameter]
temperature = 0.7
top_p = 0.9
top_k = 40
repeat_penalty = 1.1

Model Capabilities

Configure what features the model supports, image generation are always auto-detect(openrouter only).

note

You don't need this if you are using openrouter. Openrouter API allow llumen to detect it for you.

Input/Upload

Set image, audio to true to override auto-detection

Structured output

  • When set to true: deep research mode will be more accurate. Eliminate error like "Here is research..." is not a valid plan.
  • When set to false: deep research mode will retry once if error.
  • When not set: Llumen will guess its support

OCR Engine

[capability]
ocr = "Mistral" # Options: "Native", "Text", "Mistral", "Disabled"

Tool Use

  • When set to true: search/deep research mode will be enable.
  • When set to false: normal mode only.
  • When not set: Llumen will guess its support

Reasoning

  • When set to true: enable reasoning, generate nothing if model doesn't support it.
  • When set to false: no reasoning.
  • When not set: Llumen will guess its support, enable if supported.
note

Llumen support interleaved thinking(If model support it)

Model Parameters

Fine-tune inference behavior:

Temperature

[parameter]
temperature = 0.7 # Range: 0.0 - 2.0

Examples:

# For coding assistance
temperature = 0.2

# For creative writing
temperature = 0.9

# For balanced chat
temperature = 0.7

Top P (Nucleus Sampling)

[parameter]
top_p = 0.9 # Range: 0.0 - 1.0

Controls diversity by limiting token selection:

  • 0.1 - Very focused, predictable
  • 0.5 - Moderate diversity
  • 0.9 - High diversity (recommended)
  • 1.0 - All tokens considered
note

Use either temperature OR top_p, not both. OpenRouter recommends top_p.

Top K

[parameter]
top_k = 40 # Range: 1 - 100+

Repeat Penalty

[parameter]
repeat_penalty = 1.1 # Range: 1.0 - 2.0

Reduces repetition in responses:

  • 1.0 - No penalty (repetitive)
  • 1.1 - Light penalty (recommended)
  • 1.2-1.3 - Moderate penalty
  • 1.5+ - Strong penalty (may affect quality)

Configuring Models in Llumen

Via Web Interface

  1. Log in to llumen
  2. Go to Settings -> Openrouter
  3. Add or edit model configurations
  4. Save changes