ContextBudgetConfig

ContextBudgetConfig defines the token allocation strategy for the LLM context window. It governs how the finite context window is partitioned across the multiple content sources that compose each LLM request.

Properties

maxAutoRetrieveTokens

maxAutoRetrieveTokens: number;

Maximum tokens allocated to auto-retrieve (RAG) context injection. Default: 4,000.

maxBasePromptTokens

maxBasePromptTokens: number;

Maximum tokens allocated to the base system prompt. 0 means no limit — the base prompt is measured, not capped. Default: 0.

maxExtractionTokens

maxExtractionTokens: number;

Maximum tokens allocated to the extraction snapshot section. Default: 2,000.

maxLongTermMemoryTokens

maxLongTermMemoryTokens: number;

Maximum tokens allocated to cross-session long-term memory. Default: 2,000.

maxWorkingMemoryTokens

maxWorkingMemoryTokens: number;

Maximum tokens allocated to the working memory section. Default: 2,000.

modelContextWindow

modelContextWindow: number;

Total context window size of the target model, in tokens. Default: 128,000.

responseReserve

responseReserve: number;

Tokens reserved for the LLM’s response generation. Default: 4,096.