Skip to content

ContextBudgetConfig

ContextBudgetConfig defines the token allocation strategy for the LLM context window. It governs how the finite context window is partitioned across the multiple content sources that compose each LLM request.

maxAutoRetrieveTokens: number;

Maximum tokens allocated to auto-retrieve (RAG) context injection. Default: 4,000.


maxBasePromptTokens: number;

Maximum tokens allocated to the base system prompt. 0 means no limit — the base prompt is measured, not capped. Default: 0.


maxExtractionTokens: number;

Maximum tokens allocated to the extraction snapshot section. Default: 2,000.


maxLongTermMemoryTokens: number;

Maximum tokens allocated to cross-session long-term memory. Default: 2,000.


maxWorkingMemoryTokens: number;

Maximum tokens allocated to the working memory section. Default: 2,000.


modelContextWindow: number;

Total context window size of the target model, in tokens. Default: 128,000.


responseReserve: number;

Tokens reserved for the LLM’s response generation. Default: 4,096.