plugins/hooks/context_compressor/context_compressor
Context compression — summarizes middle messages to stay within context window. Protects the head (system + early context) and tail (recent messages), replacing the middle with a conversation summary.
Types
pub type CompressionPlan {
Compress(
head: List(completions.Message),
middle: List(completions.Message),
tail: List(completions.Message),
)
NoCompression(history: List(completions.Message))
}
Constructors
-
Compress( head: List(completions.Message), middle: List(completions.Message), tail: List(completions.Message), ) -
NoCompression(history: List(completions.Message))
Cooldown tracker for compression failures. Prevents retry spam when the summarizer LLM is unavailable.
pub type CooldownState {
CoolingDown(next_retry_at: Int)
Ready
}
Constructors
-
CoolingDown(next_retry_at: Int) -
Ready
Values
pub fn apply_compression(
history: List(completions.Message),
protect_head: Int,
protect_tail: Int,
summary: String,
) -> List(completions.Message)
Apply a summary to replace middle messages. Returns a new history: head + summary SystemMessage + tail, with orphaned tool_results removed.
pub fn build_summary_prompt(
middle: List(completions.Message),
) -> String
Build a summarization prompt from middle messages. Tool results are pruned to 1-line summaries to drastically reduce tokens sent to the summarizer LLM.
pub fn estimate_middle_chars(
middle: List(completions.Message),
) -> Int
Estimate total character count of middle messages.
pub fn is_cooling_down(
state: CooldownState,
now_unix: Int,
) -> Bool
Check whether compression is currently in cooldown.
now_unix should be the current Unix timestamp in seconds.
pub fn record_failure(
state: CooldownState,
now_unix: Int,
) -> CooldownState
Record a compression failure. Enters a 10-minute cooldown.
pub fn record_success(state: CooldownState) -> CooldownState
Record a successful compression. Clears the cooldown.
pub fn should_compress(
estimated_tokens: Int,
model_window: Int,
trigger_percent: Int,
) -> Bool
Check if compression should be triggered based on token estimate.
pub fn split_history(
history: List(completions.Message),
protect_head: Int,
protect_tail: Int,
) -> CompressionPlan
Split history into head (protected), middle (to compress), and tail (protected).
The first message is treated as the system message and always kept.
protect_head additional messages after system are kept.
protect_tail messages at the end are kept.
After the initial split, boundary alignment ensures tool_call / tool_result pairs are not broken across the compression boundary.
pub fn summary_budget(
middle_char_count: Int,
config_max_tokens: Int,
) -> Int
Calculate a summary budget: 20% of middle character count, capped at
12_000 and config_max_tokens, with a floor of 200.