plugins/hooks/context_compressor/context_compressor

Context compression — summarizes middle messages to stay within context window. Protects the head (system + early context) and tail (recent messages), replacing the middle with a conversation summary.

Types

pub type CompressionPlan {
  Compress(
    head: List(completions.Message),
    middle: List(completions.Message),
    tail: List(completions.Message),
  )
  NoCompression(history: List(completions.Message))
}

Constructors

Cooldown tracker for compression failures. Prevents retry spam when the summarizer LLM is unavailable.

pub type CooldownState {
  CoolingDown(next_retry_at: Int)
  Ready
}

Constructors

  • CoolingDown(next_retry_at: Int)
  • Ready

Values

pub fn apply_compression(
  history: List(completions.Message),
  protect_head: Int,
  protect_tail: Int,
  summary: String,
) -> List(completions.Message)

Apply a summary to replace middle messages. Returns a new history: head + summary SystemMessage + tail, with orphaned tool_results removed.

pub fn build_summary_prompt(
  middle: List(completions.Message),
) -> String

Build a summarization prompt from middle messages. Tool results are pruned to 1-line summaries to drastically reduce tokens sent to the summarizer LLM.

pub fn estimate_middle_chars(
  middle: List(completions.Message),
) -> Int

Estimate total character count of middle messages.

pub fn is_cooling_down(
  state: CooldownState,
  now_unix: Int,
) -> Bool

Check whether compression is currently in cooldown. now_unix should be the current Unix timestamp in seconds.

pub fn record_failure(
  state: CooldownState,
  now_unix: Int,
) -> CooldownState

Record a compression failure. Enters a 10-minute cooldown.

pub fn record_success(state: CooldownState) -> CooldownState

Record a successful compression. Clears the cooldown.

pub fn should_compress(
  estimated_tokens: Int,
  model_window: Int,
  trigger_percent: Int,
) -> Bool

Check if compression should be triggered based on token estimate.

pub fn split_history(
  history: List(completions.Message),
  protect_head: Int,
  protect_tail: Int,
) -> CompressionPlan

Split history into head (protected), middle (to compress), and tail (protected). The first message is treated as the system message and always kept. protect_head additional messages after system are kept. protect_tail messages at the end are kept.

After the initial split, boundary alignment ensures tool_call / tool_result pairs are not broken across the compression boundary.

pub fn summary_budget(
  middle_char_count: Int,
  config_max_tokens: Int,
) -> Int

Calculate a summary budget: 20% of middle character count, capped at 12_000 and config_max_tokens, with a floor of 200.

Search Document