agent
A Gleam AI agent runtime on the BEAM/Erlang VM. Multi-gateway, SQLite-backed persistence, tool-augmented chat with any OpenAI-compatible API.
Quick start
# Set your API key
echo 'DEEPSEEK_API_KEY=sk-...' > .env
# Run the CLI REPL
gleam run
# Run the daemon (Telegram bot + admin TCP listener)
gleam run -m agent_app
# Admin CLI (connect to running daemon)
gleam run -m agent_admin sessions list
gleam run -m agent_admin db stats
# Run tests
gleam test
Features
- CLI REPL — interactive chat with approval-gated bash execution, admin slash commands
- Daemon mode — headless agent with Telegram bot and TCP admin interface
- Telegram gateway — real Telegram bot using telega’s OTP supervision tree
- Tool system — bash (sandboxed), web fetch (SSRF-hardened), memory (SQLite-backed), code executor
- SQLite persistence — sessions, messages, and memories stored in WAL-mode SQLite with FTS5 full-text search
- Admin interface — inspect sessions, DB stats, cost tracking, session search, gateways, and models via REPL slash commands or TCP CLI
- Config-driven —
agent.tomlfor settings,.envfor secrets - Guardrails — hardline command blocks, dangerous-pattern detection, approval flow
- Multi-user — gateway-agnostic session type, per-user isolation via SQLite
- Context discovery — auto-discovers CLAUDE.md, AGENTS.md, and other project context files from the working directory
- Token estimation — CJK-aware heuristic token counting with context window overflow warnings
- API hardening — jittered retries on transient failures, truncation continuation, finish_reason extraction
- Memory hardening — content validation (injection/exfiltration scanning), deduplication, frozen snapshots, transaction safety
- Loop hardening — tool call validation with error recovery, graceful round-limit handling, context window awareness
- Tool loop guardrails — circuit breaker detects stuck tool loops, escalates warn → block → halt
- Context compression — summarizes middle conversation messages when the context window fills up, preserving head (system + early context) and tail (recent messages)
- Session management — REPL persists conversations to SQLite,
/resumeand/continuereload past sessions, auto-titling via LLM on first exchange, session lineage (parent/child chains), crash recovery on daemon restart, auto-prune of old ended sessions - Session search tool — AI agent can search past conversations via Discovery (FTS5 query), Scroll (message window), or Browse (recent sessions) modes
- Parallel tool execution — concurrent execution of parallel-safe tools via Gleam processes (spawn_unlinked + Subject message passing), ToolGroup grouping (Sequential/Parallel)
- Browser tools — Playwright-based browser automation (6 tools: navigate, snapshot, click, type, screenshot, scroll) via agent-browser CLI, URL safety reuses SSRF protection
- OTP supervision — structured supervisor tree; Service and Plugin shapes self-declare
supervised: Boolfor generic lifecycle management - Structured autonomy — Pulse (time-driven PULSE.md task execution), Reflection (post-turn memory consolidation), Cron (5-field scheduler with runtime-managed jobs via agent tools), Harness (deterministic safety gating for autonomous actions)
- Notifications + DND — agent-callable gateway tools (telegram/send_message) with runtime-managed Do Not Disturb rules via
/dndadmin commands - Timezone support — top-level
[agent] timezoneconfig consumed by DND, cron, and all time-aware code
Architecture
The project follows a three-category architecture:
| Category | Role | Extensible? |
|---|---|---|
| Core | Lightweight wireframe connecting components — event loop, registry, config, session type | No |
| Services | Fixed branches the core depends on, each with a Service shape (name, supervised, start, stop, health). | No (built-in) |
| Plugins | Swappable, shape-conforming components. Each has a top-level Plugin shape plus a sub-type shape (Tool, Gateway, Hook, MemoryPlugin). Users can add their own in ~/.agent/. | Yes |
src/
├── core/ # Wireframe (loop, registry, config, session, tool shape)
├── services/ # Fixed branches (api, storage, admin, tokens, guardrails, persona, context, titler, pulse, cron, harness, notifications)
│ ├── shapes.gleam # Service shape (name, supervised, start, stop, health)
│ └── supervisor.gleam # Service supervisor
├── plugins/ # Pluggable, shape-conforming — each module in its own folder
│ ├── shapes.gleam # Plugin shape (name, description, plugin_type, supervised, start, stop, health)
│ ├── tools/ # bash/, browser/, code/, cron/, memory/, session_search/, web/, gateways/telegram/
│ ├── gateways/ # telegram/, tui/ + supervisor.gleam
│ ├── hooks/ # context_compressor/, reflection/, tool_guardrails/
│ └── memory/ # file_memory/
├── agent.gleam # CLI REPL entry point
├── agent_app.gleam # Daemon entry point (uses agent_supervisor)
├── agent_admin.gleam # Admin CLI
└── agent_supervisor.gleam # Root supervisor (coordinates service + gateway supervisors)
Admin Commands
Available in both CLI REPL (prefix with /) and via gleam run -m agent_admin:
Database:
/db stats Show row counts and DB file size
/db cost Show total cost across all sessions, per-model breakdown
/db wipe memories Delete all memory entries
/db wipe sessions Delete all sessions and messages
/db prune sessions <days> Delete ended sessions older than N days
Sessions:
/sessions list List all sessions (key, source, model, tokens, cost)
/sessions show <key> Show session detail (persona, model, token breakdown)
/sessions delete <key> Delete a session and its messages
/sessions search <query> Full-text search across all message content
/sessions rename <k> <t> Rename a session
/sessions export <key> Export a session as JSON
/sessions export all Export all sessions as JSONL (deferred)
REPL-only:
/resume <id|title> Switch to a previous session
/continue Resume the most recent CLI session
/clear End current session, start new one with parent linkage
/title <text> Set title for current session
Gateways:
/gateways list List configured gateways and their status
/gateways status Show detailed status for all active gateways
DND:
/dnd status Show active DND rules
/dnd set <HH:MM> <HH:MM> Add a scheduled quiet window (UTC)
/dnd indefinite Toggle indefinite DND on/off
/dnd clear Remove all DND rules
Models:
/models list List all configured models
/models primary Show the primary model (name, base_url)
TCP protocol is line-delimited JSON: {"cmd":"sessions","action":"list"} → {"ok":"..."}
Configuration
.env— secrets:DEEPSEEK_API_KEY,TELEGRAM_BOT_TOKENagent.toml— everything else: model, persona, tool settings, gateway config, admin port
Dependencies
gleam_stdlib,gleam_httpc,gleam_json— coregleam_erlang— Erlang interopenvoy— env var loadingsqlight— SQLitetelega— Telegram Bot APIgleeunit— test framework
Comparison with hermes-agent
Our agent is modeled after hermes-agent, a production Python AI agent. Below is a summary of where we match, where we diverge by design, and where gaps remain.
Matched capabilities
| Area | Status |
|---|---|
| CLI REPL + daemon mode | Same two-gateway architecture |
| SQLite session persistence (WAL, FTS5) | Equivalent to hermes_state.py |
| Token/cost tracking per session | Same per-message accumulator pattern |
| Session lifecycle (end, fork, prune, resume) | Full lineage, crash recovery |
| Auto-titling via LLM | Same fire-and-forget approach |
| Session search tool (Discovery/Scroll/Browse) | Direct equivalent |
| Guardrails (hard blocks, approval patterns) | 46 tests, covers same patterns |
| SSRF protection (DNS-resolved, 2-tier, redirect re-validation) | Actually exceeds hermes in redirect safety |
| Memory validation (injection/exfiltration/unicode scanning) | Same scan patterns |
| Context file discovery (CLAUDE.md, AGENTS.md, etc.) | Same walk-up algorithm |
| CJK-aware token estimation | Same heuristic approach |
| API hardening (jittered retries, truncation continuation) | Same retry policy |
| Config-driven (TOML + env vars) | Same layered config model |
| Admin interface (TCP + slash commands) | Own implementation, similar feature set |
| Fuzzy command suggestions | Unique to our agent |
| Parallel tool execution | Gleam/erlang/process concurrency (spawn_unlinked + Subject message passing) |
| Browser automation tools | Playwright via agent-browser CLI (6 tools, URL safety reuses SSRF) |
Deliberately deferred (not gaps — design choices)
- Streaming responses — v1 buffers full responses; SSE streaming needs
httpp+gleam_otp/process - Multi-provider abstraction — single OpenAI-compatible endpoint suffices
- Full actor-based OTP services — supervisor tree coordinates startup/stop/health using shapes; components manage their own processes
- ETS for approval cache — in-memory list fine for current scale
Key gaps vs hermes-agent
| Gap | hermes approach | Priority |
|---|---|---|
| Conversation loop tests | 18 tests covering pure functions, tool execution, and loop logic with mock chat functions | ✅ Done |
| Tool loop guardrails | Circuit breaker: detect stuck tool loops, warn → block → halt. 16 unit + 6 integration tests. | ✅ Done |
| Context compression | Summarize middle messages, protect head/tail; triggered at token threshold. 12 unit + 5 integration tests. | ✅ Done |
| Guardrails + compression integration | Both modules wired into loop.gleam. 23 integration tests verify correctness end-to-end. | ✅ Done |
| Multi-platform gateway | 20+ chat platforms (Slack, Discord, Signal, WhatsApp, etc.) | Medium |
| Delegate/sub-agent | delegate_task tool spawns child agent for multi-step subtasks | Low |
| Provider fallback chain | Rotate credentials, chain through backup providers on failure | Low |
| Vision/image tools | Image analysis, generation, video generation | Low |
| Skill system | SKILL.md knowledge packages with execution scripts | Low |
| Background memory review | Post-turn daemon thread auto-saves to memory/skills | Done (Reflection hook) |
| ACP (Agent Communication Protocol) | IDE integration protocol | Low |
| LSP integration | Language Server Protocol client | Low |
| Voice/TTS/transcription | Two-way voice conversation | Low |
| Cron job management | Scheduled task tool + runtime CRUD | Done (Cron service + cron_create/list/update/delete tools) |
| Notifications + DND | Gateway-specific notify + quiet hours | Done (telegram/send_message tool + DND service with /dnd admin commands) |
Test coverage highlights
Our agent has ~590 tests. Guardrails (46), web/SSRF (42), admin (34), browser (14), and the full conversation loop (53 tests across unit, guardrails, compression, and integration suites) are well-covered. The only remaining untested areas are integration tests that require real API keys or external services.
Development
gleam run # Run the CLI REPL
gleam run -m agent_app # Run the daemon (Telegram + admin)
gleam run -m agent_admin ... # Connect to daemon's admin port
gleam test # Run all tests
gleam format # Format source files