A single-binary AI agent runtime
June 10, 2026
Eva is a high-performance AI agent runtime written in Zig, built on top of the Zero framework. The pitch is simple: one binary, any LLM, any channel, with the reliability bits you’d otherwise have to write yourself.
Vendor-independent
Eva talks to any OpenAI-compatible API — OpenAI, OpenRouter, Groq, Together, llama.cpp, vLLM, Ollama. You point EVA_API_BASE at your endpoint and it just works. There is no vendor SDK baked in.
Multi-channel
Out of the box it speaks WebSocket and Telegram. Discord and more sit behind a stable vtable so adding a channel doesn’t destabilize the agent loop.
Workspace-scoped tools
The agent can call a small, sharp set of tools: read_file, write_file, list_dir, shell, web_fetch. All filesystem operations are sandboxed to a configured workspace directory, and web_fetch has SSRF protection that blocks loopback and RFC1918 ranges.
Hybrid memory
This is the part I’m most happy with. Eva keeps:
- SQLite for queryable session and message history
- Human-editable flat files —
MEMORY.md,SOUL.md,USER.md— you can open them in your editor and the agent picks up the changes
A background dream thread periodically pulls insights out of recent history, and a consolidator summarizes long conversations so the context window never silently overflows.
Reliability
A circuit breaker opens after configurable failures, a token-bucket rate limiter caps LLM calls per minute, and a loop detector stops the agent from bouncing the same tool call forever. When the LLM provider has a bad day, Eva degrades gracefully instead of melting.
A note on how it was built
This project was built completely using free-tier LLMs (Qwen 3.6 / MiniMax M3) with very little manual coding. Just for fun and personal use. It helped me connect how agentic engineering works on a loop, and how it achieves results. It’s serving well with local models like Qwen 3.5 9B and Gemma 4 12B.
The stripped binary is about 7 MB. The source is at github.com/im-ng/eva.
Written as part of eva.