Why MONO is built in Go instead of Python
The Python argument: the AI ecosystem
Python won AI for historical reasons: PyTorch and TensorFlow were written there, the academic community publishes in Python, and OpenAI, Anthropic et al. ship Python SDKs first.
For prototyping an agent, Python is unbeatable: import `anthropic`, write 20 lines, have a demo. LangChain, LlamaIndex, CrewAI — all Python.
The problem is that prototype and production platform are different problems. For the first MONO, Python would have been enough. For 1000 MONOs running 24/7 each waiting on WhatsApp at any hour, it's not.
What breaks Python under real load
1. GIL. The Global Interpreter Lock prevents a single Python process from really using multiple cores. For an agent that must call LLMs, databases, webhooks, external APIs — all IO-bound — this is mitigated with async, but Python async is famously fragile (mixing sync/async libraries = silent errors).
2. Memory per goroutine vs per thread. A thread in Python takes 8 MB. A goroutine in Go takes 2 KB. When your MONO listens to WhatsApp, Telegram, Gmail, Stripe webhooks, cron jobs, and writes to SurrealDB concurrently, you want thousands of lightweight "workers". Go gives you that for free.
3. Startup time. The Go binary starts in ~20ms. A Python process with AI deps takes 2-5 seconds. This matters when the VPS reboots (updates, kernel patches) — we want 100ms downtime, not 5 seconds.
4. Deploy. Go compiles to a ~20MB static binary. `scp binary vps:/opt/mono/`, restart, done. Python requires virtualenv, pip-install, compatible C-library versions, and half the time some package needs `gcc`. Deploying to 100 VPSs simultaneously with Go is a for-loop. With Python, it's a DevOps nightmare.
What Go does well for agents
- Concurrency with goroutines. Every incoming message spawns a goroutine. Schedule tasks with
time.AfterFunc. Fan out webhooks withsync.WaitGroup. All standard library. - Channels. The "don't communicate by sharing memory, share memory by communicating" metaphor fits agents that process event streams (webhook → parse → route → respond) naturally.
- First-class tooling.
go test -racefinds race conditions automatically.pprofgives CPU/memory profiles without instrumenting. Nothing to install. - Type system. Structural interfaces make component testing trivial. Errors are values, not exceptions — they force you to think about failure flow.
- Static binary. The VPS has no Python, no virtualenv, no system deps. Just
/opt/mono/monoandsystemctl start.
What Go costs
Not all wins. The real trade-offs:
- Less mature official SDKs. Anthropic has a Go SDK, but LangChain is Python/JS only. For MONO we wrote our own wrapper (~500 lines) because we don't need a massive framework.
- Fewer ML libraries. If your agent includes local fine-tuning, reranking, custom embeddings — Python gives you more tools. MONO uses external APIs for this, so it doesn't affect us.
- Hiring is harder. There are 10× more Python devs than Go devs. But the Go devs you find tend to be senior, because Go attracts people who've already felt the pain of Python in production.
- Verbosity.
if err != nilon every call. After 10k lines it becomes mechanical and actually helps rigor.
When to use Python
Honestly: if you're prototyping, validating product-market fit, or building an internal tool with <100 users. Python will let you iterate 3× faster at first.
The time to migrate (or rewrite) is when you start seeing: (a) concurrency bottlenecks, (b) deploys >5 min, (c) memory leaks requiring hourly worker restarts, (d) teams spending more time on ops than features.
MONO started in Go directly because the architecture (1 VPS per user) makes deploy and concurrency the main problem, not iteration speed.
Numbers
MONO binary: 24MB. Base RAM: 35MB. Startup: 18ms. Median response latency (LLM aside): 3ms. Thousand idle goroutines: 5MB extra. Equivalent Python with FastAPI + uvicorn + asyncio: 180MB base RAM, 2.1s startup, 12ms median latency. For a product that runs on the user's VPS, that delta compounds.