Running 8 AI Agents on One Server — The Real Architecture

When people first hear we're running eight AI agents on a single 8GB Hetzner server, the reaction is usually the same: "Doesn't everything just crash?"

The honest answer is: it did, at first. We made every mistake in the book — multiple gateways fighting each other, crons spamming the terminal, agents stepping on each other's memory. It took months to arrive at an architecture that's actually stable. This article is the one we wish existed when we started.

If you want to run AI agents on a server without descending into chaos, here's exactly how we do it — and why OpenClaw is the piece that makes it work.

The Stack at a Glance

Everything runs on a single Hetzner VPS in Helsinki: Ubuntu 22.04, 8GB RAM, 4 vCPUs. The orchestration layer is OpenClaw 4.1.2 — the agent runtime that manages routing, model selection, memory, and inter-agent communication through one central gateway on port 18789. MasterClaw Suite installed and configured it; OpenClaw runs it.

OpenClaw Architecture — Single-Server, 8 Agents

OpenClaw Gateway :18789

↕

Cobalt Helix Surge Vega Lyra Prism Kova Optimum

↕

Anthropic Haiku Groq GPT-4.1-mini Gemini fal.ai

All eight agents communicate exclusively through this single gateway. There are no isolated gateways, no direct API calls from individual agents, and no exceptions. This single rule prevents more problems than any other.

The Eight Agents — Roles and Responsibilities

Agent	Role	Primary Model	Trigger
Cobalt	Orchestrator / Commander	Haiku	All incoming tasks
Helix	Research & Data Gathering	Gemini	On demand via Cobalt
Surge	Content & Blog Writing	GPT-4.1-mini	Weekly cron
Vega	Social Media Automation	Haiku	Daily cron
Lyra	SEO & Analytics	Groq	Weekly cron
Prism	Image & Visual Generation	fal.ai	On demand
Kova	Client Comms & Reports	GPT-4.1-mini	Daily cron
Optimum	System Monitoring & Cleanup	Groq	Hourly cron

The key insight is specialization with central routing. Every agent has a narrow job description and writes its outputs to a specific file path. Cobalt reads those outputs to decide next steps. No agent calls another directly.

Cobalt: The Orchestrator

Cobalt is the only agent with a system prompt that references all the others. It receives tasks — from the Telegram bot, from cron triggers, or from human input — and decides which agent handles them. It doesn't do the work itself; it routes.

Here's the basic structure of how Cobalt receives a cron-triggered job:

# /opt/openclaw/crons/trigger-surge.sh
#!/bin/bash
TASK="Write this week's blog article using cobalt-blog-prompt.md"
RESULT_FILE="/opt/openclaw/outputs/surge-$(date +%Y%m%d).txt"

curl -s -X POST http://localhost:18789/v1/chat \
  -H "Content-Type: application/json" \
  -d "{
    \"agent\": \"cobalt\",
    \"message\": \"$TASK\",
    \"context_file\": \"/root/.openclaw/workspace/memory/COBALT-BRIEFING-blog.md\"
  }" >> "$RESULT_FILE" 2>&1

💡

Silent Crons Rule

Every cron job appends to a result file. No stdout to terminal. Kova aggregates all result files into one daily summary report. This keeps crontab -e clean and your inbox sane.

Port 18789: One Gateway, No Exceptions

Early on, we made the mistake of spinning up isolated gateways for testing — one on 18789, another on 18790 for experiments. Within a week, agents were hitting the wrong gateway, rate limits weren't being properly tracked, and one agent overwrote another's memory because it was pointed at an old gateway that had stale config.

The fix was simple and permanent: one gateway, always on 18789, managed as a systemd service.

# /etc/systemd/system/openclaw.service
[Unit]
Description=OpenClaw Gateway
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/opt/openclaw
ExecStart=/opt/openclaw/openclaw start --port 18789
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

systemctl enable openclaw
systemctl start openclaw
systemctl status openclaw

🚫

Never Do This

Don't run openclaw start manually in a terminal session. If that session dies, the gateway dies. Always use systemd or a proper process manager. We learned this the hard way at 2am.

Memory: File-Based, Not Database-Based

All memory lives under OpenClaw's workspace directory. The structure splits into shared workspace memory and per-agent memory:

/root/.openclaw/workspace/
├── memory/                        ← Main shared memory
│   ├── MEMORY.md                  ← Long-term memory
│   ├── 2026-04-08.md              ← Daily logs
│   ├── COBALT-BRIEFING-*.md       ← Agent briefings
│   ├── COMMUNICATION-PROTOCOL-*.md
│   ├── GROQ-AUTO-ROTATION-*.md
│   ├── agents-replies/
│   ├── agents-inbox/
│   └── discussions/
└── agents/
    ├── cobalt/agent/memory/       ← Cobalt's private memory
    ├── surge/agent/memory/        ← Surge's private memory
    ├── helix/agent/memory/
    └── ...                        ← One per agent

These are plain markdown files OpenClaw loads into agent context at the start of each task. No database, no vector store — just files. Editable by hand to course-correct agent behaviour instantly, and add zero overhead. The shared memory/ folder handles briefings, logs, and inter-agent communication. Each agent's private memory folder handles their own persistent context.

⚠️

Don't Edit openclaw.json Directly

Never let an agent write to openclaw.json. We route all config changes through Cobalt with human confirmation before any changes are applied. One malformed JSON write and the gateway won't start.

The Telegram Interface

Our @AgencyCommandbot is the human interface into Cobalt. Commands arrive as messages, Cobalt parses intent and routes to the appropriate agent, and results come back via the bot. For anything that modifies config or writes to production, we've built a confirmation step — Cobalt always asks before acting.

# Example Telegram → Cobalt command
/task Surge write Article 3 on free OpenClaw models

# Cobalt response
→ Routing to Surge with cobalt-blog-prompt.md context
→ Output: /opt/openclaw/outputs/surge-article3-20251207.txt
→ Estimated completion: 4 minutes
→ Daily report will include result.

RAM: How 8 Agents Fit in 8GB

This surprises people the most. The agents themselves consume almost no RAM — they're stateless processes that wake up, make an API call, write output, and sleep. The model inference happens remotely on Anthropic, Groq, OpenAI, and Google infrastructure.

What does consume RAM:

OpenClaw gateway process: ~400MB
Nginx + web server: ~80MB
Telegram bot process: ~120MB
Ubuntu OS overhead: ~600MB
Active cron processes (peak): ~300MB

Total peak usage rarely exceeds 1.8GB. The 8GB is comfortable headroom, not a tight constraint. You could run this architecture on a 4GB server for most workloads.

Outputs and the Daily Report

Every agent writes its output to /opt/openclaw/outputs/ with a dated filename. Kova runs at 11:30pm daily, reads all that day's output files, and compiles a single summary that lands in Telegram. This is the only notification we get per day — and it's enough.

# Kova daily report cron (in crontab)
30 23 * * * /opt/openclaw/crons/kova-daily-report.sh >> /opt/openclaw/logs/kova.log 2>&1

💡

Naming Convention

All output files follow {agent}-{task}-{YYYYMMDD}.txt. This makes Kova's aggregation trivially simple and gives you a clean audit trail for every action every agent took.

What This Architecture Is Not

It's not a distributed system. It's not Kubernetes. It's not a message queue. It's a lean, well-organised single-server setup where OpenClaw handles all agent routing and memory, each agent knows its lane, and one orchestrator makes all routing decisions. MasterClaw Suite got it all installed in under ten minutes. Complexity lives in the logic, not the infrastructure.

If you're running a solo AI agency or a small team, this is the architecture we'd recommend. Scale infrastructure only when the business demands it — not because the architecture looks impressive.

For a deep dive on the model routing that makes free operation possible, see our guide on running OpenClaw for free using Groq and Gemini. And if you're still fighting gateway errors before you even get to agents, start with our OpenClaw gateway troubleshooting guide.

Running 8 AI Agents on One Server — The Real Architecture

The Stack at a Glance

The Eight Agents — Roles and Responsibilities

Cobalt: The Orchestrator

Port 18789: One Gateway, No Exceptions

Memory: File-Based, Not Database-Based

The Telegram Interface

RAM: How 8 Agents Fit in 8GB

Outputs and the Daily Report

What This Architecture Is Not

MasterClaw Team

Related Articles

How to Fix OpenClaw Gateway Errors

Run OpenClaw for Free Using Groq and Gemini