Skip to content

Architecture Overview

Deployment Perspective: What Runs Where?

Before diving into technical details, let's clarify where each component runs. Agent Network uses a Server-Client architecture -- one central Server connects to multiple distributed Agent clients.

Deployment Topology

Component Deployment Quick Reference

ComponentRuns OnPortPurposenpm Package
CommHub ServerServer (1 machine)9200Message routing, task management, auth, database@sleep2agi/commhub-server
DashboardLocal or standalone server3000 defaultWeb UI (Overview / Nodes / Tasks / Messages / Chat / Admin / Settings — see the Dashboard doc for per-page detail)@sleep2agi/agent-network-dashboard
anet CLIEach client machine--Command-line management tool (full command list: CLI reference)@sleep2agi/agent-network
Agent NodeEach client machine--AI worker (receives tasks, calls AI, reports results)@sleep2agi/agent-node
Claude CodeClient machine--Interactive AI development (joins network via MCP)Anthropic official
Channel PluginsClient machine--Telegram (v0.8 stable); WeChat / Feishu via external MCP plugins (see channels.md)channel/

Port Reference

PortComponentProtocolDescription
9200CommHub ServerHTTPMCP (POST /mcp), SSE (GET /events/:alias), REST (/api/*)
3000DashboardHTTPDefault port for anet hub dashboard

Local vs Production

Local DevelopmentProduction Deployment
CommHub ServerLocal localhost:9200Server YOUR_IP:9200
Agent NodeLocal, --hub localhost:9200Client machine, --hub YOUR_IP:9200
Dashboardlocalhost:3000YOUR_IP:3000 or standalone deploy
DatabaseLocal SQLite fileServer SQLite file
CommunicationAll via localhostVia internal network / public IP

System Architecture

Agent Network uses a centralized message routing architecture where all agents communicate through the CommHub Server.

Four npm Packages

Agent Network ships as four npm packages with clear responsibilities:

PackagePurposeHow to install / run
@sleep2agi/agent-networkanet CLI -- config management, service launcher, status monitoringnpm i -g @sleep2agi/agent-network
@sleep2agi/agent-nodeAgent runtime -- AI model + tool calls + task handlinganet node create + anet node start
@sleep2agi/commhub-serverCommunication hub -- message routing + SSE push + task managementanet hub start
@sleep2agi/agent-network-dashboardWeb Dashboard -- visual monitoring + task management (Overview / Nodes / Tasks / Messages / Chat / Admin / Settings)anet hub dashboard (CLI auto-fetches)

They can be used independently or composed:

  • Just need CLI control: install @sleep2agi/agent-network
  • Just need the agent runtime: anet node create + anet node start
  • Just need the comm hub: bunx @sleep2agi/commhub-server
  • Just need the Web UI: anet hub dashboard

Full version scheme (independent semver per npm package vs the v0.10.x bundle-release anchor) is documented in Versioning.

CommHub Server

CommHub Server is the core of the entire system, responsible for message routing, state management, and task tracking.

Runs on: Server (1 machine). All client Agents connect to it.

Triple Protocol

ProtocolEndpointPurposeAuth
MCP Streamable HTTPPOST /mcpAgent tool calls (send_task, report_status, etc.)Bearer Token
SSEGET /events/:aliasReal-time push of tasks/messages to agentsBearer Token
RESTGET/POST /api/*Dashboard / CLI / external integrationsBearer Token

v0.10.0 new — per-server daemon observability endpoint family (#99 Phase 1 scaffold, commhub-server@0.8.2, default path needs agent-network@2.2.1+)

Two new REST endpoints expose single-host health + per-agent list, used by the dashboard ServersDrawer and any monitoring / external observability integration:

  • GET /api/server/:host/health — current health snapshot for a single host (CPU / mem / disk + 24h bucketed history 5m / 1h / 24h) plus alert_level
  • GET /api/server/:host/agents — agents on a single host + per-agent process_telemetry (rss / cpu_pct / uptime_seconds / in_flight_count, #142 shipped in agent-node@2.4.0 + server schema aligned in commhub-server@0.8.2)

Version requirement: to reach these two endpoints via the default anet hub start path you need agent-network ≥ 2.2.1 (the v0.10.1 hotfix bumped PINNED_SERVER_VERSION from 0.8.0 to 0.8.2).

v0.10.2 Hero A complement: agent-node ≥ 2.4.1 adds host disk telemetrylatest.disk_total_gb / disk_used_gb / disk_avail_gb (sampled via execFileSync('df', ['-k', '/']), sharing one POSIX path across Linux + macOS; gracefully null on Windows or parse failure) + alert_level gains disk_avail < 1GB critical / < 5GB warn triggers + the 24h history buckets carry disk_avail_min / disk_used_max extreme-aggregation fields, closing #99 per-server daemon Phase 2 host metrics, final 10%.

The control layer (kill / restart / redeploy) is deferred to v0.11.0. Details: REST API — server endpoint family.

MCP Tool Groups

CommHub provides 17 MCP Tools in two groups:

Agent-side tools (4) -- agents report status and fetch tasks:

ToolDescription
report_statusHeartbeat + status reporting (idle/working/error)
report_completionTask completion report + results
get_inboxFetch pending messages
ack_inboxAcknowledge message receipt

Hub-side tools (13) -- command center / Dashboard manages tasks:

ToolDescription
send_taskDispatch a task (with lifecycle)
send_messageSend a message (no processing triggered)
send_replyReply to a task
send_ackAcknowledge task receipt
retry_taskRetry a failed task
cancel_taskCancel a pending task
reassign_taskReassign a task to another agent
get_taskQuery task details
list_tasksQuery task list
get_all_statusGet all session statuses
get_session_statusGet single session details
broadcastBroadcast a message to all agents
get_completionsQuery completion records

Database Design

SQLite with WAL mode, 14 tables:

Additional tables: completions (completion records), task_events (task event log), audit_log (audit trail), licenses (licensing), network_invites (invite codes), rename_txn (RFC-010 node-rename two-phase transaction state: prepared / committed / aborted).

SSE Push Mechanism

Agents receive tasks in real time via SSE long connections, eliminating the need for polling:

Heartbeat and Timeout

  • Agents send heartbeats (report_status) every 3 minutes
  • Server updates last_seen_at on every request
  • After 10 minutes without a heartbeat, agents are automatically marked offline
  • SSE auto-reconnects on disconnect (#202: exponential backoff 1s → 30s cap + re-register on every successful (re)connect + give up after 1h continuous failure — see agent-node)

Agent Node

Agent Node is the working unit in the network, responsible for receiving tasks, invoking the AI model, and reporting results.

Runs on: Client machines (can be multiple). Connects to CommHub Server over the network.

Four Runtimes

RuntimeAI EngineUse CaseModels
claude-code-clispawn local claude processReuse Claude subscription / interactive tool useClaude Sonnet/Opus (subscription)
claude-agent-sdkAnthropic Claude Agent SDKProgrammatic access to any Anthropic-compatible APIAnthropic / MiniMax / DeepSeek / GLM / Kimi / InternLM / Xiaomi MiMo / OpenRouter (see Multi-model)
codex-sdkOpenAI Codex SDK (v0.10.0+ can opt-in to a direct stdio path — see below)Code generation, tool useOpenAI Codex
grok-build-acpspawn local grok ACP serverxAI Grok Build ACP-protocol cross-agent collaborationxAI Grok (grok-build series) (details on GitHub ↗)

v0.10.0 new — codex-direct-stdio opt-in path (#141)

Set ANET_CODEX_STDIO_DIRECT=1 to make agent-node switch the codex runtime from the @openai/codex-sdk wrapper to spawn('codex', ['app-server']) + a ~155 LOC direct stdio JSON-RPC client, getting the full 67-method v2 protocol surface (thread / turn / item / realtime) and bypassing the wrapper's --mcp-config HTTP-transport bug family (#102 hang root cause). v0.10.x (including the current stable) still defaults to the wrapper; v0.11.0 plans to flip the default and rename the toggle to ANET_CODEX_LEGACY_SDK=1 opt-out. The LLM-side tool surface is unchanged (the codex thread still uses only its baked-in tools; the commhub roundtrip is still handled by the agent-node parent process) — what changes is purely the transport protocol between agent-node and the codex process. Details: runtimes — codex-sdk § codex-direct-stdio + agent-node — env vars § ANET_CODEX_STDIO_DIRECT + v0.10.0 GitHub release notes.

MCP integration paths (per runtime, v0.9.0+)

The four runtimes expose commhub tools to the LLM via different paths — this affects the tool names the LLM sees and how you debug routing problems:

claude-agent-sdk uses in-process SDK MCP (#102 Option A, agent-node 2.3.5-preview.0+):

  • agent-node creates an in-process McpServer via createSdkMcpServer({ name: "commhub" }) and registers the 7 agent-facing tools (send_task / send_message / send_reply / get_all_status / get_session_status / get_task / list_tasks)
  • Each tool handler forwards the call from inside agent-node to CommHub's POST /mcp via the JSON-RPC initialize → tools/call chain
  • The LLM sees the SDK-namespaced tool name mcp__commhub__send_task (single commhub prefix) — not mcp__commhub__commhub__send_task or other double-prefix variants
  • Verify agent-node/src/commhub-mcp.ts createCommhubSdkMcpServer()

Why doesn't claude-agent-sdk use HTTP MCP directly? Claude Agent SDK 0.2.x forwards mcpServers={commhub:{type:"http", url:.../mcp}} verbatim to the claude binary's --mcp-config, but the binary's HTTP MCP path does not issue initialize / tools/list against the endpoint — commhub never sees the binary subprocess's requests, so the tool list is empty for the LLM (#102 root cause). Option A hosts the MCP server inside agent-node's own process to bypass this SDK limitation.

claude-code-cli uses stdio + local .anet/node-server.js proxy: the anet CLI writes a .mcp.json in the project cwd that registers commhub as { "type": "stdio", "command": "bun", "args": [".anet/node-server.js"] } (agent-network/bin/cli.ts ensureMcpJson). The claude binary spawns that local bun script as a stdio MCP server, and node-server.ts forwards tool calls to CommHub's /mcp over HTTP internally (agent-network/src/node-server.ts StdioServerTransport). Tool names live in the node-server.ts namespace.

codex-sdk does not expose commhub tools to the LLM: codexOpts does not pass mcpServers (agent-node/src/cli.ts). The codex thread only sees its baked-in tools (Read / Write / Edit / Bash / Glob / Grep / WebSearch). Multi-agent dispatch happens outside the LLM in agent-node's parent process: agent-node maintains the SSE connection plus report_status / get_inbox / send_reply calls back to CommHub, feeds the task text into the codex thread, and posts the codex reply back via CommHub. The codex thread itself does not know commhub exists — it is just an LLM worker.

grok-build-acp uses explicit per-session mcpServers injection + HTTP transport (v0.10.11 preview #204):

agent-node explicitly passes an mcpServers list to the Grok ACP server on every session/new / session/load. The preview chain went through two phases:

  • preview.2 (4b5a657): Stdio variant — mcpServers: [{ name: "commhub", command: "bun", args: ["<abs-path>/.anet/node-server.js"], env: { COMMHUB_ALIAS, COMMHUB_TOKEN, COMMHUB_URL, ... } }]; Grok spawns .anet/node-server.js as a stdio MCP subprocess. Structurally fixes the shared-.mcp.json identity bug, but still subject to stdout-pollution / bun-PATH / framing risks.
  • preview.6 (abefbe8): transport switched to HTTPmcpServers: [{ type: "http", name: "commhub", url: "${COMMHUB_URL}/mcp", headers: [{ name: "Authorization", value: "Bearer ${AUTH_TOKEN}" }, ...] }]; Grok calls commhub /mcp directly over HTTP (Grok ACP init reports mcpCapabilities = {http: true, sse: true}). commhub-server /mcp already derives from_session from the bearer ntok_ (server/src/index.ts:446-448 d1d867e #194 hub-side), so attribution is automatic. Bypasses the subprocess + bun PATH + framing + stdout-pollution risk surface entirely. Tool names come back from commhub /mcp JSON-RPC.
  • preview.7 (72e28fd): per-node isolated cwd. Vincent's UAT still showed the wrong from= alias — root cause: Grok CLI also reads cwd .mcp.json alongside the ACP session/new mcpServers injection, so two commhub MCP servers coexist and the stale stdio one wins the LLM's hello. Fix: ACP session/new now explicitly passes cwd: <home>/.anet/nodes/<node-id>/grok-cwd/. That dir symlinks the top-level user files (so the LLM's Read('./*') still works) but omits .mcp.json — Grok CLI's cwd discovery finds nothing and there's no stdio fallback. Multi-node concurrent-spawn safe by construction.

⚠ Debug tip: if the LLM can't call a commhub tool, check the runtime first — for claude-agent-sdk nodes, confirm commhub-mcp.ts is in dist (agent-node ≥ 2.3.5-preview.0); for claude-code-cli nodes, check the .mcp.json has type: stdio and the .anet/node-server.js path is correct; for codex-sdk nodes, look at the agent-node parent process logs (the codex thread never calls commhub); for grok-build-acp nodes (current stable, agent-node@2.4.9+, #204 per-node isolated cwd), look for [grok] commhub MCP server resolved: <abs-path> in the agent-node log plus the per-node isolated cwd under .anet/nodes/<alias>/; v0.10.10 and earlier (agent-node@2.4.8) grok-build-acp follows the legacy shared-cwd path (susceptible to stale .mcp.json identity pollution, fixed by #204) — see grok-build-runtime.md.

Task Processing Flow

Key rule: Only task type messages trigger AI processing (think). message and reply are logged but not processed, preventing infinite loops.

Isolation Strategy

Each Agent Node instance is fully isolated and does not read host machine global config — it passes settingSources: [] to claude-agent-sdk's query() (the SDK entry point is the query() function, not a new Agent({...}) class):

typescript
const options = {
  model: MODEL || undefined,
  settingSources: [],  // Fully isolated — does not read ~/.claude/ etc.
  // permissionMode / mcpServers / env ...
};
for await (const message of query({ prompt, options })) { /* ... */ }

anet CLI

anet CLI is the management tool for Agent Network, covering Hub / account / network / node / monitoring / demo operations (full command list: CLI reference).

Runs on: Each client machine. Points to CommHub Server via --hub parameter or config file.

Configuration Priority

Configuration Files

Global config ~/.anet/config.json:

json
{
  "hub": "http://YOUR_IP:9200",
  "token": "utok_xxxxx"
}

Project node config {cwd}/.anet/nodes/<alias>/config.json (v0.8 per-node subdirectory schema; the old .anet/config.json {alias, type} 2-field format was the early V2 layout — see Agent Node for the full field list):

json
{
  "anet_version": "0.1.0",
  "node_id": "n_a1b2c3d4",
  "node_name": "commander",
  "alias": "commander",
  "runtime": "claude-code-cli",
  "network_id": "net_a1b2c3d4",
  "channels": ["server:commhub"],
  "env": {},
  "flags": { "dangerouslySkipPermissions": true, "teammateMode": "in-process" },
  "session": "550e8400-e29b-41d4-a716-446655440000"
}

Dashboard

Dashboard is a separate Web process that talks to CommHub over REST:

TypeTech StackRuns OnPortFeatures
DashboardNext.js 16Local, Vercel, or standalone server3000 defaultChat, Nodes, Tasks, Messages, Networks, Logs, Admin

Channel Plugins

Channel plugins enable agents to integrate with external communication platforms.

  • Telegram -- via Bot API (v0.8 stable, anet channel add telegram)
  • WeChat / Feishu -- via external MCP plugins (not part of @sleep2agi/commhub-server); see Channel plugin docs

Runs on: Client machines, mounted as MCP Servers on Claude Code.

Channel message format:

xml
<channel source="telegram" chat_id="123" user="alice">
  User's message
</channel>

Code Structure

agent-network/        # repo root (github.com/sleep2agi/agent-network) — monorepo
├── server/            # CommHub Server (Bun + SQLite) → runs on Server
│   └── src/
│       ├── index.ts          # HTTP routing + MCP + SSE
│       ├── tools.ts          # 17 MCP Tools
│       ├── auth.ts           # Auth + permissions + network management
│       ├── db.ts             # Database + table definitions
│       ├── db-adapter.ts     # DB adapter layer (SQLite + abstract interface)
│       ├── push.ts           # SSE push management
│       └── password-dict.ts  # Weak password dictionary (v0.8 admin bootstrap)
├── agent-network/     # anet CLI + CommHub SDK → runs on Client
│   ├── bin/cli.ts            # CLI entry (full command list: [CLI docs](/en/guide/cli))
│   └── src/
│       ├── index.ts          # default export
│       ├── client.ts         # CommHub SDK client
│       ├── server.ts         # Server programmatic entry
│       └── node-server.ts    # Agent Node long-running server entry
├── agent-node/        # Agent runtime → runs on Client
│   └── src/cli.ts     # Three engines + task processing
├── channel/           # Claude Code Channel plugins → runs on Client
│   └── commhub-channel.ts
├── demos/             # Demo orchestrations
│   └── codex-telegram-squad/
└── docs/              # Design docs

Security Architecture

See Security Design for details. Key security measures:

  • Dual token authentication: utok_ (user-level) + ntok_ (network-level)
  • Network isolation: Server-side enforced network_id, clients cannot cross networks
  • RBAC with four permission levels: owner / admin / member / viewer
  • SQL injection protection: All queries are parameterized
  • Rate limiting: Registration 30/min, login 10/min per IP
  • Audit logging: All operations recorded
  • v0.8 RFC-001 Phase 2: COMMHUB_AUTH_TOKEN master token soft-deprecated (only /api/* read + deprecation warning); first anet hub start auto-bootstraps admin utok_ (~/.anet/server/admin-utok.json chmod 600) with default account admin / anethub; password strength ≥ 8 + weak-password dictionary; anet passwd / anet hub admin reset-user tools; anet doctor --fix probes and reissues expired ntok_. See RFC-001.

Powered by Sleep2AGI