Some checks failed
Dev Build / build-test (pull_request) Failing after 1s
Build (Dev) / trigger-deploy (pull_request) Has been skipped
openclaw/grimm-review REJECTED — 6 blocking issues
Build (Dev) / build-go-backend (pull_request) Failing after 0s
Build (Dev) / build-frontend (pull_request) Failing after 1s
- Rename GatewayURL/GatewayPollInterval → GatewayRestURL/GatewayRestPollInterval - Change Docker-aware defaults (host.docker.internal instead of localhost) - Client.Start() waits for WS readiness (30s timeout), falls back to REST - Client.SetWSClient()/MarkWSReady() for WS→REST coordination - WSClient.SetRESTClient() so WS notifies REST on successful handshake - main.go wires both clients: WS primary, REST fallback with cross-references - .env.example documents WS_GATEWAY_URL, GATEWAY_TOKEN, REST fallback vars - docker-compose.yml adds WS_GATEWAY_URL and GATEWAY_TOKEN env vars - reference/CONTROL_CENTER_CONTEXT.md documents architecture and startup sequence
2.5 KiB
2.5 KiB
Control Center — Architecture Context
Current State
The Control Center backend uses a dual-path gateway client architecture:
- Primary path: WebSocket client (
gateway.WSClient) connects to the OpenClaw gateway using WS protocol v3. It handles handshake, initial sync (agents.list + sessions.list RPCs), live event routing (sessions.changed, presence, agent.config), and automatic reconnection with exponential backoff. - Fallback path: REST poller (
gateway.Client) polls the gateway/api/agentsendpoint on an interval. It only activates if the WS client fails to connect within 30 seconds of startup.
Live Gateway Connection
Startup Sequence
- Both WS client and REST client start concurrently
- REST client waits 30s for WS readiness signal (
wsReadychannel) - If WS connects successfully → REST client stands down (logs "using WS — REST poller standing down")
- If WS fails within 30s → REST client falls back to polling (logs "WS not ready — falling back to REST polling")
- If no WS client configured → REST client polls immediately
WebSocket Client (Primary)
- Config:
WS_GATEWAY_URL(default:ws://host.docker.internal:18789/),OPENCLAW_GATEWAY_TOKEN - Protocol: v3 handshake (challenge → connect → hello-ok)
- Initial sync:
agents.list+sessions.listRPCs → persist → merge → broadcastfleet.update - Live events:
sessions.changed,presence,agent.config - Reconnection: exponential backoff (1s → 2s → 4s → ... → 30s max)
REST Poller (Fallback)
- Config:
GATEWAY_URL(default:http://host.docker.internal:18789/api/agents),GATEWAY_POLL_INTERVAL(default: 5s) - Only used when WS is unavailable
- Polls the
/api/agentsendpoint and syncs agent status changes
Wiring
main.go
├── wsClient = NewWSClient(...)
├── restClient = NewClient(...)
├── wsClient.SetRESTClient(restClient) // WS notifies REST on ready
├── restClient.SetWSClient(wsClient) // REST defers to WS
├── go wsClient.Start(ctx) // primary
└── go restClient.Start(ctx) // fallback (waits for WS)
Key Design Decisions
- Push over poll: WS is preferred for real-time updates; REST is a safety net
- DB first, then SSE: All event handlers persist to DB before broadcasting
- Graceful degradation: System works without WS; REST provides basic functionality
- No hard dependency on REST /api/agents: If WS is connected, REST endpoint is never called