Some checks failed
Dev Build / build-test (pull_request) Failing after 1s
Build (Dev) / trigger-deploy (pull_request) Has been skipped
openclaw/grimm-review REJECTED — 6 blocking issues
Build (Dev) / build-go-backend (pull_request) Failing after 0s
Build (Dev) / build-frontend (pull_request) Failing after 1s
- Rename GatewayURL/GatewayPollInterval → GatewayRestURL/GatewayRestPollInterval - Change Docker-aware defaults (host.docker.internal instead of localhost) - Client.Start() waits for WS readiness (30s timeout), falls back to REST - Client.SetWSClient()/MarkWSReady() for WS→REST coordination - WSClient.SetRESTClient() so WS notifies REST on successful handshake - main.go wires both clients: WS primary, REST fallback with cross-references - .env.example documents WS_GATEWAY_URL, GATEWAY_TOKEN, REST fallback vars - docker-compose.yml adds WS_GATEWAY_URL and GATEWAY_TOKEN env vars - reference/CONTROL_CENTER_CONTEXT.md documents architecture and startup sequence
46 lines
2.5 KiB
Markdown
46 lines
2.5 KiB
Markdown
# Control Center — Architecture Context
|
|
|
|
## Current State
|
|
|
|
The Control Center backend uses a **dual-path gateway client** architecture:
|
|
|
|
- **Primary path**: WebSocket client (`gateway.WSClient`) connects to the OpenClaw gateway using WS protocol v3. It handles handshake, initial sync (agents.list + sessions.list RPCs), live event routing (sessions.changed, presence, agent.config), and automatic reconnection with exponential backoff.
|
|
- **Fallback path**: REST poller (`gateway.Client`) polls the gateway `/api/agents` endpoint on an interval. It only activates if the WS client fails to connect within 30 seconds of startup.
|
|
|
|
## Live Gateway Connection
|
|
|
|
### Startup Sequence
|
|
1. Both WS client and REST client start concurrently
|
|
2. REST client waits 30s for WS readiness signal (`wsReady` channel)
|
|
3. If WS connects successfully → REST client stands down (logs "using WS — REST poller standing down")
|
|
4. If WS fails within 30s → REST client falls back to polling (logs "WS not ready — falling back to REST polling")
|
|
5. If no WS client configured → REST client polls immediately
|
|
|
|
### WebSocket Client (Primary)
|
|
- Config: `WS_GATEWAY_URL` (default: `ws://host.docker.internal:18789/`), `OPENCLAW_GATEWAY_TOKEN`
|
|
- Protocol: v3 handshake (challenge → connect → hello-ok)
|
|
- Initial sync: `agents.list` + `sessions.list` RPCs → persist → merge → broadcast `fleet.update`
|
|
- Live events: `sessions.changed`, `presence`, `agent.config`
|
|
- Reconnection: exponential backoff (1s → 2s → 4s → ... → 30s max)
|
|
|
|
### REST Poller (Fallback)
|
|
- Config: `GATEWAY_URL` (default: `http://host.docker.internal:18789/api/agents`), `GATEWAY_POLL_INTERVAL` (default: 5s)
|
|
- Only used when WS is unavailable
|
|
- Polls the `/api/agents` endpoint and syncs agent status changes
|
|
|
|
### Wiring
|
|
```
|
|
main.go
|
|
├── wsClient = NewWSClient(...)
|
|
├── restClient = NewClient(...)
|
|
├── wsClient.SetRESTClient(restClient) // WS notifies REST on ready
|
|
├── restClient.SetWSClient(wsClient) // REST defers to WS
|
|
├── go wsClient.Start(ctx) // primary
|
|
└── go restClient.Start(ctx) // fallback (waits for WS)
|
|
```
|
|
|
|
## Key Design Decisions
|
|
- **Push over poll**: WS is preferred for real-time updates; REST is a safety net
|
|
- **DB first, then SSE**: All event handlers persist to DB before broadcasting
|
|
- **Graceful degradation**: System works without WS; REST provides basic functionality
|
|
- **No hard dependency on REST /api/agents**: If WS is connected, REST endpoint is never called |