CUB-200: Implement WebSocket Gateway Client #42

Merged
overseer merged 6 commits from agent/dex/CUB-200-ws-gateway-client into dev 2026-05-21 06:54:55 -04:00
Owner

Summary

Replaces the REST poller with a proper WebSocket client speaking the OpenClaw gateway protocol (v3).

Changes

  • wsclient.go: WebSocket client with v3 handshake (connect.challengeconnecthello-ok), frame routing (req/res/event), JSON-RPC Send(), auto-reconnect with exponential backoff (1s → 30s max)
  • sync.go: Initial sync via agents.list + sessions.list RPCs — merges session runtime state into AgentCardData, broadcasts fleet.update
  • events.go: Real-time event handlers for sessions.changed, presence, and agent.config — DB update first, then SSE broadcast
  • client.go: REST poller retained as fallback (WS is primary path)
  • config.go: Added GATEWAY_WS_URL and OPENCLAW_GATEWAY_TOKEN env vars
  • main.go: Wires WS client as primary, REST as fallback
  • .env.example: Documents new WS config vars

Protocol Flow

  1. Connect to ws://<gateway-host>:18789/
  2. Handshake: wait for connect.challenge → send connect with auth token → receive hello-ok
  3. On connect: call agents.list + sessions.list RPCs for initial state
  4. Subscribe to broadcast events (sessions.changed, presence, agent.config)
  5. Merge agent config + session runtime into AgentCardData
  6. Push changes through existing SSE broker
  7. Auto-reconnect with exponential backoff (1s → 30s max)
  8. Fall back to seeded demo data if WS connection fails

Config Vars

  • GATEWAY_WS_URL — WebSocket endpoint (default: ws://host.docker.internal:18789/)
  • OPENCLAW_GATEWAY_TOKEN — auth token for gateway

Build Verification

  • go build ./... → PASS
  • go vet ./... → PASS

Closes CUB-200

## Summary Replaces the REST poller with a proper WebSocket client speaking the OpenClaw gateway protocol (v3). ### Changes - **wsclient.go**: WebSocket client with v3 handshake (`connect.challenge` → `connect` → `hello-ok`), frame routing (`req`/`res`/`event`), JSON-RPC `Send()`, auto-reconnect with exponential backoff (1s → 30s max) - **sync.go**: Initial sync via `agents.list` + `sessions.list` RPCs — merges session runtime state into `AgentCardData`, broadcasts `fleet.update` - **events.go**: Real-time event handlers for `sessions.changed`, `presence`, and `agent.config` — DB update first, then SSE broadcast - **client.go**: REST poller retained as fallback (WS is primary path) - **config.go**: Added `GATEWAY_WS_URL` and `OPENCLAW_GATEWAY_TOKEN` env vars - **main.go**: Wires WS client as primary, REST as fallback - **.env.example**: Documents new WS config vars ### Protocol Flow 1. Connect to `ws://<gateway-host>:18789/` 2. Handshake: wait for `connect.challenge` → send `connect` with auth token → receive `hello-ok` 3. On connect: call `agents.list` + `sessions.list` RPCs for initial state 4. Subscribe to broadcast events (`sessions.changed`, `presence`, `agent.config`) 5. Merge agent config + session runtime into `AgentCardData` 6. Push changes through existing SSE broker 7. Auto-reconnect with exponential backoff (1s → 30s max) 8. Fall back to seeded demo data if WS connection fails ### Config Vars - `GATEWAY_WS_URL` — WebSocket endpoint (default: `ws://host.docker.internal:18789/`) - `OPENCLAW_GATEWAY_TOKEN` — auth token for gateway ### Build Verification - `go build ./...` → PASS - `go vet ./...` → PASS Closes CUB-200
Dex added 1 commit 2026-05-20 07:33:26 -04:00
CUB-200: implement WebSocket gateway client with v3 protocol
Dev Build / build-test (pull_request) Has been cancelled
Dev Build / deploy-dev (pull_request) Has been cancelled
d28d6e8dac
Replace REST poller with WebSocket client as primary gateway connection:

- wsclient.go: WebSocket client with v3 handshake (connect.challenge →
  connect → hello-ok), frame routing (req/res/event), JSON-RPC Send(),
  auto-reconnect with exponential backoff (1s → 30s max)
- sync.go: Initial sync via agents.list + sessions.list RPCs, merge
  session runtime state into AgentCardData, broadcast fleet.update
- events.go: Real-time event handlers for sessions.changed, presence,
  and agent.config — DB update first, then SSE broadcast
- client.go: REST poller retained as fallback (WS is primary)
- config.go: Add GATEWAY_WS_URL and OPENCLAW_GATEWAY_TOKEN env vars
- main.go: Wire WS client as primary, REST as fallback
- .env.example: Document new WS config vars

Fallback: If WS connection fails, seeded demo data + REST polling
remain available.
overseer added 1 commit 2026-05-20 08:12:38 -04:00
Merge branch 'dev' into agent/dex/CUB-200-ws-gateway-client
Dev Build & Deploy / test-and-build (pull_request) Failing after 0s
Dev Build & Deploy / docker-build-push (pull_request) Has been skipped
6fd2d9bec4
Dex added 1 commit 2026-05-20 12:30:02 -04:00
CUB-200: sync CI workflows with dev branch
Dev Build & Deploy / test-and-build (pull_request) Failing after 0s
Dev Build & Deploy / docker-build-push (pull_request) Has been skipped
d9a1640b10
- Overwrite dev.yml with dev's consolidated version (parameterized Go/Node versions, cleaner install steps)
- Add deploy-dev.yaml from dev (was missing on this branch)
- build-dev.yaml confirmed absent (was deleted on dev in PR #45)
Dex added 1 commit 2026-05-20 17:26:24 -04:00
CUB-200: resolve merge conflicts with dev — adopt dev's consolidated workflows and improved Go gateway code
Dev Build & Deploy / test-and-build (pull_request) Failing after 0s
Dev Build & Deploy / docker-build-push (pull_request) Has been skipped
1b82e1d3a6
Dex added 1 commit 2026-05-20 17:42:37 -04:00
CUB-200: fix WS initial sync ordering — start readLoop before initialSync
Dev Build & Deploy / test-and-build (pull_request) Failing after 0s
Dev Build & Deploy / docker-build-push (pull_request) Has been skipped
d370d5ec23
The root cause of the initial sync timeout was that connectAndRun called
initialSync (which uses Send/RPC) before starting readLoop, but Send's
response delivery depends on readLoop→routeFrame→handleResponse. Without
the readLoop running, agents.list and sessions.list would always time out.

Fix: start readLoop in a goroutine before calling initialSync so that
RPC responses are properly routed back to pending Send() calls. After
initialSync completes, event handlers are registered and MarkWSReady
is called. The connectAndRun function then blocks on the readLoop
goroutine's completion.

Also added TestConnectAndRun_InitialSyncOrdering which verifies that
agents are persisted from initial sync (would hang/timeout under the
old ordering).
Dex added 1 commit 2026-05-20 17:52:49 -04:00
CUB-200: fix event-loss race — register handlers before readLoop starts
Dev Build & Deploy / test-and-build (pull_request) Failing after 0s
Dev Build & Deploy / docker-build-push (pull_request) Has been skipped
b7b05bb4e3
Move registerEventHandlers() call before the readLoop goroutine starts
in connectAndRun(). This eliminates the startup window where live gateway
events were actively read and dropped as 'unhandled' because handler
registration happened only after initialSync completed.

The handlers only depend on c.agents and c.broker, which are wired in the
constructor — they do not require initialSync to have completed.

Also adds TestConnectAndRun_EventNotLostDuringSync regression test that
sends a live presence event during initial sync and asserts it is not lost.

All gateway tests pass with -race.
overseer merged commit fd60b0bb57 into dev 2026-05-21 06:54:55 -04:00
overseer deleted branch agent/dex/CUB-200-ws-gateway-client 2026-05-21 06:54:56 -04:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: CubeCraft-Creations/Control-Center#42