CUB-200: fix event-loss race — register handlers before readLoop starts
Move registerEventHandlers() call before the readLoop goroutine starts in connectAndRun(). This eliminates the startup window where live gateway events were actively read and dropped as 'unhandled' because handler registration happened only after initialSync completed. The handlers only depend on c.agents and c.broker, which are wired in the constructor — they do not require initialSync to have completed. Also adds TestConnectAndRun_EventNotLostDuringSync regression test that sends a live presence event during initial sync and asserts it is not lost. All gateway tests pass with -race.
This commit is contained in:
@@ -229,26 +229,31 @@ func (c *WSClient) connectAndRun(ctx context.Context) error {
|
||||
c.connId = helloOK.ConnID
|
||||
c.connMu.Unlock()
|
||||
|
||||
// Step 2b: Start the read loop in a goroutine so that Send() in
|
||||
// Step 2b: Register live event handlers BEFORE starting the read
|
||||
// loop. This eliminates the race window where readLoop dispatches
|
||||
// live events as "unhandled" because no handlers are registered yet.
|
||||
// The handlers only depend on c.agents and c.broker, which are wired
|
||||
// in the constructor — they do not need initialSync to have completed.
|
||||
c.registerEventHandlers()
|
||||
|
||||
// Step 2c: Start the read loop in a goroutine so that Send() in
|
||||
// initialSync can receive responses. The read loop goroutine will
|
||||
// continue running after initialSync completes, routing live events
|
||||
// and any future RPC responses.
|
||||
// and any future RPC responses. Because handlers are already
|
||||
// registered, any events arriving during or after initialSync are
|
||||
// dispatched correctly.
|
||||
readLoopErrCh := make(chan error, 1)
|
||||
go func() {
|
||||
readLoopErrCh <- c.readLoop(ctx, conn)
|
||||
}()
|
||||
|
||||
// Step 2c: Initial sync — fetch agents + sessions from gateway.
|
||||
// This now works because the read loop is active and will route
|
||||
// Step 2d: Initial sync — fetch agents + sessions from gateway.
|
||||
// This works because the read loop is active and will route
|
||||
// response frames back to Send() via handleResponse.
|
||||
if err := c.initialSync(ctx); err != nil {
|
||||
c.logger.Warn("initial sync failed, will continue with read loop", "error", err)
|
||||
}
|
||||
|
||||
// Step 2d: Register live event handlers (read loop is already
|
||||
// active, so events will be dispatched immediately)
|
||||
c.registerEventHandlers()
|
||||
|
||||
// Notify REST client that WS is live so it stands down.
|
||||
// This must happen AFTER initialSync so that the REST poller
|
||||
// doesn't start polling while we're still syncing.
|
||||
|
||||
Reference in New Issue
Block a user