CUB-229: Design camera auto-discovery and registration flow #14

Merged
overseer merged 2 commits from agent/dex/CUB-229-camera-auto-discovery into dev 2026-05-28 06:58:40 -04:00
+508
View File
@@ -0,0 +1,508 @@
# Camera Auto-Discovery and Registration Flow — Design Document
> **Status:** Draft | **CUB:** 229 | **Date:** 2026-05-23
> **Depends on:** MQTT_CONTRACT.md v1.0.0 | **Affects:** CUB-189 (POST /cameras), CUB-232 (MQTT subscriber)
---
## 1. Overview
When a new ESP32 camera node powers on and connects to the travel router, it must self-register with the RemoteRig hub without any manual configuration. This document defines the auto-discovery protocol, message schemas, database extensions, error handling, and retry behavior.
### Design Goals
1. **Zero-touch provisioning** — ESP32 node registers itself on first MQTT connect; no dashboard interaction required
2. **Re-registration safe** — same node rejoining after a reboot or network blip is recognized and re-associated, not duplicated
3. **Idempotent** — replaying an announce due to MQTT retain or offline buffering does not create duplicate cameras
4. **Observable** — the dashboard receives real-time SSE events when a camera appears or reconnects
5. **Backward compatible** — existing announce format (`MQTT_CONTRACT.md`) is enhanced, not replaced
---
## 2. ESP32 Announce Message (Registration Request)
### Topic
```
remoterig/cameras/+/announce
```
**Direction:** ESP32 → Hub | **QoS:** 2 | **Retain:** true
Published once on ESP32 first boot (or factory reset). Retained so the hub sees it even if it restarts after the ESP32 came online.
### JSON Schema
```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "CameraAnnounce",
"type": "object",
"required": ["mac_address", "firmware_version", "capabilities"],
"properties": {
"mac_address": {
"type": "string",
"pattern": "^([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2})$",
"description": "ESP32 Wi-Fi station MAC address — the stable, globally unique hardware identifier"
},
"firmware_version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semver of the ESP32 firmware (e.g. 0.2.0)"
},
"capabilities": {
"type": "array",
"items": { "type": "string", "enum": ["start_stop", "status", "reboot", "heartbeat"] },
"minItems": 1,
"description": "Supported feature flags. Minimal: [\"status\"]. Full: [\"start_stop\", \"status\", \"reboot\", \"heartbeat\"]"
},
"friendly_name": {
"type": "string",
"maxLength": 64,
"description": "Default human-readable name (e.g. 'ESP32-AA-BB-CC'). If omitted, hub generates one from the MAC."
},
"device_type": {
"type": "string",
"enum": ["esp32-gopro", "esp32-generic"],
"default": "esp32-gopro",
"description": "Device class for future multi-type support"
},
"mqtt_client_id": {
"type": "string",
"maxLength": 64,
"description": "The MQTT client ID the ESP32 connected with (diagnostic)"
},
"sdk_version": {
"type": "string",
"description": "ESP-IDF or Arduino SDK version (diagnostic)"
}
}
}
```
### Example — Minimal
```json
{
"mac_address": "AA:BB:CC:DD:EE:FF",
"firmware_version": "0.1.0",
"capabilities": ["status", "heartbeat"]
}
```
### Example — Full
```json
{
"mac_address": "AA:BB:CC:DD:EE:FF",
"firmware_version": "0.2.0",
"capabilities": ["start_stop", "status", "reboot", "heartbeat"],
"friendly_name": "GoPro Hero3 #1",
"device_type": "esp32-gopro",
"mqtt_client_id": "remoterig-ddeeff",
"sdk_version": "ESP-IDF v5.1.4"
}
```
### MAC Address as Identity
The ESP32's Wi-Fi station MAC is the only stable, globally unique identifier available on a closed network (no cloud, no serial number burned at factory). It is:
- **Globally unique** — OUI-assigned by Espressif
- **Immutable** — persists across firmware flashes and reboots
- **Available before MQTT connect** — no dependency on hub-assigned ID
The hub maps `mac_address → camera_id`. The `camera_id` (e.g. `cam-001`) is a short, human-friendly alias assigned at registration time.
---
## 3. Hub Response Protocol
When the hub processes an announce, it MUST publish a response so the ESP32 knows its registration outcome. The response goes to the **command topic** for the assigned camera.
### Response Topic
```
remoterig/cameras/<camera_id>/command
```
Direction: Hub → ESP32 | QoS: 2 | Retain: false
### Response Schema
```json
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "RegistrationResponse",
"type": "object",
"required": ["command", "request_id"],
"properties": {
"command": {
"type": "string",
"enum": ["registered", "registration_error"],
"description": "Outcome of the registration request"
},
"request_id": {
"type": "string",
"description": "Echo of the announce message's MAC + timestamp hash for correlation"
},
"camera_id": {
"type": "string",
"pattern": "^cam-\\d{3}$",
"description": "Assigned camera ID (present on success only)"
},
"error_code": {
"type": "string",
"enum": ["INVALID_MAC", "CAPABILITY_REQUIRED", "DB_WRITE_FAILED", "RATE_LIMITED"],
"description": "Machine-readable error code (present on failure only)"
},
"error_message": {
"type": "string",
"description": "Human-readable error description (present on failure only)"
},
"retry_after_sec": {
"type": "integer",
"minimum": 5,
"description": "Suggested retry delay in seconds (present on failure only)"
},
"timestamp": {
"type": "string",
"format": "date-time",
"description": "ISO 8601 — hub clock time of the response"
}
}
}
```
### Success Response Example
```json
{
"command": "registered",
"request_id": "req-AABBCCDDEEFF-1684771200",
"camera_id": "cam-004",
"timestamp": "2026-05-23T14:30:00Z"
}
```
### Error Responses
| error_code | Meaning | retry_after_sec | ESP32 action |
|---|---|---|---|
| `INVALID_MAC` | MAC address absent or malformed | — (fatal) | Log error, halt registration |
| `CAPABILITY_REQUIRED` | No valid capabilities specified | — (fatal) | Log error, halt registration |
| `DB_WRITE_FAILED` | Hub database is unavailable (disk full, etc.) | 60 | Retry after delay |
| `RATE_LIMITED` | Too many registration attempts in a window | 30 | Retry after delay |
Example error response:
```json
{
"command": "registration_error",
"request_id": "req-AABBCCDDEEFF-1684771200",
"error_code": "DB_WRITE_FAILED",
"error_message": "Database write failed: disk I/O error",
"retry_after_sec": 60,
"timestamp": "2026-05-23T14:30:00Z"
}
```
### ESP32 Retry Logic
```
ESP32 publishes announce (QoS 2, retain)
├── Subscribe to remoterig/cameras/+/command (QoS 2)
├── Wait for command = "registered" or "registration_error"
├── Timeout after 30s → retry announce (with exponential backoff)
│ ├── 1st attempt: immediate
│ ├── 2nd attempt: wait 5s
│ ├── 3rd attempt: wait 10s
│ ├── 4th attempt: wait 20s
│ └── 5th+ attempt: wait 30s, repeat every 30s
├── On success (registered): store camera_id in NVS, begin normal status loop
├── On fatal error (INVALID_MAC, CAPABILITY_REQUIRED):
│ Log error, blink LED pattern, do not retry
└── On transient error (DB_WRITE_FAILED, RATE_LIMITED):
Wait retry_after_sec (capped at 120s), then re-publish announce
```
**After successful registration:** On subsequent boots, the ESP32 reads `camera_id` from NVS (non-volatile storage). It does NOT re-publish announce unless:
- `camera_id` is missing from NVS (factory reset / first boot)
- The hub publishes `command: "reregister"` to force re-registration (admin action)
---
## 4. Hub Processing Logic
### Registration Flow
```
Hub receives announce on remoterig/cameras/+/announce
├── 1. VALIDATE: mac_address present? matches pattern? → if no: publish INVALID_MAC error
├── 2. VALIDATE: capabilities non-empty? → if no: publish CAPABILITY_REQUIRED error
├── 3. RATE LIMIT: >5 registrations from same IP/MAC in 60s? → RATE_LIMITED error
├── 4. LOOKUP: SELECT camera_id FROM cameras WHERE mac_address = ?
│ │
│ ├── FOUND → EXISTING CAMERA:
│ │ ├── Update: friendly_name, firmware_version, capabilities, updated_at
│ │ ├── Publish registered response with existing camera_id
│ │ ├── SSE broadcast: "camera_reconnected"
│ │ └── Clear MQTT stale announce (publish empty retained message)
│ │
│ └── NOT FOUND → NEW CAMERA:
│ ├── Generate camera_id: "cam-NNN" (sequential)
│ ├── INSERT into cameras
│ ├── Publish registered response with new camera_id
│ ├── SSE broadcast: "camera_registered"
│ └── Clear MQTT stale announce (publish empty retained message)
└── 5. CLEANUP: Publish zero-byte retained message to announce topic
(prevents stale announces after camera is registered)
```
### Rate Limiting
To protect against buggy firmware or network loops:
| Window | Max Attempts | Action |
|--------|-------------|--------|
| 60 seconds | 5 per MAC | Reject with `RATE_LIMITED`, `retry_after_sec: 30` |
| 5 minutes | 20 per MAC | Reject with `RATE_LIMITED`, `retry_after_sec: 60` |
Rate limit state is in-memory only (not persisted). Restarting the hub resets the counters.
---
## 5. Database Schema Changes
### Extended `cameras` Table
```sql
-- Migration: 002_add_camera_registration_fields.sql
ALTER TABLE cameras ADD COLUMN firmware_version TEXT;
ALTER TABLE cameras ADD COLUMN capabilities TEXT NOT NULL DEFAULT '["status"]';
ALTER TABLE cameras ADD COLUMN device_type TEXT NOT NULL DEFAULT 'esp32-gopro';
ALTER TABLE cameras ADD COLUMN registration_status TEXT NOT NULL DEFAULT 'pending'
CHECK(registration_status IN ('pending', 'registered', 'error', 'decommissioned'));
ALTER TABLE cameras ADD COLUMN last_announce_at DATETIME;
ALTER TABLE cameras ADD COLUMN registration_error TEXT;
ALTER TABLE cameras ADD COLUMN mqtt_client_id TEXT;
-- Index for MAC lookups (already exists but confirm)
-- CREATE INDEX IF NOT EXISTS idx_cameras_mac ON cameras(mac_address);
-- Index for registration status filtering
CREATE INDEX IF NOT EXISTS idx_cameras_reg_status ON cameras(registration_status);
-- Index for finding stale registrations (cameras that announced but never sent status)
CREATE INDEX IF NOT EXISTS idx_cameras_last_announce ON cameras(last_announce_at);
```
### Full Table Definition (post-migration)
| Column | Type | Constraints | Description |
|--------|------|-------------|-------------|
| `camera_id` | TEXT | PK | Hub-assigned short ID, e.g. `cam-001` |
| `friendly_name` | TEXT | NOT NULL | Human-readable name |
| `mac_address` | TEXT | UNIQUE | ESP32 Wi-Fi station MAC |
| `firmware_version` | TEXT | — | Firmware semver reported by ESP32 |
| `capabilities` | TEXT | NOT NULL, DEFAULT `'["status"]'` | JSON array of strings |
| `device_type` | TEXT | NOT NULL, DEFAULT `'esp32-gopro'` | Device class |
| `registration_status` | TEXT | NOT NULL, DEFAULT `'pending'` | `pending`, `registered`, `error`, `decommissioned` |
| `last_announce_at` | DATETIME | — | Timestamp of most recent announce |
| `registration_error` | TEXT | — | Last registration error message (cleared on success) |
| `mqtt_client_id` | TEXT | — | MQTT client ID from the announce |
| `created_at` | DATETIME | NOT NULL, DEFAULT `datetime('now')` | First registration timestamp |
| `updated_at` | DATETIME | NOT NULL, DEFAULT `datetime('now')` | Last update timestamp |
### Go Model Extension
The existing `models.Camera` struct gains:
```go
type Camera struct {
CameraID string `json:"camera_id"`
FriendlyName string `json:"friendly_name"`
MacAddress string `json:"mac_address,omitempty"`
FirmwareVersion string `json:"firmware_version,omitempty"`
Capabilities []string `json:"capabilities"`
DeviceType string `json:"device_type"`
RegistrationStatus string `json:"registration_status"`
LastAnnounceAt *time.Time `json:"last_announce_at,omitempty"`
RegistrationError string `json:"registration_error,omitempty"`
MqttClientID string `json:"mqtt_client_id,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
```
> **Note on `capabilities` storage:** SQLite does not have a native JSON array type. Store as TEXT (JSON-encoded array). Serialize/deserialize in the Go model layer. Migration default is `'["status"]'` — the minimum capability for a useful camera.
---
## 6. Registration Flow Sequence Diagram
```mermaid
sequenceDiagram
participant ESP32
participant Broker as MQTT Broker (Mosquitto)
participant Hub as Go Hub
participant DB as SQLite
participant SSE as SSE Hub
participant UI as Dashboard UI
Note over ESP32: Power on / First boot
ESP32->>ESP32: Read camera_id from NVS
alt camera_id NOT in NVS (first boot or factory reset)
ESP32->>Broker: CONNECT (client_id: remoterig-<mac_last6>)
Broker-->>ESP32: CONNACK
ESP32->>Broker: SUBSCRIBE remoterig/cameras/+/command (QoS 2)
Broker-->>ESP32: SUBACK
ESP32->>Broker: PUBLISH remoterig/cameras/announce (QoS 2, retain)
Note over ESP32,Broker: {mac_address, firmware_version, capabilities, ...}
Broker->>Hub: Forward announce
Hub->>Hub: Validate: MAC present? capabilities non-empty?
alt Validation fails
Hub->>Broker: PUBLISH command {command: "registration_error", error_code: "INVALID_MAC"}
Broker->>ESP32: Forward error
Note over ESP32: Log error, halt (fatal)
else Validation passes
Hub->>Hub: Rate limit check
alt Rate limited
Hub->>Broker: PUBLISH command {error_code: "RATE_LIMITED", retry_after_sec: 30}
Broker->>ESP32: Forward error
Note over ESP32: Wait retry_after_sec, retry
else Allowed
Hub->>DB: SELECT camera_id WHERE mac_address = ?
alt MAC already registered
DB-->>Hub: camera_id = "cam-002"
Hub->>DB: UPDATE cameras SET firmware_version, capabilities, friendly_name, ...
Hub->>SSE: Broadcast "camera_reconnected"
else New MAC
DB-->>Hub: no rows
Hub->>DB: SELECT MAX(camera_id) → "cam-003"
Hub->>Hub: Generate "cam-004"
Hub->>DB: INSERT INTO cameras (cam-004, ...)
Hub->>SSE: Broadcast "camera_registered"
end
Hub->>Broker: PUBLISH command {command: "registered", camera_id: "cam-004"}
Broker->>ESP32: Forward registration response
Hub->>Broker: PUBLISH announce (zero-byte retain) — clear stale announce
SSE-->>UI: camera_registered / camera_reconnected event
UI->>UI: Show new camera card in grid
end
end
else camera_id FOUND in NVS (subsequent boot)
Note over ESP32: Skip announce, proceed to status loop
ESP32->>Broker: PUBLISH status (QoS 1, retain)
Broker->>Hub: Forward status
Hub->>SSE: Broadcast camera_status
SSE-->>UI: Update camera card
end
```
---
## 7. Reconnection vs. Registration
It is critical to distinguish two scenarios:
### Scenario A: Reconnection (camera was previously registered)
```
ESP32 boots → reads camera_id from NVS → publishes status on remoterig/cameras/<id>/status
→ Hub sees status on a known camera_id → updates online flag → SSE broadcast
```
**No announce published.** The camera already has its identity.
### Scenario B: First Registration (or factory reset)
```
ESP32 boots → NVS empty → publishes announce → Hub assigns camera_id →
ESP32 stores camera_id in NVS → begins status loop on remoterig/cameras/<id>/status
```
### Scenario C: Hub Restart (ESP32 already running)
```
Hub restarts → subscribes to remoterig/cameras/+/announce →
MQTT broker delivers retained announce messages →
Hub processes each → re-registration safe (MAC already exists → update only)
```
This is why announce messages use `retain: true`. If the hub restarts while ESP32s are running, it re-discovers them from retained announces.
---
## 8. Security Considerations
| Concern | Mitigation |
|---------|-----------|
| Rogue node spoofing a MAC | Closed network (travel router, no internet). MAC filtering at the router level as defense-in-depth (future). |
| Replay attacks | Announce is idempotent — replaying it only updates timestamps, doesn't create duplicates. |
| Denial of registration | Rate limiting (Section 4) prevents flooding. |
| Unauthorized decommission | No `decommission` MQTT command exists. Decommission is admin-only via HTTP API with API key auth. |
---
## 9. Open Questions & Decisions
| Question | Decision | Rationale |
|----------|----------|-----------|
| **MAC as identity?** | ✅ Yes | Only globally unique, immutable ID available on a closed network. |
| **`camera_id` format?** | `cam-NNN` (zero-padded sequential) | Short, sortable, human-friendly. Collision-free with DB sequence. |
| **Re-registration behavior?** | Update existing, don't create duplicate | Announcing with same MAC = reconnection, not new camera. |
| **Retain on announce?** | ✅ Yes, cleared after processing | Allows hub restart recovery. Cleanup prevents stale data. |
| **Response protocol?** | Publish to `command` topic | Reuses existing command channel. ESP32 subscribes before publishing announce. |
| **Capabilities stored?** | ✅ Yes, in `capabilities` column | Enables future feature gating (e.g., "this camera can't start/stop recording"). |
| **`device_type` added?** | ✅ Yes, default `esp32-gopro` | Allows future camera types (e.g., Raspberry Pi CSI, USB webcam). |
| **Dashboard rename after auto-registration?** | ✅ Yes (via existing POST /cameras or settings API in future) | Already called out in MQTT_CONTRACT.md. No new work in this CUB. |
| **NVS key for camera_id?** | `"cam_id"` | Simple, unambiguous. |
---
## 10. Implementation Plan
This design document covers the protocol and schema design. Implementation is tracked in the following sub-issues:
| CUB | Title | Agent | Depends On |
|-----|-------|-------|------------|
| CUB-229 | Design camera auto-discovery and registration flow | Dex | — (this task) |
| CUB-229a | Migration: add registration fields to cameras table | Hex | CUB-229 |
| CUB-229b | Go model update: Camera struct with new fields | Dex | CUB-229a |
| CUB-229c | MQTT subscriber: registration response protocol | Dex | CUB-229b |
| CUB-229d | Rate limiting for announce messages | Dex | CUB-229b |
| CUB-229e | SSE events: camera_registered / camera_reconnected | Dex | CUB-229c |
| CUB-229f | ESP32 firmware: NVS storage + announce on first boot | Pip | CUB-229 |
| CUB-229g | ESP32 firmware: command subscription + registration ACK handling | Pip | CUB-229c |
| CUB-229h | Update MQTT_CONTRACT.md with registration response spec | Dex | CUB-229 |
| CUB-229i | Integration test: camera auto-registration end-to-end | Dex/Pip | CUB-229e, CUB-229g |
---
## 11. References
- [MQTT_CONTRACT.md](../MQTT_CONTRACT.md) — Network topology, topic hierarchy, existing status/heartbeat/command schemas
- [CONTEXT.md](../CONTEXT.md) — RemoteRig tech stack, directory layout, database schema
- [CUB-230 (Offline Buffer & Replay)](https://linear.app/cubecraft-creations/issue/CUB-230) — Related: offline buffering uses same dedup strategy
- [CUB-232 (MQTT Subscriber)](https://linear.app/cubecraft-creations/issue/CUB-232) — The subscriber that will implement this registration logic
- [CUB-189 (POST /cameras)](https://linear.app/cubecraft-creations/issue/CUB-189) — HTTP registration endpoint (may be replaced/supplemented by auto-discovery)