diff --git a/docs/design/camera-auto-discovery.md b/docs/design/camera-auto-discovery.md new file mode 100644 index 0000000..05f97d9 --- /dev/null +++ b/docs/design/camera-auto-discovery.md @@ -0,0 +1,508 @@ +# Camera Auto-Discovery and Registration Flow — Design Document + +> **Status:** Draft | **CUB:** 229 | **Date:** 2026-05-23 +> **Depends on:** MQTT_CONTRACT.md v1.0.0 | **Affects:** CUB-189 (POST /cameras), CUB-232 (MQTT subscriber) + +--- + +## 1. Overview + +When a new ESP32 camera node powers on and connects to the travel router, it must self-register with the RemoteRig hub without any manual configuration. This document defines the auto-discovery protocol, message schemas, database extensions, error handling, and retry behavior. + +### Design Goals + +1. **Zero-touch provisioning** — ESP32 node registers itself on first MQTT connect; no dashboard interaction required +2. **Re-registration safe** — same node rejoining after a reboot or network blip is recognized and re-associated, not duplicated +3. **Idempotent** — replaying an announce due to MQTT retain or offline buffering does not create duplicate cameras +4. **Observable** — the dashboard receives real-time SSE events when a camera appears or reconnects +5. **Backward compatible** — existing announce format (`MQTT_CONTRACT.md`) is enhanced, not replaced + +--- + +## 2. ESP32 Announce Message (Registration Request) + +### Topic + +``` +remoterig/cameras/+/announce +``` + +**Direction:** ESP32 → Hub | **QoS:** 2 | **Retain:** true + +Published once on ESP32 first boot (or factory reset). Retained so the hub sees it even if it restarts after the ESP32 came online. + +### JSON Schema + +```json +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "CameraAnnounce", + "type": "object", + "required": ["mac_address", "firmware_version", "capabilities"], + "properties": { + "mac_address": { + "type": "string", + "pattern": "^([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2})$", + "description": "ESP32 Wi-Fi station MAC address — the stable, globally unique hardware identifier" + }, + "firmware_version": { + "type": "string", + "pattern": "^\\d+\\.\\d+\\.\\d+$", + "description": "Semver of the ESP32 firmware (e.g. 0.2.0)" + }, + "capabilities": { + "type": "array", + "items": { "type": "string", "enum": ["start_stop", "status", "reboot", "heartbeat"] }, + "minItems": 1, + "description": "Supported feature flags. Minimal: [\"status\"]. Full: [\"start_stop\", \"status\", \"reboot\", \"heartbeat\"]" + }, + "friendly_name": { + "type": "string", + "maxLength": 64, + "description": "Default human-readable name (e.g. 'ESP32-AA-BB-CC'). If omitted, hub generates one from the MAC." + }, + "device_type": { + "type": "string", + "enum": ["esp32-gopro", "esp32-generic"], + "default": "esp32-gopro", + "description": "Device class for future multi-type support" + }, + "mqtt_client_id": { + "type": "string", + "maxLength": 64, + "description": "The MQTT client ID the ESP32 connected with (diagnostic)" + }, + "sdk_version": { + "type": "string", + "description": "ESP-IDF or Arduino SDK version (diagnostic)" + } + } +} +``` + +### Example — Minimal + +```json +{ + "mac_address": "AA:BB:CC:DD:EE:FF", + "firmware_version": "0.1.0", + "capabilities": ["status", "heartbeat"] +} +``` + +### Example — Full + +```json +{ + "mac_address": "AA:BB:CC:DD:EE:FF", + "firmware_version": "0.2.0", + "capabilities": ["start_stop", "status", "reboot", "heartbeat"], + "friendly_name": "GoPro Hero3 #1", + "device_type": "esp32-gopro", + "mqtt_client_id": "remoterig-ddeeff", + "sdk_version": "ESP-IDF v5.1.4" +} +``` + +### MAC Address as Identity + +The ESP32's Wi-Fi station MAC is the only stable, globally unique identifier available on a closed network (no cloud, no serial number burned at factory). It is: + +- **Globally unique** — OUI-assigned by Espressif +- **Immutable** — persists across firmware flashes and reboots +- **Available before MQTT connect** — no dependency on hub-assigned ID + +The hub maps `mac_address → camera_id`. The `camera_id` (e.g. `cam-001`) is a short, human-friendly alias assigned at registration time. + +--- + +## 3. Hub Response Protocol + +When the hub processes an announce, it MUST publish a response so the ESP32 knows its registration outcome. The response goes to the **command topic** for the assigned camera. + +### Response Topic + +``` +remoterig/cameras//command +``` + +Direction: Hub → ESP32 | QoS: 2 | Retain: false + +### Response Schema + +```json +{ + "$schema": "https://json-schema.org/draft/2020-12/schema", + "title": "RegistrationResponse", + "type": "object", + "required": ["command", "request_id"], + "properties": { + "command": { + "type": "string", + "enum": ["registered", "registration_error"], + "description": "Outcome of the registration request" + }, + "request_id": { + "type": "string", + "description": "Echo of the announce message's MAC + timestamp hash for correlation" + }, + "camera_id": { + "type": "string", + "pattern": "^cam-\\d{3}$", + "description": "Assigned camera ID (present on success only)" + }, + "error_code": { + "type": "string", + "enum": ["INVALID_MAC", "CAPABILITY_REQUIRED", "DB_WRITE_FAILED", "RATE_LIMITED"], + "description": "Machine-readable error code (present on failure only)" + }, + "error_message": { + "type": "string", + "description": "Human-readable error description (present on failure only)" + }, + "retry_after_sec": { + "type": "integer", + "minimum": 5, + "description": "Suggested retry delay in seconds (present on failure only)" + }, + "timestamp": { + "type": "string", + "format": "date-time", + "description": "ISO 8601 — hub clock time of the response" + } + } +} +``` + +### Success Response Example + +```json +{ + "command": "registered", + "request_id": "req-AABBCCDDEEFF-1684771200", + "camera_id": "cam-004", + "timestamp": "2026-05-23T14:30:00Z" +} +``` + +### Error Responses + +| error_code | Meaning | retry_after_sec | ESP32 action | +|---|---|---|---| +| `INVALID_MAC` | MAC address absent or malformed | — (fatal) | Log error, halt registration | +| `CAPABILITY_REQUIRED` | No valid capabilities specified | — (fatal) | Log error, halt registration | +| `DB_WRITE_FAILED` | Hub database is unavailable (disk full, etc.) | 60 | Retry after delay | +| `RATE_LIMITED` | Too many registration attempts in a window | 30 | Retry after delay | + +Example error response: + +```json +{ + "command": "registration_error", + "request_id": "req-AABBCCDDEEFF-1684771200", + "error_code": "DB_WRITE_FAILED", + "error_message": "Database write failed: disk I/O error", + "retry_after_sec": 60, + "timestamp": "2026-05-23T14:30:00Z" +} +``` + +### ESP32 Retry Logic + +``` +ESP32 publishes announce (QoS 2, retain) + │ + ├── Subscribe to remoterig/cameras/+/command (QoS 2) + │ + ├── Wait for command = "registered" or "registration_error" + │ + ├── Timeout after 30s → retry announce (with exponential backoff) + │ ├── 1st attempt: immediate + │ ├── 2nd attempt: wait 5s + │ ├── 3rd attempt: wait 10s + │ ├── 4th attempt: wait 20s + │ └── 5th+ attempt: wait 30s, repeat every 30s + │ + ├── On success (registered): store camera_id in NVS, begin normal status loop + │ + ├── On fatal error (INVALID_MAC, CAPABILITY_REQUIRED): + │ Log error, blink LED pattern, do not retry + │ + └── On transient error (DB_WRITE_FAILED, RATE_LIMITED): + Wait retry_after_sec (capped at 120s), then re-publish announce +``` + +**After successful registration:** On subsequent boots, the ESP32 reads `camera_id` from NVS (non-volatile storage). It does NOT re-publish announce unless: +- `camera_id` is missing from NVS (factory reset / first boot) +- The hub publishes `command: "reregister"` to force re-registration (admin action) + +--- + +## 4. Hub Processing Logic + +### Registration Flow + +``` +Hub receives announce on remoterig/cameras/+/announce + │ + ├── 1. VALIDATE: mac_address present? matches pattern? → if no: publish INVALID_MAC error + │ + ├── 2. VALIDATE: capabilities non-empty? → if no: publish CAPABILITY_REQUIRED error + │ + ├── 3. RATE LIMIT: >5 registrations from same IP/MAC in 60s? → RATE_LIMITED error + │ + ├── 4. LOOKUP: SELECT camera_id FROM cameras WHERE mac_address = ? + │ │ + │ ├── FOUND → EXISTING CAMERA: + │ │ ├── Update: friendly_name, firmware_version, capabilities, updated_at + │ │ ├── Publish registered response with existing camera_id + │ │ ├── SSE broadcast: "camera_reconnected" + │ │ └── Clear MQTT stale announce (publish empty retained message) + │ │ + │ └── NOT FOUND → NEW CAMERA: + │ ├── Generate camera_id: "cam-NNN" (sequential) + │ ├── INSERT into cameras + │ ├── Publish registered response with new camera_id + │ ├── SSE broadcast: "camera_registered" + │ └── Clear MQTT stale announce (publish empty retained message) + │ + └── 5. CLEANUP: Publish zero-byte retained message to announce topic + (prevents stale announces after camera is registered) +``` + +### Rate Limiting + +To protect against buggy firmware or network loops: + +| Window | Max Attempts | Action | +|--------|-------------|--------| +| 60 seconds | 5 per MAC | Reject with `RATE_LIMITED`, `retry_after_sec: 30` | +| 5 minutes | 20 per MAC | Reject with `RATE_LIMITED`, `retry_after_sec: 60` | + +Rate limit state is in-memory only (not persisted). Restarting the hub resets the counters. + +--- + +## 5. Database Schema Changes + +### Extended `cameras` Table + +```sql +-- Migration: 002_add_camera_registration_fields.sql + +ALTER TABLE cameras ADD COLUMN firmware_version TEXT; +ALTER TABLE cameras ADD COLUMN capabilities TEXT NOT NULL DEFAULT '["status"]'; +ALTER TABLE cameras ADD COLUMN device_type TEXT NOT NULL DEFAULT 'esp32-gopro'; +ALTER TABLE cameras ADD COLUMN registration_status TEXT NOT NULL DEFAULT 'pending' + CHECK(registration_status IN ('pending', 'registered', 'error', 'decommissioned')); +ALTER TABLE cameras ADD COLUMN last_announce_at DATETIME; +ALTER TABLE cameras ADD COLUMN registration_error TEXT; +ALTER TABLE cameras ADD COLUMN mqtt_client_id TEXT; + +-- Index for MAC lookups (already exists but confirm) +-- CREATE INDEX IF NOT EXISTS idx_cameras_mac ON cameras(mac_address); + +-- Index for registration status filtering +CREATE INDEX IF NOT EXISTS idx_cameras_reg_status ON cameras(registration_status); + +-- Index for finding stale registrations (cameras that announced but never sent status) +CREATE INDEX IF NOT EXISTS idx_cameras_last_announce ON cameras(last_announce_at); +``` + +### Full Table Definition (post-migration) + +| Column | Type | Constraints | Description | +|--------|------|-------------|-------------| +| `camera_id` | TEXT | PK | Hub-assigned short ID, e.g. `cam-001` | +| `friendly_name` | TEXT | NOT NULL | Human-readable name | +| `mac_address` | TEXT | UNIQUE | ESP32 Wi-Fi station MAC | +| `firmware_version` | TEXT | — | Firmware semver reported by ESP32 | +| `capabilities` | TEXT | NOT NULL, DEFAULT `'["status"]'` | JSON array of strings | +| `device_type` | TEXT | NOT NULL, DEFAULT `'esp32-gopro'` | Device class | +| `registration_status` | TEXT | NOT NULL, DEFAULT `'pending'` | `pending`, `registered`, `error`, `decommissioned` | +| `last_announce_at` | DATETIME | — | Timestamp of most recent announce | +| `registration_error` | TEXT | — | Last registration error message (cleared on success) | +| `mqtt_client_id` | TEXT | — | MQTT client ID from the announce | +| `created_at` | DATETIME | NOT NULL, DEFAULT `datetime('now')` | First registration timestamp | +| `updated_at` | DATETIME | NOT NULL, DEFAULT `datetime('now')` | Last update timestamp | + +### Go Model Extension + +The existing `models.Camera` struct gains: + +```go +type Camera struct { + CameraID string `json:"camera_id"` + FriendlyName string `json:"friendly_name"` + MacAddress string `json:"mac_address,omitempty"` + FirmwareVersion string `json:"firmware_version,omitempty"` + Capabilities []string `json:"capabilities"` + DeviceType string `json:"device_type"` + RegistrationStatus string `json:"registration_status"` + LastAnnounceAt *time.Time `json:"last_announce_at,omitempty"` + RegistrationError string `json:"registration_error,omitempty"` + MqttClientID string `json:"mqtt_client_id,omitempty"` + CreatedAt time.Time `json:"created_at"` + UpdatedAt time.Time `json:"updated_at"` +} +``` + +> **Note on `capabilities` storage:** SQLite does not have a native JSON array type. Store as TEXT (JSON-encoded array). Serialize/deserialize in the Go model layer. Migration default is `'["status"]'` — the minimum capability for a useful camera. + +--- + +## 6. Registration Flow Sequence Diagram + +```mermaid +sequenceDiagram + participant ESP32 + participant Broker as MQTT Broker (Mosquitto) + participant Hub as Go Hub + participant DB as SQLite + participant SSE as SSE Hub + participant UI as Dashboard UI + + Note over ESP32: Power on / First boot + + ESP32->>ESP32: Read camera_id from NVS + alt camera_id NOT in NVS (first boot or factory reset) + ESP32->>Broker: CONNECT (client_id: remoterig-) + Broker-->>ESP32: CONNACK + + ESP32->>Broker: SUBSCRIBE remoterig/cameras/+/command (QoS 2) + Broker-->>ESP32: SUBACK + + ESP32->>Broker: PUBLISH remoterig/cameras/announce (QoS 2, retain) + Note over ESP32,Broker: {mac_address, firmware_version, capabilities, ...} + Broker->>Hub: Forward announce + + Hub->>Hub: Validate: MAC present? capabilities non-empty? + alt Validation fails + Hub->>Broker: PUBLISH command {command: "registration_error", error_code: "INVALID_MAC"} + Broker->>ESP32: Forward error + Note over ESP32: Log error, halt (fatal) + else Validation passes + Hub->>Hub: Rate limit check + alt Rate limited + Hub->>Broker: PUBLISH command {error_code: "RATE_LIMITED", retry_after_sec: 30} + Broker->>ESP32: Forward error + Note over ESP32: Wait retry_after_sec, retry + else Allowed + Hub->>DB: SELECT camera_id WHERE mac_address = ? + alt MAC already registered + DB-->>Hub: camera_id = "cam-002" + Hub->>DB: UPDATE cameras SET firmware_version, capabilities, friendly_name, ... + Hub->>SSE: Broadcast "camera_reconnected" + else New MAC + DB-->>Hub: no rows + Hub->>DB: SELECT MAX(camera_id) → "cam-003" + Hub->>Hub: Generate "cam-004" + Hub->>DB: INSERT INTO cameras (cam-004, ...) + Hub->>SSE: Broadcast "camera_registered" + end + + Hub->>Broker: PUBLISH command {command: "registered", camera_id: "cam-004"} + Broker->>ESP32: Forward registration response + + Hub->>Broker: PUBLISH announce (zero-byte retain) — clear stale announce + + SSE-->>UI: camera_registered / camera_reconnected event + UI->>UI: Show new camera card in grid + end + end + else camera_id FOUND in NVS (subsequent boot) + Note over ESP32: Skip announce, proceed to status loop + ESP32->>Broker: PUBLISH status (QoS 1, retain) + Broker->>Hub: Forward status + Hub->>SSE: Broadcast camera_status + SSE-->>UI: Update camera card + end +``` + +--- + +## 7. Reconnection vs. Registration + +It is critical to distinguish two scenarios: + +### Scenario A: Reconnection (camera was previously registered) + +``` +ESP32 boots → reads camera_id from NVS → publishes status on remoterig/cameras//status +→ Hub sees status on a known camera_id → updates online flag → SSE broadcast +``` + +**No announce published.** The camera already has its identity. + +### Scenario B: First Registration (or factory reset) + +``` +ESP32 boots → NVS empty → publishes announce → Hub assigns camera_id → +ESP32 stores camera_id in NVS → begins status loop on remoterig/cameras//status +``` + +### Scenario C: Hub Restart (ESP32 already running) + +``` +Hub restarts → subscribes to remoterig/cameras/+/announce → +MQTT broker delivers retained announce messages → +Hub processes each → re-registration safe (MAC already exists → update only) +``` + +This is why announce messages use `retain: true`. If the hub restarts while ESP32s are running, it re-discovers them from retained announces. + +--- + +## 8. Security Considerations + +| Concern | Mitigation | +|---------|-----------| +| Rogue node spoofing a MAC | Closed network (travel router, no internet). MAC filtering at the router level as defense-in-depth (future). | +| Replay attacks | Announce is idempotent — replaying it only updates timestamps, doesn't create duplicates. | +| Denial of registration | Rate limiting (Section 4) prevents flooding. | +| Unauthorized decommission | No `decommission` MQTT command exists. Decommission is admin-only via HTTP API with API key auth. | + +--- + +## 9. Open Questions & Decisions + +| Question | Decision | Rationale | +|----------|----------|-----------| +| **MAC as identity?** | ✅ Yes | Only globally unique, immutable ID available on a closed network. | +| **`camera_id` format?** | `cam-NNN` (zero-padded sequential) | Short, sortable, human-friendly. Collision-free with DB sequence. | +| **Re-registration behavior?** | Update existing, don't create duplicate | Announcing with same MAC = reconnection, not new camera. | +| **Retain on announce?** | ✅ Yes, cleared after processing | Allows hub restart recovery. Cleanup prevents stale data. | +| **Response protocol?** | Publish to `command` topic | Reuses existing command channel. ESP32 subscribes before publishing announce. | +| **Capabilities stored?** | ✅ Yes, in `capabilities` column | Enables future feature gating (e.g., "this camera can't start/stop recording"). | +| **`device_type` added?** | ✅ Yes, default `esp32-gopro` | Allows future camera types (e.g., Raspberry Pi CSI, USB webcam). | +| **Dashboard rename after auto-registration?** | ✅ Yes (via existing POST /cameras or settings API in future) | Already called out in MQTT_CONTRACT.md. No new work in this CUB. | +| **NVS key for camera_id?** | `"cam_id"` | Simple, unambiguous. | + +--- + +## 10. Implementation Plan + +This design document covers the protocol and schema design. Implementation is tracked in the following sub-issues: + +| CUB | Title | Agent | Depends On | +|-----|-------|-------|------------| +| CUB-229 | Design camera auto-discovery and registration flow | Dex | — (this task) | +| CUB-229a | Migration: add registration fields to cameras table | Hex | CUB-229 | +| CUB-229b | Go model update: Camera struct with new fields | Dex | CUB-229a | +| CUB-229c | MQTT subscriber: registration response protocol | Dex | CUB-229b | +| CUB-229d | Rate limiting for announce messages | Dex | CUB-229b | +| CUB-229e | SSE events: camera_registered / camera_reconnected | Dex | CUB-229c | +| CUB-229f | ESP32 firmware: NVS storage + announce on first boot | Pip | CUB-229 | +| CUB-229g | ESP32 firmware: command subscription + registration ACK handling | Pip | CUB-229c | +| CUB-229h | Update MQTT_CONTRACT.md with registration response spec | Dex | CUB-229 | +| CUB-229i | Integration test: camera auto-registration end-to-end | Dex/Pip | CUB-229e, CUB-229g | + +--- + +## 11. References + +- [MQTT_CONTRACT.md](../MQTT_CONTRACT.md) — Network topology, topic hierarchy, existing status/heartbeat/command schemas +- [CONTEXT.md](../CONTEXT.md) — RemoteRig tech stack, directory layout, database schema +- [CUB-230 (Offline Buffer & Replay)](https://linear.app/cubecraft-creations/issue/CUB-230) — Related: offline buffering uses same dedup strategy +- [CUB-232 (MQTT Subscriber)](https://linear.app/cubecraft-creations/issue/CUB-232) — The subscriber that will implement this registration logic +- [CUB-189 (POST /cameras)](https://linear.app/cubecraft-creations/issue/CUB-189) — HTTP registration endpoint (may be replaced/supplemented by auto-discovery)