What the working sessions settled
Lessons learned
The decisions below are the ones the prototype sessions and integration reviews actually resolved — engineering calls worth carrying forward, each with the reasoning that produced it and the conditions under which it should be reopened.
A note on scope: the Seeing States runtime items below — capture, the floating surface, docking, and persistence — are documented design decisions carried forward from the working sessions. This hub’s bench is a state-machine simulation, not a live screen-capture runtime, so these are lessons to apply when building the real thing — not things implemented here.
01
Read cursor state from a DOM data-cursor-state attribute, not React context.
Rationale. Any element can declare its own posture and the cursor walks the DOM ancestry to pick it up. Posture can be sprinkled across deep-nested host code with no prop-drilling.
Revisit if. Hosts that can’t be annotated with data-* attributes (e.g. canvas or cross-origin content) need a different posture source.
HANDOFF §8.1
02
Implement slow-follow as a state-dependent lerp factor (0.055 held vs 0.32 normal), not a CSS transition.
Rationale. The slowed held motion is the entire concept in one gesture; driving it from the lerp factor keeps it physical and responsive rather than a fixed-duration ease. Tested clean at 60Hz.
Revisit if. Test at 144Hz before shipping; if the critically-damped lerp feels mechanical, try an underdamped spring instead.
HANDOFF §8.2 · §9.1
03
Anchor markup to elements via a target selector, recomputing the bounding rect on layout and resize.
Rationale. Screens swap and viewports change; selector-anchored callouts stay glued to their elements where pixel-anchored ones would drift off.
Revisit if. Targets that move continuously (animations, scroll-linked transforms) may need per-frame recompute rather than layout/resize only.
HANDOFF §8.3
04
Author scripted scenarios as keyframes-as-data; the player just walks the array.
Rationale. Each frame is a self-contained snapshot with its own duration. The player carries no scenario-specific logic, so flows stay inspectable, reorderable, and cheap to edit.
Revisit if. Branching or user-driven flows that aren’t a single linear sequence will outgrow a flat array and need a real state machine.
HANDOFF §8.4
05
Debounce every shown surface with a min-dwell of ≥350ms.
Rationale. Backend state can move on faster than a human can read; holding each surface for a floor duration stops chips and overlays from flickering in and out.
Revisit if. If 350ms feels laggy for fast-confirming actions, tune per-surface — but never below the readable floor.
SPEC §4
06
Default capture to external getDisplayMedia with selfBrowserSurface: ‘exclude’; self-tab is a secondary, masked mode.
Rationale. Roughly 22 of 28 audited scenarios involve content outside the Copilot surface; defaulting to own-tab would break Watch-for, form-fill, IT-alert, dashboard, and compare. Excluding the self-surface also removes the recursion (copilot-sees-itself) risk.
Revisit if. Firefox lacks selfBrowserSurface (fall back to a bbox self-mask); Safari has neither it nor reliable preferCurrentTab (treat as screen-scope only).
INTEGRATION §9.1
07
Make the floating surface a Document Picture-in-Picture window when available; CSS position: fixed is the fallback only.
Rationale. Document PiP is the only browser primitive that is genuinely always-on-top across tab and app switches and across monitors. With CSS-fixed alone, “always-on-top” is a polite fiction — the toolbar vanishes the moment the user looks at the app they need help with.
Revisit if. Firefox/Safari fall back to CSS-fixed scoped to the host page with explicit “always-on-top is not available” copy; camera (V-07) calls must route back through the opener, not the PiP doc.
INTEGRATION §9.2
08
Dock the toolbar bottom-right at rest: draggable, snap-to-edge across eight anchors, position persisted per surface.
Rationale. No single fixed position avoids occlusion across the priority scenarios (right-aligned dashboards, center-screen forms, modal IT alerts), so it must move; snap anchors keep muscle memory while avoiding a layout-shifting free-float. Right and bottom keep the gaze near content and clear of browser/OS chrome.
Revisit if. In PiP mode the OS owns window position, so the 8 anchors apply only to CSS-fixed and sidebar; the V6 voice-pill halo stays top-center always.
INTEGRATION §9.3
09
Persist state in three tiers — T1 ephemeral (RAM), T2 session (sessionStorage), T3 persistent (localStorage + IndexedDB); pixel data never leaves MacroStore.
Rationale. Refresh should feel like “right where I left off for setup, but vision starts clean.” Lifecycle/attention/buffer are derived fresh per share (T1); an armed monitor evaporates and must re-consent (T2); macros, dock anchors, and consent survive (T3). Frames and telemetry never touch web storage.
Revisit if. A monitor that survives refresh must re-prompt (refresh is not silent consent); IndexedDB quota-exceeded on macros needs its own failure-table row and an offer to delete oldest first.
INTEGRATION §9.4