Sequence Live
A real-time AI live assistant. Understands what you're looking at and answers in real time — a fast first response lands immediately, followed by a deeper follow-up, inside a minimal HUD that stays out of your way.
A live assistant that keeps up with you.
Live conversations don't wait for slow models. Sequence Live pairs two agents in a streaming merge: a fast agent lands a natural first response almost instantly, while a deeper agent thinks in parallel and extends the answer as you go. No dead air, no generic placeholders — a real answer, right when you need it.
Because the assistant also understands what you're looking at — a codebase, a slide, a document, a whiteboard — its responses stay grounded in the context in front of you, not just the words being spoken.
From spoken question to on-screen answer in under half a second.
Silence detected
Voice-activity detection confirms the other speaker has stopped.
Transcription
On-device GPU transcription. Sub-100 ms warm inference.
Parallel dispatch
Both agents receive identical context at the same moment.
First response
Fast initial answer lands on the HUD, grounded in visible context.
Follow-up streams in
Deeper continuation appends to the same surface, no repetition.
Four systems that keep it fast, grounded, and unobtrusive.
Faster responses, deeper follow-up
A fast first response lands immediately, and a deeper follow-up streams in right after — no dead air, no waiting.
Audio in and out, on-device
Captures both sides of a conversation with speaker-labelled transcripts, and returns answers through the same low-latency loop.
Minimal, doesn't obstruct your view
A quiet overlay that surfaces answers exactly where you need them and stays out of the way the rest of the time.
Under 500 ms end to end
GPU keepalive, cached prompts, and a tight first-response budget eliminate cold-start spikes.
What you see mid-call.
The live HUD and transcript view.
Where it stands today.
Core application shell
Minimal HUD overlay, foundation for the live surface.
Dual-channel audio + transcription
Both sides of a conversation captured in parallel, transcribed on-device with sub-100 ms inference.
Dual-agent streaming merge
Fast first response and deeper follow-up wired together. First response lands immediately; follow-up streams in without repetition.
Visual understanding
The assistant now grounds its answers in what's on your screen — code, slides, documents — not just the spoken context.
Scaling, auth, launch prep
Everything works end-to-end. Remaining work is productization: authentication, account system, deployment, and getting it into people's hands.