In progress
Eval Results
Results from the Tab Agent user study and AI backend comparison will be published here.
Pending
Claim 1 — Grouping quality
Does Tab Agent's AI grouping match how users would naturally organize their tabs? Measured via agreement rate test with 5–8 participants.
Pending
Claim 2 — Memory savings
Does sleeping tabs via Tab Agent free meaningful browser memory? Validated against Chrome Task Manager measurements.
Pending
Claim 3 — Task speed
Do users find and switch to tabs faster with Tab Agent than without? Measured via time-to-find tasks across 20 open tabs.
AI backend comparison
Grouping quality will be compared across three models using blind rating:
Gemini Nano
Baseline
Free, on-device
Claude Haiku
Gold standard
Anthropic API
GPT-4o mini
Comparison
OpenAI API