In progress

Eval Results

Results from the Tab Agent user study and AI backend comparison will be published here.

Pending

Claim 1 — Grouping quality

Does Tab Agent's AI grouping match how users would naturally organize their tabs? Measured via agreement rate test with 5–8 participants.

Pending

Claim 2 — Memory savings

Does sleeping tabs via Tab Agent free meaningful browser memory? Validated against Chrome Task Manager measurements.

Pending

Claim 3 — Task speed

Do users find and switch to tabs faster with Tab Agent than without? Measured via time-to-find tasks across 20 open tabs.

AI backend comparison

Grouping quality will be compared across three models using blind rating:

Gemini Nano
Baseline
Free, on-device
Claude Haiku
Gold standard
Anthropic API
GPT-4o mini
Comparison
OpenAI API