AI Performance & Quality
Models Comparison
Compare opens a side-by-side view with two (or more) playground panels. Each panel has its own model picker, and a Sync toggle fans your prompt out so every panel sees the same input. The fastest way to decide whether a frontier model is worth the credits.
What it does
Compare lives at /chatbots/[id]/compare and renders multiple chat panels in a row. Each panel runs against your trained agent but uses the model you pick in its own header. Type into one panel and the other panels stay independent — unless you turn Sync on.
Two default panels open when you land on the page; add more with Add an instance.
Instances
Each instance is a self-contained chat pane:
- Header — the same model picker from the playground
- Sync toggle — see the next section
- Toolbar — per-panel filters (coming soon) and a 'mark as winner' button (coming soon)
- Message stream — independent message history per panel
- Composer — type into any panel; output stays in that panel unless Sync is on
Two panels fit comfortably side-by-side at a typical 1280px+ viewport. On mobile or in narrower windows, panels become horizontally swipeable with snap-scrolling.
The Sync toggle
The killer feature. Each panel has its own Sync switch in the header. When sync is on for two or more panels, typing a prompt into any of them sends the same prompt to all of them. You see how each model responds to identical input.
Header actions
- Back to Playground — returns to the single-panel view.
- Clear all chats— wipes every panel’s message history without changing model selections or panel count.
- Reset — drops back to the default two-panel layout with default models.
- Add an instance — adds another chat pane to the right.
Tips
- Compare a Mini model against a frontier model on the same five questions — usually the Mini wins on cost and the frontier marginal lift isn't worth it
- Add a third instance with a different provider entirely (Anthropic vs OpenAI) — each handles tone and refusals differently
- Use the panel ellipsis menu to remove instances; the page won't let you remove the last panel (no point in an empty Compare view)
Related
Playground
The single-panel view that Compare extends. Same model picker, same prompt config.
Best Practices — Choosing a model
Heuristics for picking the right model before you spend credits A/B-testing.
Response Quality
Compare is one tool; tightening the prompt and the sources usually moves quality more than swapping models.