Build your agent

A/B experiments

A/B experiments split live traffic between agent variants — different personas, CTAs, or triggers — so you can measure which one drives more engagement, leads, or longer conversations. Assignment is sticky: a given visitor always sees the same variant.

Access & permissions

Experiments live under the A/B experiments tab in every agent's navigation. Only workspace Owners and Admins can create, start, or stop experiments. Editors can view but not mutate.

Experiment kinds

Persona — variants override the agent's name and tone in the system prompt for assigned visitors.
CTA — variants record the assignment but don't yet alter runtime behavior (measurement only today).
Trigger — similar to CTA: assignment is recorded for analysis but runtime behavior is unchanged.

Configuration

Creating an experiment requires:

At least two variants. Default is control + treatment at a 50/50 split.
Weights as positive integers — they're normalized automatically, so 1/1 and 50/50 are equivalent.
For persona experiments, a config JSON for each variant. For example:

{ "persona": { "name": "Aria", "tone": "warm and concise" } }

How assignment works

On a visitor's first message the runtime:

Checks whether the visitor already has an assigned variant. If yes, reuse it.
Otherwise picks the most recently running experiment.
Hashes visitor_id together with experiment_id and buckets the visitor into a variant according to the weights.
Persists the assignment so future conversations from the same visitor stay on the same variant.

Measurement

Assignments are stored in experiment_assignments rows, and every conversation's variant_id is stamped at message time. Joining against messages and leads tables yields lift on engagement (message count, session length) and conversion (lead capture rate) per variant.

Lifecycle

Start — activates the experiment and begins assigning visitors.
Stop — halts new assignments. Visitors already mid-conversation keep their variant for consistency.
Delete — removes the experiment record. Historical assignment data on conversations remains, so reports stay accurate.

Note. Only one experiment is “active” per agent at a time — the most recently started one. Stop the current experiment before starting a new one to avoid mixed populations.