Scenarios - Veval

What is a scenario?

A scenario is a named set of test cases for your agent. Each run posts pass/fail results to the dashboard so you can track quality over time. Scenarios have two item types:

Type	How it works	Cost
Synthetic	Live LLM, fresh input you define	Real API cost
Trace-backed	Mocked LLM, replays a recorded production trace	Zero

RunScenarioAsync

var result = await veval.RunScenarioAsync(
    scenarioName:       "my-scenario",
    agent:              service.ExecuteAsync,
    scenarioAssertions: [
        TraceAssert.NoErrors(),
        TraceAssert.MaxCost(0.10m),
        TraceAssert.StepExists("classify"),
    ],
    items: [
        new ScenarioItem { Name = "question 1", Input = "What is prompt caching?" },
        new ScenarioItem { Name = "question 2", Input = "What is tool use?" },
    ]
);

Parameter	Description
`scenarioName`	Identifies the scenario in the dashboard
`agent`	Your agent — same signature as `RunAsync`
`scenarioAssertions`	Assertions that apply to every item
`items`	List of `ScenarioItem` — inline or fetched from dashboard

Synthetic items

Use these for new inputs you want to test with a live LLM.

new ScenarioItem
{
    Name  = "prompt caching question",
    Input = "Explain prompt caching in one sentence.",
}

Trace-backed items

Use these to replay a recorded production trace with mocked LLM responses — no API cost.

new ScenarioItem
{
    Name    = "production trace replay",
    TraceId = "tr_4ec4e79c5d03...",
    Assertions = [TraceAssert.MaxCost(0.05m)],  // per-item assertion
}

The trace must have recorded steps. If it has none, Veval throws rather than silently calling the live LLM.

Per-item assertions

Each ScenarioItem can carry its own assertions on top of the scenario-level ones:

new ScenarioItem
{
    Name       = "expensive question",
    Input      = "Write me a novel.",
    Assertions = [TraceAssert.MaxCost(0.50m)],  // only for this item
}

Fetching items from the dashboard

If items is null, Veval fetches items for the scenario from the API automatically. This lets you manage test cases in the dashboard without redeploying code.

// items: null → fetched from dashboard by scenarioName
var result = await veval.RunScenarioAsync(
    scenarioName:       "my-scenario",
    agent:              service.ExecuteAsync,
    scenarioAssertions: [TraceAssert.NoErrors()]
);

ScenarioRunResult

Console.WriteLine($"{result.PassCount}/{result.Results.Count} passed");

foreach (var item in result.Results)
{
    Console.WriteLine($"{(item.Passed ? "✓" : "✗")} {item.Item.Name}");
    foreach (var failure in item.Failures)
        Console.WriteLine($"  → {failure}");
}

Member	Description
`Passed`	True if all items passed
`PassCount`	Number of passing items
`FailCount`	Number of failing items
`Results`	List of `ItemRunResult`

Documentation Index

​What is a scenario?

​RunScenarioAsync

​Synthetic items

​Trace-backed items

​Per-item assertions

​Fetching items from the dashboard

​ScenarioRunResult

What is a scenario?

RunScenarioAsync

Synthetic items

Trace-backed items

Per-item assertions

Fetching items from the dashboard

ScenarioRunResult