Replay - Veval

Overview

VevalTestSdk is a test double for IVevalSdk. It replays a recorded trace so your agent code runs with mocked LLM responses — deterministic, fast, and free. Unlike calling VevalSdk directly with a trace, VevalTestSdk guarantees no live LLM fallthrough. If a step name doesn’t match the recorded trace, it throws immediately.

Setup

// 1. Load a production trace
var trace = await veval.GetTraceAsync("tr_4ec4e79c5d03...");

// 2. Create the test double
var testSdk = new VevalTestSdk(new VevalOptions { ApiKey = "YOUR_API_KEY" })
    .WithReplay(trace);

// 3. Inject into your service (IVevalSdk interface)
var service = new MyAgentService(testSdk, claude);

// 4. Call normally — no live LLM
var result = await testSdk.RunAsync("my-agent", service.ExecuteAsync);

VevalTestSdk still reports traces and scenario results to the dashboard, so runs count in your history.

In a test project

[Fact]
public async Task Agent_ReturnsExpectedOutput()
{
    var trace = LoadFixture("trace_prompt_caching.json");  // or fetch from API

    var testSdk = new VevalTestSdk(new VevalOptions { ApiKey = TestApiKey })
        .WithReplay(trace);
    var service = new MyAgentService(testSdk, /* mock claude */);

    var result = await testSdk.RunAsync("my-agent", service.ExecuteAsync, input: "test input");

    Assert.Equal("success", testSdk.LastStatus);
    Assert.Contains("caching", result);
}

Strict mock mode

If your agent calls TrackStepAsync with a step name that isn’t in the recorded trace, VevalTestSdk throws:

InvalidOperationException: Replay mode: no mock output for step 'new_step'.
Available steps: classify, answer.
This would have made a real LLM call.

This catches regressions where code changes add new LLM calls that weren’t in the original trace.

ReplayAsync directly

For lower-level control, use ReplayAsync with explicit assertions:

var replayResult = await testSdk.ReplayAsync(
    trace,
    service.ExecuteAsync,
    new ReplayOptions
    {
        MockLlmResponses = true,
        Assertions       = [
            TraceAssert.NoErrors(),
            TraceAssert.StepExists("classify"),
        ],
    }
);

Console.WriteLine(replayResult.Failures.Count == 0 ? "Pass" : "Fail");
foreach (var f in replayResult.Failures)
    Console.WriteLine(f);

`ReplayResult` member	Description
`Failures`	List of assertion failure messages
`ReplayedContext`	The `VevalExecutionContext` from the run
`Output`	Return value of your agent
`Status`	`"success"` or `"error"`
`Error`	Exception message if status is `"error"`

LastStatus / LastError

After RunAsync, inspect the outcome without catching exceptions:

await testSdk.RunAsync("my-agent", service.ExecuteAsync);

Assert.Equal("success", testSdk.LastStatus);
Assert.Null(testSdk.LastError);

Documentation Index

​Overview

​Setup

​In a test project

​Strict mock mode

​ReplayAsync directly

​LastStatus / LastError

Overview

Setup

In a test project

Strict mock mode

ReplayAsync directly

LastStatus / LastError