Overview
VevalTestSdk is a test double for IVevalSdk. It replays a recorded trace so your agent code runs with mocked LLM responses — deterministic, fast, and free.
Unlike calling VevalSdk directly with a trace, VevalTestSdk guarantees no live LLM fallthrough. If a step name doesn’t match the recorded trace, it throws immediately.
Setup
// 1. Load a production trace
var trace = await veval.GetTraceAsync("tr_4ec4e79c5d03...");
// 2. Create the test double
var testSdk = new VevalTestSdk(new VevalOptions { ApiKey = "YOUR_API_KEY" })
.WithReplay(trace);
// 3. Inject into your service (IVevalSdk interface)
var service = new MyAgentService(testSdk, claude);
// 4. Call normally — no live LLM
var result = await testSdk.RunAsync("my-agent", service.ExecuteAsync);
VevalTestSdk still reports traces and scenario results to the dashboard, so runs count in your history.
In a test project
[Fact]
public async Task Agent_ReturnsExpectedOutput()
{
var trace = LoadFixture("trace_prompt_caching.json"); // or fetch from API
var testSdk = new VevalTestSdk(new VevalOptions { ApiKey = TestApiKey })
.WithReplay(trace);
var service = new MyAgentService(testSdk, /* mock claude */);
var result = await testSdk.RunAsync("my-agent", service.ExecuteAsync, input: "test input");
Assert.Equal("success", testSdk.LastStatus);
Assert.Contains("caching", result);
}
Strict mock mode
If your agent calls TrackStepAsync with a step name that isn’t in the recorded trace, VevalTestSdk throws:
InvalidOperationException: Replay mode: no mock output for step 'new_step'.
Available steps: classify, answer.
This would have made a real LLM call.
This catches regressions where code changes add new LLM calls that weren’t in the original trace.
ReplayAsync directly
For lower-level control, use ReplayAsync with explicit assertions:
var replayResult = await testSdk.ReplayAsync(
trace,
service.ExecuteAsync,
new ReplayOptions
{
MockLlmResponses = true,
Assertions = [
TraceAssert.NoErrors(),
TraceAssert.StepExists("classify"),
],
}
);
Console.WriteLine(replayResult.Failures.Count == 0 ? "Pass" : "Fail");
foreach (var f in replayResult.Failures)
Console.WriteLine(f);
ReplayResult member | Description |
|---|
Failures | List of assertion failure messages |
ReplayedContext | The VevalExecutionContext from the run |
Output | Return value of your agent |
Status | "success" or "error" |
Error | Exception message if status is "error" |
LastStatus / LastError
After RunAsync, inspect the outcome without catching exceptions:
await testSdk.RunAsync("my-agent", service.ExecuteAsync);
Assert.Equal("success", testSdk.LastStatus);
Assert.Null(testSdk.LastError);