Session holds one or more environments. See Concepts → The lifecycle for how reset / login / evaluate fit together; this page is the API reference.
Creating
Provide exactly one ofenvs, testcase, or artifacts.
timeout seconds after creation (default 1800 = 30 min):
testcase= auto-resets. envs= does not — call session.reset() first thing yourself.
Operations
reset()
Initialize mutation logging on every env. Call this before login and the agent run — see Concepts → Reset.
get_state()
Flush pending writes and return mutations per env.
evaluate(value=None)
Score the session against its linked testcase. value is required for OUTPUT scoring; omit for MUTATION-only.
get_public_url(port=None)
Browser-accessible URLs per env, keyed by alias.
link_testcase(testcase_id)
Link a testcase to a session created from envs= so evaluate() has scoring config to read.
login(browser) / desktop login
For app-sim sessions, pass a Playwright Browser. For desktop sessions, use desktop.sdk.login(session) instead — session.login(browser) raises if session.desktop_env is set. Full pattern: Examples → Full evaluation.
close()
try/finally so VMs go away on crash.
Accessing envs
Properties
| Property | Type | Description |
|---|---|---|
session_id | str | Unique session identifier |
task_public_id | str | None | Test case ID if created from testcase= |
envs | list[Environment] | All envs in the session |
desktop_env | Environment | None | First env where is_desktop=True |