Eval definitions

Eval definitions contain the prompt, expected behavior, starter project, and runtime options Sapient uses for API Performance runs.

Commands

Command	Description
`sapient api-performance evals list`	List eval definitions.
`sapient api-performance evals retrieve --eval-id <eval_id>`	Retrieve one eval definition.
`sapient api-performance evals update --eval-id <eval_id>`	Update one eval definition.

List eval definitions

sapient api-performance evals list --output-format json

Filter to enabled evals for one source:

sapient api-performance evals list \
  --integration-id int_123 \
  --enabled true

Example response:

{
  "data": [
    {
      "id": "eval_123",
      "endpoint_id": "endpoint_123",
      "eval_type": "integration",
      "prompt": "Use the API to crawl a page.",
      "custom_prompt": null,
      "expected_behavior": "The agent completes the task successfully.",
      "starting_project_id": "starter_123",
      "docs_mode": "include",
      "include_env_vars": true,
      "env_profile_ids": ["prod"],
      "model_ids": ["claude-sonnet-4.5"],
      "compare_skills": false,
      "skills_mode": "selected",
      "skills_enabled": true,
      "skill_ids": ["skill_123"],
      "mcp_enabled": false,
      "run_context": {},
      "enabled": true
    }
  ],
  "meta": {
    "count": 1
  }
}

Retrieve an eval definition

sapient api-performance evals retrieve --eval-id eval_123

Update an eval definition

Change the custom prompt and expected behavior:

sapient api-performance evals update --eval-id eval_123 \
  --custom-prompt '"Use the TypeScript SDK to crawl https://example.com."' \
  --expected-behavior '"The agent installs the SDK, sends a valid request, and prints the crawl result."'

Attach a starter project:

sapient api-performance evals update --eval-id eval_123 \
  --starting-project-id '"starter_123"'

Control runtime context:

sapient api-performance evals update --eval-id eval_123 \
  --docs-mode '"include"' \
  --include-env-vars true \
  --env-profile-ids '["prod"]' \
  --model-ids '["claude-sonnet-4.5"]' \
  --mcp-enabled false \
  --run-context '{"framework":"nextjs","package_manager":"pnpm"}'

Control skill usage:

sapient api-performance evals update --eval-id eval_123 \
  --skills-mode '"selected"' \
  --skill-ids '["skill_123"]' \
  --compare-skills true

Set --skills-mode '"all"' to use all available skills for the organization, or --skills-mode '"none"' to disable skill context. Use --compare-skills true when you want Sapient to compare runs with skill context enabled. Disable an eval without deleting it:

sapient api-performance evals update --eval-id eval_123 --enabled false

Updateable fields:

Flag	Description
`--custom-prompt`	Replacement prompt text for the eval.
`--expected-behavior`	Expected behavior used by graders and diagnosis.
`--starting-project-id`	Starter project cloned before the eval. Pass an empty string to clear it.
`--docs-mode`	Docs behavior for the eval. Use `default`, `include`, or `exclude`.
`--include-env-vars`	Include configured environment variable names in the eval prompt.
`--env-profile-ids`	Environment profile IDs to use for this eval as a JSON array. Empty array clears the selection.
`--model-ids`	Model IDs to use for this eval as a JSON array. Empty array uses config-level models.
`--skills-mode`	Skill behavior for the eval. Use `none`, `all`, or `selected`.
`--skill-ids`	Skill IDs to use when `--skills-mode '"selected"'`, as a JSON array.
`--compare-skills`	Compare skill-enabled runs against the default eval path.
`--mcp-enabled`	Enable MCP context for the eval.
`--run-context`	Additional runtime context as JSON.
`--enabled`	Enable or disable the eval.

Documentation Index

​Commands

​List eval definitions

​Retrieve an eval definition

​Update an eval definition

Commands

List eval definitions

Retrieve an eval definition

Update an eval definition