Documentation Index
Fetch the complete documentation index at: https://usesapient.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Custom evals are higher-level tasks Sapient should test with coding agents. Use them for workflows that span multiple operations, such as installing an SDK, authenticating, and sending a first request.
Commands
| Command | Description |
|---|
sapient api-performance custom-evals list | List custom evals. |
sapient api-performance custom-evals create | Create a custom eval. |
sapient api-performance custom-evals retrieve --custom-eval-id <custom_eval_id> | Retrieve one custom eval. |
sapient api-performance custom-evals update --custom-eval-id <custom_eval_id> | Update one custom eval. |
sapient api-performance custom-evals delete --custom-eval-id <custom_eval_id> | Delete one custom eval. |
sapient api-performance custom-evals history list | List custom eval history rows. |
List custom evals
sapient api-performance custom-evals list --output-format json
Example response:
{
"data": [
{
"id": "custom_eval_123",
"prompt": "Install the SDK and crawl https://example.com",
"prompt_template": null,
"description": "Measures whether an agent can complete the first SDK request.",
"competitor": null,
"prompt_competitor": null,
"prompt_competitor_group": null,
"category": {
"name": "Getting started",
"slug": "getting-started"
},
"eval": {
"id": "eval_123",
"eval_type": "integration",
"enabled": true
}
}
],
"meta": {
"count": 1
}
}
Create a custom eval
sapient api-performance custom-evals create \
--prompt "Install the SDK and crawl https://example.com" \
--category-name "Getting started" \
--description "Measures whether an agent can complete the first SDK request."
Configure the generated eval at creation time:
sapient api-performance custom-evals create \
--prompt "Install the SDK and crawl https://example.com" \
--description "The agent should complete the first working request." \
--eval '{"docs_mode":"include","include_env_vars":true,"mcp_enabled":false}'
Use a prompt template when Sapient should render the prompt for a specific company. Templates support {{company_name}} and {{company_domain}}.
sapient api-performance custom-evals create \
--body '{"prompt":"Build an integration for Example API.","prompt_template":"Build an integration for {{company_name}} using docs from {{company_domain}}.","category_name":"Getting started"}'
Retrieve a custom eval
sapient api-performance custom-evals retrieve --custom-eval-id custom_eval_123
List custom eval history
List evaluated history rows for all custom evals:
sapient api-performance custom-evals history list \
--limit 50
Filter to one custom eval:
sapient api-performance custom-evals history list \
--custom-eval-id custom_eval_123 \
--sort-by time \
--sort-dir desc \
--limit 25
History rows include the custom eval prompt, target labels, pass/fail result, score, tool call count, latency, and run ID. Use the returned run ID with sapient api-performance runs retrieve to inspect full run detail.
Useful flags:
| Flag | Description |
|---|
--custom-eval-id | Filters history to one custom eval. |
--result-filter | Filters by result. Use all, passed, or failed. |
--target-key | Filters to a target key from a previous history response. |
--category-id | Filters to one custom eval category. |
--search | Searches prompts, categories, target labels, and errors. |
--limit | Maximum rows to return. |
--offset | Pagination offset. |
--sort-by | Sorts by time, use_case, target, result, score, tool_calls, or latency. |
--sort-dir | Sort direction. Use asc or desc. |
Update a custom eval
sapient api-performance custom-evals update --custom-eval-id custom_eval_123 \
--description '"Covers SDK install, authentication, and first crawl request."'
Update the attached eval definition:
sapient api-performance custom-evals update --custom-eval-id custom_eval_123 \
--eval '{"expected_behavior":"The agent installs the SDK, authenticates, and prints the crawl result."}'
Updateable fields:
| Flag | Description |
|---|
--prompt | Task prompt. |
--prompt-template | Optional prompt template. Supports {{company_name}} and {{company_domain}}. |
--category-name | Category label for organizing custom evals. |
--description | Expected task outcome. |
--competitor | Competitor ID for competitor-specific eval variants. |
--prompt-competitor-group | Base custom eval ID for competitor variants. |
--eval | Eval definition update object as JSON. |
Delete a custom eval
sapient api-performance custom-evals delete --custom-eval-id custom_eval_123
Deleting a custom eval removes it from future API Performance runs.