Variant Definitions
This directory contains only V8-hybrid — the variant proposed in PR #189.
All other variants (V1–V7) were used during the evaluation but are not included in this PR to keep the diff reviewable. They are published at:
ascerra/code-agent-eval-scenarios — variants/
Variant inventory
| ID | Name | Browse | Description |
|---|---|---|---|
| V1 | fullsend-single-skill | view | PR #189 original — agent + single skill + scan-secrets |
| V2 | fullsend-multi-skill | view | PR #189 with skill split into 4 pieces |
| V3 | vanilla-claude | view | No guardrails baseline — just a prompt |
| V4 | claudemd-only | view | CLAUDE.md instructions only — stopped early |
| V5 | apex | view | Enhanced V1 with reasoning protocol, self-review, minimal diff |
| V6 | apex-github | view | V5 hardcoded for GitHub (rigidity hurts) |
| V7 | ultimate | view | V5 + "understand before you act" + reproduction step |
| V8 | hybrid | here | Cleaned V1 + V5 minimal-diff + V7 reproduction/task-type |
