ProofAgent Community is the open ecosystem for adversarial, multi-turn, domain-aware AI agent evaluation — built around the open-source ProofAgent Harness (Apache 2.0).
What's in the community
The ecosystem brings together adversarial traps, juror personas, scoring rubrics, agent skills, and domain benchmarks contributed by developers, researchers, and enterprise teams who deploy AI agents in production. Everything is open and inspectable on GitHub.
183 adversarial attack traps across 11 families: social engineering, prompt injection, compliance (GDPR, HIPAA, PCI, SOX), tool misuse, factuality, data exfiltration, and more
Composite attack chains — 5 to 7 turn adversarial sequences blending authority pressure, urgency, sympathy, and refusal-as-betrayal
Juror personas — rigorous, lenient, contrarian — that score agent transcripts independently before consensus
Agent skills and behavioral benchmarks shared across the community for reproducible evaluation
Contribute
The community accepts pull requests for new traps, agent specs, juror personas, and benchmarks. Each contribution is validated against the canonical trap manifest schema and tested against the conductor pipeline. Browse the GitHub repository to author your own trap pack or distribute one as a pip-installable package (proofagent_traps_<name>).
Built around ProofAgent Harness
The harness ships with a planner, conductor, jury, consensus, and reporter pipeline. Bring your own LLM via LiteLLM (Anthropic, OpenAI, Gemini, Bedrock, Ollama, vLLM, lm-studio). Run fully local, in CI/CD, or as part of a production deployment workflow.