ProofAgent is the accountability platform for production AI agents. It turns agent risk into deployment evidence through adversarial multi-juror scoring, production log audits, artifact reviews, signed readiness reports, and human review. The platform is built around the open-source ProofAgent Harness.

How do I test my AI agent with ProofAgent?

Install the open-source harness with 'pip install proofagent-harness', wrap your agent in a function returning AgentResponse, then call Harness().evaluate(my_agent, role, goal, knowledge, context). The harness runs adversarial multi-turn sessions and returns a /10 readiness score with traceable findings and fix recommendations.

What is adversarial multi-juror scoring?

Adversarial multi-juror scoring is ProofAgent's evaluation approach: a planner picks domain traps, a conductor applies sustained pressure across 25+ turns, and three independent juror agents score every behavior change. No single LLM call ever decides the verdict — the jury agents reach consensus or debate to a final score.

Is ProofAgent SOC 2 / HIPAA / GDPR compliant?

ProofAgent is SOC 2 Type II aligned, HIPAA-ready (BAAs available for enterprise customers), and follows GDPR best practices. Enterprise customers can deploy on-premises or in a private cloud with SSO/SAML, RBAC, tamper-evident audit logs, TLS 1.2+ in transit, and AES-256 at rest.

Can I use my own LLM with ProofAgent?

Yes. ProofAgent is BYO Harness LLM — the harness internals can run on any LLM provider (OpenAI, Anthropic, Google, local models). You bring your own model and API key; the harness orchestrates the multi-juror evaluation around it.

What metrics does ProofAgent measure?

11+ production metrics including Task Success, Hallucination Control, Safety, Policy Compliance, Memory Stability, Tone and Empathy, Manipulation Resistance, Tool Picking, Reasoning Quality, Relevance, and Drift Detection. Every metric is anchored to per-turn transcript evidence.

What is the difference between ProofAgent Platform and ProofAgent Harness OSS?

ProofAgent Harness OSS is the open-source multi-turn adversarial testing engine — Tier 1 of the platform, available standalone for developers and CI under Apache 2.0. ProofAgent Platform is the enterprise product that adds the other four tiers (production log audit, artifact review, multi-agent orchestration scoring, expert human review), a hosted dashboard, REST API, governance features, signed readiness reports, and dedicated support.

ProofAgent Harness — the best open-source AI agent evaluation framework

Name: ProofAgent Platform
Brand: ProofAgent
Availability: InStock

The most rigorous, research-backed, multi-turn adversarial AI agent testing tool. Apache 2.0 open-source. Bring your own LLM. Runs locally, in CI/CD, or in production. 183 bundled traps across 11 attack families. Trusted by engineering teams shipping production-grade AI agents. Full methodology documented in the published whitepaper (arXiv:2605.24134).

What the Harness does

The Harness runs a 5-stage pipeline against any callable AI agent. A planner picks domain-relevant traps from the 183-trap library, a conductor runs N adversarial turns with realistic attacks, three juror personas independently score the transcript across 5 canonical metrics, consensus resolves disagreements through Delphi or debate rounds, and a reporter produces a signed readiness verdict with transcript-linked findings.

Why multi-turn adversarial matters

Single-shot evaluation misses how production AI agents actually fail. The Harness ships composite attack chains — 5 to 7 turn adversarial sequences that blend authority pressure, urgency framing, sympathy appeals, refusal-as-betrayal pivots, and policy gaslighting. Real adversaries don't use "ignore previous instructions"; they apply sustained pressure across many turns. The Harness models that.

Quickstart

Install with pip install proofagent-harness. Wrap your existing AI agent as a callable that takes a string and returns a string (or an AgentResponse with tools_called and retrievals for deeper scoring). Pass it to Harness.evaluate() with your system prompt, tools, and knowledge corpus. Run locally with any LiteLLM-supported model — Anthropic, OpenAI, Gemini, Bedrock, Ollama, vLLM, lm-studio.

183 adversarial traps across 11 families: prompt injection, social engineering, compliance, tool misuse, data exfiltration, factuality, code safety, business logic, policy drift, verbal abuse, bias
Composite attack chain support for sustained multi-turn pressure
3-juror Delphi consensus reduces single-judge bias
pytest integration with assertion-style thresholds
Local-first — your context never leaves the machine