POIROT Protocol

Peer-Oriented Identification & Resolution of Operational Threats

Iñaki Dellibarda Varela^1*, R. Sendra-Arranz¹, Pablo Romero-Sorozabal¹, J.M. Valverde-García¹, Annemarie F. Laudanski^1,2, Álvaro Gutiérrez³, Eduardo Rocon^1*†, Manuel Cebrian^1†

¹ Center for Automation and Robotics, Spanish National Research Council (CSIC-UPM), Madrid, Spain

² Biomechanics of Human Mobility Laboratory, Dept. of Kinesiology and Health Sciences, University of Waterloo, Canada

³ ETSI Telecomunicación, Universidad Politécnica de Madrid (UPM), Madrid, Spain

^* Corresponding authors · ^† Equal supervision contribution

A consensus-based framework for detecting and attributing errors in multi-agent AI systems through collaborative peer interrogation and weighted voting.

🎯 What is POIROT?

The Problem

Multi-agent AI systems are increasingly deployed in critical domains — healthcare, finance, autonomous systems — but detecting and attributing errors across distributed agents remains an open problem. Traditional debugging approaches fail when agents have partial observability and faults propagate silently through agent interactions.

The Solution

Rather than relying on an external judge, POIROT turns the system's own agents into investigators. Each agent already understands its role and what it observed — making them the best-placed experts to reason about what went wrong. Through structured peer interrogation and weighted consensus voting, this collective knowledge consistently outperforms single-LLM baselines by up to +26 percentage points.

🔬 How POIROT Works: 5-Phase Protocol

Phase 1: Error Vector Space Construction

The POIROT Agent analyzes the multi-agent system description to identify all potential error locations and constructs an N-dimensional error vector space.

INPUT

System architecture description
Agent roles and responsibilities

OUTPUT

Error dimension labels
Binary vector representation [0,1,0,...]
Component descriptions

Phase 2: Individual Analysis

Each agent independently analyzes the session execution logs from their own perspective. Agents only see messages they participated in, preventing hallucination and ensuring evidence-based observations.

🔍 KEY FEATURES

Message Filtering:

Agents see only their own messages, preventing confabulation

JSON Output:

Structured anomaly reports with evidence citations

Transparency:

Agents acknowledge when they see nothing wrong

Non-Participants:

Agents not in session can still join Phase 2

Phase 3: Peer Consultation Protocol

Agents receive each other's Phase 2 reports and, in turns, can interrogate their peers — asking follow-up questions, requesting clarifications, or challenging observations. This structured dialogue lets agents cross-reference their partial views of the session. Once the consultation concludes, each agent produces its final fault attribution decision with a full justification.

Phase 4: Weighted Voting with Hamming Distance

Agent votes are weighted based on their proximity to the suspected error location using Hamming distance in the error vector space. Agents voting for themselves receive maximum weight; votes far from their position receive lower weight.

📊 VOTING FORMULA

Hamming Similarity:
similarity = 1 - (hamming_distance / N_dimensions)

Vote Weight:
weight = baseline + 0.5 × similarity

Example:
• Agent voting for self: weight ≈ 0.75 (high confidence)
• Agent voting nearby: weight ≈ 0.58 (medium)
• Agent voting far: weight ≈ 0.42 (low)

Final Consensus:
consensus[i] = sum(vote[i] × weight) / sum(weights)

Phase 5: Fault Localization

Once all weighted votes are aggregated, POIROT identifies the most probable fault location: the component or set of components with the highest consensus score across the error vector space. The result is a ranked attribution — indicating not just where the failure likely originated, but with what degree of collective confidence.

🔬

OUR EVALUATION BENCHMARK

📐 BLAME

Benchmark for Localizing Agent Malfunctions Effectively — the open evaluation suite we developed to validate POIROT across two distinct multi-agent domains. BLAME provides structured fault injection scenarios, ground truth attribution vectors, and standardized metrics for benchmarking agent debugging protocols.

🏥CORTEX

Medical rehabilitation · 7 dimensions · 15 fault scenarios · 3 agents

💹TradingAgents

Algorithmic trading · 15 dimensions · 6 fault scenarios · 12 agents

📈 Validation Results

⏱️

OPEN BENCHMARK

⏱️ Who & When Benchmark

An open multi-agent benchmark evaluating fault attribution on dynamic, real-world conversational tasks — identifying which agent made an error and at what point. POIROT achieves 42% overall accuracy on 126 heterogeneous cases, with perfect attribution on single-agent scenarios and strong performance as pipeline complexity grows.

42%

Overall accuracy

53 / 126 correct

100%

Single-agent tasks

4/4 — perfect attribution

67%

4-agent tasks

6/9 correct

126

Total cases

Across task categories

🏥

BLAME · CORTEX — POIROT vs. Baseline

Gemini 2.5 Pro

Baseline

27.8%

POIROT

40.5%

DeepSeek Reasoner

Baseline

16.7%

POIROT

42.3%

GPT-oss 120B

Baseline

32.7%

POIROT

31.3%

GPT-oss 20B

Baseline

12.7%

POIROT

19.3%

💹

BLAME · TradingAgents — POIROT vs. Baseline

Gemini 2.5 Pro

Baseline

25%

POIROT

66.7%

DeepSeek Reasoner

Baseline

25.5%

POIROT

44.1%

GPT-oss 120B

Baseline

34.4%

POIROT

48.7%

GPT-oss 20B

Baseline

40.2%

POIROT

48.4%

🎬 Live Demonstrations

Explore real POIROT analyses across two multi-agent systems from the BLAME benchmark. Each case shows the complete 5-phase protocol with actual error injection, agent deliberation, and consensus voting.

🏥

CORTEX

Medical Robotics — Exoskeleton Rehabilitation

A multi-agent system of medical specialists evaluates pediatric cerebral palsy rehabilitation with the Discover2Walk exoskeleton. Explore cases ranging from single-agent errors to simultaneous five-component failures — and watch POIROT build consensus attribution.

7 dimensions3 specialist agents4 selectable cases

💉 Errors: Doctor-X · Sensor-X · Parent-X · Multi-fault

→

💹

TradingAgents

Financial Markets — Algorithmic Trading

A 12-agent financial pipeline spanning data ingestion, research, risk deliberation, and execution. Each session operates near the LLM context limit — making fault attribution across the pipeline a uniquely demanding challenge.

15 dimensions12 specialist agents4 selectable cases

💉 Errors: Inverted Reality · Hallucinated Scandal · Amnesiac Judge

→