
POIROT Protocol
Peer-Oriented Identification & Resolution of Operational Threats
Iñaki Dellibarda Varela1*, R. Sendra-Arranz1, Pablo Romero-Sorozabal1, J.M. Valverde-García1, Annemarie F. Laudanski1,2, Álvaro Gutiérrez3, Eduardo Rocon1*†, Manuel Cebrian1†
A consensus-based framework for detecting and attributing errors in multi-agent AI systems through collaborative peer interrogation and weighted voting.
🎯 What is POIROT?
The Problem
Multi-agent AI systems are increasingly deployed in critical domains — healthcare, finance, autonomous systems — but detecting and attributing errors across distributed agents remains an open problem. Traditional debugging approaches fail when agents have partial observability and faults propagate silently through agent interactions.
The Solution
Rather than relying on an external judge, POIROT turns the system's own agents into investigators. Each agent already understands its role and what it observed — making them the best-placed experts to reason about what went wrong. Through structured peer interrogation and weighted consensus voting, this collective knowledge consistently outperforms single-LLM baselines by up to +26 percentage points.
🔬 How POIROT Works: 5-Phase Protocol
Phase 1: Error Vector Space Construction
The POIROT Agent analyzes the multi-agent system description to identify all potential error locations and constructs an N-dimensional error vector space.
- System architecture description
- Agent roles and responsibilities
- Error dimension labels
- Binary vector representation [0,1,0,...]
- Component descriptions
Phase 2: Individual Analysis
Each agent independently analyzes the session execution logs from their own perspective. Agents only see messages they participated in, preventing hallucination and ensuring evidence-based observations.
Phase 3: Peer Consultation Protocol
Agents receive each other's Phase 2 reports and, in turns, can interrogate their peers — asking follow-up questions, requesting clarifications, or challenging observations. This structured dialogue lets agents cross-reference their partial views of the session. Once the consultation concludes, each agent produces its final fault attribution decision with a full justification.
Phase 4: Weighted Voting with Hamming Distance
Agent votes are weighted based on their proximity to the suspected error location using Hamming distance in the error vector space. Agents voting for themselves receive maximum weight; votes far from their position receive lower weight.
similarity = 1 - (hamming_distance / N_dimensions)
Vote Weight:
weight = baseline + 0.5 × similarity
Example:
• Agent voting for self: weight ≈ 0.75 (high confidence)
• Agent voting nearby: weight ≈ 0.58 (medium)
• Agent voting far: weight ≈ 0.42 (low)
Final Consensus:
consensus[i] = sum(vote[i] × weight) / sum(weights)
Phase 5: Fault Localization
Once all weighted votes are aggregated, POIROT identifies the most probable fault location: the component or set of components with the highest consensus score across the error vector space. The result is a ranked attribution — indicating not just where the failure likely originated, but with what degree of collective confidence.
📈 Validation Results
BLAME · CORTEX — POIROT vs. Baseline
BLAME · TradingAgents — POIROT vs. Baseline
🎬 Live Demonstrations
Explore real POIROT analyses across two multi-agent systems from the BLAME benchmark. Each case shows the complete 5-phase protocol with actual error injection, agent deliberation, and consensus voting.
CORTEX
A multi-agent system of medical specialists evaluates pediatric cerebral palsy rehabilitation with the Discover2Walk exoskeleton. Explore cases ranging from single-agent errors to simultaneous five-component failures — and watch POIROT build consensus attribution.
TradingAgents
A 12-agent financial pipeline spanning data ingestion, research, risk deliberation, and execution. Each session operates near the LLM context limit — making fault attribution across the pipeline a uniquely demanding challenge.