Why AI code reviews need adversarial scrutiny to catch real bugs

Developers have long relied on AI tools like Claude to review code changes, but these systems often deliver either an overly cheerful "Looks good! 👍" or generic feedback that mixes real bugs with personal stylistic preferences. This tendency to please rather than critique undermines the very purpose of code reviews. One engineer sought a solution by designing Tribunal, a skill that replaces polite AI feedback with an adversarial process where reviewers actively challenge each other’s assessments.

Moving beyond polite AI feedback

Traditional AI code reviews suffer from a fundamental flaw: models are trained to be agreeable, not critical. Even when instructed to "be harsh," a single model often hedges its responses, delivering feedback that’s either too vague or overly cautious. Tribunal replaces this approach with a system where multiple AI agents assume opposing roles, forcing genuine scrutiny through confrontation rather than collaboration.

How adversarial roles expose real flaws

Tribunal operates through a four-stage adversarial process designed to eliminate ambiguity and surface genuine issues:

The Hater: Each file in a code change receives a dedicated "hater" agent whose sole job is to dissect the diff ruthlessly. This agent ignores stylistic preferences and focuses exclusively on technical flaws—correctness issues, race conditions, memory leaks, edge cases, and security vulnerabilities. Its feedback is intentionally biased toward finding problems, regardless of the developer’s intent.

The Integration Checker: While the hater examines individual files, a separate agent scans for cross-module bugs that might slip through. This agent looks for inconsistencies like changed function signatures, mismatched return values, or violated invariants between files—problems that often evade single-file reviews.

The Judge: For every accusation raised, a neutral judge reviews the actual code to determine whether the complaint is valid. Unlike the hater, the judge considers documentation and comments as evidence of intent. Its role is to sift through accusations, separating genuine flaws from misunderstandings or deliberate design choices.

The Verdict: The final report retains only the issues the judge couldn’t justify or outright conceded were weaknesses. Everything else is preserved in the full transcript, providing a transparent record of the review process. Crucially, if the hater finds no issues, it’s allowed to return an empty report—an honest result that reinforces the system’s integrity.

The power of conflict in code review

The magic of Tribunal lies not within any single agent but in the friction between them. By isolating reviewers into roles that are deliberately one-sided—one actively seeking flaws, another rigorously defending against accusations—the system produces sharper, more honest feedback than a model trying to balance criticism and politeness on its own. This adversarial dynamic reduces the noise of false positives while amplifying genuine concerns that might otherwise go unnoticed.

Seamless integration and broad compatibility

Tribunal is designed to be portable and easy to adopt. It operates purely as a set of Claude sub-agents using the Agent tool, requiring no external runtime or dependencies. The system supports multiple programming languages, including Python, JavaScript/TypeScript, Go, Rust, and Java, with minimal configuration needed to add support for additional languages. It works seamlessly within Claude Code and Claude Cowork, making it a versatile tool for developers regardless of their tech stack.

Installation is straightforward: users can download the SKILL.md file from the repository and place it in the ~/.claude/skills/ directory. Once installed, developers can trigger Tribunal by simply typing /tribunal in any repository, allowing for quick and efficient code reviews without disrupting existing workflows.

This adversarial approach raises an important question: does forcing models into polarized roles genuinely yield better results than asking a single model to be harsh? The creator of Tribunal is eager to hear from developers who test the system or attempt to break it, seeking real-world feedback to refine the approach. As AI tools become more integrated into development workflows, ensuring they provide critical, actionable feedback—rather than just polite praise—could be the key to writing more robust, secure, and maintainable code.

AI summary

Claude’un nazik kod incelemeleri yerine gerçekten güvenilir bir sistem arayan geliştirici, çoklu ajanlar arasında çatışma yaratan Tribunal aracını geliştirdi. Detaylar için tıklayın.

Why AI code reviews need adversarial scrutiny to catch real bugs

Moving beyond polite AI feedback

How adversarial roles expose real flaws

The power of conflict in code review

Seamless integration and broad compatibility

Comments

Mastering Pull Requests: Lessons from Rejected Code in LTI 1.3 Integration

Mastering AtCoder ABC462 Solutions with Python Examples

PHP SDKs streamline API integration for cleaner code