← All KinPapers
6The LocalKin Team · April 2026

Structured Multi-Agent Debate with Domain-Expert Routing

The LocalKin Team

April 2026

Keywords: multi-agent debate, domain routing, conductor architecture, structured deliberation, traditional Chinese medicine, quantitative finance, swarm intelligence

Abstract

Multi-agent debate has emerged as a powerful paradigm for improving reasoning quality in large language model (LLM) systems. However, existing approaches broadcast every question to all participating agents, regardless of domain relevance. This introduces computational waste and dilutes expert signal with noise from agents whose expertise is orthogonal to the question.

We present Domain-Expert Routing, a conductor-based architecture that interposes a routing layer between incoming queries and the agent pool. A conductor agent selects a relevant subset of experts (typically 4--6 out of 11--75 agents), then orchestrates a structured multi-round debate among only those agents. We instantiate this architecture in two production systems: (1) a TCM consultation system routing to 11 historical physician agents, and (2) a quantitative finance pipeline executing a 5-phase sequential protocol with different agent subsets at each phase.

Across both domains, domain-expert routing achieves higher consensus quality (80.2% weighted agreement in TCM debates), reduces per-query agent invocations by 55--65%, and enables Phase 0 verification gates that catch hallucinated financial data before publication.

1. Introduction

The multi-agent debate paradigm posits that LLM agents, when given the opportunity to argue, rebut, and revise their positions across multiple rounds, produce more accurate outputs than any single agent in isolation.

Yet a fundamental inefficiency persists: every agent participates in every debate. When a patient presents with gynecological symptoms, the acupuncture specialist, pharmacologist, and theoretical cosmologist all weigh in. The result is threefold waste: (1) computational cost scales with fleet size, (2) irrelevant opinions introduce noise, and (3) agents outside their domain are more likely to hallucinate.

We propose domain-expert routing. A conductor agent selects a task-appropriate subset of experts before initiating the debate.

2. Architecture

2.1 TCM Routing

The TCM conductor manages 11 historical physician agents spanning from the Yellow Emperor to the Qing dynasty. The routing table maps clinical categories to expert subsets:

CategoryExpert SubsetRationale
General internal medicineZhang Zhongjing, Sun Simiao, Li Dongyuan, Zhu DanxiCore diagnosticians
Warm disease / feverZhang Zhongjing, Ye Tianshi, Liu Wansu, Sun SimiaoYe Tianshi's Wei-Qi-Ying-Xue system
GynecologyFu Qingzhu, Zhang Zhongjing, Zhu Danxi, Sun SimiaoFu Qingzhu's specialty
AcupunctureHuangfu Mi, Zhang Zhongjing, Sun Simiao, Hua TuoHuangfu Mi's Zhenjiu Jiayi Jing
Surgery / emergencyHua Tuo, Zhang Zhongjing, Sun Simiao, Huangfu MiHua Tuo's surgical expertise
PharmacologyLi Shizhen, Sun Simiao, Zhang ZhongjingLi Shizhen's Bencao Gangmu
Theory / pedagogyHuang Di, Zhang Zhongjing, Zhu Danxi, Liu Wansu, Li DongyuanFoundational theorists

2.2 Quant Pipeline Routing

PhaseNameAgentsOutput
Phase 0Price verificationstock_price skill (API)Verified price + timestamp
Phase 1Data collection4 analystsIndependent reports
Phase 2Adversarial debateBull team vs. Bear teamDebate transcript
Phase 3Trade proposalTrader agentEntry/exit/sizing
Phase 4Risk checkRisk managerApproval / rejection
Phase 5PublicationConductorFinal report to KinBook

3. Debate Protocol

3.1 Round 1: Diverse Strategies

Each agent receives a DMAD reasoning strategy assignment from eight strategies: analytical, analogical, contrastive, first-principles, empirical, devil's advocate, systems thinking, and historical.

Agents respond in structured format with DOMAIN_ANGLE, POSITION, CONFIDENCE, REASONING, EVIDENCE, and INDEPENDENCE fields.

3.2 Round 2+: Informed Revision

Agents receive all prior positions, a cumulative evidence pool, and an IBIS rebuttal pool. Position changes are tracked explicitly.

3.3 Consensus Mechanism

Positions are tallied using confidence-weighted voting. A position is declared consensus if its weighted ratio exceeds 0.70. Consensus inertia detection flags potential social conformity when >60% of agents changed position and >50% self-report as influenced.

4. Safety Architecture

4.1 Phase 0 Verification Gate

Financial reports require real-time price verification before any analytical content is generated. If the stock_price skill returns an error, the conductor halts with [IDLE].

Phase 0 verification is protected from the self-evolution mechanism --- classified as a runtime safety constraint that cannot be modified by the swarm architect.

4.2 Mandatory Disclaimers

Disclaimers are appended at the runtime level, not the agent level, ensuring they cannot be omitted by agent self-modification.

4.3 Ollama Fallback Policy

When the primary LLM provider is unavailable and agents fall back to local models, sensitive domain agents enter [IDLE] mode. Silence is preferable to confabulation.

5. Evaluation

5.1 TCM: Spring Pollen Debate

Five masters debated spring allergy treatment. Weighted support ratio: 80.2%. Verdict: consensus for tonifying Qi as primary approach, with heat-clearing as complementary for damp-heat constitution patients.

5.2 Quality Trajectory

MetricDay 1Day 5
Phase 0 compliance67%100%
Disclaimer presence80%100%
Hallucinated prices20
Overall compliance75%92%

5.3 Routing Efficiency

SystemFleet SizeBroadcastRoutedReduction
TCM11114.3 (mean)61%
Quant662.5 (mean per phase)58%
Prediction7510 (max)4.6 (mean)54%

6. Conclusion

Domain-expert routing addresses a practical inefficiency in multi-agent debate: not every agent needs to weigh in on every question. The key insight is that expertise is not uniformly distributed, and debate protocols should respect this by routing questions to the agents best equipped to answer them.

References

Du, Y., et al. (2023). Improving Factuality and Reasoning through Multiagent Debate. arXiv:2305.14325.

Liang, T., et al. (2024). Encouraging Divergent Thinking through Multi-Agent Debate. arXiv:2305.19118.

Sun, J. (2026). Self-Evolving Multi-Agent Swarms. Technical Report, The LocalKin Team.