The Challenge of Multi-Table Reasoning
Multi-table question answering requires models to perform complex relational reasoning, often involving joining disparate datasets, filtering, and performing arithmetic across tables. Standard prompting techniques often fail because LLMs struggle to maintain logical consistency when navigating multiple schema structures simultaneously. The core issue is that models frequently hallucinate relationships or misinterpret table headers, leading to incorrect query generation or faulty reasoning chains.
Synthetic Contrastive Reasoning (SCR)
The proposed framework, Synthetic Contrastive Reasoning (SCR), addresses these failures by shifting the focus from simple instruction following to contrastive learning. Instead of relying on standard supervised fine-tuning, the researchers generate synthetic training data that pairs correct reasoning paths with "hard negative" examples—reasoning chains that look plausible but contain subtle logical or relational errors. By training the model to explicitly contrast these paths, the system develops a more robust internal representation of relational logic.
Impact on Model Performance
The study demonstrates that models trained with SCR significantly outperform baseline models on standard multi-table benchmarks. The primary advantage is a reduction in "schema-alignment errors," where the model incorrectly maps a question to the wrong column or table. By forcing the model to identify why a specific reasoning step is invalid, the framework improves the model's ability to handle complex SQL generation and multi-step arithmetic, ultimately leading to higher accuracy in zero-shot and few-shot scenarios.