#logic
Every summary, chronological. Filter by category, tag, or source from the rail.
Tag · #logic
DeFAb: A New Benchmark for Defeasible Abduction in LLMs
DeFAb is a new, verifiable benchmark designed to test how well foundation models handle defeasible abduction—the ability to form logical explanations that can be retracted or revised in light of new, contradictory information.
arXiv cs.AI
Showing 1 of 1