POPL 2026 (series) / Student Research Competition /
RESpecBench: Rigorous Evaluation of Specification Generation with Automated Verification
Wed 14 Jan 2026 17:30 - 19:30 at Halle 0 - SRC Poster Session
Thu 15 Jan 2026 10:20 - 10:36 at Dortoirs - SRC Talks
Thu 15 Jan 2026 10:20 - 10:36 at Dortoirs - SRC Talks
We introduce RESpecBench, a multi-domain LLM benchmark with a sound and automated specification equivalence checker across multiple domains, measuring the LLM’s ability to produce precise, semantically equivalent specifications from informal natural language descriptions. We evaluate several state-of-the-art LLMs on RESpecBench and compare our sound verifier to LLM-as-a-Judge pipelines, demonstrating that LLM-as-a-Judge produces unreliable verdicts and substantially overestimates specification correctness.
Wed 14 JanDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
Wed 14 Jan
Displayed time zone: Brussels, Copenhagen, Madrid, Paris change
Thu 15 JanDisplayed time zone: Brussels, Copenhagen, Madrid, Paris change
Thu 15 Jan
Displayed time zone: Brussels, Copenhagen, Madrid, Paris change
10:20 - 12:00 | |||
10:20 16mPoster | RESpecBench: Rigorous Evaluation of Specification Generation with Automated Verification Student Research Competition | ||
10:36 16mPoster | Amortized Analysis of Splay Trees via a Lax Homomorphism Student Research Competition Lukas Kebuladze Carnegie Mellon University Pre-print | ||
10:53 16mPoster | Semantic Completeness of Higher-Order Probabilistic Separation Logics Student Research Competition Puming Liu New York University Shanghai | ||
11:10 16mPoster | Language Model for MPI Student Research Competition | ||
11:26 16mPoster | Bootstrapping a Verified Compiler for an Imperative Language in Rocq Student Research Competition Kacper F. Korban EPFL | ||
11:43 16mPoster | An End-to-end Theory for Compositional Symbolic Execution: From Expressive Specifications to SMT Student Research Competition Shivanandan Tamil Kumaran Imperial College London | ||