- Our refinement relation paper was accepted at MEMOCODE 2019!
- Our symbolic execution paper was accepted at SpISA 2019!
- Our Chum paper was accepted at CPP 2019!
The Chum project is an exploration of formalizing the semantics of the x86-64 instruction set architecture (ISA) and using those semantics to describe the behavior of an x86-64 binary in a theorem prover. This involves formalization of the semantics of individual x86-64 instructions (“small step semantics”) in a theorem prover, which yields a formal model of the x86-64 ISA. Additionally, it involves lifting the semantics of an x86-64 binary program into the prover, which enables reasoning over that binary’s properties as guided by the machine model.
Past efforts have extracted small-step semantics for the x86-64 ISA into a non-formal representation, such as Heule et al.’s Strata semantics that were machine-learned on live x86-64 hardware. The Chum project lifts those Strata semantics into the Isabelle/HOL theorem prover through a syntactical translation via an intermediate language called Chum, the project’s namesake. Chum is a bitvector language with formal semantics, allowing formal reasoning over it.
To lift an x86-64 binary into a prover, the binary is disassembled to Chum, which is followed by reassembly (for symbolization) and a subsequent syntactic translation into Isabelle/HOL.
Verification of binary programs requires reasoning at a higher level of abstraction than at the individual instruction level. Chum explores this at the block level: sequences of instructions that are devoid of loops but include conditionals (“big step semantics”). Those big step semantics are extracted out of an x86-64 binary through formal symbolic execution, which is guided by a set of rewrite rules that are proven correct within Isabelle/HOL. This automatically yields a formal equivalence between decompiled x86-64 binary code and the generated big step semantics.
Trusted computing base
Chum’s TCB is entirely restricted to machine model extraction and formalization: that is, the Strata-to-Chum and Chum-to-Isabelle/HOL parsers, disassemblers, and reassemblers. The TCB is verified by testing, aided in part by the Chum language’s formal semantics.
- Establishing a Refinement Relation between Binaries and Abstract Code, Freek Verbeek, Joshua A. Bockenek, Abhijith Bharadwaj, Ian Roessle, and Binoy Ravindran, ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE 2019), October 9-11, 2019, San Diego, USA.
- Symbolic Execution of x86 Assembly in Isabelle/HOL, Freek Verbeek, Abhijith Bharadwaj, Joshua A. Bockenek, Ian Roessle, and Binoy Ravindran, Workshop on Instruction Set Architecture Specification (SpISA 2019), September 13, 2019, Portland, Oregon, USA.
- Formally Verified Big Step Semantics out of x86-64 Binaries, Ian Roessle, Freek Verbeek, and Binoy Ravindran, the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP’19), January 14-15, 2019, Cascais/Lisbon, Portugal.
Code and proofs
- Ian Roessle, Virginia Tech, email@example.com
- Freek Verbeek, Virginia Tech, firstname.lastname@example.org
- Binoy Ravindran, Virginia Tech, email@example.com
Chum is an open-source project from the Systems Software Research Group (SSRG) at Virginia Tech. It is supported by the Office of Naval Research (ONR) under grant N00014-17-1-2297 and the Naval Sea Systems Command (NAVSEA)/the Naval Engineering Education Consortium (NEEC) under grant N00174-16-C-0018.