FoxDec

The FoxDec (for Formal x86-64 Decompilation) project is investigating decompilation of x86-64 binaries into C code that is sound as well as fully recompilable. Soundness ensures that the decompiled C code is functionally equivalent to the input binary. Recompilability ensures that the decompiled C code can be successfully compiled to generate an executable binary.

FoxDec is currently actively being developed. In its current stage, it does disassembly, control flow reconstruction and function boundary detection. Work-in-progress is variable analysis, decompilation to C, data flow analysis, and much more.

NEWS

Our paper on binary-level ppointer analysis has been accepted at ICSE’25!
Our paper on symbolization has been accepted at CCS’24!
Our decompilation-to-C paper has received the Best Paper award at SEFM 2020!

Introduction
How to build
How to use
How to recompile
Papers
Contact

Introduction

Formally verified decompilation. Decompilation to a high-level language involves multiple phases. At a high-level, the phases usually include disassembly that lifts assembly code from binary, control flow graph (CFG) recovery that extracts program CFG from assembly, extraction of high-level program constructs (e.g., statements, variables, references) from assembly, and type assignment. FoxDec is investigating techniques for the decompilation phases that are formally verified.

FoxDec’s decompilation phases include disassembly; CFG recovery; extraction of an abstract code that models a program as a CFG of basic blocks; converting basic blocks into sequential code that models the program’s corresponding state changes over memory, registers, and flags; variable analysis that maps memory regions to variables and references; and type analysis that assigns types. Converting basic blocks into sequential code that captures program state changes requires a formal model of the underlying machine (i.e., formal semantics of x86-64 instructions).

Central to formally verified decompilation is the notion of sound decompilation. FoxDec defines soundness for each of these decompilation phases (sound disassembly is explored in a different project) and formally verifies them: algorithms for each phase are formalized in the Isabelle/HOL theorem prover and proven correct.

Use cases. Sound, recompilable decompilation to C has a variety of use cases. For example, patching a binary to fix errors or potential security exploits is highly complex in settings where source code or third-party libraries are no longer available or build processes or tools have become outdated. Patching at the C-level is relatively easier and a compelling alternative when the decompiled C code is formally proven to be functionally equivalent to the binary.

Other use cases include binary analysis, binary porting (as an alternative to software emulation), and binary optimization, each of which can now be performed at the C level (e.g., using off-the-shelf C code analysis and optimization tools).

How to build

Download FoxDec here. This will use Docker to build and run FoxDec. Section How to use contains further instructions. The GitHub page is here.

NOTE: instructions for building without Docker can be found here (only relevant for developpers).

How to use

These are instructions for a quickstart on the already supplied binary wc.

Move the binary of interest to ./binary/. The binary wc has already been supplied as running example.
Run FoxDec on the binary ./foxdec.sh wc
Directory ./artifacts/ is now populated with information.

The following files are generated:

$NAME.json and $NAME.json.txt: Contain disassembled instructions, control flow recovery, function boundaries, invariants, pointer analysis results. The two files contain the exact same information, one in JSON format and the other in humanly readable format.
$NAME.metrics.txt: a log containing metrics such as running time, number of covered instructions, accuracy of pointer analysis, etc. The two files contain the exact same information, one in JSON format and the other in humanly readable format.
$NAME.callgraph.dot: A Graphviz .dot file containing the ACG (Annotated Call Graph), annotated with verification conditions necessary to ensure "normal" behavior (e.g., no stack overflows, calling convention adherence).
nasm/$NAME.asm: recompilable NASM code.
functions/$ENTRY/$NAME.dot: For each function entry $ENTRY a control flow graph (CFG) and a log containing a pre- and a postcondition.

NOTE: it is possible to manually add entry points, by adding a plain-text file ./binary/$NAME.entry where each line is an entry address in hexadecimal notation.

How to recompile

The following instructions recompile the generated NASM code back into a binary.

nasm -felf64 -g -F dwarf -o $NAME.o $NAME.asm
gcc -c __gmon_start__.c -o __gmon_start__.o
gcc -g -m64 -nostartfiles -fgnu-tm -o a.out $NAME.o __gmon_start__.o

For the last compilation step, one must manually supply additional libraries as needed.

Papers

Verifiably Correct Lifting of Position-Independent x86-64 Binaries to Symbolized Assembly, Freek Verbeek, Nico Naus, Binoy Ravindran. Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS ’24).
Formally Verified Lifting of C-compiled x86-64 Binaries, Freek Verbeek, Joshua A. Bockenek, Zhoulai Fu and Binoy Ravindran, 43rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2022), June 13-17, 2022, San Diego, USA.
Sound C Code Decompilation for a Subset of x86-64 Binaries, Freek Verbeek, Pierre Olivier, and Binoy Ravindran, 18th International Conference on Software Engineering and Formal Methods (SEFM 2020), September 14-18, 2020, Amsterdam, The Netherlands.

Contacts

Binoy Ravindran, Virginia Tech, binoy@vt.edu
Freek Verbeek, Virginia Tech, freek@vt.edu

FoxDec is an open-source project from the Systems Software Research Group (SSRG) at Virginia Tech. It is supported by the Defense Advanced Research Projects Agency (DARPA) and Naval Information Warfare Center Pacific (NIWC Pacific) under Contract No. N66001-21-C-4028 and DARPA under Agreement No. HR.00112090028, by the US Office of Naval Research (ONR) under grant N00014-17-1-2297, and by the US Naval Surface Warfare Center Dahlgren Division/Naval Engineering Education Consortium (NEEC) under grants N00174-20-1-0009 and N00174-16-C-0018.

Table of Contents