LLVM with formal memory Usage CErtificates



The Luce project is an investigation into how to automatically reason about memory corruption-related weird machines. To achieve this goal, we are exploring algorithmic techniques that, at compile-time, compute a class of memory corruption-related, unintended emergent behaviors that are formally proven accurate.

While the term Luce is an acronym, standing for LLVM with formal memory usage certificates, it is also Latin and Italian for light and is pronounced as such.

Memory preservation

Central to reasoning about memory usage is the notion of memory preservation: specific regions of memory can be formally proven to not be modified by a program operating in that memory space. We are investigating how to compute preconditions on the relationships between program-modified memory regions that prevent weird machines from emerging at run-time: for any pair of memory regions accessed by the program, the preconditions state whether those regions must be (spatially) separate, enclosed, or overlapping in order to prevent potential memory corruption exploits. For example, the preconditions may state that heap memory regions pointed to by a function’s arguments must not overlap with the function’s stack frame regions. This is necessary to prevent potential return-address exploits, such as return-oriented programming (ROP) attacks.

FMUCs and automated proof generation

Luce’s algorithms operate on assembly code, which may be produced by a compiler or obtained via disassembly. From that assembly, Luce automatically generates a formal memory usage certificate, or FMUC, for every function analyzed. Each FMUC contains:

Diagram of FMUC generation process going from assembly to CFG extraction to syntactic control flow to symbolic execution and SMT solving to resulting FMUC and then the loading into Isabelle.

The theorems and preconditions mentioned above leverage Hoare logic. The proof ingredients contain required assumptions on memory layout, control flow information, and invariants. Those are computed using a suite of techniques, including syntactic control flow extraction, SMT solving, and symbolic execution.

An example showing C program source, equivalent assembly code, and resultant FMUC.

As the FMUCs are (automatically) generated by untrusted components (i.e., handcrafted control flow analyzers, SMT solvers, etc.), Luce uses an interactive theorem prover to verify them. The proof obligations embodied in the FMUCs are semi-automatically discharged by the Isabelle/HOL theorem prover, leveraging customized proof strategies. Only minor modifications and some proof guidance when dealing with loops are needed to finish the proofs.


Code and proofs


Luce is an open-source project from the Systems Software Research Group (SSRG) at Virginia Tech. It is based on work supported by the Defense Advanced Research Projects Agency (DARPA) under Agreement No. HR.00112090028.