Publications

You can also find my articles on my Google Scholar profile.

Preprints


Crux, a Precise Verifier for Rust and Other Languages

Published in ArXiv [cs.PL], 2024

Abstract: We present Crux, a cross-language verification tool for Rust and C/LLVM. Crux targets bounded, intricate pieces of code that are difficult for humans to get right: for example, cryptographic modules and serializer / deserializer pairs. Crux builds on the same framework as the mature SAW-Cryptol toolchain, but Crux provides an interface where proofs are phrased as symbolic unit tests. Crux is designed for use in production environments, and has already seen use in industry.

Download Paper

Macaw: A Machine Code Toolbox for the Busy Binary Analyst

Published in ArXiv [cs.PL], 2024

Abstract: When attempting to understand the behavior of an executable, a binary analyst can make use of many different techniques. These include program slicing, dynamic instrumentation, binary-level rewriting, symbolic execution, and formal verification, all of which can uncover insights into how a piece of machine code behaves. As a result, there is no one-size-fits-all binary analysis tool, so a binary analysis researcher will often combine several different tools. Sometimes, a researcher will even need to design new tools to study problems that existing frameworks are not well equipped to handle. Designing such tools from complete scratch is rarely time- or cost-effective, however, given the scale and complexity of modern instruction set architectures.

Download Paper

Refining Existential Properties in Separation Logic Analyses

Published in ArXiv [cs.LO], 2015

Abstract: In separation logic program analyses, tractability is generally achieved by restricting invariants to a finite abstract domain. As this domain cannot vary, loss of information can cause failure even when verification is possible in the underlying logic. In this paper, we propose a CEGAR-like method for detecting spurious failures and avoiding them by refining the abstract domain. Our approach is geared towards discovering existential properties, e.g. “list contains value x”. To diagnose failures, we use abduction, a technique for inferring command preconditions. Our method works backwards from an error, identifying necessary information lost by abstraction, and refining the forward analysis to avoid the error. We define domains for several classes of existential properties, and show their effectiveness on case studies adapted from Redis, Azureus and FreeRTOS.

Download Paper

Conference Papers


Daedalus: Safer Document Parsing

Published in Programming Language Design and Implementation (PLDI), 2024

Abstract: Despite decades of contributions to the theoretical foundations of parsing and the many tools available to aid in parser development, many security attacks in the wild still exploit parsers. The issues are myriad—flaws in memory management in contexts lacking memory safety, flaws in syntactic or semantic validation of input, and misinterpretation of hundred-page-plus standards documents. It remains challenging to build and maintain parsers for common, mature data formats. In response to these challenges, we present Daedalus, a new domain-specific language (DSL) and toolchain for writing safe parsers. Daedalus is built around functional-style parser combinators, which suit the rich data dependencies often found in complex data formats. It adds domain-specific constructs for stream manipulation, allowing the natural expression of parsing noncontiguous formats. Balancing between expressivity and domain-specific constructs lends Daedalus specifications simplicity and leaves them amenable to analysis. As a stand-alone DSL, Daedalus is able to generate safe parsers in multiple languages, currently C++ and Haskell. We have implemented 20 data formats with Daedalus, including two large, complex formats—PDF and NITF—and our evaluation shows that Daedalus parsers are concise and performant. Our experience with PDF forms our largest case study. We worked with the PDF Association to build a reference implementation, which was subject to a red-teaming exercise along with a number of other PDF parsers and was the only parser to be found free of defects.

Download Paper

Research Report: An Optim (l) Approach to Parsing Random-Access Formats

Published in Tenth LangSec Workshop at IEEE Security & Privacy, 2024

Abstract: We introduce a Domain Specific Language (DSL) that allows for the declarative specification of random access formats (formats that include offsets or require out of order parsing, e.g., zip, ICC, and PDF). Our DSL is composed by layering three distinct computational languages: first, we have the base layer of primitive parsers (these could use various “off the shelf” parsing technology). Second, upon this base we layer our XRP (eXplicit Region Parser) DSL, a pure functional language for declaratively specifying random-access formats. XRP, although similar to parser combinators, gives greater control and expressiveness by making parsing regions explicit. Third, we embed XRP into Optim (l), a novel DSL that describes Directed Acyclic Graphs (DAGs) of computations. A node represents a computation in XRP, an edge represents a dependency between the computations. From a declarative Optim (l) program we can generate an imperative module that provides an API for ondemand and incremental parsing of the random access format.

Download Paper

Trustworthy Runtime Verification via Bisimulation (Experience Report)

Published in International Conference on Functional Programming (ICFP), 2023

Abstract: When runtime verification is used to monitor safety-critical systems, it is essential that monitoring code behaves correctly. The Copilot runtime verification framework pursues this goal by automatically generating C monitor programs from a high-level DSL embedded in Haskell. In safety-critical domains, every piece of deployed code must be accompanied by an assurance argument that is convincing to human auditors. However, it is difficult for auditors to determine with confidence that a compiled monitor cannot crash and implements the behavior required by the Copilot semantics.

Download Paper

Formally Verifying Industry Cryptography

Published in IEEE Security & Privacy (Volume: 20, Issue: 3, May-June 2022), 2022

Abstract: Over the past five years, Galois has formally verified several cryptographic systems that are used in demanding industry environments. This article discusses our approach to these verification projects, focusing on the practical engineering challenges that exist when building and deploying proofs in industry.

Download Paper

Verified Cryptographic Code for Everybody

Published in Computer Aided Verification (CAV), 2021

Abstract: We have completed machine-assisted proofs of two highly-optimized cryptographic primitives, AES-256-GCM and SHA-384. We have verified that the implementations of these primitives, written in a mix of C and x86 assembly, are memory safe and functionally correct, by which we mean input-output equivalent to their algorithmic specifications. Our proofs were completed using SAW, a bounded cryptographic verification tool which we have extended to handle embedded x86. The code we have verified comes from AWS LibCrypto. This code is identical to BoringSSL and very similar to OpenSSL, from which it ultimately derives. We believe we are the first to formally verify these implementations, which protect the security of nearly everybody on the internet.

Download Paper

On the Formal Verification of the Stellar Consensus Protocol

Published in Workshop on Formal Methods for Blockchains (FMBC), 2020

Abstract: The Stellar Consensus Protocol (SCP) is a quorum-based BFT consensus protocol. However, instead of using threshold-based quorums, SCP is permissionless and its quorum system emerges from participants’ self-declared trust relationships. In this paper, we describe the methodology we deploy to formally verify the safety and liveness of SCP for arbitrary but fixed configurations. The proof uses a combination of Ivy and Isabelle/HOL. In Ivy, we model SCP in first-order logic, and we verify safety and liveness under eventual synchrony. In Isabelle/HOL, we prove the validity of our first-order encoding with respect to a more direct higher-order model. SCP is currently deployed in the Stellar Network, and we believe this is the first mechanized proof of both safety and liveness, specified in LTL, for a deployed BFT protocol.

Download Paper

Compositional Verification of Compiler Optimisations on Relaxed Memory

Published in European Symposium on Programming (ESOP), 2018

Abstract: A valid compiler optimisation transforms a block in a program without introducing new observable behaviours to the program as a whole. Deciding which optimisations are valid can be difficult, and depends closely on the semantic model of the programming language. Axiomatic relaxed models, such as C++11, present particular challenges for determining validity, because such models allow subtle effects of a block transformation to be observed by the rest of the program. In this paper we present a denotational theory that captures optimisation validity on an axiomatic model corresponding to a fragment of C++11. Our theory allows verifying an optimisation compositionally, by considering only the block it transforms instead of the whole program. Using this property, we realise the theory in the first push-button tool that can verify real-world optimisations under an axiomatic memory model.

Download Paper

Starling: Lightweight Concurrency Verification with Views

Published in Computer Aided Verification (CAV), 2017

Abstract: Modern program logics have made it feasible to verify the most complex concurrent algorithms. However, many such logics are complex, and most lack automated tool support. We propose Starling, a new lightweight logic and automated tool for concurrency verification. Starling takes a proof outline written in an abstracted Hoare-logic style, and converts it into proof terms that can be discharged by a sequential solver. Starling’s approach is generic in its structure, making it easy to target different solvers. In this paper we verify shared-variable algorithms using the Z3 SMT solver, and heap-based algorithms using the GRASShopper solver. We have applied our approach to a range of concurrent algorithms, including Rust’s atomic reference counter, the Linux ticketed lock, the CLH queue-lock, and a fine-grained list algorithm.

Download Paper

Proving Linearizability Using Partial Orders

Published in European Symposium on Programming (ESOP), 2017

Abstract: Linearizability is the commonly accepted notion of correctness for concurrent data structures. It requires that any execution of the data structure is justified by a linearization — a linear order on operations satisfying the data structure’s sequential specification. Proving linearizability is often challenging because an operation’s position in the linearization order may depend on future operations. This makes it very difficult to incrementally construct the linearization in a proof.

Download Paper

A Scalable, Correct Time-Stamped Stack

Published in Principles of Programming Languages (POPL), 2015

Abstract: Concurrent data-structures, such as stacks, queues, and deques, often implicitly enforce a total order over elements in their underlying memory layout. However, much of this order is unnecessary: linearizability only requires that elements are ordered if the insert methods ran in sequence. We propose a new approach which uses timestamping to avoid unnecessary ordering. Pairs of elements can be left unordered if their associated insert operations ran concurrently, and order imposed as necessary at the eventual removal.

Download Paper

Learning Assertions to Verify Linked-List Programs

Published in Software Engineering and Formal Methods (SEFM), 2015

Abstract: C programs that manipulate list-based dynamic data structures remain a challenging target for static verification. In this paper we employ the dynamic analysis of dsOli to locate and identify data structure operations in a program, and then use this information to automatically annotate that program with assertions in separation logic. These annotations comprise candidate pre/post-conditions and loop invariants suitable to statically verify memory safety with the verification tool VeriFast. By using both textbook and real-world examples on our prototype implementation, we show that the generated assertions are often discharged automatically. Even when this is not the case, candidate invariants are of great help to the verification engineer, significantly reducing the manual verification effort.

Download Paper

Towards Rigorously Faking Bidirectional Model Transformations

Published in Analysis of Model Transformations Workshop, 2014

Abstract: Bidirectional model transformations (bx) are mechanisms for automatically restoring consistency between multiple concurrently modified models. They are, however, challenging to implement; many model transformation languages not supporting them at all. In this paper, we propose an approach for automatically obtaining the consistency guarantees of bx without the complexities of a bx language. First, we show how to “fake” true bidirectionality using pairs of unidirectional transformations and inter-model consistency constraints in Epsilon. Then, we propose to automatically verify that these transformations are consistency preserving—thus indistinguishable from true bx—by defining translations to graph rewrite rules and nested conditions, and leveraging recent proof calculi for graph transformation verification.

Download Paper

Ribbon Proofs for Separation Logic

Published in European Symposium on Programming (ESOP), 2013

Abstract: We present ribbon proofs, a diagrammatic system for proving program correctness based on separation logic. Ribbon proofs emphasise the structure of a proof, so are intelligible and pedagogical. Because they contain less redundancy than proof outlines, and allow each proof step to be checked locally, they may be more scalable. Where proof outlines are cumbersome to modify, ribbon proofs can be visually manoeuvred to yield proofs of variant programs. This paper introduces the ribbon proof system, proves its soundness and completeness, and outlines a prototype tool for validating the diagrams in Isabelle.

Download Paper

Library Abstraction for C/C++ Concurrency

Published in Principals of Programming Languages (POPL), 2013

Abstract: When constructing complex concurrent systems, abstraction is vital: programmers should be able to reason about concurrent libraries in terms of abstract specifications that hide the implementation details. Relaxed memory models present substantial challenges in this respect, as libraries need not provide sequentially consistent abstractions: to avoid unnecessary synchronisation, they may allow clients to observe relaxed memory effects, and library specifications must capture these.

Download Paper

Journal Articles


Verifying Custom Synchronization Constructs Using Higher-Order Separation Logic

Published in Transactions on Programming Languages and Systems (TOPLAS), Volume 38, Issue 2, 2016

Abstract: Synchronization constructs lie at the heart of any reliable concurrent program. Many such constructs are standard (e.g., locks, queues, stacks, and hash-tables). However, many concurrent applications require custom synchronization constructs with special-purpose behavior. These constructs present a significant challenge for verification. Like standard constructs, they rely on subtle racy behavior, but unlike standard constructs, they may not have well-understood abstract interfaces. As they are custom built, such constructs are also far more likely to be unreliable.

Download Paper

Proof-Directed Parallelization Synthesis by Separation Logic

Published in Transactions on Programming Languages and Systems (TOPLAS), Volume 35, Issue 2, 2013

Abstract: We present an analysis which takes as its input a sequential program, augmented with annotations indicating potential parallelization opportunities, and a sequential proof, written in separation logic, and produces a correctly synchronized parallelized program and proof of that program. Unlike previous work, ours is not a simple independence analysis that admits parallelization only when threads do not interfere; rather, we insert synchronization to preserve dependencies in the sequential program that might be violated by a naïve translation. Separation logic allows us to parallelize fine-grained patterns of resource usage, moving beyond straightforward points-to analysis. The sequential proof need only represent shape properties, meaning we can handle complex algorithms without verifying every aspect of their behavior.

Download Paper

Misc


PhD Thesis: Graph Transformation and Pointer Structures

Published in University of York, Department of Computer Science, 2008

Abstract: This thesis is concerned with the use of graph-transformation rules to specify and manipulate pointer structures. In it, we show that graph transformation can form the basis of a practical and well-formalised approach to specifying pointer properties. We also show that graph transformation rules can be used as an efficient mechanism for checking the properties of graphs.

Download Paper