Skip to content

Abstractions for Reasoning about Memory in F* Low*

Catalin Hritcu edited this page May 7, 2019 · 25 revisions

Background on Low*

Low* is a DSL in F* that allows to build verified C-like programs. It provides a low-level memory model with support for pointers, arrays, stack and heap allocation, deallocation, and arbitrary aliasing patterns. It has a straightforward metatheory built by relating its memory model to that of CompCert Clight.

Low* has been used successfully to verify a variety of non-trivial programs. The main verification task in Low* involves reasoning about memory. For this, Low* provides libraries that structure specifications in the style of Kassios' dynamic frames (aka modifies clauses). This style of specification has a number of benefits: it is general, can handle arbitrary pattern of aliasing, and it is embedded within first-order logic so encodeable to an SMT solver.

Nevertheless, reasoning about memory in Low* remains challenging, incurring a significant specification and proof overhead in stating and maintaining anti-aliasing invariants. We find that it scales poorly in two respects:

Scaling problem 1: reasoning about the shape and contents of memory is intermingled with reasoning about other concerns (e.g., arithmetic) and the interactions can cause proofs to slow down significantly;

Scaling problem 2: when programming with a large number of mutable locations, specifications and proofs it often becomes quadratic to reason about locations pairwise.

Separation logic

A widespread approach to reasoning about memory (and an alternative to dynamic frames) is separation logic (SL). SL simplifies reasoning about memory, particularly in the common case where accessing mutually disjoint memory locations. In such a setting, SL specifications and proofs can closely resemble reasoning about functional programs that manipulate an n-tuple of values, largely ignoring the fact that these values are actually maintained within mutable memory cells.

Structuring proofs using separation logic stands to significantly simplify the verification of many (most?) Low* programs, since most programs use relatively simple aliasing patterns. However, building verification conditions in separation logic in a manner suitable for encoding to an SMT solver remains an active area of research.

Prior attempts at encoding separation logic in F*/Low* involved the following elements:

  • A library that defined the separating conjunction ( * ) of SL as an assertion on top of F*'s existing memory models.

  • Specifying programs using * instead of the underlying, lower-level memory model concepts

  • Building verification conditions as usual, but pre-processing the VC with tactics to solve for quantifiers using AC-rewriting of the *-connective, before feeding this VC to the SMT solver.

While this approach is general and may still be promising, we have yet to make it work at scale. Part of the difficulty is perhaps due to the need to run tactics efficiently over monolithic large VCs that mix a variety of concerns, memory being only one part. In other words, our prior attempts have tackled "Scaling problem 2", but "Scaling problem 1" remains a significant hurdle.

Lenses for separating structured reasoning about memory

Meanwhile, we have been exploring another approach that aims to provide similar SL-like capabilities to reasoning about memory, but through a very different mechanism, that aims to also address "Scaling problem 1", i.e., we are aiming to factor reasoning about memory footprints out of the VC completely.

The main idea: LensST a l pre post

We provide an indexed effect as a structured way to specify the footprint of a computation.

In contrast to Low*'s ST a pre post effect, where read footprints are specified using liveness assertions in pre and write footprints are specified using modifies assertions in post ...

The index l : hs_lens a b on the LensST effect describes:

  • the set of memory locations that a computation may read or write

  • a set of abstract handles to name those memory locations (e.g., integers, some enumerated type, fields of a tuple etc.)

  • a bijection between memories restricted to those locations and some type (the "view"), the abstraction at which pre and post assert properties about memory (restricted to their footprint)

For example, consider two styles of specification for a computation that swaps the contents of two pointers.

  1. Classic Low*
  val swap (x y: pointer a)
    : Stack unit
      (requires fun h -> 
          live h x /\
          live h y /\
          disjoint x y)
      (ensures fun h0 _ h1 ->
          live h1 x /\
          live h1 y /\
          h1.[x] == h0.[y] /\
          h1.[y] == h0.[x])
  1. LensST
  val swap (x y: pointer a)
    : LensST unit (pair_lens x y)
      (requires fun h -> True)
      (ensures fun (x0, y0) _ (x1, u1) ->
          x1 == y0 /\
          y1 == x0)

Some main benefits of this style:

  • It factors out specification and reasoning about memory footprints from the rest of the spec. (addressing problem 1, i.e., memory reasoning is no longer intermingled with the rest)

  • It internalizes reasoning about disjointness / liveness and maybe even some footprint-local invariants into the lens, freeing the rest of the program from confronting them. (addressing problem 2, providing a form of separation logic)

  • It provides a user-defined memory model on top of the underlying monolithic memory model (making reasoning more functional, in the style of a user-defined state monad)

Some challenges:

  1. Composition: We want to compose lenses and decompose lenses, in support of framing. For example, we want to call a computation val write (x:pointer a) (v:a) : LensST unit (ptr_lens x) ... from swap and have a way to conveniently select one of the pointers in the enclosing footprint of swap.

    This gets trickier if we have a lens made up of, say, 5 resources and we want to frame away two of them retaining the others ... requiring potentially some kind of AC-rewriting or proofs of lens inclusion.

  2. Abstraction: Lenses should be defined not just based on pointers to memory. Instead, we should be able to encapsulate any stateful resource within a lens. E.g., a resource can be represent a handle to some state (e.g., state), a memory footprint associated with it (fp : state -> loc) and an operation to read and update the state abstractly (sel: mem -> state -> t and upd: mem -> state -> t -> mem), and some abstract invariant (inv:state -> mem -> prop) with the expected properties.

  3. Keys for resources: To build efficient, normalizable verification conditions, we need to map resources to unique resource handles (e.g., 0, 1, 2, 3... etc.) How to assign these handles, how to manage them in the presence of dynamic allocation, unfolding abstractions etc.

Clone this wiki locally