Abstract interpretation analysis using APRON and Clang


Introduction

Abstract interpretation is a theory in computer science on which are based practical methods of software static analysis. A good presentation of the topic is given in the APRON library web page: a detailed presentation by Antoine Miné. The APRON library provides implementations of abstract domains. It is used for the mathematical analysis of the program. The Clang library, which is a part of the LLVM project, provides the tools for actually parsing a program file, creating the abstract syntax tree (AST), the control flow graph (CFG) and an implementation of the work list algorithm (also referred to as chaotic iteration), which runs on top of the CFG. The work list algorithm analyzes lines in the code using the abstract domain (the Intervals domain, in this case) using the APRON library.

The target of this work was to create a software tool that takes a C program as input and outputs the result of static analysis made by using the APRON library. Since getting it all done was not an easy task I thought it will be helpful to post this project online. It should be noticed that all I cared about is to solder the two libraries together as fast as possible so the resulting software could look ugly in its code style and very un-efficient. The version numbers I used were APRON 0.9.9 and Clang 3.6.0.


The example

For now, this ‘worklist-clang-apron’ tool was tested (actually, built around) one example input code and supports only few operations (see “further development” section). The Example program is taken from the graduate course: Program analysis and verification by Noam Rinetzky.

The example input program code, loop_example.c:

#include <stdio.h>

static void loop_example(void)
{
    int x;
    x = 7;

    while (x < 1000)
        ++x;

    if (x != 1000)
        printf("Unable to prove x == 1000!\n");
}

int main(void)
{
    loop_example();

    return 0;
}

The example program’s CFG, as created by Clang’s CFG tool:

CFG

The ‘worklist-clang-apron’ tool output, after the work list algorithm has reached a fixed point:

Fixed point reached
Each block abstract value:
[B7] interval of dim (1,0):
       x in [-oo,+oo]
[B6] interval of dim (1,0):
       x in [7,7]
[B5] interval of dim (1,0):
       x in [7,1000]
[B4] interval of dim (1,0):
       x in [8,1000]
[B3] interval of dim (1,0):
       x in [8,1000]
[B1] interval of dim (1,0): bottom
[B2] interval of dim (1,0):
       x in [1000,1000]
[B0] interval of dim (1,0):
       x in [1000,1000]

– It can be seen that block 0 ([B0]), the exit block, holds correctly the concrete value x=1000, while the block of the error branch, [B1], has the abstract “bottom” value, meaning that it wasn’t reached during the iteration.


Install


Running the tool

The execution usage:

$ ./worklist <c file> <0|1|2> <function to analyze>

The 2nd argument is for CFG output. 0: none. 1: textual CFG. 2: graphic CFG using Graphviz.

It is recommended to first run the example, including the creation of the CFG plot:

$ ./worklist loop_example.c 2 loop_example


Further development

Please comment here about bugs and problems with installation or running the tool.

There are many more operations that need to be added to the transfer functions: now are supported only conditions of the type ‘<‘ (less than), ‘>=’ (greater or equal), ‘==’ (equal), ‘!=’ (unequal), and the arithmetical operations of adding a constant (‘x+10’) and increment by one (‘x++’), over integer variables. It won’t be hard to update the tool with the other operations.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: