Skip to content

Commit

Permalink
Merge AE into main (#19)
Browse files Browse the repository at this point in the history
* Merge changes from main (#16)

* Fix cmake

* do not make clean

---------

Co-authored-by: Austin Mordahl <austin_noroot@chronos.utdallas.edu>

* Add SugarlyzerConfig locally

* Make sample size configurable, and fix urllib3 to prevent chunked error

* Update docs

* Add comparison script

* Some updates

* Remove zachfiles

* Fix deduplication

* Fix deduplication

* Fix deduplication

* Moved postprocessing to sugarlyzer instead of to jupyter notebook

* Merge configurations in baseline results

* Syntax error

* Remove unnecessary files

* remove unnecessary files

* update API to be compatible with newer versions of python

* Updated readme

* Anonymize notebook

* removed link

* Anonymization, as well as more specific pointers to the paper

* Some small updates to fix exceptions.

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Add scripts

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Add scripts

* Remove tests

* Fix space issue

* Fix the progress bar

* fix removal

* fix removal

* fix alarms

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* For some reason, moving the deletion worked?

* Remove unnecessary logs

* fix file deletion

* Remove intermediate files

* Update requests version to fix breaking change in DockerPy (#17)

* Update requests version to fix breaking change in DockerPy

* Delete results.json

* fixed missing lib

* fix dockerfile cache (#18)

---------

Co-authored-by: Austin Mordahl <austin_noroot@chronos.utdallas.edu>
Co-authored-by: arjpeg <58893337+arjpeg@users.noreply.github.com>
  • Loading branch information
3 people authored Sep 27, 2024
1 parent 7c1a822 commit c6ed20c
Show file tree
Hide file tree
Showing 533 changed files with 83,855 additions and 10,219 deletions.
32 changes: 27 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,19 @@ RUN apt-get update \
&& rm /tmp/cmake-install.sh \
&& ln -s /opt/cmake-3.24.1/bin/* /usr/local/bin

# Install cmake From https://www.softwarepronto.com/2022/09/dockerubuntu-installing-latest-cmake-on.html
RUN apt-get update \
&& apt-get -y install build-essential \
&& apt-get install -y wget \
&& rm -rf /var/lib/apt/lists/* \
&& wget https://github.com/Kitware/CMake/releases/download/v3.24.1/cmake-3.24.1-Linux-x86_64.sh \
-q -O /tmp/cmake-install.sh \
&& chmod u+x /tmp/cmake-install.sh \
&& mkdir /opt/cmake-3.24.1 \
&& /tmp/cmake-install.sh --skip-license --prefix=/opt/cmake-3.24.1 \
&& rm /tmp/cmake-install.sh \
&& ln -s /opt/cmake-3.24.1/bin/* /usr/local/bin

ARG JOBS
RUN git clone https://github.com/Z3Prover/z3.git

Expand Down Expand Up @@ -46,14 +59,23 @@ ENV CLASSPATH=:/superc/classes:/superc/bin/json-simple-1.1.1.jar:/superc/bin/jun
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
RUN cd superc && make configure && make

WORKDIR /
ADD "https://api.github.com/repos/pattersonz/sugarlyzerconfig/commits?per_page=1" latest_commit
RUN git clone https://github.com/pattersonz/SugarlyzerConfig

# fix to make docker not reinstall everything when the code changes
# (cache fix)
RUN python3.10 -m venv /venv
ENV PATH=/venv/bin:$PATH
ADD . /Sugarlyzer

RUN mkdir /Sugarlyzer
WORKDIR /Sugarlyzer

COPY requirements.txt .

RUN python -m pip install -r requirements.txt --use-pep517

ADD . .

RUN mv resources/SugarlyzerConfig /SugarlyzerConfig

RUN python -m pip install -e .

WORKDIR /

126 changes: 112 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,41 @@
# Sugarlyzer

Sugarlyzer is a framework for performing static analysis using off-the-shelf bug finders on C software product lines.

This artifact is capable of running a subset of experiments from our paper. Specifically, this artifact is capable of
producing the results from Table 3 in RQ1, which show the comparison between our approach and the sampling-based approach.
Additionally, we can run analysis on the the TOSEM benchmarks (column 3 of Table 6, as well as analysis time).
We are still working on integrating and testing the analysis for the family-based baseline (Column 2 of Table 6) as well as the
analyses for the Varbugs benchmark (Table 2), and these will be included in a future release of Sugarlyzer.

# Prerequisites
This application is written for Python version >= 3.10.0. Furthermore,
Sugarlyzer runs its analyses in Docker containers in order to maintain consistent
environments across runs, so you must have a working Docker installation. We suggest using PyEnv to manage multiple python versions.
This application is written for Python version >= 3.10.0. We suggest using PyEnv to manage multiple Python versions.
Furthermore, Sugarlyzer runs its analyses in Docker containers in order to maintain consistent
environments across runs, so you must have a working Docker installation.

# Usage
# Setup

We recommend creating a virtual environment for Sugarlyzer. To do so, run
First, create a virtual environment for Sugarlyzer. To do so, run

`python -m venv <name_of_virtual_environment>`

where 'python' points to a python 3.10.0 or higher installation. This will create a new folder. If, for example, you named your virtual environment 'venv', then
you can activate it as follows:
where 'python' points to a python 3.10.0 or higher installation (note that this may be `python3` on your system).
This will create a new folder.
If, for example, you named your virtual environment 'venv', then you can activate it as follows:

`source ./venv/bin/activate`

Your shell prompt should now have a prefix with the name of the virtual environment.
Now, when you install dependencies, they will be installed into this virtual environment instead of globally.

In order to install Sugarlyzer's dependencies, from the root directory of the repository, run

`python -m pip install -r requirements.txt`
`pip install -r requirements.txt`

Where `python` points to a Python executable of at least version 3.10.0.
This will install all of the Python dependencies required. Then, in order to install
the application, run

`python -m pip install -e .`
`pip install -e .`

This installation will put two executables on your system PATH: `dispatcher`, and `tester`.
`dispatcher` is the command you run from your host, while `tester` is the command you run from inside the Docker container (under normal usage, a user
Expand All @@ -34,18 +45,105 @@ container creation, which can take quite a while. especially for Clang which nee
Simply run `dispatcher --help` from anywhere in order to see the helpdoc on how to
invoke Sugarlyzer.

# Usage

`dispatcher` is the primary interface for interacting with Sugarlyzer. Using `dispatcher`, we can run two types of analysis.
First, we can run static analysis on desugared code (our primary contribution) (Sections 5.2.2 and 5.3).
Second, we can run the sampling-based baseline, which uses configuration samples from Mordahl et al.'s 2019 work [1] (Section 5.2.2).

An example of running static analysis on desugared code can be seen by running

```
dispatcher -t infer -p toybox --jobs <<number of jobs you want to run concurrently>>
```

This will run the Infer static analyzer on the desugared code of Toybox. Run with 8 cores, this experiment took about 30 minutes, and produces 21 reports.

To run baseline experiments, simply pass the `--baseline` parameter. **However, note that this will, by default, run all 1000 configurations from Mordahl et al.'s FSE 2019 work.** To limit the number of configurations that are run, use the `--sample-size` parameter.
For example, to run Infer's analysis on 10 random configurations of Toybox, use the following command:

```
dispatcher -t infer -p toybox --baselines --sample-size 10 --jobs <<number of jobs you want to run concurrently>
```

This should produce about 17 reports (this may change depending on which configurations are sampled.)

Alternative analyzers and target programs can be specified with `-t` and `-p`, respectively.
Currently, the Infer (infer), Clang (clang), and Phasar (phasar) static analyzers are implemented.
We have also integrated six target systems (per Section 5.1).
From Mordahl et al.'s work [1], we integrated axTLS 2.1.4 (axtls), Toybox 0.7.5 (toybox), and Busybox 1.28.0 (busybox).
From von Rhein et al's work [2], we integrated Busybox 1.18.5 (tosembusybox), OpenSSL 1.0.1c (tosemopenssl), uClibc 0.9.33.2 (tosemuclibc).

**Note that baseline experiments only work on the target programs from Mordahl et al's work [1]. The other experiments were run using different tooling that is not a part of this artifact. These experiments will be integrated in a future version of Sugarlyzer.**

## For Artifact Reviewers

We have provided three scripts: `runDesugared.sh`, `runBaselines.sh`, and `runSmallExperiments.sh`. These run the desugared analysis, the sampling-based baseline, and a small subset of experiments respectively. The first two scripts take longer than a day to run when parallelized to 60 cores, so we recommend artifact reviewers run the `runSmallExperiments.sh` script, which takes approximately an hour if run with 8 jobs at a time. This script will run the desguared analysis on Infer, as well as the sampling-based baseline on 10 configurations.

# Results

By default, results are written to a `results.json` file in the root directory, but this file can be modified with the `-r` option.
The file is a JSON file, with a list of alarms that were detected during the analysis.
Alarms on desugared inputs have the following relevant fields:

- input_file: The file on which the report was detected.
- input_line: The line in the desugared file on which the alarm was detected.
- original_line: The line(s) in the original file that the input_line corresponds to.
- message: The alarm message
- bug_type: The type of check that was being performed (e.g., a memory leak).
- presence_condition: The condition under which the alarm exists. A blank condition, like "Or(And())" indicates that the alarm is present in all variants of the SPL.

Baseline alarms are formatted somewhat differently. Specifically, instead of presence_condition, they have a "configuration" field that lists configurations under which the alarm was detected.

# Processing Results

We provide a Jupyter notebook, located at `scripts/comparison.ipynb`. This script can tell you the time that desugaring and analysis took (Tables 2 and 6), as well as compare baseline/desugared results to see their overlap (Column 5 of Table 2).
Instructions for using the notebook are embedded in the notebook.

# Extending with New Tools

To extend Sugarlyzer with new tools, the following steps must be performed.
Extending Sugarlyzer with new analysis tools is straightforward. To extend Sugarlyzer with new tools, the following steps must be performed.
1. Add a new dockerfile to `resources/tools/<tool_name>/Dockerfile`. This Dockerfile *must* 1) Inherit from the sugarlyzer/base:latest image, which contains Sugarlyzer and its dependencies, and 2) install the tool so it can be invoked from the command line. *Please note that the tool name that is exposed to the user via the command line and the name of the tool as passed to AbstractTool is the exact same as whatever this folder is named.*
2. Add a new class to `src/sugarlyzer/analyses` that inherits from AbstractTool. The only method that must be implemented is `analyze`, which takes as input a path to a code file and returns an iterable of result files, containing the analysis results. Also, update `src/sugarlyzer/analyses/AnalysisToolFactory` to correctly return an instance of your tool given its name.
3. Add a new reader to `src/sugarlyzer/readers` that inherits from AbstractReader. The only function that must be implemented is `read_output`, which takes as input a report file as produced by the runner implemented in step 2. and returns Alarm objects.*


\* Note that, depending on your needs, it may be necessary to derive your own subtype of `Alarm,` as we do for Clang.

# Extending with New Programs

To extend Sugarlyzer with new programs, the following steps must be performed:
The process for extending Sugarlyzer with new programs is more involved, and we are happy to help with such an integration. Generally, the process looks like this:
1. Add a new folder to `resources/programs` with the name of the program/set of programs you wish to use. Note that, like tools, Sugarlyzer will use the name of this folder to refer to the program.
2. This folder must have two elements. First, a runnable script (make sure to update the permissions before you try to run Sugarlyzer) that places the program somewhere in the /targets folder. This will be run in the Docker container, so it won't modify your host system. Second, a `program.json` file which contains two fields. "build_script", which contains the name of the build script you just added, and "source_location", which is a list of folders to search for source files. If "source_location" is omitted, all .c files in the /results directory will be used.
2. This folder must have two elements. First, a runnable script (make sure to update the permissions before you try to run Sugarlyzer) that places the program somewhere in the /targets folder. This will be run in the Docker container, so it won't modify your host system. Second, a `program.json` file. The program.json file must contain various fields which tell Sugarlyzer how the code is structured. We suggest looking at existing files for examples. The required fields are a "build_script," which contains the location of the aforementioned script. Next, a "project_root," which contains the name of the root folder of the source code. Next, an "included_files_and_directories," which will tell Sugarlyzer which files and directories need to be included to compile each file. This field is a list of records, where each record contains an "included_files" and "included_directories" field. For example, an excerpt of axTLS's file is shown below:

```
"included_files_and_directories": [
{
"included_files": [
"/SugarlyzerConfig/axtlsInc.h"
],
"included_directories": [
"/SugarlyzerConfig/",
"/SugarlyzerConfig/stdinc/usr/include/",
"/SugarlyzerConfig/stdinc/usr/include/x86_64-linux-gnu/",
"/SugarlyzerConfig/stdinc/usr/lib/gcc/x86_64-linux-gnu/9/include/"
]
},
{
"file_pattern": "aes\\.c$",
"included_files": [],
"included_directories": [
"config",
"ssl",
"crypto"
]
}
]
```

The first entry applies to all files in axTLS -- i.e., every file should be compiled with the axtlsInc.h file, as well as the directories listed under included_directories. The second entry has a filter (`file_pattern`), which tells us that for any file that matches the regular expression `aes\.c`, we should additionally include the `config`, `ssl`, and `crypto` directories when we compile the file.

[1] Mordahl, Austin, Jeho Oh, Ugur Koc, Shiyi Wei, and Paul Gazzillo. "An empirical study of real-world variability bugs detected by variability-oblivious tools." In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 50-61. 2019.

[2] Rhein, Alexander Von, Jörg Liebig, Andreas Janker, Christian Kästner, and Sven Apel. "Variability-aware static analysis at scale: An empirical study." ACM Transactions on Software Engineering and Methodology (TOSEM) 27, no. 4 (2018): 1-33.

[3] Abal, Iago, Jean Melo, Ştefan Stănciulescu, Claus Brabrand, Márcio Ribeiro, and Andrzej Wąsowski. "Variability bugs in highly configurable systems: A qualitative analysis." ACM Transactions on Software Engineering and Methodology (TOSEM) 26, no. 3 (2018): 1-34.
125 changes: 0 additions & 125 deletions kgenerateBeta/Config.in

This file was deleted.

10 changes: 0 additions & 10 deletions kgenerateBeta/axtlsFormat.txt

This file was deleted.

Loading

0 comments on commit c6ed20c

Please sign in to comment.