-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
74 lines (50 loc) · 2.98 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
output: md_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# signatr
<!-- badges: start -->
<!-- badges: end -->
The goal of signatr is to infer functions signatures by fuzzing their inputs.
## Installation
You can install `signatr` with `devtools::install_github("PRL-PRG/signatr")` in R.
Apart from CRAN dependencies, it also requires the following ones from github.
Some need some more preparation before using `devtools::install`:
- [sxpdb](https://github.com/PRL-PRG/sxpdb/): R value database
- [generatr](https://github.com/reallyTG/generatr): fuzzing utilities
- [contractr](https://github.com/PRL-PRG/contractr):type signature parsing and checking for R
- [argtracer](https://github.com/PRL-PRG/argtracer): trace R values and store them in the R value database. It requires a modified R interpreter called R-dyntrace. It can be installed from [here](https://github.com/PRL-PRG/R-dyntrace/tree/r-4.0.2) for version 4.0.2. Then add the resulting `R` binary in `bin` in your `PATH` and run the installation.
## Demo
The following is a short demonstration of the basic signatr functionnalities i.e., how to create the value database by running R code and then how to use it for fuzzing.
```{r}
library(signatr)
```
To generate a database of values, we need some code to run. One way to get it is to extract it from an existing R package, for example `stringr`:
```{r}
extract_package_code("stringr", output_dir = "demo")
```
This will extract all the runnable snippets from the package documentation and tests into the given directory. For example:
```{r}
cat(readLines("demo/examples/str_detect.Rd.R", n = 5))
```
Next, we trace the file by executing it and recording all the calls using the `trace_file` function:
```{r}
trace_file("demo/examples/str_detect.Rd.R", db_path = "demo.sxpdb")
```
The resulting database is stored as demo.sxpdb. In this example, after running the `str_detect.Rd.R` file, the database contains 20 unique values. This can be repeated for all the other files. To trace multiple files in parallel, one database is created per file, which are merged using the merge_dbs function once they are all available. This also allows us to run larger experiments on multiple machines.
Once the database is ready, we can start fuzzing. The fuzzer has a number of configration points, but the easiest starting point is the `quick_fuzz` helper function:
```{r}
R <- quick_fuzz("stringr", "str_detect", "demo.sxpdb", budget = 100, action = "infer")
```
The infer_call_signature function infers types for each call argument and return value using the type annotation language, specifically, the `contractr` type inference tool provided by the work is used. `infer_call_signature` returns a data frame with details for each call. The data frame includes the inferred call signature in the `result` column:
```{r}
R
```