-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy paththesis.tex
executable file
·289 lines (247 loc) · 15.8 KB
/
thesis.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
\documentclass{article}
\usepackage{caption}
\usepackage{code}
\usepackage{geometry}
\usepackage{graphicx}
\usepackage{multirow}
\usepackage{url}
\usepackage{verbatim}
\usepackage{fancyvrb}
\usepackage{booktabs}
\usepackage{tabularx}
\usepackage{color}
\usepackage{colortbl}
\definecolor{Gray}{gray}{0.9}
\newcommand\ch[1]{\textcolor[rgb]{0,0,1}{\textbf{#1}}}
\newcommand\clh[1]{\textcolor[rgb]{0,.5,1}{\textbf{#1}}}
\newcommand\coh[1]{\textcolor[rgb]{0,.6,0}{\textbf{#1}}}
\newtheorem{research-question}{Research Question}[section]
\title{Celeriac: A Daikon .NET Front End}
\author{Kellen Donohue}
\newcommand{\todo}[1]{{\color{red}\bfseries [[#1]]}}
\begin{document}
\maketitle
\abstract{
Celeriac is a .NET Front End for Daikon, it allows invariant detection in C\#, F\#, and other .NET programs. This paper discusses the features, design, and a research use of the tool. Daikon is a tool which takes a program trace and infers likely program invariants. However, it needs a front-end, such as Celeriac to modify a program to produce this trace. This paper describes many ways users can interface with Celeriac. It explains the architecture of Celeriac, with explanations of design decisions relevant to other profiler tools. It discusses the results of a preliminary research study examining how .NET developers uses code contracts today, what the behavior of .NET programs is as observed by Daikon, and these two differ.
}
\section{Introduction}
\subsection{Contracts}
Contracts provide an agreement between a method implementer and caller, and also describe well-formed object. The following code snippet shows how a typical bank account withdraw method may be annotated with contracts.
\begin{center}
\begin{Verbatim}[commandchars=\\\{\}]
\ch{class} BankAccount \{
\ch{bool} Withdraw(\clh{Account} acct, \ch{decimal} amt) \{
\coh{// Don't withdraw a negative amount.}
\clh{Contract}.Requires(amt >= 0);
\coh{// Return whether withdraw succeeded.}
\clh{Contract}.Ensures(
\clh{Contract}.Result<\ch{bool}>() == acct.bal >= amt);
\coh{// Update the account properly.}
\clh{Contract}.Ensures(acct.bal ==
\clh{Contract}.OldValue(acct.bal) - \clh{Contract}.Result<\ch{bool}>()
? amt : 0);
...
\}
\}
\end{Verbatim}
\end{center} \captionof{figure}{Classic Bank Account method annotated with CodeContracts}
\ \\
Contracts are useful for ensuring the correct behavior of programs. Additionally, the .NET framework has many tools that are enhanced by the use of contracts, in a framework called CodeContracts. The Pex white-box test generation tool \cite{pex} won't generate tests that violate specified contracts, which is important for ensuring that generated tests satisfy a specification. Developers can use contracts to statically verify program properties \cite{static} and automatically generate API documentation \cite{contracts}.
\subsection{Daikon}
To help developers insert appropriate contracts the Daikon tool \cite{daikon} dynamically detects likely program invariants. It takes as input a program trace, and outputs a list of possible invariants based on that trace. The following figure demonstrates a code snippet and the possible invariants that Daikon may infer on the snippet.
\begin{center}
\begin{Verbatim}[commandchars=\\\{\}]
\ch{public void} makeEmpty() \{
\clh{Arrays}.fill(theArray, 0, topOfStack + 1, \ch{null});
topOfStack = -1;
\}
-----------------------------------------------------
Daikon Invariants:
DataStructures.StackAr.makeEmpty():::EXIT
size(this.theArray[]) == orig(size(this.theArray[]))
this.theArray[] elements == null
this.topOfStack == -1
this.theArray[0..this.topOfStack] == []
this.topOfStack <= orig(this.topOfStack)
\end{Verbatim}
\end{center} \captionof{figure}{Example of Java code and Daikon invariants that could be generated for it}
\ \\
In this example all the Daikon inferred invariants are true of the source code, however that is not always the case. Since Daikon can only output invariants based on input traces it may output invariants that are too weak (since it is limited by the type of invariants that may occur) or too strong (since it may over-generalize if given a trace that doesn't well demonstrate the program).
\\ \\
Daikon works as follows, and as described in the figure below. First, front-end modifies a program to add instrumentation calls which output a data trace during execution. During this process a declaration file, or description of the program schema in terms of methods and variables is output. The datatrace created during execution along with the declaration file are then run through Daikon, which outputs a list of possible program invariants. These invariants can be inserted as contracts, assertions, or documentation in the source program, among a variety of other possible uses.
\begin{center}
\includegraphics[scale=.5]{daikon}
\end{center} \captionof{figure}{Operation of Daikon}
\subsection{Contributions}
The contribution of this work is as follows. The technical contribution is Celeriac, a Daikon front end for the .NET programming languages C\#, F\#, and VB.NET. The research contribution is an investigation of which types of contracts developers choose to include in their programs, and why. Section 2 described how Celeriac relates to other Daikon front ends, and other .NET profiler tools. Section 3 describes the use of Celeriac. Section 4 describes the implementation of Celeriac, including important design contributions. Section 5 describes the application of Celeriac to answer software engineering questions. Finally, Section 6 concludes.
\section{Related Work}
Daikon .NET front ends exist for some other languages. The Chicory front end, for Java, is very popular, and Celeriac was modeled after the external behavior or Chicory in many cases. The Kvasir front end, for C++, was a model for the testing suite as well.
\\ \\
Celeriac is unique among Daikon front ends in that no other front end has been written for a functional language, such as F\#. Also, no other front ends have explicitly targeted a multi-language platform, such as .NET.
\newpage
\section{User Documentation}
Daikon trace files may be very large (on the order of gigabytes), even for reasonable programs. A challenge for users of Daikon on programs such as these is to limit the trace to areas of particular interest, for example by looking only at certain classes. To support this Celeriac supports a wide variety of options for filtering which variables or program points are output. Celeriac also supports options to produce additional output, such as the values of pure methods, or the ability to synthesize linked lists.
\\ \\
Included here is the Celeriac user documentation, as available to users on the Celeriac project homepage \footnote{\url{https://code.google.com/p/daikon-dot-net-front-end/w/list}}. The documentation describes all the optional features of Celeriac.
\newpage
\section{Developer Documentation}
Included here is the Celeriac developer documentation. It describes the architecture of Celeriac, along with design decisions made for Celeriac that are useful insights for building other profiler tools.
\newpage
\section{Research Contributions}
Our research team used Celeriac in an investigation of the following research questions:
\begin{research-question}
What kinds of Code Contracts do developers manually (without the aid of tools such as Daikon that infer invariants) include in their programs?
\end{research-question}
\begin{research-question}
What are the observable behaviors of .NET programs as inferred by Daikon? How do these differ (quantitatively and qualitatively) from the types of contracts developers include.
\end{research-question}
This section consists of excerpts from an upcoming paper, describing the study investigating these questions, and other sections. The included content showcases the use of Celeriac in answering the research questions.
\input{subject-programs}
\label{sec:survey}
\subsection{Included Code Contracts}
We analyzed and categorized the contracts developers included in their sample programs, the results are in Table 2 below:
\begin{center}{\small \input{cc-usage-table}}
\captionof{table}{Code Contract usage for the subject programs. The number
listed next to the program name is the number of source code
lines in the program. The
columns REQ(uires), ENS(ures), and INV(ariants) correspond to
preconditions, postconditions, and object invariants,
respectively. A single contract can be counted in multiple
categories (rows).
%
The vast majority of preconditions written with Code Contracts
simply check the presence of information; the majority of
postconditions ensure that information is produced, or specify which
information is produced.}
\end{center}
\label{table:cc-usage}
The vast majority of contracts developers include are nullness checks. These are typically not very interesting, as they don't well describe the program. Also, often if the nullness contracts were to fail the program would crash soon anyways, limiting the usefulness of performing runtime assertions. We believed programs contained much richer invariants that developers could include as contracts, and the next subsection describes the invariants inferred by Daikon.
\input{paper-celeriac}
\section{Related Work}
\subsection{Other Daikon Front-Ends}
Daikon front-ends exist
for other programming languages including Java~\cite{ErnstPGMPTX2007},
C/C++~\cite{Guo2006,Rudd2010}, and Eiffel~\cite{Polikarpova2009}. Both
the Java and C/C++ front-end support a dynamic comparability analysis
(as compared to the static analysis described in
Section~\ref{sec:comparability}); the Eiffel front-end does not
perform any comparability analyses.
%
Interestingly, the Eiffel front-end assumes that all nullary methods
are pure and supports a black-list for impure methods.
\subsection{Language-Specific Features and Daikon Enhancements}
To support the .NET languages, we included features in Celeriac that
aren't present in the other Daikon front-ends. Additionally, I
introduced new metadata to Daikon to support the features.
\subsubsection{Properties}
A .NET property is a hybrid of a field and method. Each property has an
associated \verb|get| and \verb|set| method. These methods are
accessed without parenthesis, as if they were a field:
\begin{verbatim}
myObject.Property = ...
var x = myObject.Property
\end{verbatim}
In many cases, the property is backed by a private field. Since this
is a common case, the .NET languages provide auto-properties, which
automatically generate the backing field. Unlike fields however,
properties may throw exceptions, produce different values each time
they are called (e.g., \verb|Date.Now|), or even perform
side-effects.
%
To enable pretty printing and filtering of properties in Daikon, I
introduced a \verb|is_property| variable flag.
\subsubsection{Readonly Variables}
The .NET languages include a \verb|readonly| keyword that specifies
that a variable must be assigned in the constructor, or given a
constant value; the equivalent in Java is the \verb|final|
keyword.
%
For readonly variables, Daikon produces redundant information for
method exit points stating that the variable has not been modified
(e.g., for a variable \verb|x|, \verb|x == orig(x)|).
%
We introduced an \verb|is_readonly| variable flag to Daikon to filter
out these cases from the Daikon output.
%
In .NET, the \verb|==| operator performs a reference-equality
comparison for expressions with reference-type, and a value-based
comparison for expressions with value-types
%
\footnote{In C\#, the == and != operators can be
overloaded to perform a value equality check for a reference-type;
we do not consider this case.}.
%
Therefore, the semantics of the \verb|is_readonly| flag depends on the
type of the expression.
%
The \verb|readonly| keyword is shallow. For reference types, the
keyword prevents the reference from being reassigned, but does not
prevent the object from being modified through the reference.
%
Therefore, when considering a composite expression (e.g.,
\verb|this.foo.bar|), Celeriac cannot naively use the \verb|readonly|
attribute of the last field.
%
To compute whether an expression should be flagged as
\verb|is_readonly|, Celeriac starts at the root (e.g., \verb|this|in
the the case of \verb|this.foo.bar|) of the composite expression and
propagates reference immutability and value immutability
information. Reference immutability is propagated via the
\verb|readonly| fields (\verb|this| is reference-immutable). Value
immutability is propagated / introduced for \verb|readonly| fields
that have an immutable type. We define immutability conservatively ---
an immutable type is one that is composed of \verb|readonly| fields
with immutable types.
% Celeriac uses a conservative immutability heuristic: reference types
% that only consist of other immutable reference types or value types
% are considered immutable.
% %
% Properties, however, complicate the matter since a one of a type's
% properties may not be idempotent even if all of the fields are
% readonly (i.e., the property can reference another non-immutable
% object); Celeriac does not attempt to handle this case.
% \paragraph{Enumerations}
% \todo{Why is this interesting?}
% %
% Enumerations in .NET are backed by integers. In fact, uses of the
% enumeration constants are replaced by integer constants in the IL (the
% comparability analysis infers which integers are enumeration constants
% based on co-occurence with a variable with an enumeration type).
% %
% To aid pretty-printing of contracts involving enumerations, we
% introduced an \verb|is_enum| variable flag to Daikon.
\subsubsection{Exceptions}
Some specification languages such as JML allow developers to specify
exceptional postconditions --- postconditions that are true when a
given exception is thrown (note that this is may be different than the
conditions that cause the exception to be thrown).
%
To support reasoning about exceptional postconditions, Celeriac
produces a separate output program point for each exception type
thrown by a method. Currently, however, this information cannot be
fully integrated into Daikon.
%
Daikon has a fixed notion of the relationship between preconditions,
postconditions, and object invariants. Each return location of a
method has its own program point in Daikon. Invariants that appear at
every return location are aggregate into the set of postconditions for
the method. Additionally, preconditions and postconditions that appear
for every method are automatically lifted to an object invariant
program point.
%
To correctly support exceptions, an additional layer would need to be
introduced to separately aggregate the regular and exceptional return
locations of a method.
%
For future work, we plan to use the information to detect cases in
which a method's exceptional postconditions are not consistent with
its normal postconditions --- e.g., the method does not clean up the
object's internal state before exiting with an exception.
% \paragraph{F\# Lists}
% \todo{How does we support F\# lists?}
\section{Conclusion}
This paper has presented Celeriac, a .NET front-end for Daikon. The engineering contribution is Celeriac, the Daikon .NET front end. The conceptual contributions are design decisions when writing profiler tools, as described in the developer manual, and new Daikon invariant detection capabilities, such as conceptual post conditions. The research contribution is an analysis of which types of contracts developers choose to include in their code, how these differ from the inferred behavior or the program, and why.
\newpage
\bibliographystyle{plain}
\bibliography{thesis}
\end{document}