forked from eisop/checker-framework
-
Notifications
You must be signed in to change notification settings - Fork 0
/
new-contributor-projects.html-old
711 lines (588 loc) · 27.1 KB
/
new-contributor-projects.html-old
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
<h2 id="Avoiding_exponential_blowup_when_processing_DAGs">Avoiding exponential blowup when processing DAGs</h2>
<!-- John Field of Google is interested in this. -->
<p>
Google's <a href="https://bazel.build/">Bazel</a> open-source project is a
publicly-released version of their build system, Blaze. Blaze builds every
line of source code that is written by any Google programmer — all of
that source code appears in a single repository! Therefore, Bazel/Blaze needs to
be fast. Bazel represents all of the source code and its dependencies as a
large DAG (a
<a href="https://en.wikipedia.org/wiki/Directed_acyclic_graph">directed
acyclic graph</a>). It needs to manipulate these DAGs efficiently.
</p>
<p>
One of the biggest problems that the Bazel developers face is exponential
blowup of DAG sizes and therefore of run time. Periodically, one of the
Bazel developers makes such a mistake, and Bazel becomes unusable until
they can diagnose and fix the problem.
</p>
<p>
Here are two different ways to view the problem.
</p>
<ol>
<li>
In a DAG, multiple nodes may have the same child. Traversing the DAG
naively would visit the child multiple times — in the worst case,
exponentially many times. It is necessary to avoid doing so.
</li>
<li>
Bazel contains a function that takes a DAG as input and generates a DAG
as output (the output is an object graph). The Bazel developers want to
ensure that the size of the output DAG is O(|input DAG|). The input DAG is
processed bottom-up (it ensures that each input node is visited once) and
Bazel stores the results of intermediate computations that construct the
output DAG with nodes of the input DAG. The key thing the Bazel developers
want to avoid is copying intermediate subgraphs that have unbounded size.
</li>
</ol>
<p>
More concretely, there is only one Java type for all DAGs, and there is a
method <code>flatten()</code>. It's a mistake to call
<code>flatten()</code> on certain DAGs, because doing so may cause
exponential blowup.
</p>
<!--
</p>
<p>
A pluggable type system can subdivide the Java type, using this type hierarchy:
</p>
<p>
@PossbilyPropagatable
|
@NotPropagatable
</p>
<p>
A call to flatten() is permitted only if the receiver is known to be
@NotPropagatable.
</p>
<p>
This should be very easy to code up and evaluate, once we have more precise
definitions.
-->
<p>
The goal of this project would be to better understand the problem with
Bazel, to formalize it, and to create a program analysis that solves
the problem. You would evaluate your work by running it on the Bazel
codebase to discover latent problems, or by providing it to the Bazel
developers to run each time they propose a code change. The Bazel
team is interested in collaborating by evaluating a tool.
</p>
<h2 id="case-study-index-out-of-bounds">Array indexing (index-out-of-bounds errors)</h2>
<p>
An index-out-of-bounds error occurs when a programmer provides an illegal
index to an array or list, as in <code>a[i]</code> or <code>a.get(i)</code>
where <code>i</code> is less than 0 or greater than the length of
<code>a</code>. In languages like C, this is disastrous: buffers
overflows lead to about 1/6 of all security vulnerabilities. In languages
like Java, the result is “merely” that the program crashes. In
both languages, it is desirable to prevent programmers from making this
error and to prevent users from suffering the bad consequences.
</p>
<p>
This project will be a substantial case study with
the <a href="https://eisop.github.io/cf/manual/#index-checker">Index
Checker</a>. The first goal is to identify its merits and limitations.
Does it scale up to big,
interesting programs? Are there common, important code patterns that it
fails to handle? Does it produce too many false positive warnings? Does
it place too heavy a burden on the user, either in terms of annotations or
in terms of complexity of the error messages? Worst of all, are there
unknown unsoundnesses in the tool?
The second goal is to improve its precision enough to make it usable by
real-world programmers.
</p>
<p>
A <a href="#index-checker-mutable-length">related project</a> is to extend
the Index Checker to handle
mutable collections such as
<code>List</code>s, where the <code>remove()</code> method makes sound,
precise analysis very tricky.
</p>
<h3 id="case-study-nullness-bazel">Bazel tool</h3>
<!-- John Field of Google is interested in this. -->
<p>
This project is related to the
<a href="https://bazel.build/">Bazel</a> build system, and was
proposed by its development manager.
</p>
<p>
The Bazel codebase contains 1586 occurrences of the <code>@Nullable</code>
annotation. This annotation indicates that a variable may hold a null
value. This is valuable documentation and helps programmers avoid null
pointer exceptions that would crash Bazel. However, these annotations are
not checked by any tool. Instead, programmers have to do their best to
obey the <code>@Nullable</code> specifications in the source code. This is
a lost opportunity, since documentation is most useful when it is
automatically processed and verified. (For several years, Google tried
using <a href="https://findbugs.sourceforge.net/">FindBugs</a>, but they
eventually abandoned it: its analysis is too weak, suffering too many
false positives and false negatives.)
</p>
<p>
Despite the programmers' best efforts, null pointer exceptions do still
creep into the code, impacting users. The Bazel developers would like to
prevent these. They want a guarantee, at compile time, that no null
pointer exceptions will occur at run time.
</p>
<p>
Such a tool already exists: the
<a href="https://eisop.github.io/cf/manual/#nullness-checker">Nullness
Checker</a> of the <a href="https://eisop.github.io/">Checker
Framework</a>. It runs as a compiler plug-in, and it issues a warning at
every possible null pointer dereference. If it issues no warnings, the
code is guaranteed not to throw a <code>NullPointerException</code> at run time.
</p>
<p>
The goal of this project is to do a large-scale case study of the Nullness
Checker on Bazel. The main goal is to understand how the Nullness Checker
can be used on a large-scale industrial codebase. How many lurking bugs
does it find? What
<a href="https://eisop.github.io/cf/api/org/checkerframework/checker/nullness/qual/Nullable.html"><code>@Nullable</code></a>
annotations are missing from the codebase because the developers failed to
write them? What are its limitations, such as code patterns that it cannot
recognize as safe? (You might create new analyses and incorporating them
into the Nullness Checker, or you might just reporting bugs to the Nullness
Checker developers for fixing.) What burdens does it place on users? Is
the cost-benefit tradeoff worth the effort — that is, should Google
adopt this tool more broadly? How should it be improved? Are the most
needed improvements in the precision of the analysis, or in the UI of the
tooling?
</p>
<h3 id="case-study-nullness-bcel">BCEL library</h3>
<p>
Annotate the BCEL library to express its contracts with respect to nullness.
Show that the BCEL library has no null pointer exceptions (or find bugs
in BCEL). There are
already <a href="https://github.com/apache/commons-bcel/compare/trunk...typetools:trunk?expand=1">some
annotations</a> in BCEL, but they have not been verified as correct by
running the Nullness Checker on BCEL. (Currently, those annotations are
trusted when type-checking clients of BCEL.)
</p>
<p>
To get started:
</p>
<ul>
<li>Fork https://github.com/typetools/commons-bcel.git</li>
<li>Clone your new fork</li>
<li><code>git checkout typecheck-nullness</code></li>
<li>mvn verify</li>
</ul>
<p>
Some challenging aspects of this case study are:
</p>
<ul>
<li>
There is some poor design that needs to be resolved in discussions with
the BCEL maintainers. For example, consider the <code>copy()</code>
method. Some implementations of <code>copy()</code> return null, but
are not documented to do so. In addition, some implementations
of <code>copy()</code> catch and ignore exceptions. I think it would
be nicest to change the methods to never return null, but to throw an
exception instead. (This is no more burdensome to users, who currently
have to check for null.) Alternately, the methods could all be
documented to return null.
</li>
</ul>
<h2 id="your-own-new-type-system">Invent your own new type system</h2>
<p>
We also welcome your ideas for new type systems. For example, any run-time
failure can probably be prevented at compile time with the right
analysis. Can you come up with a way to fix your pet peeve?
</p>
<p>
It is easiest, but not required, to choose an existing type system from the
literature, since that means you can skip the design stage and go right to
implementation.
</p>
<p>
This task can be simple or very
challenging, depending on how ambitious the type system is. Remember to
focus on what helps a software developer most!
</p>
<h2 id="Bounded-size_strings">Bounded-size strings</h2>
<!-- John Field of Google is interested in this. -->
<p>
Windows cannot run command lines longer than 8191 characters. Creating a
too-long command line causes failures when the program is run on Windows.
These failures are irritating when discovered during testing, and
embarrassing or worse when discovered during deployment. The same command
line would work on Unix, which has longer command-line limits, and as a
result developers may not realize that their change to a command can cause
such a problem.
</p>
<p>
Programmers would like to enforce that they don't accidentally pass a
too-long string to the <code>exec()</code> routine. The goal of this
project is to give a compile-time tool that provides such a guarantee.
</p>
<p>
Here are two possible solutions.
</p>
<p>
<b>Simple solution:</b>
For each array and list, determine whether its length is known at compile
time. The routines that build a command line are only allowed to take such
constant-length lists, on the assumption that if the length is constant,
its concatenation is probably short enough.
</p>
<p>
<b>More complex solution:</b>
For each String, have a compile-time estimate of its maximum length. Only
permit <code>exec()</code> to be called on strings whose estimate is no more than 8191.
String concatenation would return a string whose estimated size is the sum
of the maximums of its arguments, and likewise for concatenating an array
or list of strings.
</p>
<h2 id="lock-ordering">Lock ordering</h2>
<p>
The <a href="https://eisop.github.io/cf/manual/#lock-checker">Lock
Checker</a> prevents race conditions by ensuring that locks are held when
they need to be. It does not prevent deadlocks that can result from locks
being acquired in the wrong order. This project would extend the Lock
Checker to address deadlocks, or create a new checker to do so.
</p>
<p>
Suppose that a program contains two different locks. Suppose that one
thread tries to acquire lockA then lockB, and another thread tries to
acquire lockB then lockA, and each thread acquires its first lock. Then
both locks will wait forever for the other lock to become available. The
program will not make any more progress and is said to
be <a href="https://en.wikipedia.org/wiki/Deadlock">deadlocked</a>.
</p>
<p>
If all threads acquire locks in the same order — in our example, say
lockA then lockB — then deadlocks do not happen. You will extend the
Lock Checker to verify this property.
</p>
<h2 id="asm">Upgrade to a newer version of ASM</h2>
<p>
The
<a href="https://eisop.github.io/afu/">Annotation
File Utilities</a>, or AFU, insert annotations into, and extract
annotations from, <code>.java</code> files, <code>.class</code> files,
and text files. These programs were written before the
<a href="https://asm.ow2.org/">ASM</a> bytecode library supported Java 8's
type annotations. Therefore, the AFU has its own custom version of ASM
that supports type annotations. Now that ASM 6 has been released and it
supports type annotations, the AFU needs to be slightly changed to use
the official ASM 6 library instead of its own custom ASM variant.
</p>
<p>
This project is a good way to learn about <code>.class</code> files and
Java bytecodes: how they are stored, and how to manipulate them.
</p>
<p>
(Kush Gupta is working on this project.)
</p>
<h2 id="typestate">Stateful type systems</h2>
<p>
This project is to improve support for
<a href="https://eisop.github.io/cf/manual/#typestate-checker">typestate checking</a>
</p>
<p>
Ordinarily, a program variable has
the same type throughout its lifetime from when the variable is declared
until it goes out of scope. "Typestate"
permits the type of an object or variable to <em>change</em> in a controlled way.
Essentially, it is a combination of standard type systems with dataflow
analysis. For instance, a file object changes from unopened, to opened, to
closed; certain operations such as writing to the file are only permitted
when the file is in the opened typestate. Another way of saying this is
that <code>write</code> is permitted after <code>open</code>, but not after <code>close</code>.
Typestate
is applicable to many other types of software properties as well.
</p>
<p>
Two <a href="https://eisop.github.io/cf/manual/">typestate checking frameworks</a>
exist for the Checker Framework. Neither is being maintained; a new one
needs to be written.
</p>
<h2 id="analysis_diffs">Tool for analysis diffs</h2>
<!--
This project idea is duplicated at
~mernst/public_html/uw-only/research/potential-research-projects.html .
-->
<p>
Many program analyses are too verbose for a person to read their entire
output. However, after a program change, the analysis results may
change only slightly. An "analysis diff" tool could show the
difference between the analysis run on the old code and the analysis run
on the new code.
</p>
<ul>
<li>
The analysis diffs may help the programmer to better understand the
changes.
</li>
<li>
Bug detection tools. such
as <a href="https://findbugs.sourceforge.net/">FindBugs</a> or
the <a href="https://eisop.github.io/">Checker Framework</a>, have
extremely verbose output when first run on a program. Programmers
could examine and fix only the warnings about code they have changed
(and that they are currently thinking about).
</li>
<li>
Tools that always have large output, such as inference tools, could
become manageable to users if output is shown in small doses.
</li>
<li>
You can probably think of other uses.
</li>
</ul>
<p>
The analysis diff tool would take as input two analysis results (the
previous and the current one). It would output only the new parts of its
second input. (It could optionally output a complete diff between two
analysis results.)
</p>
<p>
One challenge is dealing with changed line numbers and other analysis
output differences between runs.
</p>
<p>
It would be nice to integrate the tool with git pre-commit hooks or GitHub
pull requests, to enable either of the following functionality (for either
commits to master or for pull requests):
</p>
<ul>
<li>
Permit only those commits/pulls that do not add any new analysis warnings.
</li>
<li>
Permit only those commits/pulls that are "clean" — the analysis
issues no warnings for any changed line.
</li>
</ul>
<p>
A concrete example of an analysis diff tool
is <a href="https://github.com/plume-lib/checklink/blob/master/checklink-persistent-errors">checklink-persistent-errors</a>;
see the documentation at the top of the file. That tool only works for
one particular analysis, the W3C Link Checker.
An analysis diff tool also appears to be built into FindBugs.
The goal of this project is to build a general-purpose tool that is easy
to apply to new analyses.
</p>
<h2 id="performance">Performance improvements</h2>
<p>
The Checker Framework runs much slower than the standard javac compiler
— often 20 times slower! This is not acceptable as part of a
developer's regular process, so we need to speed up the Checker
Framework. This project involves determining the cause of slowness in
the Checker Framework, and correcting those problems.
</p>
<p>
This is a good way to learn about performance tuning for Java applications.
</p>
<p>
Some concrete tasks include:
</p>
<ul>
<li>Profile the Checker Framework. Run a profiler such as
<a href="https://www.yourkit.com/java/profiler/">YourKit</a> to determine
which parts of the Checker Framework consume the most CPU time and memory.
</li>
<li>Perhaps compare a profile of the Checker Framework against a profile
of regular javac. This probably is not necessary because the Checker
Framework is so much slower than regular javac.
</li>
<li>Consider interning string values. The Checker Framework does a fair
amount of string manipulation, in part because it reads from resources
such as stub files that do not produce <code>Element</code>s. Interning
could save time when doing comparisons. You can verify the correctness
of the optimization by running the
<a href="https://eisop.github.io/cf/manual/#interning-checker">Interning
Checker</a> on the Checker Framework code. Compare the run time of the
Checker Framework before and after this optimization.
</li>
<li> Based on profiling results, devise other optimizations, implement
them, and evaluate them.
</li>
</ul>
<h2 id="run-time-checking">Run-time checking</h2>
<p>
Implement run-time checking to complement compile-time checking. This will
let users combine the power of static checking with that of dynamic
checking.
</p>
<p>
Every type system is too strict: it rejects some programs that never go
wrong at run time. A human must insert a type loophole to make such a
program type-check. For example, Java takes this approach with its
cast operation (and in some other places).
</p>
<p>
When doing type-checking, it is desirable to automatically insert run-time
checks at each operation that the static checker was unable to verify.
(Again, Java takes exactly this approach.) This guards against mistakes by
the human who inserted the type loopholes. A nice property of this
approach is that it enables you to prevent errors in a program
with no type annotations: whenever the static checker is unable
to verify an operation, it would insert a dynamic check. Run-time checking
would also be useful in verifying whether the suppressed warnings are
correct — whether the programmer made a mistake when writing them.
</p>
<p>
The annotation processor (the pluggable type-checker) should automatically
insert the checks, as part of the compilation process.
</p>
<p>
There should be various modes for the run-time checks:
</p>
<ul>
<li>fail immediately.</li>
<li>logging, to permit post-mortem debugging without crashing the program.</li>
</ul>
<p>
The run-time penalty should be small: a run-time check is necessary only
at the location of each cast or suppressed warning. Everywhere that the
compile-time checker reports no possible error, there is no need to insert a
check. But, it will be an interesting project to determine how to minimize
the run-time cost.
</p>
<p>
Another interesting, and more challenging, design question is whether you need to add and maintain
a run-time representation of the property being tested. It's easy to test
whether a particular value is null, but how do you test whether it is
tainted, or should be treated as immutable? For a more concrete example,
see the discussion of the (not yet implemented)
[Javari run-time checker](http://pag.csail.mit.edu/pubs/ref-immutability-oopsla2005-abstract.html).
Adding this run-time support would be an interesting and challenging project.
</p>
<p>
We developed a prototype for the
<a href="https://ece.uwaterloo.ca/~wdietl/publications/pubs/EnerJ11-abstract.html">EnerJ runtime system</a>.
That code could be used as starting point, or you could start afresh.
</p>
<p>
In the short term, this could be prototyped as a source- or
bytecode-rewriting approach; but integrating it into the type checker is a
better long-term implementation strategy.
</p>
<h2 id="ide-support">IDE and build system support</h2>
<p>
The Checker Framework comes with support for
<a href="https://eisop.github.io/cf/manual/#external-tools">external tools</a>,
including both IDEs (such as an Eclipse plug-in) and build tools
(instructions for Maven, etc.).
</p>
<p>
These plug-ins and other integration should be improved.
We have a number of concrete ideas, but you will also probably come up
with some after a few minutes of using the existing IDE plugins!
</p>
<p>
This is only a task for someone who is already an expert, such as someone
who has built IDE plugins before or is very familiar with the build system.
One reason is that these tools tend to be complex, which can lead to
subtle problems.
Another reason is that we don't want to be stuck maintaining code written by
someone who is just learning how to write an IDE plugin.
</p>
<p>
Rather than modifying the Checker Framework's existing support or
building new support from scratch, it may be better to adapt some other
project's support for build systems and IDEs. For instance, you might
make <a href="https://github.com/coala/coala">coala</a> support the
Checker Framework, or you might adapt the tool integration provided
by <a href="http://errorprone.info/">Error Prone</a>.
</p>
<!--
Integrate the Checker Framework with an IDE such as Eclipse, IntelliJ, IDEA or NetBeans,
which will make pluggable type-checking easier to use and more attractive to developers.
Some specific projects include:
* Create an IDE plug-in that invokes the checker and reports any errors to the developer, just as the IDE currently does for type errors.
* Improve the existing Eclipse plug-in.
* Add a button to the IDE that hides/shows all annotations of a given variety, to reduce clutter. This would be useful beyond type annotations.
* Implement quick fixes for common errors.
* Integrate type inference with the IDE, to make it easier for a developer to add annotations to code.
* Improve an IDE's existing refactoring support so that it is aware of, and properly retains, type annotations.
* Highlight defaulted types and flow-sensitive type refinements, making error messages easier to understand.
* Highlight the parts of the code that influenced flow-sensitive type refinements, again to make errors easier to comprehend.
IDEs that build on the OpenJDK Java compiler could benefit from even tighter integration with the Checker Framework.
This project may entail the following:
* familiarity with the IDE plug-in development environment, Eclipse PDT or NetBeans plug-in support: By the end of the project, the developer should be familiar with building reasonably complex IDE plug-ins.
* UI/UX design: The developer would design a UI experience that is appealing to developers and integrates well with the developer's workflow!
-->
<h2 id="exhaustive-testing">Model checking of a type system</h2>
<p>
Design and implement an algorithm to check type soundness of a type system
by exhaustively verifying the type checker on all programs up to a certain
size. The challenge lies in efficient enumeration of all programs and
avoiding redundant checks, and in knowing the expected outcome of the
tests. This approach is related to bounded exhaustive
testing and model checking; for a reference, see
[Efficient Software Model Checking of Soundness of Type Systems](http://www.eecs.umich.edu/~bchandra/publications/oopsla08.pdf).
</p>
<p>
A valuable project all by itself would be to compare heavy-weight and
light-weight type inference this whole-program inference vs. Checker
Framework Inference vs. Julia, to understand when each one is worth using.
</p>
<h2 id="determinism">Determinism</h2>
<p>
Programs are easier to use and debug if their output is deterministic. For
example, it is easier to test a deterministic program, because
nondeterminism can lead to flaky tests that sometimes succeed and sometimes
fail. As another example, it is easier for a user or programmer to compare
two deterministic executions than two nondeterministic executions.
</p>
<p>
We have created a prototype Determinism Checker. It is documented in
this <a href="https://checkerframework.org/determinism-checker-manual/manual.html#determinism-checker">draft
manual chapter</a>, and its implementation is in
<a href="https://github.com/t-rasmud/checker-framework/tree/nondet-checker">t-rasmud/nondet-checker</a>.
You will do a case study of this type system.
</p>
<h2 id="optional-case-study">Detecting errors in use of the Optional class</h2>
<!-- This project is duplicated in the highlight section and in
the Checker Framework's new-contributor-projects.html . -->
<p>
Java 8 introduced the
<a href="https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Optional.html"><code>Optional</code></a>
class, a container that is either empty or contains a non-null value.
It is intended to solve the problem of null
pointer exceptions. However, <code>Optional</code>
has <a href="https://homes.cs.washington.edu/~mernst/advice/nothing-is-better-than-optional.html">its
own problems</a>, leading
to <a href="https://stuartmarks.wordpress.com/2016/09/27/vjug24-session-on-optional/">extensive
advice</a> on when and how to use Optional. It is difficult for
programmers to remember all these rules.
</p>
<p>
The goal of this project is to build a tool to check uses of Optional and run it on open-source projects. The research questions are:
</p>
<ul>
<li>Is it possible to build a tool that enforces good style in using Optional?
</li>
<li>How much effort is is for programmers to use such a tool?
</li>
<li>Does real-world code obey the rules about use of Optional, or not?
</li>
</ul>
<p>
We have
a <a href="https://eisop.github.io/cf/manual/#optional-checker">prototype
verification tool</a> that checks some but not all rules about use of
Optional (https://checkerframework.org/manual/#optional-checker). This
project will do case studies of it and extend it.
</p>
<p>
The methodology is to find open-source projects that use Optional(you can
do this by searching GitHub, for example), run the tool on them, and read
the tool's warnings. Each warning will lead to either a bug report against
an open-source project or an improvement to the verification tool.
</p>
<!--
For the Optional Class project, the first task is to run the existing
verification tool: https://checkerframework.org/manual/#optional-checker .
Doing so will give you a feel for its strengths and weaknesses (and the latter
may suggest ways to improve it). First, read parts of the Checker Framework
manual first (the "How to read this manual" section tells you what to read). In
addition to reading the section about the Optional Checker, do follow the link
to the webpage “Nothing is better than the Optional type”. Once you have done
that reading, you need to choose a project to type-check. You can choose a
program that you know uses Optional, or search GitHub for one. For each warning
issued by the Optional Checker, follow the methodology at
https://checkerframework.org/manual/#handling-warnings .
-->