Here is the definitive guide to the project and the assessment.
The first thing you should do is find an interesting topic. The topic should be centered around program analysis, and should demonstrate your understanding of the course. The normal case is to choose one more benchmark cases which you want to explore building a faster or more precise analysis for.
It is possible to choose your own custom project. Talk to the teacher, if you are interested.
Before starting on the project, you have to hand in a project proposal. The project proposal is ~2 pages and correspond roughly to the introduction of your final paper. It should formatted using the ACM small format (see template), and will be evaluated according to the CGIE model, and graded 1-4:
Context: what is the problem?
Gap: why does the current techniques not work?
Innovation: what is your proposed solution?
Evaluation: when have you solved the problem?
Plan: what is the plan?
Presentation: is the proposal formatted correctly and presented nicely?
As an appendix to the proposal, please also submit a table of Technical Contributions (§3), as described in Project Overview Sheet (§2.2). Remember to fill in the maximal grade for each student.
You need to score at least 12 points and none of the parameters can be 1. If your proposal are rejected you have to resubmit (even after the deadline). The proposals are graded as you submit them, so you do not have to wait to the last day.
As inspiration, here is selection of project ideas. Each project is described using abstracts from the CGIE Model as well as the potential technical contributions of those projects.
Developers know they should use a static analysis when developing code. However, there currently exist so many options that it is hard to chose.
In this project, we will investigate the state of the art, by implement runners for (five?) different analyses on the JPAMB suite, to illustrate their differences. We will expand the benchmark suite until we have found a clear difference between each different analyses, so we can discuss their short-comings.
ID | Name | Max |
---|---|---|
PWP | Write Paper | 10 |
PEX | Find Illustrating Examples | 4 |
PRW | Read and relate to recent papers | 10 |
NAN | Integrate Analyses | 15 |
EXB | Extend Benchmark Suite | 5 |
Sum | 44 |
With a maximum of 44 contribution points the suggested maximal grade is 7 (Groups of 4), 4 (Groups of 5), and 02 (Groups of 6).
Strings are one of the basic units of computing. However, currently none of the techniques shown in class handle strings.
In this project, we extend our current analyses to handle strings. To evaluate our implementation we will expand the JPAMB suite to contain cases that show normal use cases of strings.
ID | Name | Max |
---|---|---|
PWP | Write Paper | 10 |
PEX | Find Illustrating Examples | 4 |
PRW | Read and relate to recent papers | 10 |
IIN | Implement Interpreter | 10 |
ISY | Implement Syntactic Analysis | 10 |
IAI | Implement Abstract Interpreter | 10 |
IAB | Implement Novel Abstractions | 10 |
NDT | Handle Novel Datatype | 10 |
EXB | Extend Benchmark Suite | 5 |
Sum | 79 |
With a maximum of 79 contribution points the suggested maximal grade is 10 (Groups of 4), 7 (Groups of 5), and 7 (Groups of 6).
When writing software, developers (and now also LLMs) often add code which might be needed later, but never is. Ideally, the computer should be able to remove code which is not needed automatically. Currently, both dynamic and static analysis approaches have been used to remove unused method bodies and classes. However, current work does not remove code inside of functions.
In this project, we will use static and dynamic analysis techniques to estimate possible code-coverage on the source line level, and remove all the code not needed. To evaluate our implementation we will run our debloater on JPAMB suite and show we can remove X% of the code while all cases still can evaluate.
ID | Name | Max |
---|---|---|
PWP | Write Paper | 10 |
PEX | Find Illustrating Examples | 4 |
PRW | Read and relate to recent papers | 10 |
IIN | Implement Interpreter | 10 |
ISY | Implement Syntactic Analysis | 10 |
IAI | Implement Abstract Interpreter | 10 |
IAB | Implement Novel Abstractions | 10 |
NDT | Handle Novel Datatype | 10 |
EXB | Extend Benchmark Suite | 5 |
NCR | Analysis informed code-rewriting | 10 |
IBA | Implement Unbounded Static Analysis | 7 |
Sum | 96 |
With a maximum of 96 contribution points the suggested maximal grade is 12 (Groups of 4), 10 (Groups of 5), and 10 (Groups of 6).
Fuzzing is one of the most used analysis techniques in production. But for deep programs it can be hard to direct the fuzzer to all parts of the program. Currently, some dynamic techniques, like coverage and in some cases concolic analysis are used to improve the efficiency. But since these techniques are dynamic, they cannot help us with sections we have missed.
In this project, we will use static analysis techniques to improve fuzzing coverage. We do this by rating branches which changes to get higher coverage higher. To evaluate our implementation we will show that our hybrid fuzzer explores cases in the JPAMB faster than pure fuzzing.
ID | Name | Max |
---|---|---|
EXB | Extend Benchmark Suite | 5 |
IAB | Implement Novel Abstractions | 10 |
IAI | Implement Abstract Interpreter | 10 |
IIN | Implement Interpreter | 10 |
ICF | Implement Coverage-based Fuzzing | 10 |
ISY | Implement Syntactic Analysis | 10 |
NDT | Handle Novel Datatype | 10 |
NIN | Use Bytecode Instrumentation | 10 |
PEX | Find Illustrating Examples | 4 |
PWP | Write Paper | 10 |
Sum | 89 |
With a maximum of 89 contribution points the suggested maximal grade is 12 (Groups of 4), 10 (Groups of 5), and 7 (Groups of 6).
Your final grade will be an overall assessment based on:
the Written Answers (§2.1), handed in by your feedback group
the Paper (§2.3), handed in by your project group
the individualized Project Overview Sheet (§2.2), handed in by the project groups.
and the Video Presentation (§2.4) and Oral Defence (§2.5), done in the project groups.
The work will be evaluated according to the learning objectives. Special weight will be put on that you are using the content of the course (indicated by the technical contributions), that the project is well done (easy to read, use of techniques are warranted by the problem, and claims are supported by evaluation), and finally (for extra points) if the project problem is interesting and the solution is innovative.
In the end of the course, your feedback group must hand in a single PDF, with written answers to of all the Questions (§2.5) asked in class. The answers should be around a paragraph. The answers will be graded on correctness, and account for no more than 10% of the final grade.
Technical contributions is an advisory grade, which goal is to set the expectations of the course. At the exam, your grade might be higher or lower than your calculated grade, depending on the outcome of project as a whole.
Additionally to the Paper (§2.3), your project group also must hand in a single A4 page, containing the name, student number, the feedback group and the images of each of the students in the group. Finally you should also create a table summarizing the technical contributions of each of the group members.
The table, should have a row for each technical contribution. They should contain the contribution, the points it is worth, and how the points should be distributed between the students.
Contribution | Points | sXXXXXX | sYYYYYY | sZZZZZZ | ... |
---|---|---|---|---|---|
IIN Implement Interpreter | 10 | 4 | 3 | 3 | - |
IAI Implement Abstract Interpreter | 10 | 0 | 0 | 10 | - |
NAN Integrate Analyses (3) | 15 | 8 | 4 | 7 | - |
EXB Extend Benchmark Suite (10) | 5 | 2 | 1 | 0 | - |
- | - | - | - | - | - |
Total | - | 14 | 8 | 21 | - |
Suggested Maximal Grade | - | 7 | 4 | 12 | - |
Finally, the numbers are tallied up and used to calculate a maximal grade, each individual student can get. The grades are calculated from this table (subject to change):
Grade | Point | Diversity |
---|---|---|
12 | Across 5 contributions | |
10 | Across 4 contributions | |
7 | Across 3 contributions | |
4 | Across 3 contributions | |
02 | Across 3 contributions |
The paper should be at most 10 pages, excluding references. It must be formatted using the ACM format in acmsmall
mode. To be sure to adhere to the format.
Before you continue writing the paper, consider the advice in this lecture, by Simon Peyton Jones.
The final paper should contain the following sections (approximate page numbers in parentheses):
An abstract, describing the paper.
An introduction (1-2 pages), with
a good motivation and description of the problem you are trying to solve (consider using examples),
the problem stated as a research question,
a short description of your solution, and
a list of contributions with forward references, described in the lecture by Simon P. Jones.
An example illustrating your approach (1-2 pages).
An approach and theory section: high level description of the technique used to analyze (2-3 pages), with
a formal description of the approach;
a walkthrough of the theory needed to understand the technique; and
any theoretical guarantees of the approach.
An evaluation section, describing the evaluation and the results (2-3 pages), with
an empirical evaluation of the approach,
compared against at least one other technique (or variations (optimizations) of your technique),
an analysis of the results, and
a discussion of the threats to validity.
A related work section, discussion of the technique, and alternatives techniques to solve the same problem (1 - 2 pages), with
an explanation why other techniques from the lectures (and potentially the literature) were not used.
A conclusion (0 - 1/2 page), which is
A short recap of the paper.
The day before the first exam your group has to hand in a video presentation of your project. It should highlight the project you have done and help start a conversation about your project. It can be no longer than 5 min.
Finally, the projects are defended orally in a group exam.
The presentation plan is still being worked out. But expect ~30 minutes per group starting with the 5 minutes video presentation, and ending in individual questioning
The questions are to your technical contributions in the project as well as the course material in general, especially the Questions (§2.5).
This list is incomplete. Please send suggestions on email to get it expanded or revised.
Here is an table of technical contributions. Use the formula to calculate the number of points gained. The max points is the maximal points the group can get per contribution.
IIN Implement Interpreter (Max: 10)
IAI Implement Abstract Interpreter (Max: 10)
ISY Implement Syntactic Analysis (Max: 10)
Formula: 5 per analysis
For example look through the bytecode for values to use in a fuzzer. Or quickly detect if there is assertions in a method.
ICF Implement Coverage-based Fuzzing (Max: 10)
IAB Implement Novel Abstractions (Max: 10)
Formula: 5 per abstraction
Implement new abstractions not covered in class. This could be machine word abstractions or polyhedras.
ISE Implement Symbolic Execution (Max: 10)
IBA Implement Unbounded Static Analysis (Max: 7)
Run your static analysis until a fixed point.
NAB Integrate Abstractions (Max: 10)
Formula: 5 per abstraction after the first
NAN Integrate Analyses (Max: 15)
Formula: 5 per analysis after the first
NIN Use Bytecode Instrumentation (Max: 10)
Instrument the JVM bytcode, and make the JVM run the dynamic analysis.
NCR Analysis informed code-rewriting (Max: 10)
Using information from an analysis to automatically rewrite code.
NDT Handle Novel Datatype (Max: 10)
Formula: 5 per datatype
Handle datatypes like strings or doubles, which are currently not covered by the material.
EXB Extend Benchmark Suite (Max: 5)
Formula: 0.5 per method added
Extend the JPAMB benchmark suite to contain new methods. Pull requests are appreciated but not required.
ERC Make Analysis run on Real Code (Max: 10)
Formula: 1 per 10k lines handled
EOP Optimize the code (Max: 10)
Formula: 5 per order of magnitude faster
ENJ Support a Different Language than Java (Max: 10)
TAB Prove abstractions correct (Max: 10)
Formula: 5 per abstraction
TWI Prove termination for widening operators (Max: 10)
Formula: 5 per abstraction
PWP Write Paper (Max: 10)
Formula: 1 per page
PEX Find Illustrating Examples (Max: 4)
Formula: 2 per example
PRW Read and relate to recent papers (Max: 10)
Formula: 1 per paper