Project

Christian Gram Kalhauge

Table of Contents
  1. Choosing a Project§1
    1. Proposal§1.1
      1. Project Ideas§1.2
      2. Assessment§2
        1. Written Answers§2.1
          1. Project Overview Sheet§2.2
            1. Paper§2.3
              1. Video Presentation§2.4
                1. Oral Defence§2.5
                2. Technical Contributions§3
                  1. Implementation§3.1
                    1. Engineering§3.2
                      1. Evaluation§3.3
                        1. Theory§3.4
                          1. Presentation§3.5

                          Here is the definitive guide to the project and the assessment.

                          Choosing a Project §1

                          The first thing you should do is find an interesting topic. The topic should be centered around program analysis, and should demonstrate your understanding of the course. The normal case is to choose one more benchmark cases which you want to explore building a faster or more precise analysis for.

                          It is possible to choose your own custom project. Talk to the teacher, if you are interested.

                          Proposal §1.1

                          Before starting on the project, you have to hand in a project proposal. The project proposal is ~2 pages and correspond roughly to the introduction of your final paper. It should formatted using the ACM small format (see template), and will be evaluated according to the CGIE model, and graded 1-4:

                          • Context: what is the problem?

                          • Gap: why does the current techniques not work?

                          • Innovation: what is your proposed solution?

                          • Evaluation: when have you solved the problem?

                          • Plan: what is the plan?

                          • Presentation: is the proposal formatted correctly and presented nicely?

                          As an appendix to the proposal, please also submit a table of Technical Contributions (§3), as described in Project Overview Sheet (§2.2). Remember to fill in the maximal grade for each student.

                          You need to score at least 12 points and none of the parameters can be 1. If your proposal are rejected you have to resubmit (even after the deadline). The proposals are graded as you submit them, so you do not have to wait to the last day.

                          Project Ideas §1.2

                          As inspiration, here is selection of project ideas. Each project is described using abstracts from the CGIE Model as well as the potential technical contributions of those projects.

                          Analysis Survey

                          Developers know they should use a static analysis when developing code. However, there currently exist so many options that it is hard to chose.

                          In this project, we will investigate the state of the art, by implement runners for (five?) different analyses on the JPAMB suite, to illustrate their differences. We will expand the benchmark suite until we have found a clear difference between each different analyses, so we can discuss their short-comings.

                          IDNameMax
                          PWPWrite Paper10
                          PEXFind Illustrating Examples4
                          PRWRead and relate to recent papers10
                          NANIntegrate Analyses15
                          EXBExtend Benchmark Suite5
                          Sum44

                          With a maximum of 44 contribution points the suggested maximal grade is 7 (Groups of 4), 4 (Groups of 5), and 02 (Groups of 6).

                          Handling String

                          Strings are one of the basic units of computing. However, currently none of the techniques shown in class handle strings.

                          In this project, we extend our current analyses to handle strings. To evaluate our implementation we will expand the JPAMB suite to contain cases that show normal use cases of strings.

                          IDNameMax
                          PWPWrite Paper10
                          PEXFind Illustrating Examples4
                          PRWRead and relate to recent papers10
                          IINImplement Interpreter10
                          ISYImplement Syntactic Analysis10
                          IAIImplement Abstract Interpreter10
                          IABImplement Novel Abstractions10
                          NDTHandle Novel Datatype10
                          EXBExtend Benchmark Suite5
                          Sum79

                          With a maximum of 79 contribution points the suggested maximal grade is 10 (Groups of 4), 7 (Groups of 5), and 7 (Groups of 6).

                          Program Debloating

                          When writing software, developers (and now also LLMs) often add code which might be needed later, but never is. Ideally, the computer should be able to remove code which is not needed automatically. Currently, both dynamic and static analysis approaches have been used to remove unused method bodies and classes. However, current work does not remove code inside of functions.

                          In this project, we will use static and dynamic analysis techniques to estimate possible code-coverage on the source line level, and remove all the code not needed. To evaluate our implementation we will run our debloater on JPAMB suite and show we can remove X% of the code while all cases still can evaluate.

                          IDNameMax
                          PWPWrite Paper10
                          PEXFind Illustrating Examples4
                          PRWRead and relate to recent papers10
                          IINImplement Interpreter10
                          ISYImplement Syntactic Analysis10
                          IAIImplement Abstract Interpreter10
                          IABImplement Novel Abstractions10
                          NDTHandle Novel Datatype10
                          EXBExtend Benchmark Suite5
                          NCRAnalysis informed code-rewriting10
                          IBAImplement Unbounded Static Analysis7
                          Sum96

                          With a maximum of 96 contribution points the suggested maximal grade is 12 (Groups of 4), 10 (Groups of 5), and 10 (Groups of 6).

                          Statically Improved Fuzzing

                          Fuzzing is one of the most used analysis techniques in production. But for deep programs it can be hard to direct the fuzzer to all parts of the program. Currently, some dynamic techniques, like coverage and in some cases concolic analysis are used to improve the efficiency. But since these techniques are dynamic, they cannot help us with sections we have missed.

                          In this project, we will use static analysis techniques to improve fuzzing coverage. We do this by rating branches which changes to get higher coverage higher. To evaluate our implementation we will show that our hybrid fuzzer explores cases in the JPAMB faster than pure fuzzing.

                          IDNameMax
                          EXBExtend Benchmark Suite5
                          IABImplement Novel Abstractions10
                          IAIImplement Abstract Interpreter10
                          IINImplement Interpreter10
                          ICFImplement Coverage-based Fuzzing10
                          ISYImplement Syntactic Analysis10
                          NDTHandle Novel Datatype10
                          NINUse Bytecode Instrumentation10
                          PEXFind Illustrating Examples4
                          PWPWrite Paper10
                          Sum89

                          With a maximum of 89 contribution points the suggested maximal grade is 12 (Groups of 4), 10 (Groups of 5), and 7 (Groups of 6).

                          Assessment §2

                          Your final grade will be an overall assessment based on:

                          The work will be evaluated according to the learning objectives. Special weight will be put on that you are using the content of the course (indicated by the technical contributions), that the project is well done (easy to read, use of techniques are warranted by the problem, and claims are supported by evaluation), and finally (for extra points) if the project problem is interesting and the solution is innovative.

                          Written Answers §2.1

                          In the end of the course, your feedback group must hand in a single PDF, with written answers to of all the Questions (§2.5) asked in class. The answers should be around a paragraph. The answers will be graded on correctness, and account for no more than 10% of the final grade.

                          Project Overview Sheet §2.2

                          Technical contributions is an advisory grade, which goal is to set the expectations of the course. At the exam, your grade might be higher or lower than your calculated grade, depending on the outcome of project as a whole.

                          Additionally to the Paper (§2.3), your project group also must hand in a single A4 page, containing the name, student number, the feedback group and the images of each of the students in the group. Finally you should also create a table summarizing the technical contributions of each of the group members.

                          The table, should have a row for each technical contribution. They should contain the contribution, the points it is worth, and how the points should be distributed between the students.

                          ContributionPointssXXXXXXsYYYYYYsZZZZZZ...
                          IIN Implement Interpreter10433-
                          IAI Implement Abstract Interpreter100010-
                          NAN Integrate Analyses (3)15847-
                          EXB Extend Benchmark Suite (10)5210-
                          ------
                          Total-14821-
                          Suggested Maximal Grade-7412-

                          Finally, the numbers are tallied up and used to calculate a maximal grade, each individual student can get. The grades are calculated from this table (subject to change):

                          GradePointDiversity
                          1220Across 5 contributions
                          1015Across 4 contributions
                          710Across 3 contributions
                          47Across 3 contributions
                          025Across 3 contributions

                          Paper §2.3

                          The paper should be at most 10 pages, excluding references. It must be formatted using the ACM format in acmsmall mode. To be sure to adhere to the format.

                          Before you continue writing the paper, consider the advice in this lecture, by Simon Peyton Jones.

                          The final paper should contain the following sections (approximate page numbers in parentheses):

                          • An abstract, describing the paper.

                          • An introduction (1-2 pages), with

                            • a good motivation and description of the problem you are trying to solve (consider using examples),

                            • the problem stated as a research question,

                            • a short description of your solution, and

                            • a list of contributions with forward references, described in the lecture by Simon P. Jones.

                          • An example illustrating your approach (1-2 pages).

                          • An approach and theory section: high level description of the technique used to analyze (2-3 pages), with

                            • a formal description of the approach;

                            • a walkthrough of the theory needed to understand the technique; and

                            • any theoretical guarantees of the approach.

                          • An evaluation section, describing the evaluation and the results (2-3 pages), with

                            • an empirical evaluation of the approach,

                            • compared against at least one other technique (or variations (optimizations) of your technique),

                            • an analysis of the results, and

                            • a discussion of the threats to validity.

                          • A related work section, discussion of the technique, and alternatives techniques to solve the same problem (1 - 2 pages), with

                            • an explanation why other techniques from the lectures (and potentially the literature) were not used.

                          • A conclusion (0 - 1/2 page), which is

                            • A short recap of the paper.

                          Video Presentation §2.4

                          The day before the first exam your group has to hand in a video presentation of your project. It should highlight the project you have done and help start a conversation about your project. It can be no longer than 5 min.

                          Oral Defence §2.5

                          Finally, the projects are defended orally in a group exam.

                          The presentation plan is still being worked out. But expect ~30 minutes per group starting with the 5 minutes video presentation, and ending in individual questioning

                          The questions are to your technical contributions in the project as well as the course material in general, especially the Questions (§2.5).

                          Technical Contributions §3

                          This list is incomplete. Please send suggestions on email to get it expanded or revised.

                          Here is an table of technical contributions. Use the formula to calculate the number of points gained. The max points is the maximal points the group can get per contribution.

                          Implementation §3.1

                          IIN Implement Interpreter (Max: 10)

                          IAI Implement Abstract Interpreter (Max: 10)

                          ISY Implement Syntactic Analysis (Max: 10)

                          Formula: 5 per analysis

                          For example look through the bytecode for values to use in a fuzzer. Or quickly detect if there is assertions in a method.

                          ICF Implement Coverage-based Fuzzing (Max: 10)

                          IAB Implement Novel Abstractions (Max: 10)

                          Formula: 5 per abstraction

                          Implement new abstractions not covered in class. This could be machine word abstractions or polyhedras.

                          ISE Implement Symbolic Execution (Max: 10)

                          IBA Implement Unbounded Static Analysis (Max: 7)

                          Run your static analysis until a fixed point.

                          Engineering §3.2

                          NAB Integrate Abstractions (Max: 10)

                          Formula: 5 per abstraction after the first

                          NAN Integrate Analyses (Max: 15)

                          Formula: 5 per analysis after the first

                          NIN Use Bytecode Instrumentation (Max: 10)

                          Instrument the JVM bytcode, and make the JVM run the dynamic analysis.

                          NCR Analysis informed code-rewriting (Max: 10)

                          Using information from an analysis to automatically rewrite code.

                          NDT Handle Novel Datatype (Max: 10)

                          Formula: 5 per datatype

                          Handle datatypes like strings or doubles, which are currently not covered by the material.

                          Evaluation §3.3

                          EXB Extend Benchmark Suite (Max: 5)

                          Formula: 0.5 per method added

                          Extend the JPAMB benchmark suite to contain new methods. Pull requests are appreciated but not required.

                          ERC Make Analysis run on Real Code (Max: 10)

                          Formula: 1 per 10k lines handled

                          EOP Optimize the code (Max: 10)

                          Formula: 5 per order of magnitude faster

                          ENJ Support a Different Language than Java (Max: 10)

                          Theory §3.4

                          TAB Prove abstractions correct (Max: 10)

                          Formula: 5 per abstraction

                          TWI Prove termination for widening operators (Max: 10)

                          Formula: 5 per abstraction

                          Presentation §3.5

                          PWP Write Paper (Max: 10)

                          Formula: 1 per page

                          PEX Find Illustrating Examples (Max: 4)

                          Formula: 2 per example

                          PRW Read and relate to recent papers (Max: 10)

                          Formula: 1 per paper