Dynamic Analysis

Let's run it and see?

Christian Gram Kalhauge

Table of Contents

  1. What is Dynamic Analysis?
  2. Running the Program
  3. Testing
  4. Selecting Input

Questions

  1. Name cases where the choice of a dynamic analysis makes the most sense?

  2. What are the primary limitation and challenges of dynamic analysis?

  3. When is random sampling of inputs better than testing small values and visa versa?

What is Dynamic Analysis? (ยง1)

  1. Trace Selection
  2. Trace Analysis
  3. Trace Abstraction and Prediction
Fill Input State Space Prediction 1.2 Input Selection 1.1 Analysis 1.3 Repeat 2.1

The three steps of a dynamic analysis, 1.1 we select a trace from the program, 1.2 we try to predict a class of traces from which our trace came from, 1.3 we analyse the set of traces. If we did not find anything we might repeat the analysis 2.1

Trace Selection (ยง1.1)

ฯ„โˆˆ๐’๐ž๐ฅ๐ž๐œ๐ญ(P,ฯƒ)โ‰กฯ„โˆˆSem(P)โˆงฯ„0=ฯƒ

Trace Analysis (ยง1.2)

  • did it end in a failure? ๐“๐ซ๐š๐œ๐ž๐šŽ๐š›๐š›(ฯ„)โ‰กฯ„โˆ’1=err(โ€˜...โ€™)

  • did any opened resource, not get closed? ๐“๐ซ๐š๐œ๐ž๐š›๐šŽ๐šœ(ฯ„)โ‰กโˆƒfi.๐Ž(ฯ„i,f)โˆงโˆ€j.ยฌ๐‚(ฯ„j,f)โˆจj<i

  • did we execute this instruction? ๐“๐ซ๐š๐œ๐ž๐šŒ๐š˜๐šŸ๐šŽ๐š›(i)(ฯ„)โ‰กiโˆˆ{e.ฮนย |ย eโˆˆฯ„}

โˆƒฯƒโˆˆIp,ฯ„โˆˆ๐’๐ž๐ฅ๐ž๐œ๐ญ(P,ฯƒ).๐“๐ซ๐š๐œ๐žX(ฯ„)โŸนโ„’X(P)
โˆ€ฯƒโˆˆIp,ฯ„โˆˆ๐’๐ž๐ฅ๐ž๐œ๐ญ(P,ฯƒ).๐“๐ซ๐š๐œ๐žX(ฯ„)โ‰กโ„’X(P)

Story: No analysis are truly sound!

Trace Abstraction and Prediction (ยง1.3)

  • [ฯ„]โІ๐“๐ซ๐š๐œ๐ž

  • ๐“๐ซ๐š๐œ๐žX([ฯ„])โŸนโˆƒฯ„โ€ฒโˆˆ[ฯ„].๐“๐ซ๐š๐œ๐žX(ฯ„โ€ฒ)

The Path Equivalence Class (ยง1.3.1)

int checkTheWrongThing(int a) { 
  if (a != 0) { 
    return a / 0;
  } else { 
    return 0;
  }
}
[ฯ„]ฯ€={tโ€ฒโˆˆ๐“๐ซ๐š๐œ๐žย |ย โˆ€iโˆˆ[1,โˆž).(tโ€ฒi).ฮน=(ti).ฮน}

Coverage (ยง1.3.2)

#ฮน(T)={ฯ„i.ฮนย |ย iโˆˆ[0,โˆž],ฯ„โˆˆT}

Instruction coverage vs Path equivalence

#ฮน([ฯ„]ฯ€)=#ฮน({ฯ„})

Trace Abstraction (ยง1.3.3)

How much do we need to save to disk?

  • Everything done at runtime ... fast and enables reactions.

  • Everything done at rest ... lazy and enables more analysis.

Running the Program (ยง2)

Why Should We Run the Program? (ยง2.1)

  • Every warning is real

  • Every warning has a trace

Why Shouldn't We Run the Program? (ยง2.2)

  • Setting up the Environment

  • Warning by doing

  • Only tells you what happens

Testing (ยง3)

Run the program and see if it crashes.

Turn your interpreter into a dynamic analysis.

  1. If the method takes arguments, exit early, otherwise,

  2. Run the case with your interpreter,

  3. Give the case returned 100%, and all others 0%.

Characterization Trace Testing (ยง3.1)

Check if the output is the same.

$ uv run jpamb interpret -r out.expected <your_interpreter>

Assertions (ยง3.2)

Crash the program early!

Tiger Style!

public static int square(int height, int width) {
  assert height > 0; 
  assert width > 0;
  return height * width;
}

Negative Space Programming

case Get(field=field):
  assert field.fieldid.name == "assertionsDisabled"
  frame.stack.push(Value.int(0)) # Push False

Sanitizers = Automatic Assertions

Selecting Input (ยง4)

  1. Random Inputs
  2. Dictionaries
  3. Coverage-Guided Fuzzing
  4. Property Based Testing
  5. The Small-Scope Hypothesis

Random Inputs (ยง4.1)

Choose inputs at random, hope to hit all trace classes

Easy to hit both trace classes!

void split(int i) { 
  if (i > 0) { 
    ...
  } else {
    ...
  }
}

Dictionaries (ยง4.2)

Not all Trace classes are easy to hit

void isEqualToZero(int i) { 
  if (i == 0) { 
    ...
  } else {
    ...
  }
}

The Dictionary Solution

  1. build a dictionary of interesting values, and

  2. make it more likely to choose those.

Coverage-Guided Fuzzing (ยง4.3)

A more complicated technique.

@Case("([C: 'h','e','l','l','o']) -> ok")
@Case("([C: 'x']) -> assertion error")
@Case("([C: ]) -> out of bounds")
@Tag({ ARRAY })
public static void arraySpellsHello(char[] array) {
  assert array[0] == 'h'
      && array[1] == 'e'
      && array[2] == 'l'
      && array[3] == 'l'
      && array[4] == 'o';
}

This case is not easy!

Given a function from byte-strings to program inputs:

  • pick an interesting test-case.

  • try to change it by removing, adding or modifying bytes.

  • convert it to a program input and run it and see if it adds coverage.

  • if so add it to the interesting cases.

Property Based Testing (ยง4.4)

Assertions + Fuzzing = Magic!

(Hypothesis) Use it in your interpreter!

from hypothesis import given, strategies as st

@given(st.lists(st.integers() | st.floats()))
def test_sort_correct(lst):
    # lst is a random list of numbers
    assert my_sort(lst) == sorted(lst)

test_sort_correct()

The Small-Scope Hypothesis (ยง4.5)

The Small-Scope Hypothesis

Most bugs in a program can be found by investigating only a small section of the inputs.

Will be found every time.

void isEqualToZero(int i) { 
  if (i == 0) { 
    ...
  } else {
    ...
  }
}

Will never be found.

void isEqualToAMillion(int i) { 
  if (i == 1000000) { 
    ...
  } else {
    ...
  }
}

We can create small-check generators

def gen_int(depth): 
  yield 0
  for i in range(depth):
    yield (i + 1)
    yield -(i + 1)

Call using iterative deepening gen_int(0), gen_int(1), gen_int(2)

Questions

  1. Name cases where the choice of a dynamic analysis makes the most sense?

  2. What are the primary limitation and challenges of dynamic analysis?

  3. When is random sampling of inputs better than testing small values and visa versa?