Dynamic Analysis

Let's run it and see?

Christian Gram Kalhauge

What is Dynamic Analysis?
Running the Program
Testing
Selecting Input

Questions

Name cases where the choice of a dynamic analysis makes the most sense?
What are the primary limitation and challenges of dynamic analysis?
When is random sampling of inputs better than testing small values and visa versa?

What is Dynamic Analysis? (§1)

Trace Selection
Trace Analysis
Trace Abstraction and Prediction

The three steps of a dynamic analysis, 1.1 we select a trace from the program, 1.2 we try to predict a class of traces from which our trace came from, 1.3 we analyse the set of traces. If we did not find anything we might repeat the analysis 2.1

Trace Selection (§1.1)

τ \in 𝐒 𝐞 𝐥 𝐞 𝐜 𝐭 (P, σ) \equiv τ \in Sem (P) \land τ_{0} = σ

Trace Analysis (§1.2)

did it end in a failure? ${𝐓 𝐫 𝐚 𝐜 𝐞}_{𝚎𝚛𝚛} (τ) \equiv τ_{- 1} = err (‘...’)$
did any opened resource, not get closed? ${𝐓 𝐫 𝐚 𝐜 𝐞}_{𝚛𝚎𝚜} (τ) \equiv \exists f i . 𝐎 (τ_{i}, f) \land \forall j . \neg 𝐂 (τ_{j}, f) \lor j < i$
did we execute this instruction? ${𝐓 𝐫 𝐚 𝐜 𝐞}_{𝚌𝚘𝚟𝚎𝚛 (i)} (τ) \equiv i \in {e . ι | e \in τ}$

\exists σ \in I_{p}, τ \in 𝐒 𝐞 𝐥 𝐞 𝐜 𝐭 (P, σ) . {𝐓 𝐫 𝐚 𝐜 𝐞}_{X} (τ) ⟹ ℒ_{X} (P)

\forall σ \in I_{p}, τ \in 𝐒 𝐞 𝐥 𝐞 𝐜 𝐭 (P, σ) . {𝐓 𝐫 𝐚 𝐜 𝐞}_{X} (τ) \equiv ℒ_{X} (P)

Story: No analysis are truly sound!

Trace Abstraction and Prediction (§1.3)

$[τ] \subseteq 𝐓 𝐫 𝐚 𝐜 𝐞$
${𝐓 𝐫 𝐚 𝐜 𝐞}_{X} ([τ]) ⟹ \exists τ^{'} \in [τ] . {𝐓 𝐫 𝐚 𝐜 𝐞}_{X} (τ^{'})$

The Path Equivalence Class (§1.3.1)

int checkTheWrongThing(int a) { 
  if (a != 0) { 
    return a / 0;
  } else { 
    return 0;
  }
}

{[τ]}_{π} = {t^{'} \in 𝐓 𝐫 𝐚 𝐜 𝐞 | \forall i \in [1, \infty) . ({t^{'}}_{i}) . ι = (t_{i}) . ι}

Coverage (§1.3.2)

#_{ι} (T) = {τ_{i} . ι | i \in [0, \infty], τ \in T}

Instruction coverage vs Path equivalence

#_{ι} ({[τ]}_{π}) = #_{ι} ({τ})

Trace Abstraction (§1.3.3)

How much do we need to save to disk?

Everything done at runtime ... fast and enables reactions.
Everything done at rest ... lazy and enables more analysis.

Running the Program (§2)

Why Should We Run the Program? (§2.1)

Every warning is real
Every warning has a trace

Why Shouldn't We Run the Program? (§2.2)

Setting up the Environment
Warning by doing
Only tells you what happens

Testing (§3)

Run the program and see if it crashes.

Turn your interpreter into a dynamic analysis.

If the method takes arguments, exit early, otherwise,
Run the case with your interpreter,
Give the case returned 100%, and all others 0%.

Characterization Trace Testing (§3.1)

Check if the output is the same.

$ uv run jpamb interpret -r out.expected <your_interpreter>

Assertions (§3.2)

Crash the program early!

Tiger Style!

public static int square(int height, int width) {
  assert height > 0; 
  assert width > 0;
  return height * width;
}

Negative Space Programming

case Get(field=field):
  assert field.fieldid.name == "assertionsDisabled"
  frame.stack.push(Value.int(0)) # Push False

Sanitizers = Automatic Assertions

Selecting Input (§4)

Random Inputs
Dictionaries
Coverage-Guided Fuzzing
Property Based Testing
The Small-Scope Hypothesis

Random Inputs (§4.1)

Choose inputs at random, hope to hit all trace classes

Easy to hit both trace classes!

void split(int i) { 
  if (i > 0) { 
    ...
  } else {
    ...
  }
}

Dictionaries (§4.2)

Not all Trace classes are easy to hit

void isEqualToZero(int i) { 
  if (i == 0) { 
    ...
  } else {
    ...
  }
}

The Dictionary Solution

build a dictionary of interesting values, and
make it more likely to choose those.

Coverage-Guided Fuzzing (§4.3)

A more complicated technique.

@Case("([C: 'h','e','l','l','o']) -> ok")
@Case("([C: 'x']) -> assertion error")
@Case("([C: ]) -> out of bounds")
@Tag({ ARRAY })
public static void arraySpellsHello(char[] array) {
  assert array[0] == 'h'
      && array[1] == 'e'
      && array[2] == 'l'
      && array[3] == 'l'
      && array[4] == 'o';
}

This case is not easy!

Given a function from byte-strings to program inputs:

pick an interesting test-case.
try to change it by removing, adding or modifying bytes.
convert it to a program input and run it and see if it adds coverage.
if so add it to the interesting cases.

Property Based Testing (§4.4)

Assertions + Fuzzing = Magic!

(Hypothesis) Use it in your interpreter!

from hypothesis import given, strategies as st

@given(st.lists(st.integers() | st.floats()))
def test_sort_correct(lst):
    # lst is a random list of numbers
    assert my_sort(lst) == sorted(lst)

test_sort_correct()

The Small-Scope Hypothesis (§4.5)

The Small-Scope Hypothesis

Most bugs in a program can be found by investigating only a small section of the inputs.

Will be found every time.

void isEqualToZero(int i) { 
  if (i == 0) { 
    ...
  } else {
    ...
  }
}

Will never be found.

void isEqualToAMillion(int i) { 
  if (i == 1000000) { 
    ...
  } else {
    ...
  }
}

We can create small-check generators

def gen_int(depth): 
  yield 0
  for i in range(depth):
    yield (i + 1)
    yield -(i + 1)

Call using iterative deepening gen_int(0), gen_int(1), gen_int(2)

Questions

Name cases where the choice of a dynamic analysis makes the most sense?
What are the primary limitation and challenges of dynamic analysis?
When is random sampling of inputs better than testing small values and visa versa?

Dynamic Analysis

Table of Contents

Questions

What is Dynamic Analysis? (§1)

Trace Selection (§1.1)

Trace Analysis (§1.2)

Trace Abstraction and Prediction (§1.3)

The Path Equivalence Class (§1.3.1)

Coverage (§1.3.2)

Trace Abstraction (§1.3.3)

Running the Program (§2)

Why Should We Run the Program? (§2.1)

Why Shouldn't We Run the Program? (§2.2)

Testing (§3)

Turn your interpreter into a dynamic analysis.

Characterization Trace Testing (§3.1)

Assertions (§3.2)

Tiger Style!

Selecting Input (§4)

Random Inputs (§4.1)

Easy to hit both trace classes!

Dictionaries (§4.2)

The Dictionary Solution

Coverage-Guided Fuzzing (§4.3)

Property Based Testing (§4.4)

(Hypothesis) Use it in your interpreter!

The Small-Scope Hypothesis (§4.5)

The Small-Scope Hypothesis

Will be found every time.

Will never be found.

Questions