Profiling

This section describes how best to profile yk and interpreters.

JIT statistics

At the end of an interpreter run, yk can print out some simple statistics about what happened during execution. If the YKD_LOG_STATS=<path> environment variable is defined, then JSON statistics will be written to the file at <path> once the interpreter "drops" the YkMt instance. The special value - (i.e. a single dash) can be used for <path> to indicate stderr.

Note that if the interpreter starts multiple yk instances, then the contents of <file> are undefined (at best the file will be nondeterministically overwritten as instances are "dropped", but output may be interleaved, or otherwise bizarre).

Output from YKD_LOG_STATS looks as follows:

{                                       
    "duration_compiling": 5.5219,                                               
    "duration_deopting": 2.2638,
    "duration_jit_executing": 0.2,
    "duration_outside_yk": 0.142,
    "duration_trace_mapping": 3.3797,                                           
    "traces_collected_err": 0,                                                  
    "traces_collected_ok": 11,                                                  
    "traces_compiled_err": 1,
    "traces_compiled_ok": 10                                                    
}

Fields and their meaning are as follows:

  • duration_compiling. Float, seconds. How long was spent compiling traces?
  • duration_deopting. Float, seconds. How long was spent deoptimising from failed guards?
  • duration_jit_executing. Float, seconds. How long was spent executing JIT compiled code?
  • duration_outside_yk. Float, seconds. How long was spent outside yk? This is a proxy for "how much time was spent in the interpreter", but is inherently an over-approximation because we can't truly know exactly what the system outside Yk counts as "interpreting" or not. For example, if an interpreter thread puts itself to sleep, we will still count it as time spent "outside yk".
  • duration_trace_mapping. Float, seconds. How long was spent mapping a "raw" trace to compiler-ready IR?
  • trace_executions. Unsigned integer. How many times have traces been executed? Note that the same trace can count arbitrarily many times to this.
  • traces_collected_err. Unsigned integer. How many traces were collected unsuccessfully?
  • traces_collected_ok. Unsigned integer. How many traces were collected successfully?
  • traces_compiled_err. Unsigned integer. How many traces were compiled unsuccessfully?
  • traces_compiled_ok. Unsigned integer. How many traces were compiled successfully?

Perf

On Linux, perf can be used to profile yk. You first need to record an execution of an interpreter and then separately view the profiling data that was generated.

Recording a Profile

To record a profile we first recommend compiling yk with debugging info embedded. cargo's debug profile does this automatically, but because no code optimisation is performed, the profiles are unrepresentative. We recommend using yk's provided profile-with-debug profile, which turns on --release-style code optimisation and embeds debugging information:

$ cargo build --profile=release-with-debug

Ensure that the interpreter you are profiling links to the appropriate version of yk and then call perf record:

$ perf record --call-graph dwarf -g ./interpreter ...args...

This uses --call-graph dwarf to force perf use DWARF debugging information: this will only be useful if you have compiled yk with embedded debugging information, as recommended above.

Viewing a profile

perf profiles can be visualised in a number of ways. When using perf report or perf script we currently recommend passing --no-inline to avoid the huge processing time incurred by indirectly running addr2line (note that this might change in the future).

Terminal

To quickly view a profile in the terminal:

$ perf report -g --no-inline

Firefox profiler

After processing perf's output, you can use Firefox's Profiler to view the data locally. Note that this does not upload the data --- all processing happens in your browser! First process the data:

$ perf script -F +pid --no-inline > out.perf

Then go to the Firefox Profiler page, press "Load a profile from file" and upload out.perf.

Flame graphs

You can make a flame graph using the Rust flamegraph tool. Install with cargo install flamegraph and then use flamegraph to profile and produce a flamegraph in one go with:

$ /path/to/cargo/bin/flamegraph --no-inline -- ./interpreter ...args...

Note that flamegraph passes --call-graph=dwarf to perf record by default (pass -v to see the exact perf invocation used).

This will produce an svg file which you can then view.