Introduction

This book contains documentation for the yk metatracing system. A preformatted version of this book can be found at https://ykjit.github.io/yk/.

This book is written in mdbook format. If you want to edit this book, mdbook serve --open runs a local server, opens a formatted version of this book in your browser, and automatically refreshes the browser when you save a markdown file after editing.

Major components

yk is a meta-tracing system that turns existing C interpreters into JIT-compiling VMs. It is comprised of two major components:

ykllvm is a fork of LLVM that must be used to: compile C interpreters with the necessary extra information for yk; link in the yk Rust library.
The yk Rust library is the core of the run-time JIT system.

Terminology

yk utilises two different Intermediate Representations (IRs):

AOT IR is the Ahead-Of-Time IR generated by ykllvm and embedded in the binary of a C interpreter. AOT IR is similar to LLVM IR, though customised and simplified for yk.
JIT IR is the IR generated (from AOT IR) at run-time by yk and which is dynamically converted into machine code.

There are three styles of "trace" in yk:

When a hot loop in a program is detected, the actions of the interpreter are recorded to make an AOT IR trace.
The AOT IR trace is combined with AOT IR and then compiled into JIT IR to make a JIT IR trace.
The JIT IR trace is compiled into machine code to make an executable trace.

yk Internals

Installation

This section details how to get yk up and running.

System Requirements

At the time of writing, yk requires the following:

A Linux system with a CPU that supports Intel Processor Trace. (grep intel_pt /proc/cpuinfo to check)
Linux perf (for collecting PT traces).
A Yk-enabled programming language interpreter.
A recent nightly install of Rust.

Note that at present, non-root users can only use Yk if /proc/sys/kernel/perf_event_paranoid is set to -1.

Building

Clone the main yk repository and build it with cargo:

$ git clone --recurse-submodules --depth 1 \
  https://github.com/ykjit/yk/
$ cd yk
$ cargo build --release

Note that this will also clone ykllvm as a submodule of yk. If you later want access to the full git history, either remove --depth 1 or run git fetch --unshallow.

Available Interpreters

The following interpreters use Yk:

Interpreter	Status
yklua	pre-alpha
ykcbf	pre-alpha

Development

This section explains how to setup and use yk.

Configuring the build

Start by following the general installation instructions.

The yk repo is a Rust workspace (i.e. a collection of crates). You can build and test in the usual ways using cargo. For example, to build and test the system, run:

cargo test

`YKB_YKLLVM_BIN_DIR`

Under normal circumstances, yk builds a copy of its LLVM fork "ykllvm", which it also uses it to build interpreters (via the compiler's use of yk-config). You can use your own ykllvm build by specifying the directory where the executables (e.g. clang, llvm-config, and so on) are stored with YKB_YKLLVM_BIN_DIR.

yk does not check your installation for compatibility: it is your responsibility to ensure that your ykllvm build matches that expected by yk.

It is also undefined behaviour to move between defining this variable and not within a repository using yk (including the yk repository itself). If you want to set/unset YKB_YKLLVM_BIN_DIR then cargo clean any repositories using yk before rebuilding them.

`YKB_TRACER`

The YKB_TRACER environment variable allows building yk with either hwt (Hardware Tracer) or swt (Software Software Tracer).

swt - CPU architecture-independent. The default.

hwt - Relies on Intel PT. Suitable only for x86 CPUs supporting it.

Run-time configuration

There are a number of environment variables which control the run_time behaviour of the yk system.

General configuration

Variables prefixed with YK_ allow the user to control aspects of yk's execution.

The following environment variables are available:

YK_HOT_THRESHOLD: an integer from 0..4294967295 (both inclusive) that determines how many executions of a hot loop are needed before it is traced. Defaults to 131.
YK_SIDETRACE_THRESHOLD: an integer from 0..4294967295 (both inclusive) that determines how many times a guard needs to fail before a sidetrace is created. Defaults to 5.

Debugging

Variables prefixed with YKD_ are intended to help interpreter authors debug performance issues. Some are only available in certain compile-time configurations of yk, either because they increase the binary size, or slow performance down.

The following environment variables are available (some only in certain configurations of yk):

YKD_LOG=[<path>:]<level> specifies where, and how much, general information yk will log during execution.

If <path>: (i.e. a path followed by ":") is specified then output is sent to that path. The special value - (i.e. a single dash) can be used for <path> to indicate stderr. If not specified, logs to stderr.

<level> specifies the level of logging, each adding to the previous: level 0 turns off all yk logging; level 1 shows major errors only; and level 2 warnings. Levels above 3 are used for internal yk debugging, and their precise output, and indeed the maximum level may change without warning. Currently: level 3 shows tracing events (e.g. starting/stopping tracing); and level 4 shows trace execution and deoptimisation. Note that some information, at all levels, may or may not be displayed based on compile-time options. Defaults to 1.
YKD_LOG_IR [with the ykd feature]
YKD_LOG_STATS

Debugging

Trace optimisation

Trace optimisation can make it difficult to understand why a yk interpreter has behaved in the way it does. It is worth trying to run your code with the optimiser turned off. You can do this with the YKD_OPT environment variable, which takes the following values:

1: turn the optimiser on. Default if not otherwise specified.
0: turn the optimiser off.

Debugging JITted code

Often you will find the need to inspect JITted code with a debugger. If the problem trace comes from a C test (i.e. one of the test cases under tests/c), then you can use the gdb_c_test tool.

The tool automates the compilation and invocation of the resulting binary under GDB.

The simplest invocation of gdb_c_test (from the top-level of the yk repo) would look like:

bin/gdb_c_test simple.c

This will automatically compile and run the tests/c/simple.c test under GDB. This would be ideal if you have a crashing trace, as it will dump you into a GDB shell at the time of the crash.

To see what else you can do with gdb_c_test, run:

bin/gdb_c_test --help

For help on using GDB, see the GDB documentation.

GDB plugin

Yk comes with a GDB plugin that allows the debugger to show higher-level information in the source view window.

The plugin is built by default and put in target/yk_gdb_plugin.so.

To use it, put this line in ~/.gdbinit:

jit-reader-load /path/to/yk/target/yk_gdb_plugin.so

Then when you run GDB, you should see:

Yk JIT support loaded.

When you are inside JITted code, the source view will show higher-level debugging information. You can show the assembler and source views on one GDB screen using the "split" layout. Type:

la spl

Profiling

This section describes how best to profile yk and interpreters.

JIT statistics

At the end of an interpreter run, yk can print out some simple statistics about what happened during execution. If the YKD_LOG_STATS=<path> environment variable is defined, then JSON statistics will be written to the file at <path> once the interpreter "drops" the YkMt instance. The special value - (i.e. a single dash) can be used for <path> to indicate stderr.

Note that if the interpreter starts multiple yk instances, then the contents of <file> are undefined (at best the file will be nondeterministically overwritten as instances are "dropped", but output may be interleaved, or otherwise bizarre).

Output from YKD_LOG_STATS looks as follows:

{                                       
    "duration_compiling": 5.5219,                                               
    "duration_deopting": 2.2638,
    "duration_jit_executing": 0.2,
    "duration_outside_yk": 0.142,
    "duration_trace_mapping": 3.3797,                                           
    "traces_collected_err": 0,                                                  
    "traces_collected_ok": 11,                                                  
    "traces_compiled_err": 1,
    "traces_compiled_ok": 10                                                    
}

Fields and their meaning are as follows:

duration_compiling. Float, seconds. How long was spent compiling traces?
duration_deopting. Float, seconds. How long was spent deoptimising from failed guards?
duration_jit_executing. Float, seconds. How long was spent executing JIT compiled code?
duration_outside_yk. Float, seconds. How long was spent outside yk? This is a proxy for "how much time was spent in the interpreter", but is inherently an over-approximation because we can't truly know exactly what the system outside Yk counts as "interpreting" or not. For example, if an interpreter thread puts itself to sleep, we will still count it as time spent "outside yk".
duration_trace_mapping. Float, seconds. How long was spent mapping a "raw" trace to compiler-ready IR?
trace_executions. Unsigned integer. How many times have traces been executed? Note that the same trace can count arbitrarily many times to this.
traces_collected_err. Unsigned integer. How many traces were collected unsuccessfully?
traces_collected_ok. Unsigned integer. How many traces were collected successfully?
traces_compiled_err. Unsigned integer. How many traces were compiled unsuccessfully?
traces_compiled_ok. Unsigned integer. How many traces were compiled successfully?

Perf

On Linux, perf can be used to profile yk. You first need to record an execution of an interpreter and then separately view the profiling data that was generated.

Recording a Profile

To record a profile we first recommend compiling yk with debugging info embedded. cargo's debug profile does this automatically, but because no code optimisation is performed, the profiles are unrepresentative. We recommend using yk's provided profile-with-debug profile, which turns on --release-style code optimisation and embeds debugging information:

$ cargo build --profile=release-with-debug

Ensure that the interpreter you are profiling links to the appropriate version of yk and then call perf record:

$ perf record --call-graph dwarf -g ./interpreter ...args...

This uses --call-graph dwarf to force perf use DWARF debugging information: this will only be useful if you have compiled yk with embedded debugging information, as recommended above.

Viewing a profile

perf profiles can be visualised in a number of ways. When using perf report or perf script we currently recommend passing --no-inline to avoid the huge processing time incurred by indirectly running addr2line (note that this might change in the future).

Terminal

To quickly view a profile in the terminal:

$ perf report -g --no-inline

Firefox profiler

After processing perf's output, you can use Firefox's Profiler to view the data locally. Note that this does not upload the data --- all processing happens in your browser! First process the data:

$ perf script -F +pid --no-inline > out.perf

Then go to the Firefox Profiler page, press "Load a profile from file" and upload out.perf.

Flame graphs

You can make a flame graph using the Rust flamegraph tool. Install with cargo install flamegraph and then use flamegraph to profile and produce a flamegraph in one go with:

$ /path/to/cargo/bin/flamegraph --no-inline -- ./interpreter ...args...

Note that flamegraph passes --call-graph=dwarf to perf record by default (pass -v to see the exact perf invocation used).

This will produce an svg file which you can then view.

Understanding Traces

yk can print the traces it has created to stderr to help with debugging. However, these traces are often lengthy, and not always easy to understand. This section briefly explains how to get yk to print its traces, and how to make them a bit easier to understand.

Producing a trace

`YKD_LOG_IR`

YKD_LOG_IR=[<path>:]<irstage_1>[,...,<irstage_n>] logs IR from different stages to path. The special value - (i.e. a single dash) can be used for <path> to indicate stderr.

The following ir_stages are supported:

aot: the entire AOT IR for the interpreter.
jit-pre-opt: the JIT IR trace before optimisation.
jit-post-opt: the JIT IR trace after optimisation.
jit-asm: the assembler code of the compiled JIT IR trace.
jit-asm-full: the assembler code of the compiled JIT IR trace with instruction offsets and virtual addresses annotated.

Gotchas

Yk has some unusual implications, requirements, and limitations that interpreter authors should be aware of.

Trace quality depends upon completeness of interpreter IR.

Yk works by recording traces (i.e. individual control flow paths) of your interpreter's implementation. Each trace ends up as an ordered list of LLVM IR blocks which are "stitched together" and compiled. Unless callees are marked yk_noinline, the JIT will seek to inline them into the trace because, generally speaking, the more the JIT can inline, the more optimal the JITted code will be.

In order to inline a function call, the JIT needs to have LLVM IR for the callee. Yk uses fat LTO to collect (and embed into the resulting binary) a "full-program" IR for your interpreter. yk-config provides the relevant clang flags to make this happen. You should make sure that the build system of your interpreter uses the relevant flags. Namely yk-config --cppflgs --cflags for compiling C code, and yk-config --ldflags --libs for linking.

It follows that shared objects, which at the time of writing cannot take part in LTO, cannot be inlined into traces. If your interpreter dlopen()s shared objects at runtime (as is common for C extensions) Yk will be unable to trace the newly loaded code.

Symbol visibility

The JIT relies upon the use of dlsym() at runtime in order to lookup any given symbol from its virtual address. For this to work all symbols must be exposed in the dynamic symbol table.

yk-config provides flags to put every function's symbol into the dynamic symbol table. Since distinct symbols of the same name can exist (e.g. static functions in C), but dynamic symbol names must be unique, symbols may be mangled (mangling is done by the fat LTO module merger). If your interpreter does its own symbol introspection, Yk may break it.

Extra sections in your interpreter binary.

yk-config will add flags that add the following sections to your binary:

.llvmbc: LLVM bytecode for your interpreter. Used to construct traces. This is a standard LLVM section (but extended by Yk).
.llvm_bb_addr_map: The basic block address map. Used to map virtual addresses back to LLVM IR blocks. This is a standard LLVM section (but extended by Yk).
.llvm_stackmaps: Stackmap table. Used to identify the locations of live LLVM IR variables. This is a standard LLVM section (but extended by Yk).

Other interpreter requirements and gotchas

Yk can only currently work with "simple interpreter loop"-style interpreters and cannot yet handle unstructured interpreter loops (e.g. threaded dispatch).
Yk currently assumes that no new code is loaded at runtime (e.g. dlopen()), and that no code is unloaded (e.g. dlclose()). Self modifying interpreters will also confuse the JIT.
Yk currently doesn't handle calls to pthread_exit() gracefully (more details).
Yk currently doesn't handle setjmp()/longjmp().
You cannot valgrind an interpreter that is using Intel PT for tracing (more details).

Internals

Debugging / Testing

Compile-time features

`yk_testing`

The yk_testing Cargo feature is enabled whenever the tests crate is being compiled, so a regular cargo build in the root of the workspace will enable the feature (to build without the feature enabled, do cargo build -p ykcapi).

Run-time debugging / testing features

`YKD_SERIALISE_COMPILATION`

When YKD_SERIALISE_COMPILATION=1, calls to yk_control_point(loc) will block while loc is being compiled.

This variable is only available when building ykrt with the yk_testing Cargo feature enabled.

Working on yk

yk has several features designed to make it easier to work on yk itself. Most of these are transparent to the developer (e.g. rebuilding ykllvm when needed): in this page we document those that are not.

clangd

The yk build system generates compilation command databases for use with clangd. If you want diagnostics and/or completion in your editor (via an LSP), you will have to configure the LSP to use clangd (the automated build system puts a clangd binary into target/<debug|release>/ykllvm/bin that you could use).

Pull Requests

We welcome all and any contributions, from code to documentation to examples, submitted as pull requests to one of our GitHub repositories. Pull requests are reviewed by a yk contributor: changes may be requested in an ongoing dialogue between reviewee and reviewer; and, if and when agreement is reached, the pull request will be merged.

This page explains yk's approach to the pull request model, based on observations of good practise in other projects, and our own experiences. Please note that this is a living process: the details are subject to continual refinement and change, and no document can perfectly capture all the possibilities that will crop up in real life.

What makes a good pull request?

The typical aim of a pull request to a yk repository is to make a focussed change such as: add a new feature; fix a bug; or edit documentation. Good pull requests are:

Self-contained. A pull request should not try and change everything at once. Ideally a pull request fixes a single bug or adds a single feature; but sometimes a set of interdependent fixes or features makes more sense. In general, small pull requests are better than big pull requests.
Tested. Every piece of new functionality, and every bug fix, must have at least one accompanying test, unless testing is impractical. Authors should also strive not to break any other part of the system. yk runs various tests before a pull request is merged: this reduces, but does not remove, the chances of a merged pull request breaking expectations: careful thought and testing on the part of the pull request author are still required.
Documented. Code diffs show what has been changed, but not why. Documentation explains to other humans the context for a change (i.e. why something needed to be changed), the reason why the change is as it is (i.e. could the change have taken a different form? what alternatives were considered or attempted?), and the consequences of doing so (i.e. does this make something easier or harder in the future?). Documentation comes in the form of comments in code, "external" documentation (such as you read here), and commit and pull request messages. In whatever form it comes, documentation must be clear (as easy to understand as possible), concise (as short as possible while getting all the required points across), complete (not missing any important points), and necessary (not documenting that which is obvious).
Useful. The more people that will benefit from a pull request, the more likely it is to be merged. The bar for "useful" is set fairly low, but that does not mean that every pull request should be merged: for example, a pull request which simply changes code to your preferred style is not only not useful but may cause problems for other people working in parallel. However, porting the code to a little used platform is often useful (provided it doesn't cause undue problems for more widely used platforms).
Harmonious. As far as possible, new code should feel like a natural extension of existing code. Following existing naming conventions, for example, can sometimes grate, but internal consistency helps those who read and edit the code later.

The reviewer and reviewee covenant

The aim of pull request reviewing is to make yk better while ensuring that our quality standards are maintained. Ultimately, it is in yk's best interests for all pull requests which satisfy the criteria listed in the previous section to be merged in.

Both reviewer and reviewee have their part to play in making this process work. Most importantly, both parties must assume good faith on the part of the other: for example, questions are an opportunity to learn or explain, not to attack. Clear, polite, communication between both parties is required at all times. Reviewers should respond in a timely manner to comments, while understanding that reviewees may have many outside responsibilities that mean their responses are less timely.

Reviewers should help a reviewee meet the expected standards, via questioning and explicit guidance, without setting the bar unnecessarily high. Reviewers need to accept:

that a pull request cannot solve every problem, and some problems are best deferred to a future pull request;
and that some variance of individual style is inevitable and acceptable.

Put another way, while we set high standards, and require all contributions to meet them, we are not foolish enough to expect perfection. In particular, reviewers must be welcoming to newcomers, who may not be familiar with our processes or standards, and adjust accordingly.

The pull request process in yk repositories

To raise a pull request, you must:

Fork the relevant yk GitHub repository to your account. Note that pull requests must never come from a branch on a yk repository.
Create a new branch in your fork. If you are unsure about the name, start with something generic and rename it later with git branch -m <new name>.
Make your changes and commit them. It is not only allowed, but often best, to have multiple commits: each commit should be a logical change building upon one of its predecessors. Each commit should be capable of passing all tests successfully (amongst other things, to avoid breaking git bisect).
Push your branch to your GitHub fork and raise a pull request. Give the pull request a meaningful title (for example, "Fix bug" is not helpful but "Deal with an empty list correctly" is) and a description (empty descriptions are almost never acceptable). If the pull request fixes all aspects of a GitHub issue, add the text Fixes #<GitHub issue number>, as a line on its own, into the description: GitHub will then automatically close that issue when the pull request is merged.

Your pull request has now been raised and will be reviewed by a yk contributor. The aim of a review is two fold: to ensure that contributions to yk are of an acceptable quality; and that at least two people (the reviewer and reviewee) understand what was changed and why.

The reviewer will comment in detail on specific lines of the commit and on the overall pull request. Comments might be queries (e.g. "can the list ever be empty at this point?") or corrections (e.g. "we need to deal with the possibility of an empty list at this point"). For each comment you can either:

Address the issue. For example, if the reviewer has pointed out correctly that something needs to be changed then:
1. make a small additional commit (one per fix);
2. push it to your branch;
3. and, in the same place as the reviewer made their comment, add a new comment Fixed in <git hash>.
The reviewer will then review your change and either: mark the conversation as "resolved" if their point is adequately addressed; or raise further comments otherwise.

Note that the reviewee must not: mark conversations as resolved (only the reviewer should do so); force push updates to your branch unless explicitly requested to do so by the reviewer (always make new commits on the existing branch).
Add a query. You might not understand the reviewer's question: it's fine to ask for clarification.
Explain why you think there is not an issue. The reviewer might have misunderstood either the details of a change, or the pull request's context (e.g. the pull request might not have intended to fix every possible issue). A complete, and polite explanation, can help both reviewer and reviewee focus on the essentials.

Often multiple rounds of comments, queries, and fixes are required. When the reviewer is happy with the changes, they will ask the reviewee to "please squash" their pull request. The aim here is to provide a readable sequence of commits for those who later need to look at the history of changes to a particular part of the repository. The general aim is to provide a pleasing sequence of individually documented commits which clearly articulate a change to future readers. At the very least, all of the "fix commits" must be merged away: commonly, these are merged into the main commits, or, though rarely, into a new commit(s). It is not required, and is often undesirable, to squash all commits down into a single commit: when multiple commits can better explain a change, they are much preferred. During squashing, you should also check that commit messages still accurately document their contents: revise those which need updating.

The process of squashing is synonymous with git rebase, so when you are asked to squash it is also acceptable to: rebase the commit against the master branch at the same time; and force push the resulting rebased branch.

Not that being asked to squash or to update a commit message are the only times you may force push an update to a branch. In both cases, the reviewee must only do so when explicitly asked to by a reviewer. If you are unsure, please ask a reviewer before force pushing.

Documentation

Most of us prefer programming to writing documentation, whether that documentation is in the form of a comment or commit description (etc). That can lead us to rush, or omit, documentation, making life difficult for pull request reviewers and future contributors. We must prioritise creating and maintaining documentation.

yk's documentation must be clear, concise, and complete. A good rule of thumb is to write documentation to the quality you would like to read it: use "proper" English (e.g. avoid colloquialisms), capitalise and punctuate appropriately, and don't expect to phrase things perfectly on your first attempt.

In the context of pull requests, bear in mind that the code captures what has been changed, but not why: good documentation explains the context of a change, the reason the change is as it is, and the consequences of the change. The pull request itself should come with a description of what it aims to achieve and each individual commit must also contain a self-contained description of its changes.

Formatting

yk's continuous integration setup only accepts contributions of well-formatted code. Changed Rust code must be formatted with cargo fmt. Changed C++ code must be formatted with cargo xtask cfmt. Please run both before you raise PR, and rerun them each time you are about to make a commit in response to reviewer comments.

Automated testing

Before pull requests are merged into yk they must pass automated tests. yk uses bors and Buildbot to run the .buildbot.sh file in yk's root in a fresh Docker image: if that file executes successfully, the pull request is suitable for merging. Pull requests may edit .buildbot.sh as with any other file, though one must be careful not to slow down how long it takes to run unduly. In general, only yk contributors can issue bors commands, though they can in certain situations give external users the right to issue commands on a given pull request. Users given this privilege should use it responsibly.

Keyboard shortcuts

Yk