Introduction
This book contains documentation for the yk metatracing system. A preformatted version of this book can be found at https://ykjit.github.io/yk/.
This book is written in mdbook format. If
you want to edit this book, mdbook serve --open
runs a local server, opens a
formatted version of this book in your browser, and automatically refreshes the
browser when you save a markdown file after editing.
Major components
yk is a meta-tracing system that turns existing C interpreters into JIT-compiling VMs. It is comprised of two major components:
-
ykllvm
is a fork of LLVM that must be used to: compile C interpreters with the necessary extra information for yk; link in theyk
Rust library. -
The
yk
Rust library is the core of the run-time JIT system.
Terminology
yk utilises two different Intermediate Representations (IRs):
-
AOT IR is the Ahead-Of-Time IR generated by ykllvm and embedded in the binary of a C interpreter. AOT IR is similar to LLVM IR, though customised and simplified for yk.
-
JIT IR is the IR generated (from AOT IR) at run-time by yk and which is dynamically converted into machine code.
There are three styles of "trace" in yk:
-
When a hot loop in a program is detected, the actions of the interpreter are recorded to make an AOT IR trace.
-
The AOT IR trace is combined with AOT IR and then compiled into JIT IR to make a JIT IR trace.
-
The JIT IR trace is compiled into machine code to make an executable trace.
yk Internals
Installation
This section details how to get yk up and running.
System Requirements
At the time of writing, yk requires the following:
-
A Linux system with a CPU that supports Intel Processor Trace. (
grep intel_pt /proc/cpuinfo
to check) -
Linux perf (for collecting PT traces).
-
A recent nightly install of Rust.
Note that at present, non-root users can only use Yk if
/proc/sys/kernel/perf_event_paranoid
is set to -1
.
Building
Clone the main yk repository and build
it with cargo
:
$ git clone --recurse-submodules --depth 1 \
https://github.com/ykjit/yk/
$ cd yk
$ cargo build --release
Note that this will also clone ykllvm as a
submodule of yk. If you later want access to the full git history, either
remove --depth 1
or run git fetch --unshallow
.
Available Interpreters
The following interpreters use Yk:
Development
This section explains how to setup and use yk.
Configuring the build
Start by following the general installation instructions.
The yk
repo is a Rust workspace (i.e. a collection of crates). You can build
and test in the usual ways using cargo
. For example, to build and test the
system, run:
cargo test
YKB_YKLLVM_BIN_DIR
Under normal circumstances, yk builds a copy of its LLVM fork "ykllvm", which
it also uses it to build interpreters (via the compiler's use of yk-config
).
You can use your own ykllvm build by specifying the directory where the
executables (e.g. clang
, llvm-config
, and so on) are stored with
YKB_YKLLVM_BIN_DIR
.
yk does not check your installation for compatibility: it is your responsibility to ensure that your ykllvm build matches that expected by yk.
It is also undefined behaviour to move between defining this variable and not
within a repository using yk
(including the yk
repository itself). If you
want to set/unset YKB_YKLLVM_BIN_DIR
then cargo clean
any repositories
using yk
before rebuilding them.
YKB_TRACER
The YKB_TRACER
environment variable allows building yk with either hwt
(Hardware Tracer) or swt
(Software Software Tracer).
hwt
- Relies on Intel PT, suitable only for x86 CPUs supporting it.
swt
- CPU architecture-independent, but with fewer features compared to
hwt
.
Run-time configuration
There are a number of environment variables which control the run_time behaviour of the yk system.
General configuration
Variables prefixed with YK_
allow the user to control aspects of yk's execution.
The following environment variables are available:
-
YK_LOG=[<path>:]<level>
specifies where, and how much, general information yk will log during execution.If
<path>:
(i.e. a path followed by ":") is specified then output is sent to that path. The special value-
(i.e. a single dash) can be used for<path>
to indicate stderr. If not specified, logs to stderr.<level>
specifies the level of logging, each adding to the previous: level 0 turns off all yk logging; level 1 shows major errors only; and level 2 warnings. Levels above 3 are used for internal yk debugging, and their precise output, and indeed the maximum level may change without warning. Currently: level 3 logs transitions of aLocation
transition; and level 4 JIT events such as starting/stopping tracing. Note that some information, at all levels, may or may not be displayed based on compile-time options. Defaults to 1. -
YK_HOT_THRESHOLD
: an integer from 0..4294967295 (both inclusive) that determines how many executions of a hot loop are needed before it is traced. Defaults to 50.
Debugging
Variables prefixed with YKD_
are intended to help interpreter authors debug
performance issues. Some are only available in certain compile-time
configurations of yk, either because they increase the binary size, or slow
performance down.
The following environment variables are available (some only in certain configurations of yk):
YKD_LOG_IR
[with theykd
feature]YKD_LOG_STATS
Debugging
Trace optimisation
Trace optimisation can make it difficult to understand why a yk interpreter has
behaved in the way it does. It is worth trying to run your code with the
optimiser turned off. You can do this with the YKD_OPT
environment
variable, which takes the following values:
- 1: turn the optimiser on. Default if not otherwise specified.
- 0: turn the optimiser off.
Debugging JITted code
Often you will find the need to inspect JITted code with a debugger. If the
problem trace comes from a C test (i.e. one of the test cases under tests/c
),
then you can use the gdb_c_test
tool.
The tool automates the compilation and invocation of the resulting binary under GDB.
The simplest invocation of gdb_c_test
(from the top-level of the yk
repo)
would look like:
bin/gdb_c_test simple.c
This will automatically compile and run the tests/c/simple.c
test under GDB.
This would be ideal if you have a crashing trace, as it will dump you into a
GDB shell at the time of the crash.
To see what else you can do with gdb_c_test
, run:
bin/gdb_c_test --help
For help on using GDB, see the GDB documentation.
GDB plugin
Yk comes with a GDB plugin that allows the debugger to show higher-level information in the source view window.
The plugin is built by default and put in target/yk_gdb_plugin.so
.
To use it, put this line in ~/.gdbinit
:
jit-reader-load /path/to/yk/target/yk_gdb_plugin.so
Then when you run GDB, you should see:
Yk JIT support loaded.
When you are inside JITted code, the source view will show higher-level debugging information. You can show the assembler and source views on one GDB screen using the "split" layout. Type:
la spl
Profiling
This section describes how best to profile yk and interpreters.
JIT statistics
At the end of an interpreter run, yk can print out some simple statistics about
what happened during execution. If the YKD_LOG_STATS=<path>
environment
variable is defined, then JSON statistics will be written to the file at
<path>
once the interpreter "drops" the YkMt
instance. The
special value -
(i.e. a single dash) can be used for <path>
to indicate stderr.
Note that if the interpreter starts multiple yk instances, then the contents of
<file>
are undefined (at best the file will be nondeterministically
overwritten as instances are "dropped", but output may be interleaved, or
otherwise bizarre).
Output from YKD_LOG_STATS
looks as follows:
{
"duration_compiling": 5.5219,
"duration_deopting": 2.2638,
"duration_jit_executing": 0.2,
"duration_outside_yk": 0.142,
"duration_trace_mapping": 3.3797,
"traces_collected_err": 0,
"traces_collected_ok": 11,
"traces_compiled_err": 1,
"traces_compiled_ok": 10
}
Fields and their meaning are as follows:
duration_compiling
. Float, seconds. How long was spent compiling traces?duration_deopting
. Float, seconds. How long was spent deoptimising from failed guards?duration_jit_executing
. Float, seconds. How long was spent executing JIT compiled code?duration_outside_yk
. Float, seconds. How long was spent outside yk? This is a proxy for "how much time was spent in the interpreter", but is inherently an over-approximation because we can't truly know exactly what the system outside Yk counts as "interpreting" or not. For example, if an interpreter thread puts itself to sleep, we will still count it as time spent "outside yk".duration_trace_mapping
. Float, seconds. How long was spent mapping a "raw" trace to compiler-ready IR?trace_executions
. Unsigned integer. How many times have traces been executed? Note that the same trace can count arbitrarily many times to this.traces_collected_err
. Unsigned integer. How many traces were collected unsuccessfully?traces_collected_ok
. Unsigned integer. How many traces were collected successfully?traces_compiled_err
. Unsigned integer. How many traces were compiled unsuccessfully?traces_compiled_ok
. Unsigned integer. How many traces were compiled successfully?
Perf
On Linux, perf
can be used to profile yk. You first need to record an
execution of an interpreter and then separately view the profiling data that
was generated.
Recording a Profile
To record a profile we first recommend compiling yk with debugging info
embedded. cargo's debug
profile does this automatically, but because no code
optimisation is performed, the profiles are unrepresentative. We recommend
using yk's provided profile-with-debug
profile, which turns on
--release
-style code optimisation and embeds debugging information:
$ cargo build --profile=release-with-debug
Ensure that the interpreter you are profiling links to the appropriate version
of yk and then call perf record
:
$ perf record --call-graph dwarf -g ./interpreter ...args...
This uses --call-graph dwarf
to force perf use DWARF debugging information:
this will only be useful if you have compiled yk with embedded debugging
information, as recommended above.
Viewing a profile
perf profiles can be visualised in a number of ways. When using perf report
or perf script
we currently recommend passing --no-inline
to avoid the huge
processing time incurred by indirectly running addr2line
(note that this
might change in the
future).
Terminal
To quickly view a profile in the terminal:
$ perf report -g --no-inline
Firefox profiler
After processing perf's output, you can use Firefox's Profiler to view the data locally. Note that this does not upload the data --- all processing happens in your browser! First process the data:
$ perf script -F +pid --no-inline > out.perf
Then go to the Firefox Profiler page, press
"Load a profile from file" and upload out.perf
.
Flame graphs
You can make a flame graph using the Rust
flamegraph
tool. Install with
cargo install flamegraph
and then use flamegraph
to profile and produce a
flamegraph in one go with:
$ /path/to/cargo/bin/flamegraph --no-inline -- ./interpreter ...args...
Note that flamegraph
passes --call-graph=dwarf
to perf record
by default
(pass -v
to see the exact perf invocation used).
This will produce an svg
file which you can then view.
Understanding Traces
yk can print the traces it has created to stderr
to help with debugging.
However, these traces are often lengthy, and not always easy to understand.
This section briefly explains how to get yk to print its traces, and how
to make them a bit easier to understand.
Producing a trace
YKD_LOG_IR
YKD_LOG_IR=[<path>:]<irstage_1>[,...,<irstage_n>]
logs IR from different stages
to path
. The special value -
(i.e. a single dash) can be used for <path>
to indicate stderr.
The following ir_stage
s are supported:
aot
: the entire AOT IR for the interpreter.jit-pre-opt
: the JIT IR trace before optimisation.jit-post-opt
: the JIT IR trace after optimisation.jit-asm
: the assembler code of the compiled JIT IR trace.jit-asm-full
: the assembler code of the compiled JIT IR trace with instruction offsets and virtual addresses annotated.
Gotchas
Yk has some unusual implications, requirements, and limitations that interpreter authors should be aware of.
Trace quality depends upon completeness of interpreter IR.
Yk works by recording traces (i.e. individual control flow paths) of your
interpreter's implementation. Each trace ends up as an ordered list of LLVM IR
blocks which are "stitched together" and compiled. Unless callees are marked
yk_noinline
, the JIT will seek to inline them into the trace because,
generally speaking, the more the JIT can inline, the more optimal the JITted
code will be.
In order to inline a function call, the JIT needs to have LLVM IR for the
callee. Yk uses fat LTO to collect (and embed into the resulting binary) a
"full-program" IR for your interpreter. yk-config
provides the relevant clang
flags to make this happen. You should make sure that the build system of your
interpreter uses the relevant flags. Namely yk-config --cppflgs --cflags
for
compiling C code, and yk-config --ldflags --libs
for linking.
It follows that shared objects, which at the time of writing cannot take part
in LTO, cannot be inlined into traces. If your interpreter dlopen()
s shared
objects at runtime (as is common for C extensions) Yk will be unable to trace
the newly loaded code.
Symbol visibility
The JIT relies upon the use of dlsym()
at runtime in order to lookup any
given symbol from its virtual address. For this to work all symbols must be
exposed in the dynamic symbol table.
yk-config
provides flags to put every function's symbol into the dynamic
symbol table. Since distinct symbols of the same name can exist (e.g. static
functions in C), but dynamic symbol names must be unique, symbols may be
mangled (mangling is done by the fat LTO module merger). If your interpreter
does its own symbol introspection, Yk may break it.
Extra sections in your interpreter binary.
yk-config
will add flags that add the following sections to your binary:
.llvmbc
: LLVM bytecode for your interpreter. Used to construct traces. This is a standard LLVM section (but extended by Yk)..llvm_bb_addr_map
: The basic block address map. Used to map virtual addresses back to LLVM IR blocks. This is a standard LLVM section (but extended by Yk)..llvm_stackmaps
: Stackmap table. Used to identify the locations of live LLVM IR variables. This is a standard LLVM section (but extended by Yk).
Other interpreter requirements and gotchas
-
Yk can only currently work with "simple interpreter loop"-style interpreters and cannot yet handle unstructured interpreter loops (e.g. threaded dispatch).
-
Yk currently assumes that no new code is loaded at runtime (e.g.
dlopen()
), and that no code is unloaded (e.g.dlclose()
). Self modifying interpreters will also confuse the JIT. -
Yk currently doesn't handle calls to
pthread_exit()
gracefully (more details). -
Yk currently doesn't handle
setjmp()
/longjmp()
. -
You cannot valgrind an interpreter that is using Intel PT for tracing (more details).
Internals
Debugging / Testing
Compile-time features
yk_testing
The yk_testing
Cargo feature is enabled whenever the tests
crate is being
compiled, so a regular cargo build
in the root of the workspace will enable
the feature (to build without the feature enabled, do cargo build -p ykcapi
).
Run-time debugging / testing features
YKD_SERIALISE_COMPILATION
When YKD_SERIALISE_COMPILATION=1
, calls to yk_control_point(loc)
will block
while loc
is being compiled.
This variable is only available when building ykrt
with the yk_testing
Cargo feature enabled.
Working on yk
yk has several features designed to make it easier to work on yk itself. Most of these are transparent to the developer (e.g. rebuilding ykllvm when needed): in this page we document those that are not.
clangd
The yk
build system generates compilation command databases for use with
clangd. If you want diagnostics and/or completion in your editor (via an LSP),
you will have to configure the LSP to use clangd
(the automated build system
puts a clangd
binary into target/<debug|release>/ykllvm/bin
that you could
use).
Pull Requests
We welcome all and any contributions, from code to documentation to examples, submitted as pull requests to one of our GitHub repositories. Pull requests are reviewed by a yk contributor: changes may be requested in an ongoing dialogue between reviewee and reviewer; and, if and when agreement is reached, the pull request will be merged.
This page explains yk's approach to the pull request model, based on observations of good practise in other projects, and our own experiences. Please note that this is a living process: the details are subject to continual refinement and change, and no document can perfectly capture all the possibilities that will crop up in real life.
What makes a good pull request?
The typical aim of a pull request to a yk repository is to make a focussed change such as: add a new feature; fix a bug; or edit documentation. Good pull requests are:
-
Self-contained. A pull request should not try and change everything at once. Ideally a pull request fixes a single bug or adds a single feature; but sometimes a set of interdependent fixes or features makes more sense. In general, small pull requests are better than big pull requests.
-
Tested. Every piece of new functionality, and every bug fix, must have at least one accompanying test, unless testing is impractical. Authors should also strive not to break any other part of the system. yk runs various tests before a pull request is merged: this reduces, but does not remove, the chances of a merged pull request breaking expectations: careful thought and testing on the part of the pull request author are still required.
-
Documented. Code diffs show what has been changed, but not why. Documentation explains to other humans the context for a change (i.e. why something needed to be changed), the reason why the change is as it is (i.e. could the change have taken a different form? what alternatives were considered or attempted?), and the consequences of doing so (i.e. does this make something easier or harder in the future?). Documentation comes in the form of comments in code, "external" documentation (such as you read here), and commit and pull request messages. In whatever form it comes, documentation must be clear (as easy to understand as possible), concise (as short as possible while getting all the required points across), complete (not missing any important points), and necessary (not documenting that which is obvious).
-
Useful. The more people that will benefit from a pull request, the more likely it is to be merged. The bar for "useful" is set fairly low, but that does not mean that every pull request should be merged: for example, a pull request which simply changes code to your preferred style is not only not useful but may cause problems for other people working in parallel. However, porting the code to a little used platform is often useful (provided it doesn't cause undue problems for more widely used platforms).
-
Harmonious. As far as possible, new code should feel like a natural extension of existing code. Following existing naming conventions, for example, can sometimes grate, but internal consistency helps those who read and edit the code later.
The reviewer and reviewee covenant
The aim of pull request reviewing is to make yk better while ensuring that our quality standards are maintained. Ultimately, it is in yk's best interests for all pull requests which satisfy the criteria listed in the previous section to be merged in.
Both reviewer and reviewee have their part to play in making this process work. Most importantly, both parties must assume good faith on the part of the other: for example, questions are an opportunity to learn or explain, not to attack. Clear, polite, communication between both parties is required at all times. Reviewers should respond in a timely manner to comments, while understanding that reviewees may have many outside responsibilities that mean their responses are less timely.
Reviewers should help a reviewee meet the expected standards, via questioning and explicit guidance, without setting the bar unnecessarily high. Reviewers need to accept:
- that a pull request cannot solve every problem, and some problems are best deferred to a future pull request;
- and that some variance of individual style is inevitable and acceptable.
Put another way, while we set high standards, and require all contributions to meet them, we are not foolish enough to expect perfection. In particular, reviewers must be welcoming to newcomers, who may not be familiar with our processes or standards, and adjust accordingly.
The pull request process in yk repositories
To raise a pull request, you must:
-
Fork the relevant yk GitHub repository to your account. Note that pull requests must never come from a branch on a yk repository.
-
Create a new branch in your fork. If you are unsure about the name, start with something generic and rename it later with
git branch -m <new name>
. -
Make your changes and commit them. It is not only allowed, but often best, to have multiple commits: each commit should be a logical change building upon one of its predecessors. Each commit should be capable of passing all tests successfully (amongst other things, to avoid breaking
git bisect
). -
Push your branch to your GitHub fork and raise a pull request. Give the pull request a meaningful title (for example, "Fix bug" is not helpful but "Deal with an empty list correctly" is) and a description (empty descriptions are almost never acceptable). If the pull request fixes all aspects of a GitHub issue, add the text
Fixes #<GitHub issue number>
, as a line on its own, into the description: GitHub will then automatically close that issue when the pull request is merged.
Your pull request has now been raised and will be reviewed by a yk contributor. The aim of a review is two fold: to ensure that contributions to yk are of an acceptable quality; and that at least two people (the reviewer and reviewee) understand what was changed and why.
The reviewer will comment in detail on specific lines of the commit and on the overall pull request. Comments might be queries (e.g. "can the list ever be empty at this point?") or corrections (e.g. "we need to deal with the possibility of an empty list at this point"). For each comment you can either:
-
Address the issue. For example, if the reviewer has pointed out correctly that something needs to be changed then:
- make a small additional commit (one per fix);
- push it to your branch;
- and, in the same place as the reviewer made their comment, add a new
comment
Fixed in <git hash>
.
The reviewer will then review your change and either: mark the conversation as "resolved" if their point is adequately addressed; or raise further comments otherwise.
Note that the reviewee must not: mark conversations as resolved (only the reviewer should do so); force push updates to your branch unless explicitly requested to do so by the reviewer (always make new commits on the existing branch).
-
Add a query. You might not understand the reviewer's question: it's fine to ask for clarification.
-
Explain why you think there is not an issue. The reviewer might have misunderstood either the details of a change, or the pull request's context (e.g. the pull request might not have intended to fix every possible issue). A complete, and polite explanation, can help both reviewer and reviewee focus on the essentials.
Often multiple rounds of comments, queries, and fixes are required. When the reviewer is happy with the changes, they will ask the reviewee to "please squash" their pull request. The aim here is to provide a readable sequence of commits for those who later need to look at the history of changes to a particular part of the repository. The general aim is to provide a pleasing sequence of individually documented commits which clearly articulate a change to future readers. At the very least, all of the "fix commits" must be merged away: commonly, these are merged into the main commits, or, though rarely, into a new commit(s). It is not required, and is often undesirable, to squash all commits down into a single commit: when multiple commits can better explain a change, they are much preferred. During squashing, you should also check that commit messages still accurately document their contents: revise those which need updating.
The process of squashing is synonymous with git rebase
, so when you are asked
to squash it is also acceptable to: rebase the commit against the master
branch at the same time; and force push the resulting rebased branch.
Not that being asked to squash or to update a commit message are the only times you may force push an update to a branch. In both cases, the reviewee must only do so when explicitly asked to by a reviewer. If you are unsure, please ask a reviewer before force pushing.
Documentation
Most of us prefer programming to writing documentation, whether that documentation is in the form of a comment or commit description (etc). That can lead us to rush, or omit, documentation, making life difficult for pull request reviewers and future contributors. We must prioritise creating and maintaining documentation.
yk's documentation must be clear, concise, and complete. A good rule of thumb is to write documentation to the quality you would like to read it: use "proper" English (e.g. avoid colloquialisms), capitalise and punctuate appropriately, and don't expect to phrase things perfectly on your first attempt.
In the context of pull requests, bear in mind that the code captures what has been changed, but not why: good documentation explains the context of a change, the reason the change is as it is, and the consequences of the change. The pull request itself should come with a description of what it aims to achieve and each individual commit must also contain a self-contained description of its changes.
Formatting
yk's continuous integration setup only accepts contributions of well-formatted
code. Changed Rust code must be formatted with cargo fmt
. Changed C++ code
must be formatted with cargo xtask cfmt
. Please run both before you raise PR,
and rerun them each time you are about to make a commit in response to reviewer
comments.
Automated testing
Before pull requests are merged into yk they must pass automated tests. yk uses
bors and Buildbot to run the
.buildbot.sh
file in yk's root in a fresh Docker image: if that file executes
successfully, the pull request is suitable for merging. Pull requests may edit
.buildbot.sh
as with any other file, though one must be careful not to slow
down how long it takes to run unduly. In general, only yk contributors can
issue bors
commands, though they can in certain situations give external
users the right to issue commands on a given pull request. Users given this
privilege should use it responsibly.