This module implements the basic framework for the compiler's self-
profiling support. It provides the
SelfProfiler type which enables
recording "events". An event is something that starts and ends at a given
point in time and has an ID and a kind attached to it. This allows for
tracing the compiler's activity.
Internally this module uses the custom tailored measureme crate for
efficiently recording events to disk in a compact format that can be
post-processed and analyzed by the suite of tools in the
project. The highest priority for the tracing framework is on incurring as
little overhead as possible.
Events have a few properties:
event_kinddesignates the broad category of an event (e.g. does it correspond to the execution of a query provider or to loading something from the incr. comp. on-disk cache, etc).
event_iddesignates the query invocation or function call it corresponds to, possibly including the query key or function arguments.
- Each event stores the ID of the thread it was recorded on.
- The timestamp stores beginning and end of the event, or the single point in time it occurred at for "instant" events.
Event generation can be filtered by event kind. Recording all possible
events generates a lot of data, much of which is not needed for most kinds
of analysis. So, in order to keep overhead as low as possible for a given
use case, the
SelfProfiler will only record the kinds of events that
pass the filter specified as a command line argument to the compiler.
As far as
measureme is concerned,
event_ids are just strings. However,
it would incur too much overhead to generate and persist each
string at the point where the event is recorded. In order to make this more
measureme has two features:
Strings can share their content, so that re-occurring parts don't have to be copied over and over again. One allocates a string in
measuremeand gets back a
StringIdis then used to refer to that string.
measuremestrings are actually DAGs of string components so that arbitrary sharing of substrings can be done efficiently. This is useful because
event_ids contain lots of redundant text like query names or def-path components.
StringIds can be "virtual" which means that the client picks a numeric ID according to some application-specific scheme and can later make that ID be mapped to an actual string. This is used to cheaply generate
event_ids while the events actually occur, causing little timing distortion, and then later map those
StringIds, in bulk, to actual
event_idstrings. This way the largest part of the tracing overhead is localized to one contiguous chunk of time.
How are these
event_ids generated in the compiler? For things that occur
infrequently (e.g. "generic activities"), we just allocate the string the
first time it is used and then keep the
StringId in a hash table. This
is implemented in
For queries it gets more interesting: First we need a unique numeric ID for
each query invocation (the
QueryInvocationId). This ID is used as the
StringId we use as
event_id for a given event. This ID has to
be available both when the query is executed and later, together with the
query key, when we allocate the actual
event_id strings in bulk.
We could make the compiler generate and keep track of such an ID for each
query invocation but luckily we already have something that fits all the
the requirements: the query's
DepNodeIndex. So we use the numeric value
event_id when recording the event and then,
just before the query context is dropped, we walk the entire query cache
(which stores the
DepNodeIndex along with the query key for each
invocation) and allocate the corresponding strings together with a mapping
DepNodeIndex as StringId.
Something that uniquely identifies a query invocation.
A reference to the SelfProfiler. It can be cloned and sent across thread boundaries at will.
MmapSerializatioSink is faster on macOS and Linux but FileSerializationSink is faster on Windows