If a new process is spawned after the recording has started (processes
spawned *by* sysprof are done before recording starts) then try to extract
information about that process and append it to the recording.
The goal here is to get enough process information to actually decode the
process without creating fork()/exec() amplification.
Related GNOME/gnome-builder#2090
This is more of what we want to be doing anyway, we don't care about all
the forks in existence.
Additionally, include the comm[] with the pid so that instruments can take
action based on it.
This is by no means perfect, but it gets the kernel tasks running on my
machine out of the profiles. We will no doubt need to add more in the
future, or find a way to record a flag for that in the capture format.
If we get a request for a process that we have not captured any information
about then give it the "Unknown Process" symbol. That way we do not crash
and we also maintain our invariant of not mutating the hash table.
If we get an empty string, just normalize that to NULL so that we can be
more likely to match equality checks via hash comparison.
Additionally, break hashes out into two so that we can improve the
situation where some symbols do not have paths but still match. This
can happen with bundled symbols.
We only mutate this during loading of the document so that we can be
confident in multi-threaded workers after loading. This just asserts
that invariant holds true.
This shouldn't affect categorizing because that only uses the value if
is_toplevel. But with this added, we can use the count for weights in
other tooling w/o needing augmentation.
These are largely pre-sorted, but not fully when you have merged data. This
uses timsort to speed that up a bit.
In particular, the comparison of various sorts break down to (for a
~32,000,000 record capture.
g_array_sort_with_data() => 3.9 seconds
qsort_r() = > 3.7 seconds
gtk_tim_sort() => .79 seconds
This starts the perf streams from prepare instead of from record so that
we can do the linux instrument work in prepare. The samples are dropped
until our start-time is set.
Doing it this way removes sysprof-cli and sysprofd greatly from the
overhead in the callgraph which is useful so that the user gets to see
what they really care about.
It has the added benefit that we're less likely to see the pkla processes
showing up from authorizing our D-Bus connection for creating per streams.