This doesn't add support for the legacy symbol mangling scheme which is
currently the default pending support in tools for the v0 symbol
mangling scheme. The legacy symbol mangling scheme is similar enough to
C++'s symbol mangling scheme that demangling them using the C++
demangler generally produces readable symbols. The v0 scheme is entirely
custom and due to backreferences and encoding all generic arguments not
very readable when mangled, so supporting it is more important than
supporting the legacy scheme.
While fixing things for Flatpak (which have a path deduplicating some
of the path parts) works there, it breaks locating the debuglink in
other places such as GNOME OS.
This tries both forms, using the long form first and then the short
form second, since Flatpak is likely a subset of everything that needs
to be located.
If we have a path like /app/bin/gnome-builder and the debug prefix is
/app/lib/debug then we don't want to end up with /app/lib/debug/app/bin
as the real data directory is /app/lib/debug/bin.
This often works with /usr because /usr/lib/debug/usr can link back to it's
parent. But we should try to do the right thing instead of relying on that
anyway.
We need access to this from the process info but can share the instance.
It sucks to walk the hashtable here, but the alternative is to make these
recursive so that we can check a parent mount namespace.
Until then, take the hit and iterate all the pids to populate them with
the additional device.
Related GNOME/gnome-builder#2090
If a new process is spawned after the recording has started (processes
spawned *by* sysprof are done before recording starts) then try to extract
information about that process and append it to the recording.
The goal here is to get enough process information to actually decode the
process without creating fork()/exec() amplification.
Related GNOME/gnome-builder#2090
This is more of what we want to be doing anyway, we don't care about all
the forks in existence.
Additionally, include the comm[] with the pid so that instruments can take
action based on it.
This is by no means perfect, but it gets the kernel tasks running on my
machine out of the profiles. We will no doubt need to add more in the
future, or find a way to record a flag for that in the capture format.
If we get a request for a process that we have not captured any information
about then give it the "Unknown Process" symbol. That way we do not crash
and we also maintain our invariant of not mutating the hash table.
If we get an empty string, just normalize that to NULL so that we can be
more likely to match equality checks via hash comparison.
Additionally, break hashes out into two so that we can improve the
situation where some symbols do not have paths but still match. This
can happen with bundled symbols.
We only mutate this during loading of the document so that we can be
confident in multi-threaded workers after loading. This just asserts
that invariant holds true.
This shouldn't affect categorizing because that only uses the value if
is_toplevel. But with this added, we can use the count for weights in
other tooling w/o needing augmentation.
These are largely pre-sorted, but not fully when you have merged data. This
uses timsort to speed that up a bit.
In particular, the comparison of various sorts break down to (for a
~32,000,000 record capture.
g_array_sort_with_data() => 3.9 seconds
qsort_r() = > 3.7 seconds
gtk_tim_sort() => .79 seconds