It has previously been implicitly pulled in by libsysprof-capture, but
that will change in future.
Correspondingly, add some missing `glib.h` includes.
Signed-off-by: Philip Withnall <withnall@endlessm.com>
Helps: #40
We can't get all the symbols here because of -Bsymbolic on the glib
library, but we can get the higher level bit. And if we're blocking for
a period of time, it can help track things down to know we block for
longer time periods.
We want to be backtracing directly into the capture buffer, but also need
to skip a small number of frames.
If we call the backtrace before filling in information, we can capture to
the position *before* ev->addrs and then overwrite that data right after.
This seems to be significantly faster than doing the manual stepping. A
quick look shows that it has a number of special cases which we'd have
to duplicate, so best to just use it directly.
This uses a simplified form of collection without writing to capture
files directly. The data is written into a ring buffer for Sysprof to
pick up and copy into the real destination file. Using the mmap() ring
buffer allows loss of data when Sysprof cannot keep up, but on the other
hand allows the inferior to be fast enough to be useful.
This brings over some of the techniques from the old memprof design.
Sysprof and memprof shared a lot of code, so it is pretty natural to
bring back the same callgraph view based on memory allocations.
This reuses the StackStash just like it did in memprof. While it
would be nice to reuse some existing tools out there, the fit of
memprof with sysprof is so naturally aligned, it's not really a
big deal to bring back the LD_PRELOAD. The value really comes
from seeing all this stuff together instead of multiple apps.
There are plenty of things we can implement on top of this that
we are not doing yet such as temporary allocations, cross-thread
frees, graphing the heap, and graphing differences between the
heap at to points in time. I'd like all of these things, given
enough time to make them useful.
This is still a bit slow though due to the global lock we take
to access the writer. To improve the speed here we need to get
rid of that lock and head towards a design that allows a thread
to request a new writer from Sysprof and save it in TLS (to be
destroyed when the thread exits).