Commit Graph

235 Commits

Author SHA1 Message Date
45c8c95706 libsysprof-capture: Drop GError usage from SysprofCaptureWriter
Use `errno` instead, which is icky, but given that all of the failure
modes are from POSIX I/O functions, it’s at least in keeping with them.

This is a major API break.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-03 22:00:34 +01:00
e19d70bca0 libsysprof-capture: Drop GError usage from SysprofCaptureReader
Use `errno` instead, which is icky, but given that all of the failure
modes are from POSIX I/O functions, it’s at least in keeping with them.

This is a major API break.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-03 22:00:34 +01:00
6a45f020f7 libsysprof-capture: Add SysprofCaptureJitmapIter to replace GHashTable
Change `sysprof_capture_reader_read_jitmap()` to return a `const
SysprofCaptureJitmap *` (like the other `read` functions), and add a new
`SysprofCaptureJitmapIter` type to allow easy iteration over the jitmap.

This allows a use of `GHashTable` to be removed from the API. It breaks
the libsysprof-capture API and ABI.

All the callers iterate over the jitmap rather than looking up elements
by key. If that functionality is needed in future, additional API can be
added to allow it on `SysprofCaptureJitmap`.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-03 22:00:34 +01:00
5636bbf4f0 libsysprof-capture: Use stdbool instead of gboolean
Another step towards dropping GLib as a dependency of
libsysprof-capture.

Unlike the previous commit which replaced GLib integer types with the
bitwise equivalent C standard types, `stdbool` is potentially a different
width from `gboolean`, so this is an ABI break.

It therefore involves some changes to callback functions in the tests
and tools, and in libsysprof.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-02 21:07:11 +01:00
b449baa205 libsysprof-capture: Move MappedRingBufferSource to libsysprof
As preparation for dropping the GLib dependency from libsysprof-capture,
move the `GSource` which links a `MappedRingBuffer` to a `GMainContext`
from libsysprof-capture to libsysprof.

This requires adding one new piece of API to libsysprof-capture to check
whether the `MappedRingBuffer` is empty.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-02 21:07:11 +01:00
d2047fb557 libsysprof-capture: Move autocleanup definitions to libsysprof
In preparation for dropping the GLib dependency from libsysprof-capture,
move the autocleanup definitions up to libsysprof. Add a new header for
them.

This is slightly awkward in the tools, which depend on
libsysprof-capture but not libsysprof. Rather than make them depend on
libsysprof (which might be disabled at configure time), include the
`sysprof-capture-autocleanups.h` file between source directories.
`SYSPROF_COMPILATION` needs to be defined for this to work.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-02 21:07:11 +01:00
4b6855a2ab libsysprof: Add missing preload dependencies on glib-2.0
It has previously been implicitly pulled in by libsysprof-capture, but
that will change in future.

Correspondingly, add some missing `glib.h` includes.

Signed-off-by: Philip Withnall <withnall@endlessm.com>

Helps: #40
2020-07-02 21:07:11 +01:00
eb0d58dcc4 speedtrack: track g_main_context_iteration()
We can't get all the symbols here because of -Bsymbolic on the glib
library, but we can get the higher level bit. And if we're blocking for
a period of time, it can help track things down to know we block for
longer time periods.
2020-03-13 16:49:38 -07:00
55f8f313b7 speedtrack: add sync and syncfs wrappers 2020-03-13 16:10:14 -07:00
37afd71370 speedtrack: start on simple port of iobt as "speedtrack"
The long term goal here is to help people find issues with their main
loop performance because of mixed workloads getting in the way of
interactivity.
2020-03-13 15:51:33 -07:00
a70907be0e libsysprof: add simple preload source
This source is useful to quickly add an LD_PRELOAD to a profiler.
2020-03-13 15:50:46 -07:00
f3d6bd3ed3 memprof: fix joining of LD_PRELOAD 2020-03-13 15:19:18 -07:00
390c5cde18 libsysprof-ui: add access to control source from .ui 2020-03-13 15:19:06 -07:00
f7a53ca8f9 preload: move backtrace helper into helper
This is useful so that we can have more of these LD_PRELOAD tools without
having to duplicate this code.
2020-03-13 12:06:39 -07:00
a6c39af553 preload: move to libdir from libexecdir
This isn't an executable, it just belongs in libdir.
2020-03-05 15:46:03 -08:00
6d8841267a preload: add assertion for performance hack
We steal two pointers temporarily, so ensure that we have the space to
overwrite a couple of addresses.
2020-03-05 15:22:17 -08:00
4af293a364 libsysprof: apply whole-system during capture replay
Fixes #31
2020-03-04 11:05:13 -08:00
1fbeabf2a2 libsysprof: add context check for inline symbol decoding 2020-02-26 10:55:34 -08:00
f01298ead5 libsysprof: allow disabling the kernel symbol resolver
The kernel symbol resolver requires access to sysprofd, which might not
be available in some contexts (such as when no polkit agent is available).

This allows that to continue working by disabling the kernel with the
user-only setting.
2020-02-26 10:24:40 -08:00
14973bc2b9 callgraph: fix loading of stacks coming from backtrace() 2020-02-26 09:31:31 -08:00
26bf532a8c libsysprof: include process ID when cmdline is not available
We shouldn't really hit this, but if we do, it's easy enough to synthesize
a real parent node for the process in question.
2020-02-26 08:48:05 -08:00
cd2c2c954a libsysprof: extend LD_PRELOAD when pre-existing 2020-02-21 12:40:55 -08:00
5b88e9c3aa build: check for unw_set_cache_size()
This may not be available on older libunwind versions such as 1.2.
2020-02-20 11:22:56 -08:00
b0157683ef preload: use #ifdef, not #if ENABLE_LIBUNWIND 2020-02-20 11:08:26 -08:00
348a1ef110 libsysprof: fix use of {0} initializers for older GCC
Older GCC versions complain about this so we might as well squash it.
2020-02-20 11:08:21 -08:00
0cb4bb61ac libsysprof: remove use of GAtomicRCBox
Switching this to use an embedded ref count allows us to backport to
operating systems restricted to GLib 2.56.
2020-02-20 11:03:05 -08:00
cd8a99402f build: fallback to __sync_synchronize()
If we're running on a GCC older than 4.9, then we won't have the
stdatomic.h available. We can just use a full barrier instead using
__sync_synchronize() to get the same effect, albeit slower.
2020-02-20 10:38:35 -08:00
8e81b1fcf9 libsysprof: add summary information for memprof profile 2020-02-19 00:16:36 -08:00
e3cb30e4ee libsysprof: apply selection to temporary allocations 2020-02-18 19:19:01 -08:00
84e2c288dc libsysprof: add support for calculating temporary allocations
This is useful to find allocations free'd right after they were created.

A temporary allocation is currently defined as a free() right after an
allocation of that same memory address. From a quick glance, that appears
to be similar to what I've been seeing in heaptrack all these years.

In the long term, I'd expect we can do something more useful such as
"freed from similar stack trace" since things like g_strdup_printf()
would obviously break important temporary allocations.
2020-02-18 17:18:04 -08:00
87004f5d24 preload: skip an additional stack frame
Now that we do some math here, we need to skip another frame to keep this
more useful and not show everything inside of sysprof_collector_allocate().
2020-02-18 14:11:09 -08:00
7490a774ab libsysprof-capture: use signed int for backtrace return
This allows us to more safely subtract 1 from the unw_backtrace() to get
the proper number of frames (and detect it in the collector).
2020-02-18 14:03:19 -08:00
9f43bf2813 libsysprof: only add process cmdline info once
We might get this information from multiple sources (such as Linux's perf
or the proc data source). So only add this information once to avoid
having additional data we don't care about.

This also helps ensure we get a proper tree for the callgraph without
splitting things between updated cmdline information.
2020-02-18 13:45:24 -08:00
93acce520f libsysprof: plug leak and do less strdups
Small leak based on the number of PROCESS frames we see. Easy to fix and
easy to not do the g_strdup_printf() at all in those cases.
2020-02-18 13:44:25 -08:00
70bea64f88 libsysprof-capture: allow for backtrace skip optimization
We want to be backtracing directly into the capture buffer, but also need
to skip a small number of frames.

If we call the backtrace before filling in information, we can capture to
the position *before* ev->addrs and then overwrite that data right after.
2020-02-18 13:35:18 -08:00
b6dc058d62 libsysprof: check for time series after extracting processes 2020-02-18 13:17:49 -08:00
a37ad780ca preload: use unw_backtrace()
This seems to be significantly faster than doing the manual stepping. A
quick look shows that it has a number of special cases which we'd have
to duplicate, so best to just use it directly.
2020-02-17 12:05:32 -08:00
ebeba62669 preload: setup cache size for libunwind 2020-02-17 12:03:02 -08:00
e06638d665 build: make libunwind optional 2020-02-17 12:02:44 -08:00
63f781eef9 preload: setup per-thread caching
We also need to invalidate caches at some point on dlopen()/dlclose().
2020-02-16 21:02:21 -07:00
ba2f6dfa54 preload: dont track stack for free
Ideally we'll use this later on for temporary allocations, but we can be
time based on that rather than similar stacks.
2020-02-16 21:01:52 -07:00
2329a6e25e preload: define UNW_LOCAL_ONLY for libunwind 2020-02-16 16:44:57 -08:00
ed7c9bbaef libsysprof-capture: fix advancement in drain callback 2020-02-15 22:02:03 -07:00
e3ed30eb48 libsysprof-capture: remove framing data from MappedRingBuffer
This removes the 8 bytes of framing data from the MappedRingBuffer which
means we can write more data without racing. But also this means that we
can eventually use the mapped ring buffer as our normal buffer for
capture writing (to be done later).
2020-02-15 20:46:05 -07:00
e7f2702f88 memprof: simplify memprof source
This doesn't need to be using trace-fd anymore now that we have the
collector API.
2020-02-13 18:58:35 -08:00
04d599c718 preload: port memory collector to collector API
This uses a simplified form of collection without writing to capture
files directly. The data is written into a ring buffer for Sysprof to
pick up and copy into the real destination file. Using the mmap() ring
buffer allows loss of data when Sysprof cannot keep up, but on the other
hand allows the inferior to be fast enough to be useful.
2020-02-13 18:58:03 -08:00
089f5d7c56 control-fd: add SysprofControlSource
This is a source that will allow the inferior to call into Sysprof to
create a new mmap()'d ring buffer to share data. This allows significantly
less overhead in the child process as Sysprof itself will take care of
copying the data out of the inferior into the final capture file. There is
more copying of course, but less intrusive to the inferior itself.
2020-02-13 18:53:58 -08:00
3e7acd5663 libsysprof: protect against bad reads 2020-02-13 14:32:34 -08:00
3f07cf2748 libsysprof: decode allocation frames into symbol map
This ensures that we have proper symbols when opening a file with
allcoation frames.
2020-02-11 18:40:29 -08:00
33c81a3a9c memprof: add memory profiling using LD_PRELOAD
This brings over some of the techniques from the old memprof design.
Sysprof and memprof shared a lot of code, so it is pretty natural to
bring back the same callgraph view based on memory allocations.

This reuses the StackStash just like it did in memprof. While it
would be nice to reuse some existing tools out there, the fit of
memprof with sysprof is so naturally aligned, it's not really a
big deal to bring back the LD_PRELOAD. The value really comes
from seeing all this stuff together instead of multiple apps.

There are plenty of things we can implement on top of this that
we are not doing yet such as temporary allocations, cross-thread
frees, graphing the heap, and graphing differences between the
heap at to points in time. I'd like all of these things, given
enough time to make them useful.

This is still a bit slow though due to the global lock we take
to access the writer. To improve the speed here we need to get
rid of that lock and head towards a design that allows a thread
to request a new writer from Sysprof and save it in TLS (to be
destroyed when the thread exits).
2020-02-07 19:00:33 -08:00