Commit Graph

2702 Commits

Author SHA1 Message Date
64a886eea8 libsysprof-analyze: handle NULL process info
This can happen for "process 0" for example, or anything that was
recorded for a pid which did not get a SysprofCaptureProcess frame.
2023-05-15 14:33:41 -07:00
22828daad1 libsysprof-anzlyze: deduplicate and sort kernel address ranges
Turns out we do need this, and we cannot trust kallsyms all that much even
from duplicated entries on immediate next lines.
2023-05-15 14:31:18 -07:00
65318afa51 libsysprof-analyze: use define for final address range 2023-05-15 14:30:29 -07:00
a90b9a2fc7 libsysprof-analyze: use separate cache for kernel symbols
Additionally this reduces some GHashTable lookup costs by doing it once
for the process-info per-traceable rather than one per instruction pointer
per traceable.
2023-05-15 14:30:06 -07:00
909228174e libsysprof-analyze: use mapped file from file chunk 2023-05-15 13:51:22 -07:00
131d9fba29 libsysprof-analyze: implement kallsyms symbolizer
This does a simple binary search across the parsed kallsyms using the
addresses we've parsed. We need to be sure we've created the array properly
so that our bounds checking will prevent infinite loops in the tight
binary search loop.
2023-05-15 13:32:34 -07:00
525b30a42f libsysprof-analyze: give symbolizer access to address context
The kallsym symbol resolver will need this to short-circuit unless we're
within a kernel address context.
2023-05-15 13:07:48 -07:00
14e5cf06a5 libsysprof-analyzer: include kallsym symbolizer in default
Generally we're capturing Linux systems, and even if not, the capture may
contain embedded Linux symbols if on a secondary system.
2023-05-15 13:04:28 -07:00
c6135ac538 libsysprof-analyze: parse embedded /proc/kallsyms.gz
This is useful so that even if we do not get __symbols__ in the capture
file we can decode symbols from the target machine.
2023-05-15 13:03:46 -07:00
fa55594e23 libsysprof-analyze: make return type GRefString 2023-05-15 12:46:46 -07:00
11f0531591 libsysprof: include gzip'd /proc/kallsyms in capture
Compressed, this adds about 2.5mb to the capture file for the contents of
the kallsyms. However, that is useful so that we can decode kernel symbols
after the fact without relying on __symbols__ to be tacked on by the
recording machine.
2023-05-15 12:15:08 -07:00
580889f8cb libsysprof-analyze: short-circuit when address > max
This adds an O(1) check at the head of the lookup to avoid looking at
every RB_RIGHT() in the tree when address falls beyond the upper bound of
the interval tree.
2023-05-15 11:20:44 -07:00
00ecc41209 libsysprof-analyze: make SysprofDocumentSymbols private
This instead moves to a public API on the document to symbolize now
that we've gotten much of the necessary bits private in loading the
document. This commit ensures that we only do loading via the loader
now (and removes the incorrect use from the tests so they too go
through the loader).

We check for NoSymbolizer in document symbols so that we can skip any
decoding. That keeps various use cases fast where you don't want to
waste time on symbolizing if you don't need to look at symbols.

There is plenty more we can do to batch decode symbols with some more
API changes, but that will come after we have kernel/userland decoding
integrated from this library.

We may still want to get all symbols into a single symbol cache, but
given that we have address ranges associated with them, that may not
be very useful beyond the hashtable to pid-specific cache we have now.

If symbols were shared between processes, that'd make more sense, but
we aren't doing that (albeit strings are shared between symbol
instances to reduce that overhead).
2023-05-15 10:56:09 -07:00
ed030d2c25 libsysprof-analyze: fix nick for symbol 2023-05-15 10:56:09 -07:00
0e95fd3841 libsysprof-analyze: add SysprofNoSymbolizer
A symbolizer that will never symbolize. We can use this later internally
to short-circuit various symoblization steps during loading.
2023-05-15 10:56:09 -07:00
a27f700c06 libsysprof-analyze: add scaffolding for kallsyms parsing
We will want to start embedding this content in the capture file (but
after gzipping it as it's otherwise quite large). This will get things in
place so that we can parse that .gz file into the address ranges and
decode symbols found within the capture file.
2023-05-14 18:45:03 -07:00
1bcb534e16 libsysprof-analyze: add api to get size for file 2023-05-14 18:36:29 -07:00
6429768373 libsysprof-analyze: add API to open input stream 2023-05-14 18:19:30 -07:00
3350ad61eb libsysprof-analyze: introduce SysprofDocumentLoader
and thereby make a bunch of the exposed API on SysprofDocument private.
Instead we'll push some of that to the loader but for now the tests can
keep doing what their doing using the private API.

The goal here is to not expose a SysprofDocument pointer until the document
has been loaded and symbolized via the loader API. Then we can lookup
symbols directly from the document w/o intermediary objects.
2023-05-12 15:41:48 -07:00
9f6e16e373 libsysprof-analyze: take depth from translation func 2023-05-12 14:29:16 -07:00
59044bd813 libsysprof-analyze: remove implementation debugging code 2023-05-12 14:15:18 -07:00
81c384a974 libsysprof-analyze: use interval tree for symbol cache
This uses an augmented red-black tree to create an interval tree with
non-interval lookups. That amounts to storing address ranges within the
red-black tree, but looking up by single address.
2023-05-12 14:13:40 -07:00
379db77349 libsysprof-analyze: add sys/tree.h
This pulls in the venerable sys/tree.h so that we may use it in some
upcoming cache code.
2023-05-12 14:12:33 -07:00
3a8471906f libsysprof-analyze: build static library for testing internals
Otherwise we can't test our private API which very much will need it to
ensure correctness.
2023-05-12 14:11:17 -07:00
582986f5c9 libsysprof-capture: free deduplicated array entries 2023-05-12 14:09:43 -07:00
dc99b46254 libsysprof-capture: fix leak of mapped ring buffer structure 2023-05-12 14:02:46 -07:00
8c17830522 libsysprof-analyze: resolve via symbol cache for pid 2023-05-11 15:48:13 -07:00
c0bc3c047a libsysprof-analyze: get stack addresses as group 2023-05-11 15:38:14 -07:00
2d1bf107e5 libsysprof-analyze: handle section edge better 2023-05-11 15:37:59 -07:00
ff1b4d00bd libsysprof-analyze: implement key/value superblock option API
This currently gets used in libsysprof to try to cross mount namespaces to
somewhere we can access a binary on the same subvolume.
2023-05-11 15:21:36 -07:00
35f87b6121 libsysprof-analyze: add superblock-options property
This lets you get the full string that was parsed from the mountinfo
rather than having to go through our yet-to-be-implemented specific
option API.
2023-05-11 15:13:23 -07:00
1f9d37837d libsysprof-analyze: use pid list to load mountinfo 2023-05-11 15:02:52 -07:00
971166a82f libsysprof-analyze: keep bitset of known pids
This can be handy for iterating all the pids we know we've seen during
process loading.
2023-05-11 15:02:32 -07:00
7c37120edf libsysprof-analyze: make SysprofMount public API
And expose it via sysprof_document_process_list_mounts() so that when
inspecting processes we can see what binaries were mapped as well as what
the filesystem looked like to locate those mapped paths.
2023-05-11 14:37:02 -07:00
352fa617c0 libsysprof-analyze: fix transfer ownership of refstring 2023-05-11 14:05:07 -07:00
a393dd9acd libsysprof-analyze: parse various data from mountinfo
We still need to parse optional fields for filesystem type and what not
so that we can resolve btrfs subvolumes and more.
2023-05-11 13:58:34 -07:00
f914580675 libsysprof-analyze: remove process list
This is done with a GtkBitset index and so we can drop this unused list
model wrapper.
2023-05-11 13:15:28 -07:00
9e01216945 libsysprof-analyze: remove unused code
We went a different route for this, so we can just drop the unused code.
2023-05-11 13:14:19 -07:00
5d5f0a5085 libsysprof-analyze: plumb access to string pooling
We want various subsystems to start using this, but we need to plumb it
to the symbolizers to take advantage of it.
2023-05-11 13:09:50 -07:00
5abad47160 libsysprof-analyze: give access to memory maps from process
And add it to test tool to ensure it works.
2023-05-11 12:32:32 -07:00
9b5e25037b libsysprof-analyze: add SysprofProcessInfo
This internal type is used to collect things about a process like the
memory maps, address layout, and symbol cache. This can persist once
parsed at startup, then applied to objects created on demand such as the
SysprofDocumentProcess or used by symbolizers internally rather than
complicated function arguments.
2023-05-11 12:21:32 -07:00
ccd790fef5 libsysprof-analyze: make AddressLayout a ListModel 2023-05-11 11:41:17 -07:00
8f4fa95663 libsysprof-analyze: use mountnamespace/addresslayout in symbolize 2023-05-10 16:51:05 -07:00
c72955e7d4 libsysprof-analyze: add address layout to contain mmap regions 2023-05-10 16:48:48 -07:00
b3a4c295c3 libsysprof-analyze: add basic symbol cache
This relies on begin/end range for the symbols to create something akin to
an interval tree, albeit with GSequence. If performance needs to be
addressed, can probably augment SysprofSymbol for an interval rbtree.
2023-05-10 15:14:09 -07:00
0b0fe9f903 libsysprof-analyze: give mount access to strings in ctor 2023-05-10 12:55:30 -07:00
7118c38b2b libsysprof-analyze: break out string helper
We'll want strings to be deduplicated a bunch, and may need to pass this
around to make that a bit easier to ensure.
2023-05-10 12:42:28 -07:00
fa39a3291a libsysprof-analyze: start plumbing mounts into namespaces 2023-05-09 21:03:13 -07:00
fd6256e68f libsysprof-analyze: start on object to represent a mount 2023-05-09 20:46:10 -07:00
a9f615cff0 libsysprof-analyze: add scaffolding for mountinfo parsing
I want to do this differently than we did in libsysprof, which is going
to require a bit of thinking on how we should represent something like
a SysprofMount within the mount namespace.
2023-05-09 17:51:38 -07:00