This API helps us simplify some of the tooling to iterate
through a capture. In particular, we might want to setup a
bunch of matches and then just iterate through the items.
This can also allow delaying the iteration until the future
which might be handy for visualizers which won't want to block
the main loop.
I'm not jazzed about the 64k buffer created for every cursor
due to the SpCaptureReader copy, but it's probably not a big
deal in practice until we start doing more exotic things.
This function allows copying a capture so that we can do
additional reads. This does, however, copy the buffers which
might be more memory than we want for large usage. We can
tweak things as we go to figure out the cursors.
In case we are building in a flatpak, we might want to rely on a system
installed sysprofd. This means we might want to pretend we have sysprofd
support (to be found on the system), but not actually build sysprofd.
We need this for recent changes to how natural sizing will work now that
the max-content-* properties are stabilizing for their first public ABI
release.
The kernel says here http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/events/ring_buffer.c?id=7a1dcf6ad#n61 :
* Since the mmap() consumer (userspace) can run on a different CPU:
*
* kernel user
*
* if (LOAD ->data_tail) { LOAD ->data_head
* (A) smp_rmb() (C)
* STORE $data LOAD $data
* smp_wmb() (B) smp_mb() (D)
* STORE ->data_head STORE ->data_tail
* }
*
* Where A pairs with D, and B pairs with C.
*
* In our case (A) is a control dependency that separates the load of
* the ->data_tail and the stores of $data. In case ->data_tail
* indicates there is no room in the buffer to store $data we do not.
*
* D needs to be a full barrier since it separates the data READ
* from the tail WRITE.
*
* For B a WMB is sufficient since it separates two WRITEs, and for C
* an RMB is sufficient since it separates two READs.
*
* See perf_output_begin().
So I'm pretty sure we need a full barrier before writing out data_tail.