mirror of
https://github.com/varun-r-mallya/sysprof.git
synced 2026-02-12 16:10:54 +00:00
Update TODO
This commit is contained in:
239
TODO
239
TODO
@ -23,11 +23,40 @@ Before 1.0.4:
|
|||||||
|
|
||||||
Before 1.2:
|
Before 1.2:
|
||||||
|
|
||||||
|
* On 32 bit, NMI stackframes are not filtered out which leads to
|
||||||
|
wrong kernel traces
|
||||||
|
|
||||||
|
* Can we track kernel modules being inserted/removed?
|
||||||
|
|
||||||
|
* Does it make sense to try and locate a kernel binary, or can we
|
||||||
|
always just use kallsyms.
|
||||||
|
|
||||||
|
* open_inode() would be useful in many cases
|
||||||
|
|
||||||
|
* Is the double mapping of the ring buffer ok?
|
||||||
|
|
||||||
|
* Why do we get EINVAL when we try to track forks?
|
||||||
|
|
||||||
|
* Sometimes it get samples for unknown processes. This may be due to
|
||||||
|
forking without execing.
|
||||||
|
|
||||||
* Give an informative error message if run as root
|
* Give an informative error message if run as root
|
||||||
|
|
||||||
|
* What are the events generated for pid 0 and pid 1? They have no
|
||||||
|
stacktrace, and an eip in the kernel.
|
||||||
|
|
||||||
|
* Delete the binparser stuff and fix the elf parser to just deal with
|
||||||
|
32 bits vs 64 bits. Or use C++ like behdad does in harfbuzz?
|
||||||
|
|
||||||
|
* We often get "No map". I suspect this is because the vdso stackframe
|
||||||
|
is broken.
|
||||||
|
|
||||||
* Find out why gtk_tree_view_columns_autosize() apparently doesn't
|
* Find out why gtk_tree_view_columns_autosize() apparently doesn't
|
||||||
work on empty tree views.
|
work on empty tree views.
|
||||||
|
|
||||||
|
* Write my own tree model? There is still performance issues in
|
||||||
|
footreestore.
|
||||||
|
|
||||||
* Hack to disable recursion for binaries without symbols causes the
|
* Hack to disable recursion for binaries without symbols causes the
|
||||||
symbols to not work the way other symbols do. A better approach is
|
symbols to not work the way other symbols do. A better approach is
|
||||||
probably to simply generate a new symbol for every appearance except
|
probably to simply generate a new symbol for every appearance except
|
||||||
@ -36,57 +65,22 @@ Before 1.2:
|
|||||||
"has same parent" may be the correct criterion in all cases. (done:
|
"has same parent" may be the correct criterion in all cases. (done:
|
||||||
master now doesn't fold those recursions anymore)
|
master now doesn't fold those recursions anymore)
|
||||||
|
|
||||||
* See if we can make "In file <blah>" not be treated as a recursive function.
|
* See if we can make "In file <blah>" not be treated as a recursive
|
||||||
Maybe simply treat each individual address in the file as a function.
|
function. Maybe simply treat each individual address in the file
|
||||||
Or try to parse the machine code. Positions that are called are likely
|
as a function. Or try to parse the machine code. Positions that
|
||||||
to be functions.
|
are called are likely to be functions.
|
||||||
|
|
||||||
- Treat identical addresses as one function
|
- Treat identical addresses as one function
|
||||||
|
|
||||||
- Treat all addresses within a library that don't have children are
|
- Treat all addresses within a library that don't have children
|
||||||
treated as one function.
|
are treated as one function.
|
||||||
|
|
||||||
This will have the effect of coalescing adjacent siblings without
|
This will have the effect of coalescing adjacent siblings without
|
||||||
children. Which is what you want since you can't tell them apart
|
children. Which is what you want since you can't tell them apart
|
||||||
anyway. It will never be a great experience though.
|
anyway. It will never be a great experience though.
|
||||||
|
|
||||||
* Find out what is going on with kernel threads:
|
|
||||||
|
|
||||||
[(ksoftirqd/0)] 0.00 0.03
|
|
||||||
No map ([(ksoftirqd/0)]) 0.00 0.03
|
|
||||||
kernel 0.00 0.03
|
|
||||||
do_softirq 0.00 0.03
|
|
||||||
__do_softirq 0.00 0.03
|
|
||||||
|
|
||||||
* Make sure there aren't leftover stacktraces from last time when
|
|
||||||
profiling begins.
|
|
||||||
|
|
||||||
* Is the move-to-front in process_locate_map() really worth it?
|
|
||||||
|
|
||||||
* Whenever we fail to lock the atomic variable, track this, and send the
|
|
||||||
information to userspace as an indication of the overhead of the profiling.
|
|
||||||
Although there is inherent aliasing here since stack scanning happens at
|
|
||||||
regular intervals.
|
|
||||||
|
|
||||||
* Apparently, if you upgrade the kernel, then don't re-run configure,
|
|
||||||
the kernel Makefile will delete all of /lib/modules/<release>/kernel
|
|
||||||
if you run make install in the module directory. Need to find out what
|
|
||||||
is going on.
|
|
||||||
|
|
||||||
* Performance:
|
|
||||||
Switching between descendant views is a slow:
|
|
||||||
- gtk_tree_store_get_path() is O(n^2) and accounts
|
|
||||||
for 43% of the time.
|
|
||||||
- GObject signal emission overhead accounts for 18% of
|
|
||||||
the time.
|
|
||||||
Consider adding a forked version of GtkTreeStore with
|
|
||||||
performance and bug fixes.
|
|
||||||
|
|
||||||
* Make sure that labels look decent in case of "No Map" etc.
|
* Make sure that labels look decent in case of "No Map" etc.
|
||||||
|
|
||||||
* If we end up believing the kernel's own stacktraces, maybe
|
|
||||||
/proc/kallsyms shouldn't be parsed until the user hits profile.
|
|
||||||
|
|
||||||
* Elf bugs:
|
* Elf bugs:
|
||||||
- error handling for bin_parser is necessary.
|
- error handling for bin_parser is necessary.
|
||||||
|
|
||||||
@ -209,7 +203,7 @@ Before 1.2:
|
|||||||
* Rename stack_stash_foreach_by_address() to stack_stash_foreach_unique(),
|
* Rename stack_stash_foreach_by_address() to stack_stash_foreach_unique(),
|
||||||
or maybe not ...
|
or maybe not ...
|
||||||
|
|
||||||
* Make it compilable against a non-running kernel.
|
Which things are we actually using from stack stash now?
|
||||||
|
|
||||||
* Maybe report idle time? Although this would come for free with the
|
* Maybe report idle time? Although this would come for free with the
|
||||||
timelines.
|
timelines.
|
||||||
@ -231,15 +225,6 @@ Before 1.2:
|
|||||||
|
|
||||||
* Make it compile and work on x86-64
|
* Make it compile and work on x86-64
|
||||||
|
|
||||||
* With kernel module not installed, select Profiler->Start, then dismiss
|
|
||||||
the alert. This causes the start button to appear prelighted. Probably
|
|
||||||
just another gtk+ bug.
|
|
||||||
|
|
||||||
- Fix bugs/performance issues:
|
|
||||||
- add_trace_to_tree() might be a little slow when dealing with deeply
|
|
||||||
recursive profiles. Hypothesis: seen_nodes can grow large, and the
|
|
||||||
algorithm is O(n^2) in the length of the trace.
|
|
||||||
|
|
||||||
- make the things we put in a stackstash real
|
- make the things we put in a stackstash real
|
||||||
objects so that
|
objects so that
|
||||||
- we can save them
|
- we can save them
|
||||||
@ -326,9 +311,7 @@ Before 1.2:
|
|||||||
stype_define_record (object, "object",
|
stype_define_record (object, "object",
|
||||||
name, total, self, NULL);
|
name, total, self, NULL);
|
||||||
stype_define_pointer (...);
|
stype_define_pointer (...);
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
* See if the auto-expanding can be made more intelligent
|
* See if the auto-expanding can be made more intelligent
|
||||||
- "Everything" should be expanded exactly one level
|
- "Everything" should be expanded exactly one level
|
||||||
- all trees should be expanded at least one level
|
- all trees should be expanded at least one level
|
||||||
@ -502,6 +485,7 @@ Before 1.2:
|
|||||||
- See if there is a way to make it distcheck
|
- See if there is a way to make it distcheck
|
||||||
|
|
||||||
- grep "FIXME - not10"
|
- grep "FIXME - not10"
|
||||||
|
- grep FIXME
|
||||||
|
|
||||||
- translation should be hooked up
|
- translation should be hooked up
|
||||||
|
|
||||||
@ -570,6 +554,7 @@ http://www.linuxbase.org/spec/booksets/LSB-Embedded/LSB-Embedded/ehframe.html
|
|||||||
the details of a function are not that interesting
|
the details of a function are not that interesting
|
||||||
together with a tree. (Could add radio buttons somewhere in
|
together with a tree. (Could add radio buttons somewhere in
|
||||||
in the right pane).
|
in the right pane).
|
||||||
|
- Open a new window for the function.
|
||||||
|
|
||||||
- Add view->ancestors/descendants menu items
|
- Add view->ancestors/descendants menu items
|
||||||
|
|
||||||
@ -581,63 +566,12 @@ http://www.linuxbase.org/spec/booksets/LSB-Embedded/LSB-Embedded/ehframe.html
|
|||||||
find the immediate callers.
|
find the immediate callers.
|
||||||
- Now it's back to just listing the immediate callers.
|
- Now it's back to just listing the immediate callers.
|
||||||
|
|
||||||
- Have kernel module report the file the address was found in
|
|
||||||
Should avoid a lot of potential broken/raciness with dlopen etc.
|
|
||||||
Probably better to send a list of maps with each trace. Which
|
|
||||||
shouldn't really be that expensive. We already walk the list of
|
|
||||||
maps in process_ensure_map() on every trace. And we can do hashing
|
|
||||||
if it turns out to be a problem.
|
|
||||||
Maybe replace the SysprofStackTrace with a union, so that
|
|
||||||
it can be either a list of maps, or a stacktrace. Both map lists and
|
|
||||||
stacktraces would come with a hashcode.allowing userspac. This avoids
|
|
||||||
the problem that maps could take up a lot of extra bandwidth.
|
|
||||||
|
|
||||||
struct MapInfo
|
|
||||||
{
|
|
||||||
long start;
|
|
||||||
long end;
|
|
||||||
long offset;
|
|
||||||
long inode;
|
|
||||||
}
|
|
||||||
|
|
||||||
struct Maps
|
|
||||||
{
|
|
||||||
int hash_code;
|
|
||||||
int n_maps;
|
|
||||||
MapInfo info [127];
|
|
||||||
char filenames [2048];
|
|
||||||
}
|
|
||||||
|
|
||||||
- Figure out how Google's pprof script works. Then add real call graph
|
- Figure out how Google's pprof script works. Then add real call graph
|
||||||
drawing. (google's script is really simple; uses dot from graphviz).
|
drawing. (google's script is really simple; uses dot from graphviz).
|
||||||
KCacheGrind also uses dot to do graph drawing.
|
KCacheGrind also uses dot to do graph drawing.
|
||||||
|
|
||||||
- hide internal stuff in ProfileDescendant
|
- hide internal stuff in ProfileDescendant
|
||||||
|
|
||||||
- possibly add dependency on glib 2.8 if it is released at that point.
|
|
||||||
(g_file_replace())
|
|
||||||
|
|
||||||
* Some notes about timer interrupt handling in Linux
|
|
||||||
|
|
||||||
On an SMP system APIC is used - the interesting file is arch/i386/kernel/apic.c
|
|
||||||
|
|
||||||
On UP systems, the normal IRQ0 is used
|
|
||||||
When the interrupt happens,
|
|
||||||
entry.S
|
|
||||||
calls do_IRQ, which sets up the special interrupt stack,
|
|
||||||
and calls __do_IRQ, which is in /kernel/irq/handle.c.
|
|
||||||
This calls the corresponding irqaction, which has previously
|
|
||||||
been setup by arch/i386/mach-default/setup.c to point to
|
|
||||||
timer_interrupt, which is in arch/i386/kernel/time.c.
|
|
||||||
This calls do_timer_interrupt_hooks() which is defined in
|
|
||||||
/include/asm-i386/mach-default/do_timer.h. This function
|
|
||||||
then calls profile_tick().
|
|
||||||
|
|
||||||
Note when the CPU switches from user mode to kernel mode, it
|
|
||||||
pushes SS/ESP on top of the kernel stack, but when it switches
|
|
||||||
from kernel mode to kernel mode, it does _not_ push SS/ESP.
|
|
||||||
It does in both cases push EIP though.
|
|
||||||
|
|
||||||
Later:
|
Later:
|
||||||
|
|
||||||
- If the stack trace ends in a memory access instruction, send the
|
- If the stack trace ends in a memory access instruction, send the
|
||||||
@ -844,6 +778,103 @@ Later:
|
|||||||
|
|
||||||
-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ALREADY DONE: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ALREADY DONE: -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
|
||||||
|
|
||||||
|
* Find out what is going on with kernel threads:
|
||||||
|
|
||||||
|
[(ksoftirqd/0)] 0.00 0.03
|
||||||
|
No map ([(ksoftirqd/0)]) 0.00 0.03
|
||||||
|
kernel 0.00 0.03
|
||||||
|
do_softirq 0.00 0.03
|
||||||
|
__do_softirq 0.00 0.03
|
||||||
|
|
||||||
|
* Make sure there aren't leftover stacktraces from last time when
|
||||||
|
profiling begins.
|
||||||
|
|
||||||
|
* Is the move-to-front in process_locate_map() really worth it?
|
||||||
|
|
||||||
|
* Whenever we fail to lock the atomic variable, track this, and send the
|
||||||
|
information to userspace as an indication of the overhead of the profiling.
|
||||||
|
Although there is inherent aliasing here since stack scanning happens at
|
||||||
|
regular intervals.
|
||||||
|
|
||||||
|
* Apparently, if you upgrade the kernel, then don't re-run configure,
|
||||||
|
the kernel Makefile will delete all of /lib/modules/<release>/kernel
|
||||||
|
if you run make install in the module directory. Need to find out what
|
||||||
|
is going on.
|
||||||
|
|
||||||
|
* Performance:
|
||||||
|
Switching between descendant views is a slow:
|
||||||
|
- gtk_tree_store_get_path() is O(n^2) and accounts
|
||||||
|
for 43% of the time.
|
||||||
|
- GObject signal emission overhead accounts for 18% of
|
||||||
|
the time.
|
||||||
|
Consider adding a forked version of GtkTreeStore with
|
||||||
|
performance and bug fixes.
|
||||||
|
|
||||||
|
* If we end up believing the kernel's own stacktraces, maybe
|
||||||
|
/proc/kallsyms shouldn't be parsed until the user hits profile.
|
||||||
|
|
||||||
|
* Make it compilable against a non-running kernel.
|
||||||
|
|
||||||
|
* With kernel module not installed, select Profiler->Start, then dismiss
|
||||||
|
the alert. This causes the start button to appear prelighted. Probably
|
||||||
|
just another gtk+ bug.
|
||||||
|
|
||||||
|
- Fix bugs/performance issues:
|
||||||
|
- add_trace_to_tree() might be a little slow when dealing with deeply
|
||||||
|
recursive profiles. Hypothesis: seen_nodes can grow large, and the
|
||||||
|
algorithm is O(n^2) in the length of the trace.
|
||||||
|
|
||||||
|
- Have kernel module report the file the address was found in
|
||||||
|
Should avoid a lot of potential broken/raciness with dlopen etc.
|
||||||
|
Probably better to send a list of maps with each trace. Which
|
||||||
|
shouldn't really be that expensive. We already walk the list of
|
||||||
|
maps in process_ensure_map() on every trace. And we can do hashing
|
||||||
|
if it turns out to be a problem.
|
||||||
|
Maybe replace the SysprofStackTrace with a union, so that
|
||||||
|
it can be either a list of maps, or a stacktrace. Both map lists and
|
||||||
|
stacktraces would come with a hashcode.allowing userspac. This avoids
|
||||||
|
the problem that maps could take up a lot of extra bandwidth.
|
||||||
|
|
||||||
|
struct MapInfo
|
||||||
|
{
|
||||||
|
long start;
|
||||||
|
long end;
|
||||||
|
long offset;
|
||||||
|
long inode;
|
||||||
|
}
|
||||||
|
|
||||||
|
struct Maps
|
||||||
|
{
|
||||||
|
int hash_code;
|
||||||
|
int n_maps;
|
||||||
|
MapInfo info [127];
|
||||||
|
char filenames [2048];
|
||||||
|
}
|
||||||
|
|
||||||
|
- possibly add dependency on glib 2.8 if it is released at that point.
|
||||||
|
(g_file_replace())
|
||||||
|
|
||||||
|
* Some notes about timer interrupt handling in Linux
|
||||||
|
|
||||||
|
On an SMP system APIC is used - the interesting file is arch/i386/kernel/apic.c
|
||||||
|
|
||||||
|
On UP systems, the normal IRQ0 is used
|
||||||
|
When the interrupt happens,
|
||||||
|
entry.S
|
||||||
|
calls do_IRQ, which sets up the special interrupt stack,
|
||||||
|
and calls __do_IRQ, which is in /kernel/irq/handle.c.
|
||||||
|
This calls the corresponding irqaction, which has previously
|
||||||
|
been setup by arch/i386/mach-default/setup.c to point to
|
||||||
|
timer_interrupt, which is in arch/i386/kernel/time.c.
|
||||||
|
This calls do_timer_interrupt_hooks() which is defined in
|
||||||
|
/include/asm-i386/mach-default/do_timer.h. This function
|
||||||
|
then calls profile_tick().
|
||||||
|
|
||||||
|
Note when the CPU switches from user mode to kernel mode, it
|
||||||
|
pushes SS/ESP on top of the kernel stack, but when it switches
|
||||||
|
from kernel mode to kernel mode, it does _not_ push SS/ESP.
|
||||||
|
It does in both cases push EIP though.
|
||||||
|
|
||||||
* Rename sysprof-text to sysprof-cli
|
* Rename sysprof-text to sysprof-cli
|
||||||
|
|
||||||
* Find out why the samples label won't right adjust
|
* Find out why the samples label won't right adjust
|
||||||
|
|||||||
@ -773,7 +773,7 @@ process_sample (state_t *state, StackStash *resolved, sample_t *sample)
|
|||||||
if (!process)
|
if (!process)
|
||||||
{
|
{
|
||||||
static gboolean warned;
|
static gboolean warned;
|
||||||
if (!warned)
|
if (!warned || sample->pid != 0)
|
||||||
{
|
{
|
||||||
g_warning ("sample for unknown process %d", sample->pid);
|
g_warning ("sample for unknown process %d", sample->pid);
|
||||||
warned = TRUE;
|
warned = TRUE;
|
||||||
|
|||||||
Reference in New Issue
Block a user