What if any profiling tools have people had success with?
I'd like to do some performance optimizations for emulators running on the Pi. I've never done CPU profiling in Linux before, but I'm intermediate-level experienced with Linux. I'm perfectly comfortable recompiling or making intrusive source changes if necessary, but I'd like as simple of a workflow as possible.
Sampling-based profiling tools are fine, as well as instrumentation-based tools. Doing a quick search on the forums it seems that some people have done profiling on various applications, but there's no mention of the tools they used (other than valgrind not working).
Profiling on the Pi?
15 posts
- Posts: 31
- Joined: Wed Sep 05, 2012 2:59 am
Have you tried OProfile?
- Posts: 150
- Joined: Wed Nov 23, 2011 1:29 pm
I got oprofile building yesterday, but when I try to run operf:
main() calls _check_perf_events_cap(), which does a __NR_perf_event_open syscall. It returns ENODEV, which apparently was not expected by oprofile. I'm guessing this means that the kernel doesn't provide support for perf_events, although /proc/sys/kernel/perf_event_paranoid exists.
Any idea how I can add support for this? I'm not opposed to recompiling my kernel.
- Code: Select all
pi@raspberrypi:~/oprofile-0.9.8/pe_profiling$ operf
Unexpected error running operf: No such device
Please use the opcontrol command instead of operf.
main() calls _check_perf_events_cap(), which does a __NR_perf_event_open syscall. It returns ENODEV, which apparently was not expected by oprofile. I'm guessing this means that the kernel doesn't provide support for perf_events, although /proc/sys/kernel/perf_event_paranoid exists.
Any idea how I can add support for this? I'm not opposed to recompiling my kernel.
- Posts: 31
- Joined: Wed Sep 05, 2012 2:59 am
jlongstreet wrote:I got oprofile building yesterday, but when I try to run operf:
- Code: Select all
pi@raspberrypi:~/oprofile-0.9.8/pe_profiling$ operf
Unexpected error running operf: No such device
Please use the opcontrol command instead of operf.
main() calls _check_perf_events_cap(), which does a __NR_perf_event_open syscall. It returns ENODEV, which apparently was not expected by oprofile. I'm guessing this means that the kernel doesn't provide support for perf_events, although /proc/sys/kernel/perf_event_paranoid exists.
Any idea how I can add support for this? I'm not opposed to recompiling my kernel.
The ARM PMU should have an IRQ that needs to be wired up to an "arm-pmu" platform device, see arch/arm/mach-bcmring/arch.c as an example. I can't see the PMU IRQ in the arm peripherals datasheet though so it's either not wired up to the controllers or just not documented...
Jamie
- Posts: 3
- Joined: Wed Sep 19, 2012 10:38 am
jamieiles wrote:The ARM PMU should have an IRQ that needs to be wired up to an "arm-pmu" platform device, see arch/arm/mach-bcmring/arch.c as an example. I can't see the PMU IRQ in the arm peripherals datasheet though so it's either not wired up to the controllers or just not documented...
Unfortunately the interrupt is not available.
I believe the wraparound time for the PMU resisters is quite high, so it may be possibly to get the same effect from a timer.
- Moderator
- Posts: 3245
- Joined: Wed Aug 17, 2011 7:41 pm
- Location: Cambridge
What about the gpertools: http://code.google.com/p/gperftools/
They contain a sampling profiler that seems to work nicely on the Pi.
They contain a sampling profiler that seems to work nicely on the Pi.
- Posts: 2
- Joined: Wed Oct 10, 2012 1:55 pm
- Location: Berlin
Yeah, it looks like OProfile is too much trouble for the moment. I'm building gperftools now.
I needed to add "-march=armv7-a" to CFLAGS and CXXFLAGS to get it compiling, because Debian armhf apparently targets ARMv4 out of the box. I'm going to post a bug on the google code page about that.
Assuming gperftools works, it looks like it'll do what I want. I'll post back here with updates.
I needed to add "-march=armv7-a" to CFLAGS and CXXFLAGS to get it compiling, because Debian armhf apparently targets ARMv4 out of the box. I'm going to post a bug on the google code page about that.
Assuming gperftools works, it looks like it'll do what I want. I'll post back here with updates.
- Posts: 31
- Joined: Wed Sep 05, 2012 2:59 am
Hmm, it seems that the memory barriers are still not supported, even after compiling for -march=armv7-a:
- Code: Select all
pi@raspberrypi:~/profiletest$ LD_LIBRARY_PATH=/usr/local/lib gdb ./test
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/pi/profiletest/test...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/pi/profiletest/test
Program received signal SIGILL, Illegal instruction.
Acquire_CompareAndSwap (new_value=1, ptr=0x11024, old_value=0)
at ./src/base/atomicops-internals-arm-v6plus.h:138
138 MemoryBarrier();
(gdb)
- Posts: 31
- Joined: Wed Sep 05, 2012 2:59 am
jlongstreet wrote:Hmm, it seems that the memory barriers are still not supported, even after compiling for -march=armv7-a:
Why are you building for armv7-a? We're an armv6.
- Moderator
- Posts: 3245
- Joined: Wed Aug 17, 2011 7:41 pm
- Location: Cambridge
ive gotten oprofile to work before, i think i had to compile the kernel module for it to work
but sadly, all the performance monitoring stuff didnt work, the irq never triggered
i had to force a module option to make it use the timer tick irq
but sadly, all the performance monitoring stuff didnt work, the irq never triggered
i had to force a module option to make it use the timer tick irq
- Posts: 92
- Joined: Sat Aug 18, 2012 2:33 pm
I use gperftools from the Arch repos and this works nicely:
- Code: Select all
[mypi] ~ %CPUPROFILE=ls.prof LD_PRELOAD=/usr/lib/libprofiler.so /bin/ls -R >| /dev/null
[mypi] ~ %pprof --text /bin/ls ls.prof | head
Using local file /bin/ls.
Using local file ls.prof.
Total: 324 samples
81 25.0% 25.0% 81 25.0% __getdents64
66 20.4% 45.4% 66 20.4% __openat_nocancel
55 17.0% 62.3% 55 17.0% strcoll_l
8 2.5% 64.8% 8 2.5% _int_malloc
7 2.2% 67.0% 7 2.2% __fxstat64
5 1.5% 68.5% 5 1.5% _int_free
5 1.5% 70.1% 5 1.5% closedir
5 1.5% 71.6% 5 1.5% readdir64@@GLIBC_2.4
4 1.2% 72.8% 4 1.2% __aeabi_read_tp
- Posts: 2
- Joined: Wed Oct 10, 2012 1:55 pm
- Location: Berlin
It seems there have been problems building tcmalloc (used in gperftools) for ARM in the past - e.g. http://code.google.com/p/chromium-os/is ... l?id=34620. So it may not be in complete working order for all ARM CPUs.
There is an implementation with gperftools but it doesn't seem to be working for all flavours of ARM (hence the 'dmb' instruction compilation error).
After some inspection of the code, I was able to make a small kludge that got tcmalloc to build for ARMV5, which enabled me to successfully profile a small executable on my Pi running Raspbian. BTW This isn't a recommended kludge for public consumption, really just a proof-of-concept to help track down the issue.
Building for ARMV7 stops the compiler complaining, but then generates instructions that the Pi CPU doesn't understand, as it is the wrong version.
The right thing to do would be post a bug against tcmalloc to get a proper fix. A possible workaround in the meantime would be to try getting a proper ARMV5 build of gperftools going.
There is an implementation with gperftools but it doesn't seem to be working for all flavours of ARM (hence the 'dmb' instruction compilation error).
After some inspection of the code, I was able to make a small kludge that got tcmalloc to build for ARMV5, which enabled me to successfully profile a small executable on my Pi running Raspbian. BTW This isn't a recommended kludge for public consumption, really just a proof-of-concept to help track down the issue.
Building for ARMV7 stops the compiler complaining, but then generates instructions that the Pi CPU doesn't understand, as it is the wrong version.
The right thing to do would be post a bug against tcmalloc to get a proper fix. A possible workaround in the meantime would be to try getting a proper ARMV5 build of gperftools going.
- Posts: 5
- Joined: Sat Aug 04, 2012 6:02 pm
I'm using linux-tools-3.2 on raspbian.
install with:
sudo apt-get linux-tools
Record profile:
perf record -g "path-to-executable"
report profile:
perf report -g
Better than a poke in the eye with a sharp stick -- but not much. I'm going to have to add my own instrumentation I think to suit my needs.
install with:
sudo apt-get linux-tools
Record profile:
perf record -g "path-to-executable"
report profile:
perf report -g
Better than a poke in the eye with a sharp stick -- but not much. I'm going to have to add my own instrumentation I think to suit my needs.
- Posts: 56
- Joined: Sat Jul 07, 2012 11:21 pm
- Location: Zero Page
Thanks for the information on using perf record.
I'd also like to use perf stat to query the hardware counters but I keep getting the error:
Does anyone know how to fix this?
I'm guessing that I'll need to recompile a kernel with support for this enabled.
If there's another way to get around this, please let me know.
I'd also like to use perf stat to query the hardware counters but I keep getting the error:
Error: open_counter returned with 19 (No such device). /bin/dmesg may provide additional information.
Fatal: Not all events could be opened.
Does anyone know how to fix this?
I'm guessing that I'll need to recompile a kernel with support for this enabled.
If there's another way to get around this, please let me know.
- Posts: 1
- Joined: Mon Nov 05, 2012 5:10 pm
robquant wrote:I use gperftools from the Arch repos and this works nicely:
However, the current tarball did not compile on raspbian, because it used armv7 instructions even when compiling for armv6. Fortunately, a patch has been committed recently which solves this issue:
http://code.google.com/p/gperftools/issues/detail?id=493
Still compiling, but it seems to work now ...
- Posts: 18
- Joined: Thu Jan 17, 2013 11:00 am