User avatar
HermannSW
Posts: 1876
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Raspberry Pi Zero "cycle counter register" always returns 0?

Wed Jul 27, 2016 2:38 pm

Yesterday I successfully installed kernel header files needed to build kernel modules on Raspbian:
viewtopic.php?p=1015439#p1015439

Next I went to "the" page wrt. Raspberry CPU "cycle counter register":
http://blog.regehr.org/archives/794

The precompiled module did not work because I have a different kernel (4.4.11+), but compiling the kernel module worked without any issues.

I tried to use this small sample program to read the cycle couter register:

Code: Select all

#include <stdio.h>
#include <unistd.h>

static inline unsigned ccnt_read (void)
{
  unsigned cc;
  __asm__ volatile ("mrc p15, 0, %0, c15, c12, 1":"=r" (cc));
  return cc;
}

int main()
{
  unsigned cc0,cc1;

  cc0 = ccnt_read();
  sleep(1);
  cc1 = ccnt_read();

  printf("%u %u\n",cc0,cc1);

  return 0;
}
All is fine, illegal instruction without installing the kernel mdoule, and normal run after installing:

Code: Select all

[email protected]:~/raspbian-ccr $ ./tst
Illegal instruction
[email protected]:~/raspbian-ccr $ sudo insmod enable-ccr.ko
[email protected]:~/raspbian-ccr $ dmesg | tail -1
[ 2339.369603] User-level access to CCR has been turned on.
[email protected]:~/raspbian-ccr $ ./tst
0 0
[email protected]:~/raspbian-ccr $ 
Although there is a 1 second sleep between both reads of cycle count register, both report 0 :-(


Just to be on the safe side I converted the litte test program to a kernel module:

Code: Select all

#include <linux/module.h>
#include <linux/kernel.h>

//#include <linux/time.h>
#include <asm/delay.h>

/*
 * works for ARM1176JZ-F
 */

static inline unsigned ccnt_read (void)
{
  unsigned cc;
  __asm__ volatile ("mrc p15, 0, %0, c15, c12, 1":"=r" (cc));
  return cc;
}

int init_module(void)
{
  unsigned cc0,cc1;

  cc0 = ccnt_read();
  udelay(10);
  cc1 = ccnt_read();

  printk("%u %u\n",cc0,cc1);

  return 0;
}
void cleanup_module(void)
{
}

MODULE_LICENSE("GPL");
Unfortunately this code returns 0 two times as well:

Code: Select all

[email protected]:~/raspbian-ccr $ sudo reboot 0
Connection to raspberrypi02.local closed by remote host.
Connection to raspberrypi02.local closed.
[email protected]:~$ ssh -X [email protected]
Warning: Permanently added the ECDSA host key for IP address '10.42.0.43' to the list of known hosts.
[email protected]'s password: 

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Wed Jul 27 14:32:38 2016
[email protected]:~ $ sudo insmod raspbian-ccr/tst-1.ko
[email protected]:~ $ dmesg | tail -2
[   42.907963] random: nonblocking pool is initialized
[  201.024787] 0 0
[email protected]:~ $ 
I verified that the assembler statement for reading cycle counter is correct in manual:
http://infocenter.arm.com/help/topic/co ... f#page=270


Some questions:
  • Does cycle counter register work for Raspberry Pi Zero?
  • Is something needed to start cycle counter register?
  • Is reading cycle counter register the equivalent to Intel rdtsc and Arduino Due SysTick->Val?

Hermann.
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

User avatar
HermannSW
Posts: 1876
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Raspberry Pi Zero "cycle counter register" always return

Thu Jul 28, 2016 8:13 pm

After a lot of googling I found this relevant posting:
http://stackoverflow.com/questions/3247 ... ve#tab-top

It says that counters need to be enabled first. The code does help although it is for Cortex-A8. At the bottom you find a kernel module containing that code, with the replacement asm statements for Pi Zero's arm11, and with links to A8 as well as arm11 specs for each asm statement, Perhaps that is a good place to start investigation if you want help to get CPU clock cycle counter timing work for Pi Zero.

I reduced the problem to this really short kernel module:

Code: Select all

/* Kernel Programming */
#define MODULE
#define LINUX
#define __KERNEL__

#include <linux/module.h>  /* Needed by all modules */
#include <linux/kernel.h>  /* Needed for KERN_ALERT */

#include <linux/delay.h>

int init_module(void)
{
  unsigned int t;


  printk(KERN_ALERT "Hello rd-1.\n");

  __asm__ volatile ("mrc p15, 0, %0, c15, c12, 0":"=r" (t));
  // Read Performance Monitor Control Register

  printk ("t=%x\n", t);


  t=1;
  __asm__ volatile ("mcr p15, 0, %0, c15, c12, 0":"=r" (t));
  // Write Performance Monitor Control Register


  udelay(10);
  __asm__ volatile ("mrc p15, 0, %0, c15, c12, 0":"=r" (t));
  // Read Performance Monitor Control Register

  printk ("t=%x\n", t);

  return 0;
}

void cleanup_module(void)
{
  printk(KERN_ALERT "Goodbye rd-1.\n");
}  

MODULE_LICENSE("GPL");


It reads Performance Monitor Control Register (0), then writes it with 1 to enable all three counters and then reads it again. Although I inserted a delay of 10us (you can see that this delay happens on dmesg timestamps) the 2nd read returns 0 as well although I think 1 should be returned, shouldn't it?

Code: Select all

[email protected]:~/arm11-1 $ sudo insmod rd-1.ko 
[email protected]:~/arm11-1 $ sudo rmmod rd-1.ko 
[email protected]:~/arm11-1 $ dmesg | tail -4
[ 1074.924256] Hello rd-1.
[ 1074.926731] t=0
[ 1074.926755] t=0
[ 1079.411175] Goodbye rd-1.
[email protected]:~/arm11-1 $ uname -r
4.4.15+
[email protected]:~/arm11-1 $ 

Any explanation, help or pointer to working Pi Zero code reading clock cycles is appreciated.


Hermann.

Code: Select all

//  http://stackoverflow.com/questions/3247373/how-to-measure-program-execution-time-in-arm-cortex-a8-processor?answertab=active#tab-top

/* Kernel Programming */
#define MODULE
#define LINUX
#define __KERNEL__

#include <linux/module.h>  /* Needed by all modules */
#include <linux/kernel.h>  /* Needed for KERN_ALERT */

#include <linux/delay.h>

static inline unsigned int get_cyclecount (void)
{
  unsigned int value;
  // Read CCNT Register
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbjjhaj.html
//asm volatile ("MRC p15, 0, %0, c9, c13, 0\t\n": "=r"(value));  

  __asm__ volatile ("mrc p15, 0, %0, c15, c12, 1":"=r" (value));
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/BIHCGFCF.html
  return value;
}

static inline void init_perfcounters (int32_t do_reset, int32_t enable_divider)
{
  // in general enable all counters (including cycle counter)
  int32_t value = 1;

  // peform reset:  
  if (do_reset)
  {
    value |= 2;     // reset all counters to zero.
    value |= 4;     // reset cycle counter to zero.
  } 

  if (enable_divider)
    value |= 8;     // enable "by 64" divider for CCNT.

//  value |= 16;  // export events to external ... (a8)

// [6:4] interrupts, 0 disable (arm11)
// [10:8] clear overflows
  value |= 0x380;


  // program the performance-counter control-register:
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbdeggf.html
// asm volatile ("MCR p15, 0, %0, c9, c12, 0\t\n" :: "r"(value));  


  // enable all counters:  
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbedifc.html
//  asm volatile ("MCR p15, 0, %0, c9, c12, 1\t\n" :: "r"(0x8000000f));  


  // clear overflows:
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbidaah.html
//  asm volatile ("MCR p15, 0, %0, c9, c12, 3\t\n" :: "r"(0x8000000f));


  __asm__ volatile ("mcr p15, 0, %0, c15, c12, 0":"=r" (value));
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0360f/BIHCJBAA.html
}



int init_module(void)
{
  unsigned int overhead, t;


  printk(KERN_ALERT "Hello arm11-1.\n");

#if 0
  /* enable user-mode access to the performance counter*/
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbcjifb.html
//asm ("MCR p15, 0, %0, C9, C14, 0\n\t" :: "r"(1));
  __asm__ volatile ("mcr p15,  0, %0, c15,  c9, 0\n" : : "r" (1));
// 
#endif


  /* disable counter overflow interrupts (just in case)*/
// http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0344k/Bgbhgacf.html
//  asm ("MCR p15, 0, %0, C9, C14, 2\n\t" :: "r"(0x8000000f));
// 0x8 => CCNT, f => other counters
// done for arm11 in init_perfcounters()


  // init counters:
  init_perfcounters (0, 0); 


  // measure the counting overhead:
  overhead = get_cyclecount();
  overhead = get_cyclecount() - overhead;    

printk("a\n");
  t = get_cyclecount();
  udelay(200);
  t = get_cyclecount() - t;
printk("o\n");


  printk ("ov=%u\n", overhead);
  printk ("dt=%u\n", t);

  return 0;
}

void cleanup_module(void)
{
  printk(KERN_ALERT "Goodbye arm11-1.\n");
}  

MODULE_LICENSE("GPL");
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

User avatar
joan
Posts: 14668
Joined: Thu Jul 05, 2012 5:09 pm
Location: UK

Re: Raspberry Pi Zero "cycle counter register" always return

Thu Jul 28, 2016 9:30 pm

It may be the sort of question the bare metal guys could answer. It seems pretty low level.

User avatar
HermannSW
Posts: 1876
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: Raspberry Pi Zero "cycle counter register" always return

Fri Jul 29, 2016 5:53 pm

Searching this forum (bare metal) did not give a solution.

But googling for "arm11" and "cycle counter" did find this posting with code "Counting cycles on the Nokia N810":
http://bench.cr.yp.to/cpucycles/n810.html

Luckily Nokia N810 is arm11 based as the Raspberry Pi Zero, and the code worked without changes. Find infomation on how to build kernel modules on Pi Zero in initial posting:

Code: Select all

[email protected]:~/cpucycles4ns-1 $ ll
total 8
-rw-r--r-- 1 pi pi 2711 Jul 29 17:12 cpucycles4ns.c
-rw-r--r-- 1 pi pi  162 Jul 29 17:12 Makefile
[email protected]:~/cpucycles4ns-1 $ cat Makefile 
obj-m += cpucycles4ns.o

all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
[email protected]:~/cpucycles4ns-1 $ 
[email protected]:~/cpucycles4ns-1 $ make
make -C /lib/modules/4.4.15+/build M=/home/pi/cpucycles4ns-1 modules
make[1]: Entering directory '/home/pi/linux-19cf22758bad1e120ee13a5170f59df560dfcdea'
  CC [M]  /home/pi/cpucycles4ns-1/cpucycles4ns.o
/home/pi/cpucycles4ns-1/cpucycles4ns.c: In function ‘device_read’:
/home/pi/cpucycles4ns-1/cpucycles4ns.c:31:20: warning: ignoring return value of ‘copy_to_user’, declared with attribute warn_unused_result [-Wunused-result]
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/pi/cpucycles4ns-1/cpucycles4ns.mod.o
  LD [M]  /home/pi/cpucycles4ns-1/cpucycles4ns.ko
make[1]: Leaving directory '/home/pi/linux-19cf22758bad1e120ee13a5170f59df560dfcdea'
[email protected]:~/cpucycles4ns-1 $
Installing the kernel module (3 11) and removing the kernel module (155741306 155741306) report clock cycles in dmesg:

Code: Select all

[email protected]:~/cpucycles4ns-1 $ sudo insmod cpucycles4ns.ko
[email protected]:~/cpucycles4ns-1 $ 
[email protected]:~/cpucycles4ns-1 $ sudo rmmod cpucycles4ns.ko
[email protected]:~/cpucycles4ns-1 $ 
[email protected]:~/cpucycles4ns-1 $ dmesg | egrep "^\[ (239|240)"
[ 2396.886833] cpucycles4ns starting
[ 2396.886860] cpucycles4ns creating device
[ 2396.886885] cpucycles4ns suggests mknod /dev/cpucycles4ns c 243 0
[ 2396.886894] cpucycles4ns enabling cycle counter
[ 2396.886902] cpucycles4ns 3 11
[ 2400.285792] cpucycles4ns removing device
[ 2400.285825] cpucycles4ns disabling cycle counter
[ 2400.285837] cpucycles4ns 155741306 155741306
[ 2400.285845] cpucycles4ns stopping
[email protected]:~/cpucycles4ns-1 $
What the previous posting code named "overhead" (difference of two reads of cycle counter) is 11-3=8. Unlike the N810 the Pi Zero clock cycle precision is 1ns!!! The two clock cycle reads on module removal are the same value because all counters have been disabled by the preceding asm statement.


I am so happy to have working cycle counter start/read code. Now I need to understand what this code does differently to previous posting code. Also I don't need device "/dev/cpucycles4ns".

Hermann.

P.S:
See related 'Raspberry Pi Zero 1ns "cycle counter register"' thread with code samples:
viewtopic.php?f=63&t=155830
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

Return to “General discussion”