User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

6line "copyin" microsecond timestamping

Sun Feb 02, 2020 10:55 am

I did need to timestamp several framebuffer related code alternatives for finding most efficient.
I was used to Linux clock_gettime(), but did not need its nanosecond precision.
So I developed below minimal "copyin" timestamping.
Only 6 lines of code (compared to one include line and external file).
And microsecond resolution allows to store timestamp as long, overflow after 35min is OK for my applications.
Implementation does not look nice.
But even with 2 character indentation I use, there is always space in leftmost two colums for "_" and "__" measuring macros.
"___(N, BLK)" macro is simple solution for repeating N times code block BLK.
"prev_us" is last timestamp taken (with "_" or "__" macro).
I did compile my code with "gcc -O6 -Wall -pedantic -Wextra" without any warnings.

Code: Select all

#include <time.h>
long prev_us = 0, delta_us = 0;
#define _  {long a=prev_us;struct timespec t;clock_gettime(CLOCK_REALTIME,&t);\
  prev_us = 1000000 * t.tv_sec + t.tv_nsec / 1000; delta_us = prev_us - a;}
#define __ _ printf("%ld\n", delta_us);
#define ___(n, blk) for(int i=0; i<n; ++i) { blk }

Here is part of code that illustrates how to use:

Code: Select all

___(5,
_ for (y = (vinfo.yres - 480) / 2; y < (vinfo.yres + 480) / 2; y++)
  {
    unsigned *row = fbp + y * finfo.line_length/4;
    for (x = (vinfo.xres - 640) / 2; x < (vinfo.xres + 640) / 2; x++)
      row[x] = 0xC31C4A;
  }
__

_ msync(fbp, finfo.smem_len, MS_SYNC);
__)

This is the output generated by just that part (microsecond time deltas):

Code: Select all

3252
6
3180
2
3178
3
3241
6
3195
8
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

User avatar
buja
Posts: 542
Joined: Wed Dec 31, 2014 8:21 am
Location: Netherlands

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 11:17 am

HermannSW wrote:
Sun Feb 02, 2020 10:55 am
...
Implementation does not look nice.
...
Using underscores as identifiers, not so nice is bit of an understatement I think. :D
Quick hack from the hospital?

jahboater
Posts: 5203
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 12:24 pm

GCC has statement expressions:

Code: Select all

#define usec() ({ struct timespec t; clock_gettime( CLOCK_MONOTONIC_RAW, &t ); \
                  1000000 * (long)t.tv_sec + t.tv_nsec / 1000; })
long start = usec();
printf( "%ld usec\n", usec() - start );
Not strictly portable but other compilers such as Clang accept them too.

Since "long" is only 32 bits for Pi/Raspbian, I would use int64_t instead.

CLOCK_MONOTONIC is a much better clock for timing things, because it cannot go backwards!
CLOCK_MONOTONIC_RAW is the hardware clock and is not affected by the continual adjustments by NTP.
"We are in the beginning of a mass extinction, and all you can talk
about is money and fairy tales of eternal economic growth."
- Greta Thunberg

User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 12:29 pm

buja wrote:
Sun Feb 02, 2020 11:17 am
Using underscores as identifiers, not so nice is bit of an understatement I think. :D
Agreed in general, but they are only short time in source code for profiing areas of interest, then they will be removed.
Quick hack from the hospital?
Sure ;-)

Fiz water bottle is nice part of my WoodenBoardPi now:
Image
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 12:40 pm

jahboater wrote:
Sun Feb 02, 2020 12:24 pm
Since "long" is only 32 bits for Pi/Raspbian, I would use int64_t instead.
int64_t will not hurt here because of precision reduced from nanoseconds to microseconds.
But only needed for avoiding overflow after 35min -- int64_t will overflow after 292471 years ...
CLOCK_MONOTONIC is a much better clock for timing things, because it cannot go backwards!
CLOCK_MONOTONIC_RAW is the hardware clock and is not affected by the continual adjustments by NTP.
Interesting.
GCC has statement expressions:

Code: Select all

#define usec() ({ struct timespec t; clock_gettime( CLOCK_MONOTONIC_RAW, &t ); \
                  1000000 * (long)t.tv_sec + t.tv_nsec / 1000; })
long start = usec();
printf( "%ld usec\n", usec() - start );
"printf()" is a "slow" fuction, so I would want to to avoid it being executed before the timestamp gets taken.

The macros I showed can be used differently to get timestamps at different places with no slow printf() in between.
The "__" macro calls "_" macro for taking timestamp first, before doing printf(), for that reason:

Code: Select all

...
_ t0=prev_us;
...
_ t1=prev_us;
...
_ t2=prev_us;
...
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

jahboater
Posts: 5203
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 2:08 pm

HermannSW wrote:
Sun Feb 02, 2020 12:40 pm
"printf()" is a "slow" function, so I would want to to avoid it being executed before the timestamp gets taken.
I know, it was just to illustrate how to use usec().

usec() behaves like a pure function. Its much better than relying on global variables in a macro - yuk!
Last edited by jahboater on Sun Feb 02, 2020 2:46 pm, edited 2 times in total.
"We are in the beginning of a mass extinction, and all you can talk
about is money and fairy tales of eternal economic growth."
- Greta Thunberg

jahboater
Posts: 5203
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 2:11 pm

HermannSW wrote:
Sun Feb 02, 2020 12:40 pm
jahboater wrote:
Sun Feb 02, 2020 12:24 pm
Since "long" is only 32 bits for Pi/Raspbian, I would use int64_t instead.
int64_t will not hurt here because of precision reduced from nanoseconds to microseconds.
tv_sec is a "time_t" and you are multiplying it by a million. It will overflow 32-bits now.

Code: Select all

>> time
Bin (63) 00000000 00000000 00000000 00000000 | 01011110 00110110 | 11011001 | 00001011 (0)
Oct 13615554413,  Dec 1580652811,  Hex 5E36D90B
>> time * 1000000
Bin (63) 00000000 00000101 10011101 10011000 | 01101001 10111010 | 11101000 | 10000000 (0)
Oct 54731415156564200,  Dec 1580652818000000,  Hex 59D9869BAE880
>> 
The fact that you divide the tv_nsec field by 1000 does not help.
"We are in the beginning of a mass extinction, and all you can talk
about is money and fairy tales of eternal economic growth."
- Greta Thunberg

User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 8:11 pm

Thanks jahboater for identifying this overflow bug.
The tv_sec component of the timespec structure is unix epoch time, seconds since 1/1/1970, just now:

Code: Select all

[email protected]:~ $ date +%s
1580673682
[email protected]:~ $

I had to add a new line in order to fix it -- base_sec avoids the overflow for 35 minutes:

Code: Select all

[email protected]:~ $ diff fb.c.posted fb.c
10c10
< long prev_us = 0, delta_us = 0;
---
> long prev_us = 0, delta_us = 0, base_sec = 0;
12c12,13
<   prev_us = 1000000 * t.tv_sec + t.tv_nsec / 1000; delta_us = prev_us - a;}
---
>   if (!base_sec) { base_sec = t.tv_sec; }\
>   prev_us=1000000*(t.tv_sec-base_sec)+t.tv_nsec/1000; delta_us = prev_us - a;}
[email protected]:~ $ 



So this is the complete corrected version:

Code: Select all

#include <time.h>
long prev_us = 0, delta_us = 0, base_sec = 0;
#define _  {long a=prev_us;struct timespec t;clock_gettime(CLOCK_REALTIME,&t);\
  if (!base_sec) { base_sec = t.tv_sec; }\
  prev_us=1000000*(t.tv_sec-base_sec)+t.tv_nsec/1000; delta_us = prev_us - a;}
#define __ _ printf("%ld\n", delta_us);
#define ___(n, blk) for(int i=0; i<n; ++i) { blk }

I wanted to know the cost of calling clock_gettime() and wrote below code reporting the cost in nanoseconds, far less than 1000ns or 1us. This nicely fits the 1us precision of the presented timestamping method:

Code: Select all

[email protected]:~ $ ./tt.0
521
313
260
260
313
[email protected]:~ $ 

tt.0.c

Code: Select all

#include <stdio.h>
#include <time.h>
struct timespec t0,t1;
#define ___(n, blk) for(int i=0; i<n; ++i) { blk }
int main()
{
___(5,
  clock_gettime(CLOCK_REALTIME, &t0);
  clock_gettime(CLOCK_REALTIME, &t1);
  printf("%ld\n", 1000000000*(t1.tv_sec-t0.tv_sec)+(t1.tv_nsec-t0.tv_nsec));
)

  return 0;
}
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

jahboater
Posts: 5203
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 8:34 pm

base_sec is a good idea.
Do you get smoother, more regular times with CLOCK_MONOTONIC_RAW ?
NTP (or systemd) constantly micro adjusts the clock so that there are no sudden "steps" in the system time that could confuse software.
And the time-of-day clock can go backwards of course!

I did this:-

Code: Select all

static uint64_t
counter( void )
{
  struct timespec now;
  clock_gettime( CLOCK_MONOTONIC_RAW, &now );
  return (uint64_t)now.tv_sec * 1000000000ULL + (uint64_t)now.tv_nsec;
}
to return a nano second resolution counter, or as good as the system can provide.
64-bit arithmetic will not overflow.
The Pi can do the entire expression in a single "multiply-add" instruction in aarch64 mode, which is pretty cool I think!
"We are in the beginning of a mass extinction, and all you can talk
about is money and fairy tales of eternal economic growth."
- Greta Thunberg

User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 8:46 pm

jahboater wrote:
Sun Feb 02, 2020 8:34 pm
base_sec is a good idea.
Do you get smoother, more regular times with CLOCK_MONOTONIC_RAW ?
No, I used that clock with last nanosecond precision program

Code: Select all

[email protected]:~ $ diff tt.0.c tt.1.c
8,9c8,9
<   clock_gettime(CLOCK_REALTIME, &t0);
<   clock_gettime(CLOCK_REALTIME, &t1);
---
>   clock_gettime(CLOCK_MONOTONIC_RAW, &t0);
>   clock_gettime(CLOCK_MONOTONIC_RAW, &t1);
[email protected]:~ $ 

and got weird results:

Code: Select all

[email protected]:~ $ ./tt.0
521
469
364
312
260
[email protected]:~ $ ./tt.1
2136
1927
1407
1250
1250
[email protected]:~ $ 

I think I will go with CLOCK_REALTIME with repeated tests -- if the measured time deltas agree then all is fine.
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

jahboater
Posts: 5203
Joined: Wed Feb 04, 2015 6:38 pm
Location: West Dorset

Re: 6line "copyin" microsecond timestamping

Sun Feb 02, 2020 8:55 pm

HermannSW wrote:
Sun Feb 02, 2020 8:46 pm
and got weird results:

Code: Select all

[email protected]:~ $ ./tt.0
521
469
364
312
260
[email protected]:~ $ ./tt.1
2136
1927
1407
1250
1250
[email protected]:~ $ 
That's interesting.
I speculate that the real hardware clock resolution on the Pi is quite course, maybe 1us .
The finer changes you are seeing with the real time clock may be the result of the micro adjustments.
The manual page says:-

Code: Select all

CLOCK_MONOTONIC
       Clock that cannot be set and  represents  monotonic  time  since  some  unspecified
       starting  point.   This  clock is not affected by discontinuous jumps in the system
       time (e.g., if the  system  administrator  manually  changes  the  clock),  but  is
       affected by the incremental adjustments performed by adjtime(3) and NTP.

CLOCK_MONOTONIC_COARSE (since Linux 2.6.32; Linux-specific)
       A faster but less precise version of CLOCK_MONOTONIC.  Use when you need very fast,
       but not fine-grained timestamps.

CLOCK_MONOTONIC_RAW (since Linux 2.6.28; Linux-specific)
       Similar to CLOCK_MONOTONIC, but provides access to a raw hardware-based  time  that
       is  not subject to NTP adjustments or the incremental adjustments performed by adj-
       time(3).
As you can see, it says "provides access to a raw hardware-based time", so my guess is that its as accurate as the Pi can really provide.
"We are in the beginning of a mass extinction, and all you can talk
about is money and fairy tales of eternal economic growth."
- Greta Thunberg

User avatar
HermannSW
Posts: 2022
Joined: Fri Jul 22, 2016 9:09 pm
Location: Eberbach, Germany
Contact: Website Twitter YouTube

Re: 6line "copyin" microsecond timestamping

Wed Feb 05, 2020 3:34 pm

The profiling can be used as is, or as basis for new analysis tool.
In thread "High framerate camera multiple displays on framebuffer"
https://www.raspberrypi.org/forums/view ... 3&t=264109
awk scripts separate the "__" macro duration outputs (odd lines report "fread()" durations, even lines report "draw frame" durations).
"paste" command for side-by-side output of three command outputs is handy as well.

Code: Select all

[email protected]:~ $ ./yuvrgbfb 6000 -a
compiling rgb2fb.c
     fread[us];      fill[us];   530 frames were captured at 90fps
      1   0xxx;    114   65xx;   frame delta[us]
     54   2xxx;    269   66xx;      1 
    281   3xxx;    119   67xx;      2 11077
     34   4xxx;     23   68xx;     29 11079
    119   5xxx;      3   69xx;    177 11080
     36   6xxx;      2   70xx;    234 11081
      5   7xxx;              ;     74 11082
              ;              ;     11 11083
              ;              ;      1 11084
              ;              ;      1 11086
[email protected]:~ $
⇨https://stamm-wilbrandt.de/en/Raspberry_camera.html

https://github.com/Hermann-SW/Raspberry_v1_camera_global_external_shutter
https://stamm-wilbrandt.de/github_repo_i420toh264
https://github.com/Hermann-SW/fork-raspiraw
https://twitter.com/HermannSW

Return to “C/C++”