sibiquin
Posts: 21
Joined: Thu Nov 01, 2012 4:45 pm

Would this help overruns due to USB activity?

Fri Jul 27, 2018 3:04 pm

I have been fighting to get a reliable data acquisition system running on the Pi. Data comes in at 12Mbps on the I2S interface and is lifted into user memory space by the ALSA library. (The data is not actually audio, but ALSA does not care it just moves bits from the serial buffers into user space). A user space process writes the data to a USB disk. But occasionally ALSA signals a buffer overrun which I can correlate with disk write activity. I suspect that during USB writes the I2S interrupts are being delayed such that the hardware buffer overruns. Even a single overrun is fatal for this application. Making the buffers the max size allowed by ALSA does not avoid the problem.

No amount of disk system tuning seems to help (and writing to the CF card is even worse). Will this RT kernel help that scenario? If interrupts are disabled during USB activity (by design), then I suspect not. I also suspect moving the user-space process to a dedicated core would not help as the issue is interrupt service time, not CPU dispatching of the user space process.

Thoughts?

jdb
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 2468
Joined: Thu Jul 11, 2013 2:37 pm

Re: New RT (Real Time) kernel branch

Fri Jul 27, 2018 3:21 pm

If you're using a single thread to both read the sound samples and write the data out to disk, your program will block while the write completes. USB drives are by default written to in a synchronous manner (to help mitigate against surprise removal).

You could try mounting the USB drive with the "async" mount option enabled, or test writing to the SD card which will do async writes by default.
Rockets are loud.
https://astro-pi.org

sibiquin
Posts: 21
Joined: Thu Nov 01, 2012 4:45 pm

Re: New RT (Real Time) kernel branch

Fri Jul 27, 2018 8:18 pm

The user space application is multithreaded. One thread calls ALSA to get the next buffer and as soon as it comes back that buffer is handed off to a processing thread and we call right back to ALSA for the next buffer. The disk writes do not happen on the same thread as the I2S reads, and yet they still cause delays in those reads. We verified with timing details that there is almost no time between when we get a buffer from ALSA and call back for the next one. It is completely decoupled from disk writing (at the application layer, yet apparently not in the o/s).

User avatar
karrika
Posts: 1284
Joined: Mon Oct 19, 2015 6:21 am
Location: Finland

Re: New RT (Real Time) kernel branch

Sun Jul 29, 2018 4:33 am

For high speed streaming I always use 3 buffers.

One buffer for reading in data. When it is full continue with buffer 2. When it is full continue with buffer 3.

Once a buffer is full start processing it.

Stream out processed buffers in the same fashion as while reading.

All these 3 buffers are shared by three processes without any locking or mutex stuff.

Oh, one more thing. Use DMA for the transfers if possible.

sibiquin
Posts: 21
Joined: Thu Nov 01, 2012 4:45 pm

Re: New RT (Real Time) kernel branch

Mon Jul 30, 2018 11:06 pm

I did not describe the buffering scheme in detail, but yes it is triple buffered (reader, processor, writer). All threads in a single process, I wonder if making them separate processes would matter. From appearances, ALL interrupts get blocked during disk I/O so I doubt if making them separate processes would help.

Eirikur
Posts: 88
Joined: Sun Sep 09, 2018 9:43 pm

Re: New RT (Real Time) kernel branch

Sun Sep 09, 2018 9:51 pm

I want to thank PhilE and everyone who worked/works on this kernel. I built it today and it solved my problems and I'm sure it will help out in other cases in my music keyboard application.

Ironically, my problem was audio stuttering caused by the heartbeat LED blinking that enabling direct booting from USB seems to bring along with it. I've not seen this mentioned anywhere! I did try setting the activity LED to a different trigger or no trigger, but I still got the stuttering in exactly the same pattern as the beeps.

So, the RT kernel saved me from having to bother the fellow who crammed USB booting into that tiny amount of space.

Cheers!

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Mon Sep 10, 2018 7:59 am

Thanks, but I can't claim credit for anything except a bit of guidance and giving the project semi-official blessing. Tiejun Chen is the person doing all of the work.

gizurieta
Posts: 1
Joined: Fri Feb 08, 2019 9:26 am

Re: New RT (Real Time) kernel branch

Fri Feb 08, 2019 9:34 am

I found this work from Mauro Riva @ https://hackaday.io/project/123415-real ... pi/details

"The Preempt-RT patched Raspbian kernel (4.14.y-rt) offers a solution to reduce the kernel latency. But, you lose a lot of CPU and communication performance. The data transfer over Ethernet is reduced to 34% and the CPU performance up to 12%. If your application is sampling sensors really fast and it doesn't do a lot of "math" or/and data transfers, probably you need to patch the Raspberry Pi kernel. Otherwise, you should use the standard kernel"

I would like to somebody here comment if his shared conclusions or method using RT-Tools has some point to be featured or improved.

Also look the latency tests conducted by Mete Balci and published at
https://metebalci.com/blog/latency-of-r ... .9-kernel/

Notice that seems to be kernel 4.9.47-rt37-v7+ vs 4.14.y-rt, shows a similar latency for core 2,3,4 but there is a lot of difference with the core 1 between tests conducted by Mr. Riva and Mr. Balci. Mr. Balci found an issue and was forced to apply a patch:

"A final solution instead of a workaround
The solution, of course, would be to understand the crashes of the FIQ exception handler and to fix the underlying problem. We found that the lockup occurs when the IRQ handler thread gets preempted while it holds the FIQ spin lock. The solution, thus, is to disable the IRQ while the FIQ spin lock is held, irrespective of whether the interrupt handler is threaded or not. The patch that was created for this purpose introduces two new macros

fiq_fsm_spin_lock_irqsave
fiq_fsm_spin_unlock_irqrestore

to facilitate the implementation. The complete patch is available here. To better compare the two kernel versions – original kernel with FIQ disabled vs. patched kernel with FIQ enabled – a shadow system was installed at rack #b, slot #3 that runs the patched kernel. So far, everything runs well."

Reference: https://www.osadl.org/Single-View.111+M ... c57.0.html

Paleloshow
Posts: 13
Joined: Mon Feb 11, 2019 9:23 pm

Re: New RT (Real Time) kernel branch

Wed May 01, 2019 9:53 pm

I have managed to install the Preempt RT patch for my Pi 3B+, following the instructions from this github user:

https://github.com/thanhtam-h/rpi23-4.9.80-rt

But I want to know if just installing the patch in the PI is enough to make run in realtime. I intend to use the Pi as a digital relay, so I need a high sampling rate and be able to run complex algorithms between samples. I have also used the configuration and code provided by tjrob (quote of his post below)
tjrob wrote:
Tue Dec 11, 2018 2:45 am
Linux is most definitely not a real-time operating system, and if you just use the default configuration you'll find you occasionally have latencies as large as 50 milliseconds. That's pretty hopeless for anything that needs real-time response. You can certainly install some real-time OS, or write a kernel module -- those require real expertise and are a major effort. If the requirements are such that this approach can meet them, it will be much easier. Note that the results of using some RTOS, or writing a kernel module or "bare metal" code may be less than expected, as the RPi hardware is not capable of disabling interrupts on a single core (tests show that the Linux Kernel routine local_irq_disable() disables interrupts on all cores).

I describe how to achieve latencies less than 3 microseconds 99+% of the time, with a worst-case latency of 41 microseconds. This is on a Raspberry Pi 3B+ running 2018-06-27-raspbian-stretch-lite, without any kernel drivers or other exotica. Note that on single-core models the results will be much worse. While this is not "hard" real time, it is sufficient for many applications. My applications are for a scientific instrument and involve sampling hardware at 1,000 Hz and writing each result to a socket.

By "latency" I mean the time difference between the time that "something" happens, and the time the program knows that it happened. Here "something" could be an edge on a GPIO, a time-delay ends, a specific time of day, etc. As best I can tell, large latencies are due to an interrupt causing the kernel to schedule some other task. Because of that, worst-case latencies do not add, and during one loop the code can call several routines that have latencies, but it essentially never happens that more than one of them gets a large latency. The RPi has a hardware counter that increments at 1 MHz, so measuring latency with 1 microsecond (us) resolution is easy.

The idea is to dedicate one core on a RPi 3B+ to the real-time process, and then write it as a simple loop doing whatever it needs to do:

Code: Select all

initialize
for(;;) {
    wait for connection to the socket
    for(;;) {
        wait for GPIO edge indicating ADC data are ready
        read ADC channels
        fprintf(socket, ...)
    }
    close socket
}
Yes, there is time to use fprintf(). But using stdout to an ssh connection is barely possible at 1 kHz, while using a bare TCP connection leaves a lot more headroom.

How to do it
  1. add this argument to /boot/cmdline.txt:
    isolcpus=3
    This prevents Linux from scheduling processes on core 3. But interrupts still happen on it, and there is an essential kernel task running on it. Still, it is a good start, but not at all sufficient.
  2. Set the process's CPU affinity to 3. See attached code.
  3. Disable turbo mode. Otherwise the OS will change the CPU clock frequency, and that screws up the SPI clock and the overall timing. See attached code.
  4. Set the process to real-time FIFO scheduling, with maximum priority. See attached code.
  5. Permit real-time processes to use 100% of the CPU. Without this you'll get 50 millisecond delays once per second. See attached code.
Attached are Realtime.h and Realtime.cc that implement the code needed in your process for #2-5. Just call Realtime::setup() during initialization.

Also attached is latency.cc to measure the latencies. Note that to measure the GPIO latency it needs GPIO22 connected to GPIO23. Here is a screenshot of an 18-hour run on a Raspberry Pi 3B+:
Screen Shot 2018-12-10 at 5.39.52 PM.png

The line
0us: 0 76961522 31486427 9058 ...
Means that no sample had a 0-microsecond latency, 76961522 samples had a 1-microsecond latency, 31486427 samples had a 2-microsecond latency, etc. See the comments in the code for an explanation of what is measured.
I have a neural network, already trained, for the Iris classification. Using tjrob real time code, the neural network ran in 14 ms average, but I would like to make run even faster, like 1 or 2 ms. Does anyone have suggestions on how to do it?

If you have sample codes in real time using this patch I would really appreciate.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 8:47 am

I have a neural network, already trained, for the Iris classification. Using tjrob real time code, the neural network ran in 14 ms average, but I would like to make run even faster, like 1 or 2 ms. Does anyone have suggestions on how to do it?
What makes you think that your code would ran faster with a different scheduler? You need to understand the difference between throughput and latency.

I work best when given one task at a time, taking each one to completion before starting the next. That is how I achieve my highest throughput. However, sometimes a task involves waiting around for something - a long compile, perhaps, or an email response from someone - so it makes sense to task switch while waiting otherwise I would be idle. There is an overhead in task switching - when that compile finishes or the email arrives it takes a while to get back into the original task, but as long as I don't get interrupted too frequently my overall throughput is increased.

Accepting an increased latency can improve overall throughput. If a particularly urgent task arrives, immediately dropping whatever I'm doing - possibly mid-sentence - to complete that task minimises the latency of handling that urgent request, but it may take longer to resume the original task, causing the overall throughput to decrease. Tuning the system to minimise latency can actually increase overheads: you notice a spill on a work surface and wish to wipe it up but there is another object on the surface. With a cloth in one hand, the quickest way to proceed is probably to pick up the object in the other hand (assuming you have the use of two hands...), wipe up the spill, and then put the object down again, but for that short time you are unable to answer the telephone. A slower but more interruptible approach is to move the object somewhere else, wipe, then move it back.

RT kernels aim to improve responsiveness to external events by minimising latency. They do this by moving work from non-interruptible context to high priority threads. What they don't do is improve throughput, and they make virtually no difference if only a single task is running per core with no I/O. Neural network code is likely to be limited only by CPU speed and memory bandwidth, not I/O. The only way to get a speed up of a factor of 10 is to improve the algorithm or increase the clock speeds, neither of which is changed by an RT kernel.

Paleloshow
Posts: 13
Joined: Mon Feb 11, 2019 9:23 pm

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 12:08 pm

Thanks for the response. I am new to RT and I've just recently started using it for my Master's research project. I plan to use the Pi as a digital relay: I need a fast acquisition time to read voltage and current signals, like 1 or 2 kHz for sampling frequency. Which means 1 to 2 ms between samples, and during this time a need to send this buffered data to a protection algorithm, which it you make the necessary measurements. The idea is to run ANY algorithm, including ones that use Neural Network or Fuzzy logic. This project is based on another one, but instead of a Pi, it used a BeagleBone Black, and the other project managed to get a 0.5 ms time between samples and to analyse the information. Unfortunately I have limited budget so I need to get this work if PI. Any suggestions?

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 12:20 pm

To see where the effort needs to go, split the problem in two: replace the wait & capture part with something that uses fixed samples (or pre-captured samples in memory), then see how quickly your processing (prediction?) can run. Only if it's quick enough to handle samples in realtime (faster than realtime, actually) is it worth thinking about an RT kernel. If it isn't fast enough, how close is it? Pis can usually be overclocked by quite a margin, provided they have a decent power supply and enough ventilation.

Paleloshow
Posts: 13
Joined: Mon Feb 11, 2019 9:23 pm

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 8:10 pm

PhilE wrote:
Thu May 02, 2019 12:20 pm
To see where the effort needs to go, split the problem in two: replace the wait & capture part with something that uses fixed samples (or pre-captured samples in memory), then see how quickly your processing (prediction?) can run. Only if it's quick enough to handle samples in realtime (faster than realtime, actually) is it worth thinking about an RT kernel. If it isn't fast enough, how close is it? Pis can usually be overclocked by quite a margin, provided they have a decent power supply and enough ventilation.
I have actually acquired a really good power source for my Pi: 5V/5A. It's delivering enough power and it's stable. As for the RT kernel, I am using the Preempt RT. I have considered using Xenomai, but it's API is too complicated for my to understand. The Preempt RT is easier: no API and I just need to configure the priorities to maximum. As for the Neural Network, it's samples comes from an iris.data file. I am also using a library in C called Genann, which is a lightweight neural network. It provides functions to create, train a run neural networks. After the training part, the code creates a .data file with the net configurations (number of inputs, outputs, hidden layers and neurons in each layer). Then, I wrote a code that calls this net.data file, inicialize the network and run it, using the iris.data file as input. After compilation, I saved the executable file created. Finally, I have the tbrog code I have mentioned earlier, that measures latency, and I used it to measure the time spent to execute the code.

As for the overclocking, I saw some post in the forum not recommending that, since the 3B+ is already overclocked version of Pi 3.

I hope I was clear this time.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 8:53 pm

Where's the realtime, latency-sensitive element? This just sounds like some straight-forward number crunching on data from a file, where throughput should be all that matters. Is it even multithreaded?

Paleloshow
Posts: 13
Joined: Mon Feb 11, 2019 9:23 pm

Re: New RT (Real Time) kernel branch

Thu May 02, 2019 11:28 pm

PhilE wrote:
Thu May 02, 2019 8:53 pm
Where's the realtime, latency-sensitive element? This just sounds like some straight-forward number crunching on data from a file, where throughput should be all that matters. Is it even multithreaded?
The ADC part of the project wil be done later. When I stated working with the Pi and Real time, I thought that real time was about running applications faster than usual. But going through the forum I realised that RT is about latency and interrupts from events, in order to make the Pi response faster (at least is what I got so far). I was using the iris.data file to se how long takes the Pi to run the Neural Network. But because it was taking 14 ms I thought that could be improved by the Real time, but it seems that I need to make the sampling section of the project before running the Neural Network. As for the multi threading, that means to make tasks with different priorities? I am still new I this Real time issues. Sorry if I am taking longer than I thought to understand this.

My reason for worrying so much about time is because power systems need a really fast response, like 1ms. During this time, the Pi needs to sample the voltage information, putting in a buffer, send to the measurement commands to calculate phasors, and then to the algorithm so the fault can be detected. If it does, trip signal must be sent from the Pi back to the grid, informing that a fault has occurred. I am afraid that putting different priorities will make the Pi not running those tasks in time.

As I said, I am new to RT, so I need all the help I can get.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Fri May 03, 2019 8:19 am

There may be a point in your project where real-time responsiveness becomes an issue, but you are a long way from there yet.

CPU intensive tasks are limited by clock speed and RAM access speed (both of which may be reduced if the processor is too hot or under-powered). The laws of physics make it hard to keep increasing clock speeds, so the way to increase the megaflops count is to run multiple cores in parallel. Taking advantage of a multi-core platform requires that a task can be broken down into smaller elements than can run concurrently. Some tasks are best arranged in a pipeline (the classic production line, where an item progresses through multiple stages of processing, all stages running concurrently but on different items), and some tasks are better submitted to a pool of workers, each of which runs the work item to completion before picking up the next. On a 4-core device it makes sense to have at least 4 worker threads - more if there is some I/O latency.

In your case you are a factor of 10 down on the required performance. Going multithreaded won't bridge that gap on its own - changing compiler optimisation settings might help a bit, and enabling NEON support can help some algorithms a lot, but I still think you may be out of luck.

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Sun Jun 30, 2019 11:30 pm

I just last week got around to using the guysoft RealtimePi converter to generate an installable version of stretch-full to a pi-3b that is running a big old Sheldon lathe using linuxcnc. That kernel still doesn't know anything about the pi's video engines so its still a framebuffer that runs close to a second behind the machine. But the machine is working noticeable smoother than it was with an earlier jessie rt-v7 kernel that had about a 70 microsecond latency.

And its a better realtime, servo-thread at 1 kilohertz measures about 28 u-secs which in good enough to run machinery with.

Now I'm thinking of going to a pi-4, but theres a huge problem, the spi driver for the mesa interface cards, rpspi.ko was written by Prof. Bertho Stultans so as to preclude its running on anything but a pi-3b. So that will need rebuilt to run on a pi-4. So we either need to find someone to fix that, or we need a new spi driver that can cope with 50+ megahertz spi clocking. Its not something I would feel confident to tackle as I had a pulmonary embolism, usually fatal but he wasn't ready for me, 4 years ago that cost me a few points in the IQ dept. That, and the age (84 now) of this wet ram are two strikes working on an out.

To guysoft, I tried yesterday to build a buster-rc2, but it didn't convert the 4.19.50 to an rt-v7 in the finished image. Has the instruction sequence been changed or need updating for buster? I was working with fresh git clones of both utilities.

Thank you for any help or encouragement.

Cheers, Gene84

maknu7
Posts: 1
Joined: Wed Jul 10, 2019 12:13 pm

Re: New RT (Real Time) kernel branch

Wed Jul 10, 2019 12:57 pm

Based on following sources: I got following results on a Raspberry Pi 3B+:
  • I've tried to configure and compile official rpi-4.19. rt on 4.19.50 based Raspbian Installation. All fine, but while booting I had various not resolvable problems to get running into the new kernel. Better but not clean boot at all I get with the rpi-4.19 on following builds (Direct Etcher Flash): http://unofficialpi.org/Distros/RealtimePi/nightly/
  • I've successfully configured and compiled the rpi 4.14. rt on 4.14.91 based Raspbian Installation and got clean boot at beginning. It could be an advice to go on rpi 4.14 rt at the moment.
Hope that helps. Have fun.

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Tue Sep 10, 2019 1:25 am

is that kernel now available as an installable deb? The new package manager sorting is no match for synaptic and I can't find the answer with its somewhat abbreviated comments.

And, does it also use the new video drivers on the rpi3b+? I am plumb amazed at the glxgears frame rate, its about 20x faster than jessie or stretch is with the standard non-rt kernel.

Thanks.

Gene1934

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Tue Sep 10, 2019 8:26 am

The RT kernel is officially hosted but not officially supported. The aim is to reduce the amount of effort required to use a -rt kernel by pre-applying the necessary RT patches, but we have no plans to build and distribute it.

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Tue Sep 10, 2019 11:03 am

This will be for the updated, new faster video, on a rpi-3b+ running buster.

Where can I get the src for 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l, complete with the config that built it? Uncommenting the deb-src lines doesn't seem to show the src for this kernel.

Then, do you have a current build and install it guide? link please.

Thanks, gene1934

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Mon Oct 07, 2019 3:15 pm

unfortunately, the build instructions above will not allow a preempt-rt build, so while it boots nice, there is no realtime.

Digging thru the github link, and looking at the bcm2711_defconfig, all the realtime stuff is missing. And if added to the .config, from another src, it is all removed by the first invocation of make before it actually starts the make.

The video works great, but there is nothing realtime about it. latencies are far worse than a default build.

How can I fix this?

Thanks all.

PhilE
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 3475
Joined: Mon Sep 29, 2014 1:07 pm
Location: Cambridge

Re: New RT (Real Time) kernel branch

Mon Oct 07, 2019 3:24 pm

1. You should be using bcm2709_defconfig for 3B+ (which you indicated is what you are building for) - bcm2711_defconfig is for 4B.
2. Both bcm2709_defconfig and bcm2711_defconfig in the rpi-4.19.y-rt branch (you did checkout the correct branch, didn't you?) define CONFIG_PREEMPT_RT_FULL, which result in the following settings:

Code: Select all

CONFIG_PREEMPT_RT_BASE=y
CONFIG_PREEMPT_RT_FULL=y
CONFIG_RT_MUTEXES=y
Which RT stuff is missing?

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Mon Oct 07, 2019 4:21 pm

Dumbass at work, I missed the -rt on my git pull, so I will start over on both the 3 and the 4, thank you a bunch.

gene1934

Gene1934
Posts: 56
Joined: Tue May 02, 2017 12:47 pm

Re: New RT (Real Time) kernel branch

Mon Oct 07, 2019 4:47 pm

I blew those pulls away on both boards and added -rt to the --branch make a 4.19.y-rt pull. No errors on either machine, but no diffs either, a make bcm27xx_defconfig still produces a .config thats missing the above configure lines to build a fully preemptable kernel. on both boards. What did I screw up this time?

Thanks

gene1934

Return to “Advanced users”