Colander
Posts: 5
Joined: Tue Jan 31, 2017 2:56 pm

Multithreading CPU usage

Thu Feb 02, 2017 10:14 pm

I am having trouble getting the most performance out of multithreaded Java applications on my Raspberry Pi 2. Whenever I try to'run code in multiple threads, the maximum CPU load of the Java process stays at about 140% (of a core). I have tried running multiple processes (4) at once and it did end up using all of the CPU.
The code runs perfectly on my PC. I have tried different JDKs and many ways of using multiple threads - the Thread class, parallel streams...
I am running stock Raspbian.
Thanks in advance!

knute
Posts: 608
Joined: Thu Oct 23, 2014 12:14 am
Location: Texas
Contact: Website

Re: Multithreading CPU usage

Fri Feb 03, 2017 4:42 pm

So what are you using to measure your 140%? Can you make an SSCCE that I can try? I've gotten some multi-threaded Java programs to use all the cores but I had to use top to see it and then it gave some interesting information. Post an SSCCE that demonstrates your problem.

Colander
Posts: 5
Joined: Tue Jan 31, 2017 2:56 pm

Re: Multithreading CPU usage

Sat Feb 04, 2017 11:58 am

This is the code I was running (if you try it, feel free to redirect the output to /dev/null)

Code: Select all

public class Main {
    public static void main(String[] args) {
        new Thread() {
            @Override
            public void run() {
                for (int i = 0; i < 10000000; i++) {
                    System.out.println("THREAD #0 " + i);
                }
            }
        }.start();
        new Thread() {
            @Override
            public void run() {
                for (int i = 0; i < 10000000; i++) {
                    System.out.println("THREAD #1 " + i);
                }
            }
        }.start();
        new Thread() {
            @Override
            public void run() {
                for (int i = 0; i < 10000000; i++) {
                    System.out.println("THREAD #2 " + i);
                }
            }
        }.start();
        new Thread() {
            @Override
            public void run() {
                for (int i = 0; i < 10000000; i++) {
                    System.out.println("THREAD #3 " + i);
                }
            }
        }.start();
    }
}
The CPU usage was measured using top, when I tried it this morning, it capped at around 160%.

EULERPI
Posts: 51
Joined: Sun May 15, 2016 2:44 pm

Re: Multithreading CPU usage

Sat Feb 04, 2017 12:56 pm

Hi,

I think that on modern operating systems when a program asks the CPU to do many forms of Input/Output (things that take a lot longer than the CPU takes to execute an instruction) one of the things the program/ O/S (under the bonnet) does is to report the CPU as not being active doing that thread until that I/O or parts of is complete.

So as your program seems to mainly execute Output 'operation's (system.out.println) in the threads that may explain why you haven't seen 400% reported by top on a Pi 2 with its four cores.

I would expect if you put the system.out.println part of the program outside the for loop (just as a test) you might get nearer to 400% (I don't yet know how to execute a Java script to test that).

Regards

Nick

Colander
Posts: 5
Joined: Tue Jan 31, 2017 2:56 pm

Re: Multithreading CPU usage

Sat Feb 04, 2017 1:04 pm

EULERPI wrote:Hi,

I think that on modern operating systems when a program asks the CPU to do many forms of Input/Output (things that take a lot longer than the CPU takes to execute an instruction) one of the things the program/ O/S (under the bonnet) does is to report the CPU as not being active doing that thread until that I/O or parts of is complete.

So as your program seems to mainly execute Output 'operation's (system.out.println) in the threads that may explain why you haven't seen 400% reported by top on a Pi 2 with its four cores.

I would expect if you put the system.out.println part of the program outside the for loop (just as a test) you might get nearer to 400% (I don't yet know how to execute a Java script to test that).

Regards

Nick
I thought the same thing, but when I reduce the program to just 1 thread and run it 4 times at once, it does use all 400%.

EULERPI
Posts: 51
Joined: Sun May 15, 2016 2:44 pm

Re: Multithreading CPU usage

Sat Feb 04, 2017 1:36 pm

Hi,

I think that's because your program does lots of Output operations which causes the program to 'pause' and report back to the O/S that it can't continue execution until the next part of the Output operation is complete.

I'm guessing here but making your Java program run four times (instances) at once doesn't get up to 400% because of:

1) The multitasking nature of an operating system is not 100% efficient - there is probably a minimum time set to wait for an complex I/O operations to complete before handing back to the O/S. Also CPU register values probably have to moved to/from caches/RAM - and the caches/RAM are slower than the CPU.

2) All the instances of your single thread variant program run 4 times in parallel are Outputing to the same resource (via system.out.println) if the resource has any CPU software involved in its implementation I would expect there will be extra delay for each instance of the program.

Note: I expect if a program was compiled as opposed to interpreted, and the code plus data space was less than the size of the level 1 cache, did very little I/O, no floating point and was executed 4 times in parallel at a time when the Pi2 / O/S was doing little else then top might report as near to 400% as the speed of the level 1 cache allowed.

Regards

Nick

Colander
Posts: 5
Joined: Tue Jan 31, 2017 2:56 pm

Re: Multithreading CPU usage

Sun Feb 05, 2017 3:15 pm

Thank you for the help, I was not aware of all the issues the IO could cause, this was obviously a bad example. The main aim of my project is focused around floating point computation, when I tried making an example for this case, I was able to use all 400%.

Code: Select all

new Thread() {
            public void run() {
                double sum = 0;
                for (long i = 0; i < 10000000000L; i++) {
                    sum += i * Math.PI * Math.E;
                }
                System.out.println(sum);
            }
        }.start();


EULERPI
Posts: 51
Joined: Sun May 15, 2016 2:44 pm

Re: Multithreading CPU usage

Sun Feb 05, 2017 4:27 pm

Hi,

Glad to be of help. Your question prompted me to learn how to prepare and run a Java program.

I moved the system.out.println to outside the for loop and made the for loops longer using long integer types and values and got all CPUs reported as 100% in htop.

It also sent the SoC temperature of the Pi 3 up to 83 degrees centigrade (in a Pi-Blox case).

Regards

Nik

knute
Posts: 608
Joined: Thu Oct 23, 2014 12:14 am
Location: Texas
Contact: Website

Re: Multithreading CPU usage

Sun Feb 05, 2017 5:22 pm

The console I/O is slow and I think there is synchronization to prevent mixing output from one thread to another that slows it down even more. I made some test code from yours which produces some interesting results on my Pi3.

Code: Select all

import java.util.*;
import java.util.stream.*;

public class test {
    static volatile double all4;

    public static void main(String... args) throws Exception {
        long start = System.currentTimeMillis();
        double sum = LongStream.range(0,400000000L).
         parallel().
         mapToDouble(i -> i * Math.PI * Math.E).
         sum();
        long stop = System.currentTimeMillis();
        
        System.out.printf("%d:%e\n",stop-start,sum);

        start = System.currentTimeMillis();
        sum = LongStream.range(0,400000000L).
         parallel().
         sum() * Math.PI * Math.E;
        stop = System.currentTimeMillis();

        System.out.printf("%d:%e\n",stop-start,sum);

        Thread t1 = new Thread(() -> {
            double sum1 = 0.0;
            for (long i=0; i<100000000L; i++)
                sum1 += i * Math.PI * Math.E;
            all4 += sum1;
        });
        Thread t2 = new Thread(() -> {
            double sum2 = 0.0;
            for (long i=100000000; i<200000000L; i++)
                sum2 += i * Math.PI * Math.E;
            all4 += sum2;
        });
        Thread t3 = new Thread(() -> {
            double sum3 = 0.0;
            for (long i=200000000; i<300000000L; i++)
                sum3 += i * Math.PI * Math.E;
            all4 += sum3;
        });
        Thread t4 = new Thread(() -> {
            double sum4 = 0.0;
            for (long i=300000000; i<400000000L; i++)
                sum4 += i * Math.PI * Math.E;
            all4 += sum4;
        });

        start = System.currentTimeMillis();
        t1.start();
        t2.start();
        t3.start();
        t4.start();
        t1.join();
        t2.join();
        t3.join();
        t4.join();
        stop = System.currentTimeMillis();

        System.out.printf("%d:%e\n",stop-start,all4);
    }
}
produces:

pi@raspberrypi:~ $ java test
17975:6.831787e+17
7551:6.831787e+17
5777:6.831787e+17

on my Pi3 and:

C:\Users\Knute Johnson>java test
328:6.831787e+17
62:6.831787e+17
422:6.831787e+17

on my Dell desktop running Windows 10.

I'm not sure why on the Pi that the first method utilizing the stream is so much slower than the multi-thread method. The results on my desktop are more of what I would expect.

Anyway it was interesting.

knute
Posts: 608
Joined: Thu Oct 23, 2014 12:14 am
Location: Texas
Contact: Website

Re: Multithreading CPU usage

Sun Feb 05, 2017 5:29 pm

and running it on the 'server' VM significantly improves the performance:

pi@raspberrypi:~ $ java -server test
5373:6.831787e+17
1000:6.831787e+17
2144:6.831787e+17

Colander
Posts: 5
Joined: Tue Jan 31, 2017 2:56 pm

Re: Multithreading CPU usage

Sat Feb 18, 2017 12:38 am

knute wrote:and running it on the 'server' VM significantly improves the performance:

pi@raspberrypi:~ $ java -server test
5373:6.831787e+17
1000:6.831787e+17
2144:6.831787e+17
I just tried it on my project and got about 25% performance gain, thanks for the tip!

Return to “Java”