Memory leakage with multithreaded EGL/OpenVG programming


14 posts
by blackshard83 » Mon Jan 13, 2014 8:18 am
Hello guys, I'm new here on the forum.

I'd like to confront myself with some problems with multithreaded graphics programming I've run into.
I am working on a python project where there is a lot of 2D graphical interaction. I built myself some libraries in C/C++ to access low level EGL/openVG primitives and I'm using raspbian distro as base.

I am using multithreading for such things like fade in and fade out features: the main thread render frames and then fade in and fade out threads overlay a rectangle with proper opacity over the rendered frame to simulate fade in and fade out effect.

I realized myself that I have to:

1) bind the proper API for each thread I'm using with eglBindAPI call. If I don't do this, the behavior of subsequent egl calls is undefined (ie: they may work or not...)
2) use eglMakeCurrent to bind a context and a surface to a thread, do the rendering work, and then call again eglMakeCurrent to unbind them to allow binding on another thread

The problem I'm facing is that the more threads I generate, the more memory gets used. I create a new thread each time a fade in/out effect is required and I noticed that each thread increases the memory usage of my application, actually causing a memory leak by EGL implementation.
I tried to call eglReleaseThread function just before a thread terminates. Documentation says that eglReleaseThread should mark EGL thread state to be freed as soon as possible, but resources never get released and the leak is still there.

Has anyone had experience with such problem? I wonder if there's room to fill a bug request for broadcom/raspberry foundation to address this.
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by blackshard83 » Fri Jan 31, 2014 10:31 am
I did some experiments but hadn't found a solution to this memory leak.

To either bump up this thread and give a simple proof everyone can run, I made a little program.
It just initializes the EGL environment and do an on screen color fading using a newly generated thread on each iteration.

Inside the .tar.gz archive there are three files:
leak_test - the executable binary file
leak_test.c - the source code
compile.sh - a shell script that simply invokes cc with proper include paths and libraries

You can watch the process acquiring more and more memory as it runs even using top.

BTW I'm using raspbian with kernel 3.6.11+, I'll test later an updated raspbian to see if there's still the problem.

edit: tried also on latest raspbian with kernel 3.10.28+ and the leak is still there :(
Attachments
leak_test.tar.gz
(11.67 KiB) Downloaded 98 times
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by jamesh » Sat Feb 01, 2014 10:51 am
Have you run the program in efence or similar (does valgrind work yet on Pi?) to see where the memory leak is coming from? Also, you can instrument the code - flag all allocates and deallocates to see if they match up.
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11911
Joined: Sat Jul 30, 2011 7:41 pm
by blackshard83 » Sun Feb 02, 2014 12:48 pm
I didn't try any memory leak detector yet.
I'm pretty sure the leak is in the egl side, but I'm not sure if the leak happens because I forgot to setup correctly the egl state, it is expected* or it is a real implementation fault.

Just calling eglBindApi and then immediately calling eglReleaseThread inside a thread causes the egl implementation to allocate some context memory for the thread state that never gets freed.

* EGL documentation says that eglReleaseThread schedules the thread additional information to be release as soon as possible... clearly it may happen that, in the broadcom implementation, as soon as possibile is "at the end of the process", but since there's no source code I can't check it out :|
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by jamesh » Sun Feb 02, 2014 5:27 pm
You should have all the source code for the ARM side EGL code. Check out the raspi github pages (they are not installed as standard I don't think). Since your memory leak appears to be on the ARM it's likely to be in the open source EGL code if not in your code.
Soon to be employed engineer - Hurrah! Volunteer at the Raspberry Pi Foundation, helper at PiAcademy September 2014.
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 11911
Joined: Sat Jul 30, 2011 7:41 pm
by blackshard83 » Sun Feb 02, 2014 6:18 pm
Ok, I took a look at the source code...

I found the implementation of eglReleaseThread function in file userland/interface/khronos/egl/egl_client.c.
At the row 1249 of such file I see this suspect comment...:

Code: Select all
      //TODO free thread state?


:o
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by StuartF » Sun Feb 02, 2014 6:22 pm
I also had this kind of problem, but found that adding usleep( 10000 ) after the eglReleaseThread() gave enough time for egl to release its memory allocations before the calling thread terminated.
'release as soon as possible' is not the same as 'release immediate'.
Posts: 1
Joined: Sun Feb 02, 2014 5:41 pm
by blackshard83 » Sun Feb 02, 2014 11:26 pm
Thanks for the hint. I'll try the workaround and report if it worked.

edit: I tried to add a usleep (10000); as suggested after eglReleaseThread in the leak_test example provided above but no luck... to me it looks like there just isn't the code to free the thread state memory in the implementation.
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by oldnpastit » Mon Feb 03, 2014 7:32 pm
I agree, it looks like the code is missing.

eglReleaseThread() calls platform_hint_thread_finished() which doesn't do anything.

That ought to be OK though because in platform_tls_get(), in khrn_client_platform_linux.c, it does:

Code: Select all
      vcos_thread_at_exit(client_thread_detach, NULL);


Valgrind will probably reveal the answer. Got to go now but might be able to have a look later.
Posts: 31
Joined: Wed Dec 04, 2013 7:57 am
by oldnpastit » Mon Feb 03, 2014 8:28 pm
If you change the loop in leak_test.c as below then the problem goes away:

Code: Select all
      VCOS_THREAD_T th;
      vcos_thread_create(&th, "foo", NULL, thread_func, egl_state);
      vcos_thread_join(&th,NULL);
      //pthread_create (&thread_handle, NULL, thread_func, (void *)egl_state);
      //pthread_join (thread_handle, NULL);


(This isn't a fix obviously!).

The reason is that the Khronos library is relying on the call to vcos_thread_at_exit() to sort out the cleanup. For a thread created using vcos_thread_create() that all works fine. But for a thread just created with pthread_create(), this doesn't work. For starters, since the thread wasn't created with VCOS, it has no hook for actually being called when the thread exits.

One solution is to find a way to be called back when a thread exits, and another is to change the Khronos code (but I'm not sure how).
Posts: 31
Joined: Wed Dec 04, 2013 7:57 am
by oldnpastit » Mon Feb 03, 2014 9:38 pm
This seems to fix it for me:

https://github.com/luked99/userland/tre ... emleak_fix

Caveat: all I've tested is leak_test.c. I'd be happier if valgrind was quieter, but I think it needs to be told about how to correctly track vchiq calls, so it has a lot of false positives at the moment.
Posts: 31
Joined: Wed Dec 04, 2013 7:57 am
by blackshard83 » Tue Feb 04, 2014 12:40 pm
I'll check that out as soon as possible. I hope there aren't side-effects, but I have the chance to do a stress test on a real long-running process that will make evident any leak.
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am
by oldnpastit » Tue Feb 04, 2014 7:00 pm
I've just pushed (rewriting history) a slightly nicer version that makes valgrind (on a standalone test) much happier (9ebccf1).
Posts: 31
Joined: Wed Dec 04, 2013 7:57 am
by blackshard83 » Tue Feb 18, 2014 5:06 pm
I did some preliminar tests, leaving the application running all the night and didn't notice memory leaks. I'll do some more long-term empirical testings and report here. Anyway to me the leaks seems pretty well fixed.
Posts: 16
Joined: Fri Jan 10, 2014 8:31 am