I sort of have a fuzzy understanding of some of the parts of a cpu, but am not really au fait with how things really work or the timing and speed of hardware.
Is there any short "rpi optimization for idiots" type guide available that would give beginners general useful tips and stats about speed of different parts of the bcm hardware and how they fit together?
As said I really am a newbie to any sort of optimization or low level stuff on the rpi and what I may be asking is stupid as I am rather uninformed but........
I am thinking about things like:
- DMA, can it write to ARM cache memory, how fast is it (setup time, transfer times, etc), interaction with ARM MMU, address types (virtual/flat/paged)
- MMU what are the specs of the ARM MMU, transfer speed setup time, wait states, cache interactions, does the cpu idle wait while memcpys occur, comparison and reasons for mmu or DMA and if possible for them to interact problems that could occur.
- VC4 comunication, how does the vc4 and ARM exchange data, how does memory split function, speeds and types of transfer, locking, DMA, is it possible to share memory space between ARM and VC4, does the VC4 have faster cache memory available or other possible ways of speeding?
- ARM cache memory, how does this get used, stats like speed, load times, kernel code that works out how to cache data or program code, is it possible to lock program code into cache from userspace and prevent kernel flushing it?
- Mesa/X11/Wayland memory transfers and sharing and mapping gles/egl/openvg buffers and code into windowing systems, are there any basic optimization or coding methods that everyone should know to get the best speed from the 3d hardware in the VC4, low level but also optimizations for those writing code that just runs on the librarys. Also how the rpi Wayland works with Mesa as I got the feeling the collabora Wayland was based on dispman rather than EGL?
I think with this type of stuff it needs some very talented experts to lay foundations and then give good explanations so others can pitch in making bug reports or bug fixes and actually using the infrastructure in the best way. I know there is no substitute for reading the source code, but an overview really makes code reading much easier, and also with wayland there seems very little application programming example code available yet.
I hope this post makes sense, and I hope others can offer some advice.
