Go to advanced search

by Akane
Sun Jul 12, 2020 12:44 pm
Forum: General programming discussion
Topic: gpu_fft for RPi4?
Replies: 22
Views: 1246

Re: gpu_fft for RPi4?

Though there must exist an internal document of VC6 QPU, it is not publicly available (c.f. a post by jamesh ). However, you can refer to the open-source Mesa library , which generates QPU instructions from its internal IR, for the information of the instruction set. Also, py-videocore6 has some wor...
by Akane
Sun Jul 12, 2020 12:17 pm
Forum: General programming discussion
Topic: gpu_fft for RPi4?
Replies: 22
Views: 1246

Re: gpu_fft for RPi4?

The V3D DRM driver works only on RPi 4, so you need to use py-videocore for older models instead. Or, you can use our C libraries listed in https://gist.github.com/Terminus-IMRC/c5d1f6f78c890c26947a4553296b50d6 . The instruction set and architecture of VC6 QPU have changed from VC4 QPU e.g. the numb...
by Akane
Sun Jul 12, 2020 3:03 am
Forum: General programming discussion
Topic: gpu_fft for RPi4?
Replies: 22
Views: 1246

Re: gpu_fft for RPi4?

It seems that the EXECUTE_QPU Mailbox call is no longer usable on Pi4 (system freezes when you use it). However, py-videocore6 already has the functionalities to allocate/free memory and post a QPU job via the V3D DRM kernel driver. The reference codes under the "examples" directory may help. Please...
by Akane
Fri Jul 03, 2020 10:07 am
Forum: Bare metal, Assembly language
Topic: "Execute Code" (and other) mailbox tags?
Replies: 10
Views: 714

Re: "Execute Code" (and other) mailbox tags?

Yes, but Pi 3 has another processor named VideoCore IV QPU in addition to ARM and VPU, which means that three processors run in parallel!
by Akane
Thu Jul 02, 2020 1:02 pm
Forum: General discussion
Topic: Any benifit to overclocking gpu_freq for purely headless users?
Replies: 11
Views: 694

Re: Any benifit to overclocking gpu_freq for purely headless users?

I learned so many things from Raspberry Pi and QPU that I could never know, and with the experience, I got a stimulating part-time job that altered my whole life though I'm still a student. So I think it's not bad to learn QPU :D
by Akane
Thu Jul 02, 2020 11:09 am
Forum: Bare metal, Assembly language
Topic: "Execute Code" (and other) mailbox tags?
Replies: 10
Views: 714

Re: "Execute Code" (and other) mailbox tags?

No, some applications use the execute_code call for code acceleration on VPU: https://github.com/raspberrypi/firmware/issues/1140#issuecomment-494380175 I remember Kodi uses that for video/audio encoding/decoding. The instruction set of VPU was reverse-engineered by hermanhermitage and others: https...
by Akane
Thu Jul 02, 2020 9:57 am
Forum: General discussion
Topic: Any benifit to overclocking gpu_freq for purely headless users?
Replies: 11
Views: 694

Re: Any benifit to overclocking gpu_freq for purely headless users?

Any benifit to overclocking gpu_freq for purely headless users? ANY benefit ? Yes. As always: The exact answer depends on the details. Not every headless PI does without the GPU. Example: If your headless Pi use the vector-processing-QPUs in the videocore there are benefits in increasing gpu-freq. ...
by Akane
Wed Jun 17, 2020 1:15 am
Forum: General programming discussion
Topic: Pi4B GPU GFlops comparable to Intel i7 FPU?
Replies: 3
Views: 791

Re: Pi4B GPU GFlops comparable to Intel i7 FPU?

Hi. I'm a developer of py-videocore4. Thank you for testing it! And sorry for the inconvenience during the installation. We've updated the instruction 12 days ago, so it might become easier for you. To get the maximum performance, you need force_turbo=1 line in your config.txt. This keeps the QPU (V...
by Akane
Fri Oct 11, 2019 11:31 pm
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

Cannot give any schedules, Japan is a particularly long winded process requiring lots of documentation changes plus changes to the silk screening on the board. Check out Roger's blog post on the subject. https://www.raspberrypi.org/blog/compliance-and-why-raspberry-pi-4-may-not-be-available-in-your...
by Akane
Thu Oct 10, 2019 4:07 pm
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

So now we have an electromagnetic anechoic box from Micronix, and we are writing GPGPU library for VideoCore VI: https://github.com/Idein/py-videocore6 . It's working very well with simple loops e.g. https://github.com/Idein/py-videocore6/blob/master/tests/test_qpu.py and should work with more compl...
by Akane
Sat Sep 28, 2019 12:52 am
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

OK, so it appears that the VideoCore6 specifications are not publicly available, so I am afraid I cannot comment on your question with regard to the instructions. You may be able to glean something from the publicly available Mesa driver. I've read the Mesa code many times, only to find that Mesa d...
by Akane
Fri Sep 27, 2019 2:19 pm
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

Thanks. I see.
by Akane
Fri Sep 27, 2019 12:11 pm
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

@jamesh We are trying to create py-videocore6, an easy programming environment for VideoCore VI QPU, like the one for VideoCore IV QPU https://github.com/nineties/py-videocore . For now, we have difficulty in using VPM and its DMA. Do you know what values to be fed to the vpmsetup instruction for DM...
by Akane
Fri Sep 27, 2019 5:03 am
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

No, nope. The correct theoretical performance of the GPUs is as follows: VideoCore IV @ 250MHz: 250 [MHz] x 3 [slice] x 4 [qpu/slice] x 4 [processor] x 2 [op/clock] = 24 Gflop/s VideoCore IV @ 300MHz: 300 [MHz] x 3 [slice] x 4 [qpu/slice] x 4 [processor] x 2 [op/clock] = 28.8 Gflop/s VideoCore VI @...
by Akane
Wed Sep 18, 2019 11:16 pm
Forum: Bare metal, Assembly language
Topic: mmap() fails
Replies: 9
Views: 2502

Re: mmap() fails

TonySterrett wrote:
Wed Sep 18, 2019 12:55 pm
Akane wrote:
Wed Sep 18, 2019 12:33 am
offset should be 0xFE000000?
Yes this is the base address for the Pi 4
The log shows that

Code: Select all

mapmem: offset = 0
which indicates that the offset is mistakenly set to 0 instead of 0xFE000000 I suspect.
by Akane
Wed Sep 18, 2019 12:33 am
Forum: Bare metal, Assembly language
Topic: mmap() fails
Replies: 9
Views: 2502

Re: mmap() fails

offset should be 0xFE000000?
by Akane
Tue Sep 10, 2019 6:23 am
Forum: General discussion
Topic: Pi 4 - full specification of VideoCore 6
Replies: 97
Views: 35924

Re: Pi 4 - full specification of VideoCore 6

No, nope. The correct theoretical performance of the GPUs is as follows: VideoCore IV @ 250MHz: 250 [MHz] x 3 [slice] x 4 [qpu/slice] x 4 [processor] x 2 [op/clock] = 24 Gflop/s VideoCore IV @ 300MHz: 300 [MHz] x 3 [slice] x 4 [qpu/slice] x 4 [processor] x 2 [op/clock] = 28.8 Gflop/s VideoCore VI @ ...
by Akane
Sat Oct 14, 2017 10:01 am
Forum: Bare metal, Assembly language
Topic: GPU LED blinking
Replies: 2
Views: 1090

Re: GPU LED blinking

Dom, who is an engineer at Pi Towers, said QPU cannot access the peripheral bus: viewtopic.php?f=72&t=128309. So it's unlikely I think.
by Akane
Sat Apr 08, 2017 9:18 am
Forum: Graphics programming
Topic: Questions about VideoCore IV GPU
Replies: 20
Views: 16195

Re: Questions about VideoCore IV GPU

It's great to hear that you are doing well. Please continue rocking!

(And sorry for my mistakes in English!)
by Akane
Thu Apr 06, 2017 3:51 pm
Forum: Graphics programming
Topic: Questions about VideoCore IV GPU
Replies: 20
Views: 16195

Re: Questions about VideoCore IV GPU

Thanks for the quick answer, The address can be different across QPU threads, that is, you can read up to 16 x 4bytes of memory on a TMU read. Are the 16 * 4 Bytes the 16 SIMD-elements times 4 Bytes data-type? If so, what does this have to do with threads? Or has the TMU a cache of 16 requests Yes ...
by Akane
Thu Apr 06, 2017 11:25 am
Forum: Graphics programming
Topic: Questions about VideoCore IV GPU
Replies: 20
Views: 16195

Re: Questions about VideoCore IV GPU

For random accesses, I recommend to use TMU because it's more flexible than VPM DMA. To do TMU read, you do: 1. Write memory address (aligned with 4 byte) to TMU[01]_S. The address can be different across QPU threads, that is, you can read up to 16 x 4bytes of memory on a TMU read. 2. Signal the TMU...
by Akane
Sat Mar 25, 2017 9:36 am
Forum: Interfacing (DSI, CSI, I2C, etc.)
Topic: Way to sync CPU-side L2C and GPU-side one?
Replies: 6
Views: 1675

Re: Way to sync CPU-side L2C and GPU-side one?

Thank you for your reply! You are correct! Eventually I noticed that I must use non-bufferable memory for DMA operation since L2C on the ARM is speculative. i.e. We cannot know when the memory is fetched if it's being mapped. I changed the flow to unmap the memory instead of (3) and re-map it instea...
by Akane
Mon Mar 20, 2017 7:12 am
Forum: Interfacing (DSI, CSI, I2C, etc.)
Topic: Way to sync CPU-side L2C and GPU-side one?
Replies: 6
Views: 1675

Re: Way to sync CPU-side L2C and GPU-side one?

I'm using this code https://github.com/Terminus-IMRC/qmkl/blob/master/src/memory.c#L111 for memory allocation, which is almost the same flow as hello_fft. However, instead of mapping memory on /dev/mem, I'm using /dev/vc-mem to change vma->vm_page_prot: https://github.com/Idein/linux/blob/rpi-4.9.y-...
by Akane
Sun Mar 19, 2017 12:13 pm
Forum: Interfacing (DSI, CSI, I2C, etc.)
Topic: Way to sync CPU-side L2C and GPU-side one?
Replies: 6
Views: 1675

Re: Way to sync CPU-side L2C and GPU-side one?

Thank you for your reply! You need to be careful when specifying "L2". The V3D has a unified L2 cache for pixel data and QPU instructions but there is also a system-level L2 cache. So Raspberry Pi 2&3 have three L2 caches (V3D unified, V3D system-level and ARM unified) ? Whether memory operations ge...
by Akane
Sat Mar 18, 2017 8:37 pm
Forum: Interfacing (DSI, CSI, I2C, etc.)
Topic: Way to sync CPU-side L2C and GPU-side one?
Replies: 6
Views: 1675

Way to sync CPU-side L2C and GPU-side one?

Hello. I'm finding a way to fix this L2 cache-related problem. Here, let "mapped memory" be a memory which is allocated through the Mailbox interface and mmap()'ed on /dev/mem. And let "normal memory" be a memory which is allocated by using malloc(). I noticed that accessing mapped memory is relativ...

Go to advanced search