User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Sat Mar 21, 2015 12:55 pm

C_D wrote:If I could isolate a 'window' of pixels that required updating, is there a sensible way of copying that across to the new framebuffer? Ive got a few ideas for isolating the region of interest.
The only way to do this is through one or more calls to write on the device (or calls to memcpy or memory writes if the device is mmap'ed). The kernel.org documentation on the framebuffer is possibly an interesting read.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Sun Mar 22, 2015 12:21 am

I have another version. This time I am only copying pixels that are different onto the framebuffer. This should be as good as it gets, however it may add a little overhead as it iterates through the whole buffer for each frame displayed. I really should profile the different approaches to see which is the fastest.

C_D
Posts: 18
Joined: Tue Mar 10, 2015 9:51 pm
Location: New Zealand

Re: Speeding up rpi-fbcp

Mon Mar 23, 2015 9:33 pm

Trialed your new version of raspi2fb today, works nicely :D

I'm getting around 4-5% cpu usage with it running as a daemon @ 10fps and around 10% cpu usage @ 30fps.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Tue Mar 24, 2015 3:37 am

C_D wrote:Trialed your new version of raspi2fb today, works nicely :D

I'm getting around 4-5% cpu usage with it running as a daemon @ 10fps and around 10% cpu usage @ 30fps.
That is great. I am glad that it is useful.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 20777
Joined: Sat Jul 30, 2011 7:41 pm

Re: Speeding up rpi-fbcp

Tue Mar 24, 2015 8:35 am

It does sound like something that NEON might help to optimise...

No, I don't know NEON assembler. But there are some good hints on how to write C that makes it easier to convert to NEON by the compiler out there in Google land.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Please direct all questions to the forum, I do not do support via PM.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Tue Mar 24, 2015 12:39 pm

jamesh wrote:...But there are some good hints on how to write C that makes it easier to convert to NEON by the compiler out there in Google land.
Thanks James, I will have a look. Only issue for me is that neither of my LCDs are attached to a Raspberry Pi 2 B. I have one attached to a Model B (MZTX-PI-EXT 2.5" 320x240) and one attached to a Model A (tinylcd 480x320). I am sure I could attach either to my Pi 2, but to be honest I have other projects that I would like to start.

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Wed May 06, 2015 9:50 pm

Ive been reading this thread because I am having the same issue as the OP. I finally got eclipse setup to where I can cross compile for the rpi 2 including opengl es 2 on my linux box, including remote debugging. My next objective was to tackle the tearing effect Im getting on my 480x320 screens. Since a less than decent frame rate would make 3d graphics on the lcd pointless.

I have a ozzmaker and a jbtek 3.5 inch lcd. Both do the samething, that is tear when a full screen is written, but not on x. Which I am sure x windows is doing a block transfer for the areas needing to be updated. I did notice something, in fbtft_device.c many of the display defintions limit the spi frequency to 32Mhz, which translates to 8 fps roughly assuming a resolution of 480x320 and a pixel is 888, 11 if its 656 or something odd like that, and my math is correct. Also, many of the bit shift registers they use on these screens shares that limitation.

I was also wondering if the snapshot and the framebuffer dump to the spi were colliding. Another words fbcp is trying to copy but the fb is locked while the fb is dumped by dma to the spi, causing some latency issues. especially since the snapshot is being spun on and is effectivily locking the spi transfer out. Just a thought as I ran into this same problem with another chip on a diff screen while writing my own driver for the SSD1963, which gave me a 8 bit parallel in and true rgb out, i got it solved and ended up with about 20fps on a much much slower chip. Lastly does anyone have a screen that connects directly to the pi's spi without using any external shift registers on the circuit board?

C_D
Posts: 18
Joined: Tue Mar 10, 2015 9:51 pm
Location: New Zealand

Re: Speeding up rpi-fbcp

Thu May 07, 2015 11:13 am

I am using a different screen so I cant comment much on your hardware, but I can say that the way rpi-fbcp updates the framebuffer results in a full screen rewrite every frame. X outputting directly to fb1 feels a lot faster because it does windowed updates and doesnt rewrite the whole screen if it is not required. This makes mouse movements and typing text a lot more responsive even though the screen itself is pretty slow. AndyD's variation on rpi-fbcp does do windowed updates to the framebuffer, there is a higher cpu overhead but I find for my screen it results in a much better overall user experience as mousing and typing become much snappier.

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Thu May 07, 2015 7:04 pm

I am going to try his code, I think I may have found a bug just by reading the source on github, if anything I want to see how it improves screen performance, then I will play with his code to see if I can make it even quicker. Found another possibility for speeding things up as well. Ill report back when I have something more to add.

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Thu May 07, 2015 8:39 pm

Is there a git source for bsd/libutil.h ? I need to resolve the reference and I have several libutil.h .so .a files on my box. Dont know which one was used. I wanna get the source files into a usable path for eclipse so i can debug it remotely.

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Thu May 07, 2015 9:24 pm

I changed the following code...

Code: Select all

uint32_t pixels = vinfo.xres * vinfo.yres * 3;

Code: Select all

uint32_t *fbIter = fbp;
        uint32_t *frontCopyIter = frontCopyP;
        uint32_t *backCopyIter = backCopyP;

Code: Select all

for (pixel = 0 ; pixel < (pixels/4) ; pixel++)
        {
            if (*frontCopyIter != *backCopyIter)
            {
                *fbIter = *frontCopyIter;
            }

            ++frontCopyIter;
            ++backCopyIter;
            ++fbIter;
        }
It seems to be running faster to me with these changes, but I could be biased. The ozzmaker piscreen is running at 32Mhz on the spi, like I said before though the screen is still limited to 8 fps, I may have a work around for that as well and maybe get framerate above 20, all of the latency is the speed of the spi. Id so rather be doing an 8,16 or 24 bit paralell. which is plan d. Here the full raspi2fb.c with changes if youd like to try it.

Code: Select all

//-------------------------------------------------------------------------
//
// The MIT License (MIT)
//
// Copyright (c) 2015 Andrew Duncan
//
// Permission is hereby granted, free of charge, to any person obtaining a
// copy of this software and associated documentation files (the
// "Software"), to deal in the Software without restriction, including
// without limitation the rights to use, copy, modify, merge, publish,
// distribute, sublicense, and/or sell copies of the Software, and to
// permit persons to whom the Software is furnished to do so, subject to
// the following conditions:
//
// The above copyright notice and this permission notice shall be included
// in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
// OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
// IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
// CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
// TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
// SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
//
//-------------------------------------------------------------------------

#define _GNU_SOURCE

#include <errno.h>
#include <fcntl.h>
#include <getopt.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <syslog.h>
#include <unistd.h>

#include <bsd/libutil.h>

#include <linux/fb.h>

#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/time.h>

#include "bcm_host.h"

#include "syslogUtilities.h"

//-------------------------------------------------------------------------

#define DEFAULT_DEVICE "/dev/fb1"
#define DEFAULT_DISPLAY_NUMBER 0
#define DEFAULT_FPS 10

//-------------------------------------------------------------------------

volatile bool run = true;

//-------------------------------------------------------------------------

void
printUsage(
    FILE *fp,
    const char *name)
{
    fprintf(fp, "\n");
    fprintf(fp, "Usage: %s <options>\n", name);
    fprintf(fp, "\n");
    fprintf(fp, "    --daemon - start in the background as a daemon\n");
    fprintf(fp, "    --device <device> - framebuffer device");
    fprintf(fp, " (default %s)\n", DEFAULT_DEVICE);
    fprintf(fp, "    --display <number> - Raspberry Pi display number");
    fprintf(fp, " (default %d)\n", DEFAULT_DISPLAY_NUMBER);
    fprintf(fp, "    --fps <fps> - set desired frames per second");
    fprintf(fp, " (default %d frames per second)\n", DEFAULT_FPS);
    fprintf(fp, "    --pidfile <pidfile> - create and lock PID file");
    fprintf(fp, " (if being run as a daemon)\n");
    fprintf(fp, "    --help - print usage and exit\n");
    fprintf(fp, "\n");
}

//-------------------------------------------------------------------------

static void
signalHandler(
    int signalNumber)
{
    switch (signalNumber)
    {
    case SIGINT:
    case SIGTERM:

        run = false;
        break;
    };
}

//-------------------------------------------------------------------------

int
main(
    int argc,
    char *argv[])
{
    const char *program = basename(argv[0]);

    int fps = DEFAULT_FPS;
    suseconds_t frameDuration =  1000000 / fps;
    bool isDaemon =  false;
    uint32_t displayNumber = DEFAULT_DISPLAY_NUMBER;
    const char *pidfile = NULL;
    const char *device = DEFAULT_DEVICE;

    //---------------------------------------------------------------------

    static const char *sopts = "df:hn:p:D:";
    static struct option lopts[] = 
    {
        { "daemon", no_argument, NULL, 'd' },
        { "fps", required_argument, NULL, 'f' },
        { "help", no_argument, NULL, 'h' },
        { "display", required_argument, NULL, 'n' },
        { "pidfile", required_argument, NULL, 'p' },
        { "device", required_argument, NULL, 'D' },
        { NULL, no_argument, NULL, 0 }
    };

    int opt = 0;

    while ((opt = getopt_long(argc, argv, sopts, lopts, NULL)) != -1)
    {
        switch (opt)
        {
        case 'd':

            isDaemon = true;
            break;

        case 'f':

            fps = atoi(optarg);

            if (fps > 0)
            {
                frameDuration = 1000000 / fps;
            }
            else
            {
                fps = 1000000 / frameDuration;
            }

            break;

        case 'h':

            printUsage(stdout, program);
            exit(EXIT_SUCCESS);

            break;

        case 'n':

            displayNumber = atoi(optarg);

            break;

        case 'p':

            pidfile = optarg;

            break;

        case 'D':

            device = optarg;

            break;

        default:

            printUsage(stderr, program);
            exit(EXIT_FAILURE);

            break;
        }
    }

    //---------------------------------------------------------------------

    struct pidfh *pfh = NULL;

    if (isDaemon)
    {
        if (pidfile != NULL)
        {
            pid_t otherpid;
            pfh = pidfile_open(pidfile, 0600, &otherpid);

            if (pfh == NULL)
            {
                fprintf(stderr,
                        "%s is already running %jd\n",
                        program,
                        (intmax_t)otherpid);
                exit(EXIT_FAILURE);
            }
        }
        
        if (daemon(0, 0) == -1)
        {
            fprintf(stderr, "Cannot daemonize\n");

            exitAndRemovePidFile(EXIT_FAILURE, pfh);
        }

        if (pfh)
        {
            pidfile_write(pfh);
        }

        openlog(program, LOG_PID, LOG_USER);
    }

    //---------------------------------------------------------------------

    if (signal(SIGINT, signalHandler) == SIG_ERR)
    {
        perrorLog(isDaemon, program, "installing SIGINT signal handler");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    if (signal(SIGTERM, signalHandler) == SIG_ERR)
    {
        perrorLog(isDaemon, program, "installing SIGTERM signal handler");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    bcm_host_init();

    DISPMANX_DISPLAY_HANDLE_T display
        = vc_dispmanx_display_open(displayNumber);

    if (display == 0)
    {
        messageLog(isDaemon, program, LOG_ERR, "cannot open display");
        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    DISPMANX_MODEINFO_T info;

    if (vc_dispmanx_display_get_info(display, &info) != 0)
    {
        messageLog(isDaemon,
                   program,
                   LOG_ERR,
                   "cannot get display dimensions");
        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    int fbfd = open(device, O_RDWR);

    if (fbfd == -1)
    {
        perrorLog(isDaemon, program, "cannot open framebuffer device");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    struct fb_fix_screeninfo finfo;

    if (ioctl(fbfd, FBIOGET_FSCREENINFO, &finfo) == -1)
    {
        perrorLog(isDaemon,
                  program,
                  "cannot get framebuffer fixed information");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    struct fb_var_screeninfo vinfo;

    if (ioctl(fbfd, FBIOGET_VSCREENINFO, &vinfo) == -1)
    {
        perrorLog(isDaemon,
                  program,
                  "cannot get framebuffer variable information");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    if ((vinfo.xres * 2) != finfo.line_length)
    {
        perrorLog(isDaemon,
                  program,
                  "assumption failed ... framebuffer lines are padded");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    if ((vinfo.xres % 16) != 0)
    {
        perrorLog(isDaemon,
                  program,
                  "framebuffer width must be a multiple of 16");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    if (vinfo.bits_per_pixel != 16)
    {
        perrorLog(isDaemon,
                  program,
                  "framebuffer is not 16 bits per pixel");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    uint16_t *fbp = mmap(0,
                         finfo.smem_len,
                         PROT_READ | PROT_WRITE,
                         MAP_SHARED,
                         fbfd,
                         0);

    if (fbp == MAP_FAILED)
    {
        perrorLog(isDaemon, program, "cannot map framebuffer into memory");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    //---------------------------------------------------------------------

    uint32_t image_ptr;

    DISPMANX_RESOURCE_HANDLE_T resourceHandle = 
        vc_dispmanx_resource_create(VC_IMAGE_RGB565,
                                    vinfo.xres,
                                    vinfo.yres,
                                    &image_ptr);

    VC_RECT_T rect;
    vc_dispmanx_rect_set(&rect, 0, 0, vinfo.xres, vinfo.yres);

    //---------------------------------------------------------------------

    uint16_t *backCopyP = malloc(finfo.smem_len);
    uint16_t *frontCopyP = malloc(finfo.smem_len);

    if ((backCopyP == NULL) || (frontCopyP == NULL))
    {
        perrorLog(isDaemon, program, "cannot allocate offscreen buffers");

        exitAndRemovePidFile(EXIT_FAILURE, pfh);
    }

    memset(backCopyP, 0, finfo.line_length * vinfo.yres);

    uint32_t pixels = vinfo.xres * vinfo.yres * 3;

    //---------------------------------------------------------------------

    messageLog(isDaemon,
               program,
               LOG_INFO,
               "copying from %dx%d to %dx%d",
               info.width,
               info.height,
               vinfo.xres,
               vinfo.yres);

    //---------------------------------------------------------------------

    struct timeval start_time;
    struct timeval end_time;
    struct timeval elapsed_time;

    //---------------------------------------------------------------------

    while (run)
    {
        gettimeofday(&start_time, NULL);

        //-----------------------------------------------------------------

        vc_dispmanx_snapshot(display, resourceHandle, 0);
        vc_dispmanx_resource_read_data(resourceHandle,
                                       &rect,
                                       frontCopyP,
                                       finfo.line_length);

        uint32_t *fbIter = fbp;
        uint32_t *frontCopyIter = frontCopyP;
        uint32_t *backCopyIter = backCopyP;

        uint32_t pixel;
        for (pixel = 0 ; pixel < (pixels/4) ; pixel++)
        {
            if (*frontCopyIter != *backCopyIter)
            {
                *fbIter = *frontCopyIter;
            }

            ++frontCopyIter;
            ++backCopyIter;
            ++fbIter;
        }

        uint16_t *tmp = backCopyP;
        backCopyP = frontCopyP;
        frontCopyP = tmp;

        //-----------------------------------------------------------------

        gettimeofday(&end_time, NULL);
        timersub(&end_time, &start_time, &elapsed_time);

        if (elapsed_time.tv_sec == 0)
        {
            if (elapsed_time.tv_usec < frameDuration)
            {
                usleep(frameDuration -  elapsed_time.tv_usec);
            }
        }
    }

    //---------------------------------------------------------------------

    free(frontCopyP);
    free(backCopyP);

    memset(fbp, 0, finfo.smem_len);

    munmap(fbp, finfo.smem_len);
    close(fbfd);

    //---------------------------------------------------------------------

    vc_dispmanx_resource_delete(resourceHandle);
    vc_dispmanx_display_close(display);

    //---------------------------------------------------------------------

    messageLog(isDaemon, program, LOG_INFO, "exiting");

    if (isDaemon)
    {
        closelog();
    }

    if (pfh)
    {
        pidfile_remove(pfh);
    }

    //---------------------------------------------------------------------

    return 0 ;
}

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 12:36 am

shaotim wrote:I am going to try his code, I think I may have found a bug just by reading the source on github...
Let me know what the bug is. More than happy to fix it.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 12:38 am

shaotim wrote:Is there a git source for bsd/libutil.h ? I need to resolve the reference and I have several libutil.h .so .a files on my box.
The libbsd project is here.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 12:43 am

shaotim wrote:

Code: Select all

uint32_t pixels = vinfo.xres * vinfo.yres * 3;

Code: Select all

for (pixel = 0 ; pixel < (pixels/4) ; pixel++)
I think your maths is wrong here. You have changed the from uint16_t to uint32_t, so there are half as many comparisons required. I would imagine the following would be correct:-

Code: Select all

uint32_t pixels = vinfo.xres * vinfo.yres;

Code: Select all

for (pixel = 0 ; pixel < (pixels/2) ; pixel++)

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Fri May 08, 2015 2:14 am

I was figuring that the buffer is rgb, no alpha. Since most displays take in some form of 3 bytes to a pixel. so a single line would be 480 x rgb(3); 1440 subpixels(bytes). That would allow for a loop on a single line of only 360 iterations since the pointers would refer to a 4 byte value. Im not sure what the register width is in the 2836 but if the regs being used are 32bits wide it would allow for twice as much data on a single memory read to be checked, even better if the data bus is 32bits wide as well, neither of which I know, the regs could even be 64 bit in which case some sort of unsigned long would be even better to minimize the amount of time in the loop.

currently pixels is 153600 iteration for a screen 480x320, 76800 using a pointer to a 16bit value, and 38400 using 32bit pointer. Problem I thought I saw was that the buffer is actually 460800 bytes. Which only checks the first 16bits of a 24bit pixel. My thought was that this could lead to artifacts because pixels that have only changed in the LSB would go unupdated. I was unable to locate the source for bsd/libutils.h which would allow me to debug it and verify vinfo.xres and yres are being returned as actual screen resolution, and not something wacky like number of subpixels, and that the buffer is actually contains more or less a raw 24bit bitmap of screen data and hasnt been padded to be byte aligned to 32bits.

edit: where i said first 16 of a 24bit pixel, oops thats wrong as well because it would check the first two bytes of the first pixel then the last byte of first and first byte of the second, and so on. so it would actually fall short of a full line check, and in this case, where its a full screen, the last few lines of the screen may not get checked. In any case thats where the math wasnt adding up to me.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 2:34 am

shaotim wrote:I was figuring that the buffer is rgb, no alpha. Since most displays take in some form of 3 bytes to a pixel. so a single line would be 480 x rgb(3); 1440 subpixels(bytes). That would allow for a loop on a single line of only 360 iterations since the pointers would refer to a 4 byte value. Im not sure what the register width is in the 2836 but if the regs being used are 32bits wide it would allow for twice as much data on a single memory read to be checked, even better if the data bus is 32bits wide as well, neither of which I know, the regs could even be 64 bit in which case some sort of unsigned long would be even better to minimize the amount of time in the loop.

currently pixels is 153600 iteration for a screen 480x320, 76800 using a pointer to a 16bit value, and 38400 using 32bit pointer. Problem I thought I saw was that the buffer is actually 460800 bytes. Which only checks the first 16bits of a 24bit pixel. My thought was that this could lead to artifacts because pixels that have only changed in the LSB would go unupdated. I was unable to locate the source for bsd/libutils.h which would allow me to debug it and verify vinfo.xres and yres are being returned as actual screen resolution, and not something wacky like number of subpixels, and that the buffer is actually contains more or less a raw 24bit bitmap of screen data and hasnt been padded to be byte aligned to 32bits.

edit: where i said first 16 of a 24bit pixel, oops thats wrong as well because it would check the first two bytes of the first pixel then the last byte of first and first byte of the second, and so on. so it would actually fall short of a full line check, and in this case, where its a full screen, the last few lines of the screen may not get checked. In any case thats where the math wasnt adding up to me.
Hi @shaotim, the buffers are RGB565 which is two bytes per pixel (or 16 bits per pixel). I think this is where the confusion lies. This is enforced in the code here:-

Code: Select all

if (vinfo.bits_per_pixel != 16)
{
    perrorLog(isDaemon,
              program,
              "framebuffer is not 16 bits per pixel");

    exitAndRemovePidFile(EXIT_FAILURE, pfh);
}
Edit: The information is from the framebuffer itself. Have a look at this code and run it on your display.

Code: Select all

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <linux/fb.h>
#include <sys/ioctl.h>
#include <sys/mman.h>

//-------------------------------------------------------------------------

int main()
{
    int fbfd = open("/dev/fb1", O_RDWR);

    if (fbfd == -1)
    {
        perror("Error: cannot open framebuffer device");
        exit(EXIT_FAILURE);
    }

    //---------------------------------------------------------------------

    struct fb_fix_screeninfo finfo;

    if (ioctl(fbfd, FBIOGET_FSCREENINFO, &finfo) == -1)
    {
        perror("Error: reading fixed frame buffer information");
        exit(EXIT_FAILURE);
    }

    printf("fb_fix_screeninfo\n");
    printf("    id = %s\n", finfo.id);
    printf("    smem_start = 0x%0lX\n", finfo.smem_start);
    printf("    smem_len = %d\n", finfo.smem_len);
    printf("    line_length = %d\n", finfo.line_length);
    printf("\n");

    //---------------------------------------------------------------------

    struct fb_var_screeninfo vinfo;

    if (ioctl(fbfd, FBIOGET_VSCREENINFO, &vinfo) == -1)
    {
        perror("Error: reading variable frame buffer information");
        exit(EXIT_FAILURE);
    }

    printf("fb_var_screeninfo\n");
    printf("    xres = %d\n", vinfo.xres);
    printf("    yres = %d\n", vinfo.yres);
    printf("    xoffset = %d\n", vinfo.xoffset);
    printf("    yoffset = %d\n", vinfo.yoffset);
    printf("    bits_per_pixel = %d\n", vinfo.bits_per_pixel);
    printf("    red\n");
    printf("        offset = %d\n", vinfo.red.offset);
    printf("        length = %d\n", vinfo.red.length);
    printf("        msb_right = %d\n", vinfo.red.msb_right);
    printf("    green\n");
    printf("        offset = %d\n", vinfo.green.offset);
    printf("        length = %d\n", vinfo.green.length);
    printf("        msb_right = %d\n", vinfo.green.msb_right);
    printf("    blue\n");
    printf("        offset = %d\n", vinfo.blue.offset);
    printf("        length = %d\n", vinfo.blue.length);
    printf("        msb_right = %d\n", vinfo.blue.msb_right);
    printf("    transp\n");
    printf("        offset = %d\n", vinfo.transp.offset);
    printf("        length = %d\n", vinfo.transp.length);
    printf("        msb_right = %d\n", vinfo.transp.msb_right);
    printf("    height = %d\n", vinfo.height);
    printf("    width = %d\n", vinfo.width);

    return 0;
}
Here is the output from my 320x240 display.

Code: Select all

fb_fix_screeninfo
    id = fb_bd663474
    smem_start = 0x0
    smem_len = 153600
    line_length = 640

fb_var_screeninfo
    xres = 320
    yres = 240
    xoffset = 0
    yoffset = 0
    bits_per_pixel = 16
    red
        offset = 11
        length = 5
        msb_right = 0
    green
        offset = 5
        length = 6
        msb_right = 0
    blue
        offset = 0
        length = 5
        msb_right = 0
    transp
        offset = 0
        length = 0
        msb_right = 0
    height = 0
    width = 0
smem_len = 153600 = 320 x 320 x 2

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Fri May 08, 2015 3:04 am

Marvelous! Since each pixel is 16bits, then using a 32bit pointer would allow it to check two pixels at a time :D

so this remains the same

Code: Select all

uint32_t pixels = vinfo.xres * vinfo.yres;
Pointers stay at 32bits and the loop becomes edit: change to

Code: Select all

for (pixel = 0 ; pixel < (pixels/2 ); pixel++)
        {
            if (*frontCopyIter != *backCopyIter)
            {
                *fbIter = *frontCopyIter;
            }

            ++frontCopyIter;
            ++backCopyIter;
            ++fbIter;
        }
like you said. It would cut the iterations down by half. Ill try this tomorrow as I dont have qemu or a pi setup at home, too many after hours projects. See how it reacts an make sure there arent any artifacts. So my next question is when the screen data is sent out over the spi is it transmitted as 565 or is it bit padded so each subpixel is a byte. The screens I have each have 3 8bit shift registers so I figured it was being transmitted as padded to 24bits, which i may have a solution so we can use the spi up to 80Mhz, which ill test tomorrow as well hopefully. Because fbtft caps the frequency at 32Mhz. If this works I may be able to get us up to 20fps like i said. if not then ill go with plan c which is design my own pi board using a standalone screen, plan d and e i may do anyway. Have you tried or tinkered with the above changes to see if it has any benefit?

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 3:11 am

shaotim wrote:Marvelous! Since each pixel is 16bits, then using a 32bit pointer would allow it to check two pixels at a time :D
Yes ... I tried various different schemes, but I think the answer is to profile some of the different schemes and see which works best. It could probably be made faster with some assembly language.
shaotim wrote:... So my next question is when the screen data is sent out over the spi is it transmitted as 565 or is it bit padded so each subpixel is a byte. The screens I have each have 3 8bit shift registers so I figured it was being transmitted as padded to 24bits, which i may have a solution so we can use the spi up to 80Mhz, which ill test tomorrow as well hopefully. Because fbtft caps the frequency at 32Mhz. If this works I may be able to get us up to 20fps like i said. if not then ill go with plan c which is design my own pi board using a standalone screen, plan d and e i may do anyway. Have you tried or tinkered with the above changes to see if it has any benefit?
Unfortunately, I don't know the specifics of Notro's framebuffer driver.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 4:42 am

AndyD wrote:... Unfortunately, I don't know the specifics of Notro's framebuffer driver.
Actually, I was driving my display directly from userland using libbcm2835. This the code is on github. In this case all the pixel data was transferred to the display as two bytes - RGB565.

shaotim
Posts: 7
Joined: Wed May 06, 2015 5:19 pm

Re: Speeding up rpi-fbcp

Fri May 08, 2015 5:39 am

I started working with that code because it was the recommended lib for the jbtek 3.5 inchbut i couldnt get the lcd to take init commands, then found it was updated for the rpi 2, then found that the fbtft supports both lcds. Then found out in the schematic that the miso was used by the touchscreen so i couldnt verify that the registers were actually being set. So I canned the little progress i made with that driver, was also working with wiring pi to write my own low level driver, but init commands were an issue, which ill use for the other stuff i need to control. All I had to do was sudo apt-get update and upgrade. Your tft is supported as well. Found the device defintion in fbtft. When I noticed the tearing effect and began researching I came upon this thread, looking for a solution. Your code gives me control of the buffer to buffer copy, but I found that there was a cap on the spi speed. Which I blame for the tearing on a fulll screen update. The device Im working will depend heavily on displaying full screen images and right now I can see the image being drawn.

While it is somewhat quick when it sends an image to the screen, its not fast enough. I also plan on using opengl for 3d renderings, and switching between the two so the framerate is a major problem. Im going to clone fbtft and remove the spi speed cap so I can do some tests on higher speeds. Since you are sending 16bits per pixel you should be getting about 26 fps if the screen is completely redrawn.

Im begining to wonder if fbtft is sending 565 as well, if it is then there shouldnt be any reason why i am seeing redraw on the lcd, which is most noticeable on boot up and its scrolling the startup text. My math is SPIfreq/(X*Y*bits_per_pixel), which works out to roughly 13fps @ 16bit and 8 @ 24bit. eek, thats still kinda sad. Thinking about redraw, im wondering if the screen is being redrawn while the buffer is being copied, wonder if there is a quick way to make sure that only one is happening at a time, double buffering the framebuffer copy maybe?

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Fri May 08, 2015 5:49 am

shaotim wrote:...Your tft is supported as well...
Yes, I know. I don't actually drive the display from userland any more. Since Notro's fbtft driver is now in the Rasperry Pi kernel it has been a lot easier to use.

It sounds like there are a lot of things to work out!

gmli
Posts: 4
Joined: Fri Mar 04, 2016 8:10 am

Re: Speeding up rpi-fbcp

Fri Mar 04, 2016 8:18 am

Hi,

(sorry for my English)

I'm making a portable console, with the PiTFT 3.5", Retropie and a Raspberry Pi Zero. The "original" fbcp was quite laggy, and this is what drives me here.

I'm now using the Andy's version, which is far better, no lag at all. But I'm facing another problem : tearing. Severe tearing.

Do you (Andy ?) think it is possible to do vsync in raspi2fb ? I would be happy to do it, but I'm not sure how to proceed…

Edit: I've added this at the end of the loop :

Code: Select all

__u32 dummy = 0;
ioctl(fbfd, FBIO_WAITFORVSYNC, &dummy);
But it doesn't work, because this call returns -1 on /dev/fb1 (and 0 on /dev/fb0).

mrvanes
Posts: 1
Joined: Thu Aug 11, 2016 11:44 am

Re: Speeding up rpi-fbcp

Thu Aug 11, 2016 12:00 pm

I created an fb0 vsync callback based version of rpi-fbcp for those who care or are interrested. CPU load is slightly (1% on PiZero) higher than 25ms sleep version of tasanakorn (fb1 is 320x240, fb0 refresh rate is 27 fps) but at least I know fbcp is only copying exactly every frame once precicely after it was synced, which feels like a more "correct" way to me.

https://github.com/mrvanes/rpi-fbcp

I checked callback to be called exactly 27fps, using latest firmware and kernel.

User avatar
AndyD
Posts: 2327
Joined: Sat Jan 21, 2012 8:13 am
Location: Melbourne, Australia
Contact: Website

Re: Speeding up rpi-fbcp

Thu Aug 11, 2016 9:13 pm

That looks interesting @mrvanes. I will try and see if I can do something similar.
@gmli, sorry I missed you question. I wonder if @mrvanes approach will help with the tearing.

ill-tempered
Posts: 2
Joined: Tue Sep 06, 2016 5:03 pm

Re: Speeding up rpi-fbcp

Tue Sep 06, 2016 5:07 pm

Any updates? Really interested to see whether there are any options to speeding this up.

When small portions of the screen update the refresh rate is perfect but am getting some tearing when a large portion of the screen has to be changed.

Return to “Graphics programming”