rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Raspberry Pi 3B+ Bare Metal USB Driver

Wed Jun 13, 2018 9:40 pm

I am working on the new Raspberry Pi 3 B+ board in a bare metal environment (32-bit). I have a working USB driver for the older Pi 1 boards. From what I understand, the Pi 1 and the Pi 3 B+ have the same USB host controller (Synopsis DesignWare 2.0 USB Host Controller; or dwc for short), yet the USB driver that works on the Pi 1 does not work for me on the Pi 3 B+ (or the Pi 3 B either).

After going through some debugging messages, I found that the problem is that when the DWC is enumerating the devices, it will try to read the device descriptor of, what I am guessing is, the on-board USB hub/ethernet device (LAN7515), but it will return a transfer error, and then therefore is unable to enumerate the device.

My question is why does this happen? If the Pi 1 and the Pi 3 have the same host controller then it should, in theory, at least be able to properly enumerate a device.

If someone can point me in the right direction as to why this happens, it would be greatly appreciated.

Thank you in advance.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 3:26 am

So we assume you have changed the IO address and done the drop from HYP because you are getting a fair way along. However you need to explain are compiling for ARM6 or ARM8 code specifically what I am asking is what are your cpu compiler flags, they could be

-march=armv6zk -mtune=arm1176jzf-s
or
-march=armv8-a -mtune=cortex-a53

If they are the later you need to be aware the ARM8 half word instructions are faster and the compiler prefers them to some of the slower load and shift opcodes on ARM6 codes. The issue is you can more easily get alignment faults on 16 bit half word instructions if you aren't careful with pointers to structs. I had the issue with both my USB and FAT32 code when I first compiled for ARM8 with the later settings.

My USB code was based on the old Chadderz code and it had a lot of structs and there were a handful that caused me grief initially because of poor alignment in the packed structs. I also had to change all the bool bitfields he used to just unsigned bitfield as bitfield bool behaviour is erratic on different GCC compiler versions. It's easy to correct the alignment you just add alignment attributes to the structs, or move the struct entity to an aligned temp variable. The other one around is USPI which I haven't had much experience with but rst maintains a copy and he could advise.

The other possibility is the DMA transfer buffers must be align 4, you could also be unlucky and one compilation is making a buffer aligned and the other compilation not. Are all the buffers for the DMA process using alignment 4 attributes.

If you are using Chadderz my redux of the code is in the link and it works Pi1,Pi2,Pi3 32bit and 64bit the struct alignment attributes may help you
https://github.com/LdB-ECM/Raspberry-Pi ... m32_64_USB

rst who also frequents this site has a copy of USPI
https://github.com/rsta2/uspi

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 5:36 am

Yes, I changed the IO address and change from HYP mode to SYS mode.

For the compiler flags, I am using

-mcpu=cortex-a53
and
-mno-unaligned-access

for the RPi 3.

I am using the Embedded Xinu's USB subsystem for the RPi 1, which can be found at https://github.com/xinu-os/xinu/tree/master/device/usb

It appears that Xinu only uses the __packed attribute at the end of their structs.

To my knowledge, Embedded Xinu's code is loosely based off of Chadderz code and the Linux Kernel's code.

I haven't had the chance to look into the way the DMA transfer buffers are initialized, but I will look into it.

I appreciate the help.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 7:38 am

Found 1 for you in that code

Look at the definition of struct usb_descriptor_header it's only 2 bytes big

Code: Select all

/** Fields that begin every standard USB descriptor.  */
struct usb_descriptor_header {
    uint8_t bLength;
    uint8_t bDescriptorType;
} __packed;
Now look at the point I have marked with "<<< SEE THIS SUCKER" in usb_get_descriptor
I believe that could end up balign(2) or balign(4) depending on stack behaviour and it gets optimized with that other "uint16_t len" above it.
Then it looks they pass that in as a pointer to the DMA transfer as (&hdr). If that is going to the DMA and that is balign(2) it is going to have a fit.

I would try adding the __attribute__((aligned(4))), make sure the compiler knows it must end up balign(4) however it gets optimized :-)

One trick you could try is compiling with optimizer at level zero -O0 then it shouldn't move stuff around as much. My suspicion is the new GCC optimizer is opening up flaws in the code, I assume you have a newer version of GCC as the old version didn't compile ARM8.

Code: Select all

usb_get_descriptor(struct usb_device *dev, uint8_t bRequest, uint8_t bmRequestType,
                   uint16_t wValue, uint16_t wIndex, void *buf, uint16_t buflen)
{
    usb_status_t status;
    uint16_t len;
    if (buflen > sizeof(struct usb_descriptor_header))
    {
        /* Get descriptor length.  */
        struct usb_descriptor_header hdr;  // <<< SEE THIS SUCKER
        status = usb_control_msg(dev, NULL, bRequest, bmRequestType,
                                 wValue, wIndex, &hdr, sizeof(hdr));
        if (status != USB_STATUS_SUCCESS)
        {
            return status;
        }

        if (hdr.bLength < sizeof(hdr))
        {
            usb_dev_error(dev, "Descriptor length too short\n");
            return USB_STATUS_INVALID_DATA;
        }

        /* Length to read is the minimum of the descriptor's actual length and
         * the buffer length.  */
        len = min(hdr.bLength, buflen);
    }
    else
    {
        len = buflen;
    }

    /* Read the descriptor for real.  */
    return usb_control_msg(dev, NULL, bRequest, bmRequestType, wValue, wIndex,
                           buf, len);
}

StevoD
Posts: 19
Joined: Tue Aug 29, 2017 11:37 am

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 11:18 am

rlatinov wrote:
Wed Jun 13, 2018 9:40 pm
After going through some debugging messages, I found that the problem is that when the DWC is enumerating the devices, it will try to read the device descriptor of, what I am guessing is, the on-board USB hub/ethernet device (LAN7515), but it will return a transfer error, and then therefore is unable to enumerate the device.
Hello rlatinov, Because you say transfer error it is probably not alignment error. You could try doing as linux does and send 64 byte request when reading device descriptor, the linux driver says some devices are faulty and fail if you send a 8 byte request.

Also make sure you give at least 10ms settle time after setting usb address on new device.

StevoD
Posts: 19
Joined: Tue Aug 29, 2017 11:37 am

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 11:33 am

LdB wrote:
Thu Jun 14, 2018 7:38 am
Now look at the point I have marked with "<<< SEE THIS SUCKER" in usb_get_descriptor
I believe that could end up balign(2) or balign(4) depending on stack behaviour and it gets optimized with that other "uint16_t len" above it.
Then it looks they pass that in as a pointer to the DMA transfer as (&hdr). If that is going to the DMA and that is balign(2) it is going to have a fit.
That is plain and simple just wrong!

Struct usb_descriptor_header has only 2 x uint8_t and reading bLength or bDescriptorType can never make an alignment error as it will always use ldrb instruction to read every time.

If you try to say pointer to usb_descriptor_header will give alignment error you should check code in xinu usb first because it looks at alignment of data before sending to usb.

Look at code here https://github.com/xinu-os/xinu/blob/ma ... hcd.c#L988 if you think you know all about everything in xinu.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 12:46 pm

I don't know xinu at all but I have written a USB driver and I know all the alignment issues inside and out.

You are getting the wrong end of stick yes they are aware of the problem but let me walk you thru it

Let me mark a very interesting line of code with <<<< SEE THIS
/* Need to use alternate buffer for DMA, since the actual source or
* destination is not word-aligned. If the attempted transfer size
* overflows this alternate buffer, cap it to the greatest number of
* whole packets that fit. */
chanptr->dma_address = (uint32_t)aligned_bufs[chan];
if (transfer.size > sizeof(aligned_bufs[chan]))
{
transfer.size = sizeof(aligned_bufs[chan]) -
(sizeof(aligned_bufs[chan]) %
characteristics.max_packet_size);
req->short_attempt = 1;
}
/* For OUT endpoints, copy the data to send into the DMA buffer. */
if (characteristics.endpoint_direction == USB_DIRECTION_OUT)
{
memcpy(aligned_bufs[chan], data, transfer.size); // <<<< SEE THIS
}
Do you see the situation they have .. lets write it out
An unaligned void* source data .. it's guaranteed unaligned they even checked it and commented it
You have an aligned buffer which is of uint8_t that will be typecast to void* as it hits memcpy
You have a size of transfer which they nicely made match an aligned count
What happens next? What could possibly go wrong?
The answer is it is actually up to the C compiler as an indication perhaps google "GCC ARM memcpy with unaligned data"

Perhaps I will give you my solution I had to do complete with my comment as my USB DMA transfer was no different and it drove me crazy.

Code: Select all

/*==========================================================================}
{	 MY MEMORY COPY .. YEAH I AM OVER THE ARM MEMCOPY ALIGNMENT ISSUES	    }
{==========================================================================*/
void myMemCopy (uint8_t* dest, uint8_t* source, uint32_t size){
	if (dest && source && size) {
		while (size) {													// While data to copy
			*dest++ = *source++;										// Copy 1 byte from source to dest and increment pointers
			size--;														// Decerement size
		}
	}
}
Last edited by LdB on Thu Jun 14, 2018 1:10 pm, edited 2 times in total.

StevoD
Posts: 19
Joined: Tue Aug 29, 2017 11:37 am

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 1:09 pm

LdB wrote:
Thu Jun 14, 2018 12:46 pm
I don't know xinu at all but I have written a USB driver and I know all the alignment issues inside and out.
All such nonsense, I use xinu and it works.

So many projects now have usb and only you have to make your own memcpy to stop alignment error.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 1:13 pm

Well then we agree and you knew the answer .. so why didn't you tell him and save me crawling thru the code !!!

Anyhow mystery solved and it will probably work if he fixes it, all I knew is it would be an alignment issue.

Do you have a preferred one Xinu uses that you could post for him since he is trying to fix it and my version is very rough?

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 3:08 pm

Hello,
Thanks for the responses! I really appreciate it

I tried out what you guys suggested and I am still having no luck. Specifically, I changed 3 things.

I added the -O0 flag for the compiler,

I also changed

Code: Select all

struct usb_descriptor_header {
	uint8_t bLength;
	uint8_t bDescriptorType;
} __packed;
to

Code: Select all

struct __attribute__((__packed__, aligned(4))) usb_descriptor_header {
	uint8_t bLength;
	uint8_t bDescriptorType;
};
For memcpy, I am already using my own version of it, but just to test it out, I made a new one and use it only in the case where you mentioned it.

Code: Select all

static void _memcpy(uint8_t* dest, uint8_t* source, uint32_t size)
{
	if (dest && source && size) 
	{
		while (size) 
		{
			*dest++ = *source++;									
			size--;					
		}
	}
}
and change this line

Code: Select all

/* For OUT endpoints, copy the data to send into the DMA buffer. */
if (characteristics.endpoint_direction == USB_DIRECTION_OUT)
{
	memcpy(aligned_bufs[chan], data, transfer.size);
}
to

Code: Select all

/* For OUT endpoints, copy the data to send into the DMA buffer. */
if (characteristics.endpoint_direction == USB_DIRECTION_OUT)
{
	_memcpy(aligned_bufs[chan], data, transfer.size);
}
to use the new _memcpy that was suggested.

But like I said, no luck. I still get the same error messages.

Code: Select all

USB: [DEBUG] usb_attach_device(): Reading device descriptor.
USB: [DEBUG] Device 2: usb_submit_xfer_request(): Submitting xfer request (18 bytes, type=Control, dir=IN)
USB: [DEBUG] Device 2: usb_submit_xfer_request(): Control message: {.bmRequestType=0x80, .bRequest=0x06, wValue=0x0100, wIndex=0x0000, wLength=0x0012}
USB: [DEBUG] Device 2: dwc_channel_start_xfer(): Starting SETUP transaction
USB: [DEBUG] Device 2: dwc_channel_start_xfer(): Setting up transactions on channel 7:
                max_packet_size=64, endpoint_number=0, endpoint_direction=OUT,
                low_speed=0, endpoint_type=Control, device_address=2,
                size=8, packet_count=1, packet_id=3, split_enable=0, complete_split=0
USB: [DEBUG] dwc_interrupt_handler(): Received SOF intr (host_frame_number=0x0ccd370c)
USB: [DEBUG] dwc_handle_channel_halted_interrupt(): Handling channel 7 halted interrupt
                (interrupts pending: 0x00000082, characteristics=0x20900040, transfer=0x60080008)
USB: [ERROR] Device 2: Transfer error on channel 7 (interrupts pending: 0x00000082, packet_count=1)
USB: [DEBUG] Device 2: usb_complete_xfer(): Calling completion callback (Actual transfer size 0 of 18 bytes, type=Control, dir=IN, status=-3)
USB: [ERROR] Device 2: Failed to read device descriptor: hardware error
USB: [ERROR] Device 1: Failed to attach new device to port 1: hardware error
USB: [DEBUG] Device 2: usb_free_device(): Releasing USB device structure.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 3:32 pm

All I can suggest is search the whole code if the DWC file for memcpy and replace them there is likely to be a couple. It wont matter the code does the same thing it is just slower.

That was just the outbound there should be the same thing for received to align it and it is the read you are having issues with.
It looks right its requesting 18 bytes and not getting any back.
if you get stuck I am happy to build it just point me at which repo you are using.

Update .. yep found the in one inout it is marked see my comment " <<<< this guy ".

Code: Select all

if (dir == USB_DIRECTION_IN)
        {
            /* The transfer.size field seems to be updated sanely for IN
             * transfers.  (Good thing too, since otherwise it would be
             * impossible to determine the length of short packets...)  */
            bytes_transferred = req->attempted_bytes_remaining -
                                chanptr->transfer.size;
            /* Copy data from DMA buffer if needed */
            if (!IS_WORD_ALIGNED(req->cur_data_ptr))
            {
                memcpy(req->cur_data_ptr,
                       &aligned_bufs[chan][req->attempted_size -
                                           req->attempted_bytes_remaining],
                       bytes_transferred);  // <<<<<< THIS GUY
            }
}

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 3:48 pm

I replaced all the memcpy's in the DWC file and am getting the same errors.

The repo I am using is https://github.com/rlatinovich/xinu/tree/feature/usb
(forked from the xinu repo)

Once again, thank you for your helpfulness.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Thu Jun 14, 2018 3:53 pm

No worries hopefully it shouldn't take me long to find it.

Update: I am in makefile hell .. wont build under cygwin. Might be faster to setup a spare machine on linux :-)

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Fri Jun 15, 2018 2:54 pm

For building, a gcc cross compiler toolchain might work better. Currently I am using the arm-none-eabi-gcc version 7.1.0, if that helps.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Fri Jun 15, 2018 3:19 pm

I have GCC 6.3,7.1,7.2 and Linaro 6.1, 7.1 on my system that isn't the problem. If it was just a normal make I could rewrite the makefile but it does shell excutes to run what look like scripts, runs bison to build some conf.h file and all sorts of weird stuff. The only thing I have seen like this was ReactOS which had a whole build system they wrote . I have cygwin on windows but even in that terminal all the shell executes failed.

Hence the cross tool won't help me what I need is a running linux system by look, what do you use?

At the moment I am just slowly trying to write a manual makefile but I don't know what conf.h should even look like so I am having to reverse engineer it by looking at errors and stuff it must contain. I am figuring I just need the base system and your USB files to look at it. I have got the base code up but then it looks like dev header is scripted and built different for different processors.

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Fri Jun 15, 2018 4:04 pm

You're right. I forgot about that...

I am using a Fedora Linux system.

I'll post the files here, you can copy and paste them into your code. If you have conf.c then the Makefile will bypass those scripts.

Here is the conf.h: (goes in include/conf.h)

Code: Select all

/* conf.h (GENERATED FILE; DO NOT EDIT) */

#ifndef _CONF_H_
#define _CONF_H_

#include <stddef.h>

/* Device table declarations */

/* Device table entry */
typedef struct dentry
{
	int     num;
	int     minor;
	char    *name;
	devcall (*init)(struct dentry *);
	devcall (*open)(struct dentry *, ...);
	devcall (*close)(struct dentry *);
	devcall (*read)(struct dentry *, void *, uint);
	devcall (*write)(struct dentry *, const void *, uint);
	devcall (*seek)(struct dentry *, long);
	devcall (*getc)(struct dentry *);
	devcall (*putc)(struct dentry *, char);
	devcall (*control)(struct dentry *, int, long, long);
	void    *csr;
	void    (*intr)(void);
	uchar   irq;
} device;

extern const device devtab[]; /* one entry per device */

/* Device name definitions */

#define SERIAL0     0       /* type uart     */
#define DEVNULL     1       /* type null     */
#define CONSOLE     2       /* type tty      */
#define TTYLOOP     3       /* type tty      */

/* Control block sizes */

#define NNULL 1
#define NUART 1
#define NTTY 2

#define DEVMAXNAME 20
#define NDEVS 4


/* Configuration and Size Constants */

#define LITTLE_ENDIAN 0x1234
#define BIG_ENDIAN    0x4321

#define BYTE_ORDER    LITTLE_ENDIAN

#define NTHREAD   100           /* number of user threads           */
#define NSEM      100           /* number of semaphores             */
#define NMAILBOX  15            /* number of mailboxes              */
#define RTCLOCK   TRUE          /* timer support                    */
#define NETEMU    FALSE         /* Network Emulator support         */
#define NVRAM     FALSE         /* nvram support                    */
#define SB_BUS    FALSE         /* Silicon Backplane support        */
#define USE_TLB   FALSE         /* make use of TLB                  */
#define USE_TAR   FALSE         /* enable data archives             */
#define NPOOL     8             /* number of buffer pools available */
#define POOL_MAX_BUFSIZE 2048   /* max size of a buffer in a pool   */
#define POOL_MIN_BUFSIZE 8      /* min size of a buffer in a pool   */
#define POOL_MAX_NBUFS   8192   /* max number of buffers in a pool  */
#define WITH_USB                /* USB support                      */
//#define WITH_DHCPC              /* DHCP client support              */

#endif /* _CONF_H_ */

Here is conf.c: (goes in system/conf.c)

Code: Select all

/* conf.c (GENERATED FILE; DO NOT EDIT) */

#include <conf.h>
#include <device.h>

#include <null.h>
#include <uart.h>
#include <tty.h>

extern devcall ioerr(void);
extern devcall ionull(void);

/* device independent I/O switch */

const device devtab[NDEVS] =
{
/**
 * Format of entries is:
 * dev-number, minor-number, dev-name,
 * init, open, close,
 * read, write, seek,
 * getc, putc, control,
 * dev-csr-address, intr-handler, irq
 */

/* SERIAL0 is uart */
	{ 0, 0, "SERIAL0",
	  (void *)uartInit, (void *)ionull, (void *)ionull,
	  (void *)uartRead, (void *)uartWrite, (void *)ioerr,
	  (void *)uartGetc, (void *)uartPutc, (void *)uartControl,
	  (void *)0x3f201000, (void *)uartInterrupt, 57 },

/* DEVNULL is null */
	{ 1, 0, "DEVNULL",
	  (void *)ionull, (void *)ionull, (void *)ionull,
	  (void *)ionull, (void *)ionull, (void *)ioerr,
	  (void *)ionull, (void *)ionull, (void *)ioerr,
	  (void *)0x0, (void *)ioerr, 0 },

/* CONSOLE is tty */
	{ 2, 0, "CONSOLE",
	  (void *)ttyInit, (void *)ttyOpen, (void *)ttyClose,
	  (void *)ttyRead, (void *)ttyWrite, (void *)ioerr,
	  (void *)ttyGetc, (void *)ttyPutc, (void *)ttyControl,
	  (void *)0x0, (void *)ioerr, 0 },

/* TTYLOOP is tty */
	{ 3, 1, "TTYLOOP",
	  (void *)ttyInit, (void *)ttyOpen, (void *)ttyClose,
	  (void *)ttyRead, (void *)ttyWrite, (void *)ioerr,
	  (void *)ttyGetc, (void *)ttyPutc, (void *)ttyControl,
	  (void *)0x0, (void *)ioerr, 0 }
};


LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sat Jun 16, 2018 7:53 am

getting close now but new one

Code: Select all

process_begin: CreateProcess(NULL, sh mkvers.sh arm-rpi3, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [../include/version.h] Error 2
I take it version.h is also scripted to be created?

Update: Ignore that I worked out it must just have a #define version in it. I am down to a new problem with /lib/libxc.a which is right at the point of building the elf file. It compiles the libxc.a file and then does something that crashes.
"Installing" ../lib/libxc.a
'tr' is not recognized as an internal or external command, operable program or batch file.
tr is obviously a linux cmd or bash file to do some sort of filtering
CFILES := $(filter-out $($(shell echo $(LIBNAME) | tr a-z A-Z)_OVERRIDE_CFILES), $(CFILES))
SFILES := $(filter-out $($(shell echo $(LIBNAME) | tr a-z A-Z)_OVERRIDE_SFILES), $(SFILES))
UPDATE: Okay I worked out roughly what that does and worked a windows equivalent and now have Xinu.elf and Xinu.boot but it isnt a baremetal boot file. Trying to work out what I am supposed to do with this or get a baremetal boot file. Might be the linker script I am sure it's not supposed to throw those
Discarded input sections
.data 0x00000000 0x0 ../loader/platforms/arm-rpi3/start.o
.bss 0x00000000 0x0 ../loader/platforms/arm-rpi3/start.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/ctxsw.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/ctxsw.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/halt.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/halt.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/intutils.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/intutils.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/irq_handler.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/irq_handler.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/memory_barrier.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/memory_barrier.o
.text 0x00000000 0x8 ../system/platforms/arm-rpi3/pause.o
.data 0x00000000 0x0 ../system/platforms/arm-rpi3/pause.o
.bss 0x00000000 0x0 ../system/platforms/arm-rpi3/pause.o
UPDATE: Ok sorted out the linker you really need a KEEP section as most of the linkers now default to throw away unused sections so with no KEEP it throws everything away. I just put it on the init obviously ..... KEEP(*(.init .init.*))

Does the code actually work reliably for you? I just went to put the debug stub online and it looks like the FPU is still offline (it may come on later than what i was expecting) and there is something clunky with the MMU? I am going to bring the screen up in platforminit.c so I can debug :-)

Anyhow got it running will look at the USB.
Last edited by LdB on Sat Jun 16, 2018 1:30 pm, edited 6 times in total.

User avatar
rpdom
Posts: 12648
Joined: Sun May 06, 2012 5:17 am
Location: Ankh-Morpork

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sat Jun 16, 2018 11:51 am

LdB wrote:
Sat Jun 16, 2018 7:53 am
"Installing" ../lib/libxc.a
'tr' is not recognized as an internal or external command, operable program or batch file.
tr is obviously a linux cmd or bash file to do some sort of filtering
CFILES := $(filter-out $($(shell echo $(LIBNAME) | tr a-z A-Z)_OVERRIDE_CFILES), $(CFILES))
SFILES := $(filter-out $($(shell echo $(LIBNAME) | tr a-z A-Z)_OVERRIDE_SFILES), $(SFILES))
tr stands for "translate". It performs simple editing of the input stream. In this case it is being used to convert all lowercase letters in $LIBNAME into uppercase ones.

(translate characters in the range "a-z" to the characters in the range "A-Z")

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sat Jun 16, 2018 2:03 pm

Thanks for that rpdom.

rlatinov:
Issue 1:
Okay you had me going with your for the Pi Screen you forgot to use PERIPHERALS_BASE instead of MMIO_BASE (0x20000000) the address on the mailbox code is hence the old Pi1 and no screen ... the struct also needs to be packed :-)

I have changed MMIO_BASE to 0x3F000000 but now you have two of these floating around meaning same thing

You old have the status register bug there are two status registers not 1 as listed in framebuffer

Code: Select all

// These are a the read pair ARM reads from
#define MAILBOX_READ 0x00 // the register we read from .. check for EMPTY
#define MAILBOX_READ_STATUS 0x18 //the read status register

// These are a the write pair ARM write to
#define MAILBOX_WRITE 0x20 //the register we write to
#define MAILBOX_WRITE_STATUS 0x38 //the write status register ... check for FULL 
You also have another mailbox in bcm2837_power.c used by the USB and it has status problem as well ... How many mailbox code blocks do you want :-)

Issue 2:
The fpu is off and it is using soft floats because when I went to turn the FPU register in start.s with the code below ... I got the good old register code unknown command. I put it just after the un-align register change they did. I had to fix the context switch code up as well if you don't know how leave all this for later .. I just need it to debug.

Code: Select all

;@"========================================================================="
@#    PI NSACR regsister setup for access to floating point unit
@#    Cortex A-7 => Section 4.3.34. Non-Secure Access Control Register
@#    Cortex A-53 => Section 4.5.32. Non-Secure Access Control Register
;@"========================================================================="
	mrc p15, 0, r0, c1, c1, 2				;@ Read NSACR into R0
	cmp r0, #0x00000C00						;@ Access turned on or in AARCH32 mode already
	beq .free_to_enable_fpu1
	orr r0, r0, #0x3<<10					;@ Set access to both secure and non secure modes
	mcr p15, 0, r0, c1, c1, 2				;@ Write NSACR
;@"========================================================================="
@#                           Bring fpu online
;@"========================================================================="
.free_to_enable_fpu1:
	mrc p15, 0, r0, c1, c0, #2				;@ R0 = Access Control Register
	orr r0, #(0x300000 + 0xC00000)			;@ Enable Single & Double Precision
	mcr p15,0,r0,c1,c0, #2					;@ Access Control Register = R0
	mov r0, #0x40000000						;@ R0 = Enable VFP
	vmsr fpexc, r0							;@ FPEXC = R0
Ok so you need to add some new flags to CC and Assembler flags in platformvars
CFLAGS += -mcpu=cortex-a53 -mfpu=neon-vfpv4 -mfloat-abi=hard
ASFLAGS += -mcpu=cortex-a53 -mfpu=neon-vfpv4 -mfloat-abi=hard
Then you need to do "make libclean" because you need to rebuild the library to use hard floats
Then do a "make clean"
The do a "make" and it should rebuild complete now with FPU code.

Now I can debug (I don't have a softfloat debugger stub) and moving onto USB :-)

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 2:24 am

Thanks LdB,

The way the mailboxes are used are the same as it was on the Pi 1, except that in the bcm2837.h the peripheral base address has been updated for the Pi 3, I did not know that I had to change the mailbox addresses as wel, so thank you for that.

Also, for my testing I have not been using the screen output but rather I have been using the UART output via a USB to UART connector on my desktop. I don't know if that would be easier for you to use for testing purposes.

Just a question about the FPU. I understand that I have not been using it, but I don't quite understand the necessity of using it. Is it for your debugging purposes or is it a necessary step to reduce bugs?

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 3:30 am

If the debugger stub started using software float maths for offset, table calcs and float display etc it would go re-entrant on the same maths routines you are using. I have no way of knowing if the current library is re-entrant safe. The hardware FPU is just a series of registers and result registers when the debug stub wants to use them it can simply go and grab them save them, do what it wants and then put them back. You do the same for context switching when you have them turned on (that is the couple of assembler lines you add). The only other way is for the debugger to carry it's own soft copy of float mathematics so it doesn't clash with what it's debugging and that is a lot of work.

So all basically this is the code to push and pop all the FPU registers onto a stack .. beats worrying about re-entrant or writing soft library.

Code: Select all

fstmdbd sp!, {d0-d15}  // Push Floating point registers D0 to D15 onto stack
fmrx r12, fpscr
push {r12}                       //Push FPSCR onto stack
fmrx r12, fpexc
push {r12}                       //Save FPEXC onto stack
This is the code to load them back off the stack

Code: Select all

 pop {r12}
 fmxr fpexc, r12           //Restore FPEXC from stack
 pop {r12}
 fmxr fpscr, r12           //Restore FPSCR from stack
 fldmiad sp!, {d0-d15} //Restore Floating point registers D0 to D15 from stack
I have both your serial output and the screen debugging.

I fixed up a couple of problems but there is something funny in the device enumeration my second device is a mouse so it is supposed to be a low speed transaction but it's going at full speed. Device 0 = fake root hub, Device 1 = Pi physical hub .. for me Device 2 = Mouse

Code: Select all

USB: [DEBUG] Device 2: dwc_channel_start_xfer(): Setting up transactions on channel 7:
		max_packet_size=64, endpoint_number=0, endpoint_direction=OUT,
		low_speed=0, endpoint_type=Control, device_address=2,
		size=8, packet_count=1, packet_id=3, split_enable=0, complete_split=0

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 5:42 am

There is the on board ethernet device, which could be the device you're seeing. Unless you're very sure that it is in fact the mouse.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 5:48 am

Ah yes the old LAN9214

Houston we have a bug !!!!
USB: [DEBUG] Device 0: dwc_channel_start_xfer(): Setting up transactions on channel 7:
max_packet_size=64, endpoint_number=0, endpoint_direction=IN,
low_speed=0, endpoint_type=Control, device_address=0,
size=0, packet_count=1, packet_id=2, split_enable=0, complete_split=0
USB: [DEBUG] dwc_interrupt_handler(): Received SOF intr (host_frame_number=0x1a7333a4)
USB: [DEBUG] dwc_handle_channel_halted_interrupt(): Handling channel 7 halted interrupt
(interrupts pending: 0x00000023, characteristics=0x20108040, transfer=0x8007fff8)
USB: [DEBUG] Device 0: dwc_handle_normal_channel_halted(): 1 packets transferred on channel 7
USB: [DEBUG] Device 0: dwc_handle_normal_channel_halted(): attempted_bytes_remaining: 0 Channel transfer size: 524280
USB: [DEBUG] Device 0: dwc_handle_normal_channel_halted(): Calculated 4294443016 bytes transferred
So one packet of max 64 bytes becomes a transfer size of 524280 which when it is subtracted from from another value becomes 4294443016 :-)

The interesting part 524280 = 0x‭7FFF8‬ which is a 19bit value the exact size of size in dwc_host_channel_transfer

Code: Select all

uint32_t size         : 19; /* Bits 0-18  */
Convert that to 19 bit sign negative it's 8 which would be what we are after.

hmmm .. looks like the transfer size wasn't set and it's gone negative

Did you want the changes for running with FPU it's only 3 files you have to change which are all in your Pi platform stuff?
Here anyhow if you do https://github.com/LdB-ECM/Exchange/tre ... r/rlatinov

rlatinov
Posts: 22
Joined: Mon Jun 26, 2017 7:35 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 5:41 pm

That's a very interesting find! Im guessing that the size is being set automatically by the dwc when it does a transfer? If so, how is it supposed to be corrected?

Thanks for the FPU changes, I will add them to my code.

LdB
Posts: 769
Joined: Wed Dec 07, 2016 2:29 pm

Re: Raspberry Pi 3B+ Bare Metal USB Driver

Sun Jun 17, 2018 5:46 pm

hehe I just realized I left my test activity LED flash code in Start.S .... you can remove it was just for debugging

And don't forget to change the lines in platformvars for fpu operation .. I lied there were 4 file changes :-)
I forgot the secondary cores (1,2,3) have to start with FPU on as well .. it's up there (setupCores.S).

platformvars changes

Code: Select all

CFLAGS   += -mcpu=cortex-a53 -mfpu=neon-vfpv4 -mfloat-abi=hard
ASFLAGS  += -mcpu=cortex-a53 -mfpu=neon-vfpv4 -mfloat-abi=hard
Don't forget to run "make libclean" as you are changing libraries to hard floats

Return to “Bare metal, Assembly language”

Who is online

Users browsing this forum: No registered users and 2 guests