CAN controller


569 posts   Page 8 of 23   1 ... 5, 6, 7, 8, 9, 10, 11 ... 23
by Zeta » Tue Dec 25, 2012 9:32 pm
Hello Chris,

muellie wrote:Version 1 is damaged and everything seems to be working fine (except the CAN itself (ERROR-Active)) on MCP2515 side.

Take care of the misleading "ERROR-Active" Flag. It is not a problem.
From the MCP2515 data-sheet (http://ww1.microchip.com/downloads/en/d ... 21801d.pdf) , you can read :
6.6 Error States wrote:each CAN node is in one of the three error states according to the value of the internal
error counters:
1. Error-active.
2. Error-passive.
3. Bus-off (transmitter only).
The error-active state is the usual state where the node can transmit messages and active error frames (made of dominant bits) without any restrictions.
In the error-passive state, messages and passive error frames (made of recessive bits) may be transmitted.
The bus-off state makes it temporarily impossible for the station to participate in the bus communication. During this state, messages can neither be received or transmitted. Only transmitters can go bus-off.

So, error active state means that fault detection is active (and thus will generate an interrupt if it occurs), not that it has already been detected. When the fault counters increase, it will switch to error passive, meaning it will deactivate the fault reporting. Finally, if there is too much errors on the bus, it will switch to bus off, that will stop the device.

I haven't got an hmdi monitor nor my CAN module with me for some days, so will not be able to check if hdmi plugged in is a problem. Everything I have done right now what headless by SSH.
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by muellie » Wed Jan 02, 2013 7:45 pm
Hello,

I wish all of you a happy new year. Zeta, thanks for the reply, got it.

I found the problem: the ribbon wire was too long (20 cm). I am working now with a pretty short one and everything seems to be fine, with HDMI, S-Video as well as SSH.

Unfortunatelly I ran into the next problem: The mcp2515 doesn't send. I can "send" on SocketCAN (within the buffer), but not with the MCP 2515. I do not have another node on the CAN, only terminated after my board with a second resistor. I am working in loopback mode, that means internal transmission between receive and transmit buffer and ACK Bit ignored. Is there anyway another CAN node neccessary? The setup seems to be finde, this is the only thought /for the cause of the problem) I had so far. Also the general behavior is as expected. I tried to get BUS-OFF and do a restart, then some activity on the CAN is generated. General behavior as far as I can see as expected (error-count ok, bus off ok, etc.), please have a look below.

Initial Setup:
kernel messages:
Code: Select all
[    0.075777]  BCM2708 mcp251x_init:  got IRQ 195 for MCP2515
[   65.840684] bcm2708_spi bcm2708_spi.0: SPI Controller at 0x20204000 (irq 80)
[   65.893083] CAN device driver interface
[   65.934128] mcp251x spi0.0: CANSTAT 0x80 CANCTRL 0x07
[   65.938441] mcp251x spi0.0: probed
[   66.075257] mcp251x spi0.0: bit-timing not yet defined
[   66.075287] mcp251x spi0.0: unable to set initial baudrate!
...
[   69.536099] can: controller area network core (rev 20090105 abi 8)
...
[   73.774849] can: raw protocol (rev 20090105)
...
[   76.006846] can: broadcast manager protocol (rev 20090105 t)
...
[   95.214349] mcp251x spi0.0: bit-timing not yet defined
[   95.214379] mcp251x spi0.0: unable to set initial baudrate!
[   95.594352] mcp251x spi0.0: CNF: 0x03 0xb5 0x01

interrupts:
Code: Select all
           CPU0       
  3:      10897   ARMCTRL  BCM2708 Timer Tick
 32:     122754   ARMCTRL  dwc_otg, dwc_otg_pcd, dwc_otg_hcd:usb1
 52:    1631075   ARMCTRL  BCM2708 GPIO catchall handler
 65:        315   ARMCTRL  ARM Mailbox IRQ
 66:          1   ARMCTRL  VCHIQ doorbell
 75:          1   ARMCTRL
 77:       6804   ARMCTRL  bcm2708_sdhci (dma)
 80:    3262160   ARMCTRL  bcm2708_spi.0
 83:         19   ARMCTRL  uart-pl011
 84:       8484   ARMCTRL  mmc0
195:    1631075      GPIO  mcp251x
FIQ:              usb_fiq
Err:          0

lsmod:
Code: Select all
Module                  Size  Used by
can_bcm                11130  0
can_raw                 5798  0
can                    22877  2 can_raw,can_bcm
mcp251x                 9059  0
can_dev                 8587  1 mcp251x
spidev                  5152  0
spi_bcm2708             4401  0
evdev                   8714  2
joydev                  9146  0
bnep                   10406  2
rfcomm                 33990  0
bluetooth             156976  10 rfcomm,bnep
snd_bcm2835            12732  0
snd_pcm                75426  1 snd_bcm2835
snd_page_alloc          4987  1 snd_pcm
snd_seq                52584  0
snd_seq_device          6316  1 snd_seq
snd_timer              19642  2 snd_seq,snd_pcm
snd                    52567  5 snd_timer,snd_seq_device,snd_seq,snd_pcm,snd_bcm2835


I always tried to send 15 times (to get the buffer overrun)
# cat /proc/net/can/stats
Code: Select all
       11 transmitted frames (TXF)
        0 received frames (RXF)
        0 matched frames (RXMF)

        0 % total match ratio (RXMR)
        0 frames/s total tx rate (TXR)
        0 frames/s total rx rate (RXR)

        0 % current match ratio (CRXMR)
        0 frames/s current tx rate (CTXR)
        0 frames/s current rx rate (CRXR)

        0 % max match ratio (MRXMR)
        1 frames/s max tx rate (MTXR)
        0 frames/s max rx rate (MRXR)

        0 current receive list entries (CRCV)
        0 maximum receive list entries (MRCV)


Before unpluging the board: # ip -details -statistics link show can0
Code: Select all
3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
    link/can
    can <LOOPBACK> state ERROR-ACTIVE restart-ms 0
    bitrate 125000 sample-point 0.875
    tq 500 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
    mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp-inc 1
    clock 8000000
    re-started bus-errors arbit-lost error-warn error-pass bus-off
    0          0          0          0          0          0         
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0     
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        0       0       0       0     


Afterward:
Code: Select all
3: can0: <NO-CARRIER,NOARP,UP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT qlen 10
    link/can
    can <LOOPBACK> state BUS-OFF restart-ms 0
    bitrate 125000 sample-point 0.875
    tq 500 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
    mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp-inc 1
    clock 8000000
    re-started bus-errors arbit-lost error-warn error-pass bus-off
    0          0          0          1          1          1         
    RX: bytes  packets  errors  dropped overrun mcast   
    16         2        2       0       2       0     
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        0       0       0       0     


After restart:
Code: Select all
3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT qlen 10
    link/can
    can <LOOPBACK> state ERROR-ACTIVE restart-ms 0
    bitrate 125000 sample-point 0.875
    tq 500 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
    mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp-inc 1
    clock 8000000
    re-started bus-errors arbit-lost error-warn error-pass bus-off
    1          0          0          1          1          1         
    RX: bytes  packets  errors  dropped overrun mcast   
    24         3        2       0       2       0     
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        1       1       0       0     


Thank you very much for the help in advance.

Best regards
Chris
Posts: 8
Joined: Fri Nov 23, 2012 11:27 am
by Zeta » Wed Jan 02, 2013 9:08 pm
muellie wrote:I wish all of you a happy new year. Zeta, thanks for the reply, got it.

Hello Chris,
Happy new year too !

muellie wrote: I am working in loopback mode, that means internal transmission between receive and transmit buffer and ACK Bit ignored.

I still have a lot to learn about the CAN bus and its driver. For example, how did you set it in Loopback mode ? Do you know if this is implemented at the SocketCAN level or if it's at the MCP2515 level ?

muellie wrote: Is there anyway another CAN node neccessary? The setup seems to be finde, this is the only thought /for the cause of the problem) I had so far. Also the general behavior is as expected.

So far, I made almost all my tests with some other nodes.
The first time I launched it, it was alone, so I used only a termination resistor. So I couldn't receive frame (no device to send them), so I tried to send some. I had a problem close to yours, where after some frames (10?) sent with "cansend" it was giving an error (something like "buffer full").
I did not tried much more, and then got a second node with which I made my following tests.

I will have access to my devices again next week, so if you describe how you switch in loopback your device, I will be able to make some tests to compare.

Zeta
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by muellie » Thu Jan 03, 2013 9:49 pm
Hello,

I still have a lot to learn about the CAN bus and its driver. For example, how did you set it in Loopback mode ? Do you know if this is implemented at the SocketCAN level or if it's at the MCP2515 level ?

The Controller is set to loop back mode over a register of the mcp 1525 through its driver (mcp251x).
Below you may find the respective parts of the driver:
Code: Select all
...
#  define CANCTRL_REQOP_LOOPBACK    0x40
...
if (priv->can.ctrlmode & CAN_CTRLMODE_LOOPBACK) {
      /* Put device into loopback mode */
      mcp251x_write_reg(spi, CANCTRL, CANCTRL_REQOP_LOOPBACK);
...

The corresponding part in the data sheet is on side 58, attached is a screenshot. Bit 6 (REQOP1) of CANCTRL needs to be set to one, and for the complete register that's the 0x40 (all others are 0).

Command to set loopback on:
Code: Select all
ip link set can0 type can loopback on


For further information type:
Code: Select all
ip link set can0 type can help


This info is included in 6.5.1 of can.txt: https://www.kernel.org/doc/Documentation/networking/can.txt

I hope I do have the hardware for another node by the end of next week to run the next test...

Best regards
Chris
Attachments
Loopbackmode.png
Loopbackmode.png (61.67 KiB) Viewed 2971 times
Posts: 8
Joined: Fri Nov 23, 2012 11:27 am
by Zeta » Fri Jan 04, 2013 12:21 am
muellie wrote:The Controller is set to loop back mode over a register of the mcp 1525 through its driver (mcp251x).
Below you may find the respective parts of the driver:
Code: Select all
...
#  define CANCTRL_REQOP_LOOPBACK    0x40
...
if (priv->can.ctrlmode & CAN_CTRLMODE_LOOPBACK) {
      /* Put device into loopback mode */
      mcp251x_write_reg(spi, CANCTRL, CANCTRL_REQOP_LOOPBACK);
...

OK, thanks for the pointers, I will normally take a few minutes on monday to try that, and compare the output with you (with/without other nodes).

It is already intersting to see that the mcp2515 drivers lacks the loopback mode, (whereas it is implemented in the mcp251x driver). The code setting the CANCTRL register is hard-setted in the code to "0" :
Code: Select all
/* Finally, enter normal operation mode. */
        err = mcp2515_write(spi, CANCTRL, 0);
There is also no way to set it to "listen only" mode, neither than in "sleep" mode, which the mcp251x has (if I'm reading both drivers code correctly).
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by lookers » Fri Jan 04, 2013 1:15 pm
vlick wrote:Hi at all, I come back after quite a long period of absence, due to some family issues.

First of all, congratulations to all the guys for the very big steps I saw in interfacing the Can Controller MCP2515: I'll need some time to follow all the discussion and to be able to put in practise all the indications for having it running on the PI.

From my side, I finally received my PI some weeks ago and I started some activities: one of them was to interface a USB Can Controller to the PI, just for learning how to build a cross-comp tool chain for PI.
I choose to test some Kvaser USB Can Interfaces (which I bought in the past years for other projects): they are High Performance (and High Price...) devices but I decided to test them under Linux with the PI.
In the other projects, I used them with Windows based Operating Systems.
After some days of attempts, I was finally able to compile, to load the driver and to see the devices running on the PI: if someone is interested, I'll be glad to share my experience about it.
I'm aware that it is a very expensive solution to interface a Can Bus with PI, but using such a configuration the CPU load is quite low due to the high throughtput of HS USB bulk communication, leaving it free for other tasks.
As I said before, I'll try to arrange in the next week all the necessary to test the MCP2515, since I agree that this should be a good solution for the PI.

vlick


Hi vlick,
If you could post details on your experience I would be interested.

Regards
lookers
Posts: 1
Joined: Fri Jan 04, 2013 1:06 pm
by Zeta » Mon Jan 07, 2013 5:27 pm
Hello Chris,

I made a quick try today, with the mcp251x driver (as the other one doesn't support Loopback), kernel 3.6.y, and the command you taught me:
Code: Select all
ip link set can0 type can loopback on

It seems Ok, but as you can see below, the interface is "STOPPED":
Code: Select all
4: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT qlen 10
    link/can
    can <LOOPBACK> state STOPPED restart-ms 0
    bitrate 0 sample-point 0.000
    tq 0 prop-seg 0 phase-seg1 0 phase-seg2 0 sjw 0
    mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp-inc 1
    clock 10000000
    re-started bus-errors arbit-lost error-warn error-pass bus-off
    0          0          0          0          0          0
    RX: bytes  packets  errors  dropped overrun mcast
    0          0        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        0       0       0       0

I tried to open it, as I usually do with the up command, but it fails :
Code: Select all
$ sudo ifconfig can0 up
SIOCSIFFLAGS: Invalid argument

So sending a frame with cansend answers that it is not possible.
$ cansend can0 185#0001111000111100
write: Network is down

After looking at dmesg, I had a lot of errors messages:
Code: Select all
mcp251x spi0.0: can0: bit-timing not yet defined
mcp251x spi0.0: unable to set initial baudrate!

If I try to set loopback mode with a speed, It works:
Code: Select all
sudo ip link set can0 type can bitrate 125000 loopback on

I can then send a frame :
Code: Select all
$ cansend can0 185#0001111000111100

and receive it in another simultaneous SSH session (they are shown twice, as there is one frame out and one in):
Code: Select all
$ candump any,0:0
  can0  185   [8]  00 01 11 10 00 11 11 00
  can0  185   [8]  00 01 11 10 00 11 11 00

I tried several times and it seems OK:
Code: Select all
4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
    link/can
    can <LOOPBACK> state ERROR-ACTIVE restart-ms 0
    bitrate 125000 sample-point 0.875
    tq 500 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
    mcp251x: tseg1 3..16 tseg2 2..8 sjw 1..4 brp 1..64 brp-inc 1
    clock 10000000
    re-started bus-errors arbit-lost error-warn error-pass bus-off
    0          0          0          0          0          0
    RX: bytes  packets  errors  dropped overrun mcast
    40         5        0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    40         5        0       0       0       0

It worked even without a termination resistor.

Hope this can help you.
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by muellie » Fri Jan 11, 2013 5:16 pm
Hello,

Zeta, thank you very much for you efforts. Appreciate it! Sorry, of course you need to set the bitrate. :)

Today I had the possibility to test with another node. As I expected after reading Zeta's comment it didn't help. However, it was quite usefull:

I can receive without any problems. But there is no sending possible. It just puts all the messages into the CAN netdevice queue until there is an overflow. Thats it. I have read the queue itself is a part of the general network stack.
When I do 'ifconfig can0 down' there is one message sent on the CAN-Bus, but still zero checking with 'ip -s link show can0'. Checking with an scope on the SPI side the there is all the time communication, and the MCP2515 is answering. It seems like the socket tries to send all the time (as supposed).

Did anybody have this problem before? Receiving possible but not sending?

Regards
Chris
Posts: 8
Joined: Fri Nov 23, 2012 11:27 am
by muellie » Sun Jan 13, 2013 11:33 am
Hello,

did anyone patch the mcp251x driver?
http://www.mail-archive.com/socketcan-core@lists.berlios.de/msg01747.html

Regards,
Chris
Posts: 8
Joined: Fri Nov 23, 2012 11:27 am
by Zeta » Sun Jan 13, 2013 11:55 am
Hello Chris,


This mail is more than 2 years old. If I look at the source of the linux_3.6.y branch for the raspberry Pi, it seems that all that is in that patch is already there (only checked the first 4/5 out of 8 patches). And there are not a lot of differences with the current raspbian linux branch (3.2 ?).

http://www.mail-archive.com/socketcan-core@lists.berlios.de/msg01756.html

From this answer, you can see that David Miller has pulled it. If it is the same guy as this one : http://en.wikipedia.org/wiki/David_S._Miller (one of linux networking stack maintainer !), then it is probable that is has reached upstream.

Did you found anything in these patches that was not in your driver ?
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by muellie » Sun Jan 13, 2013 1:00 pm
Hey,

yes, you are right. By accident I copied the driver out of a trunk I had, and checked it together with some other stuff in my "troubleshooting folder".
Now I checked in the kernel - everything is included. I think I gonna lay it aside for a couple of days, maybe it is just some minor thing...

Thanks
Chris
Posts: 8
Joined: Fri Nov 23, 2012 11:27 am
by Zeta » Sun Jan 13, 2013 1:14 pm
Chris,

I still intend to make a wiki page on eLinux.org, describing the steps I used to make it work. I wanted to finished my quick project (CAN to wireless UDP or TCP gateway) using it to be sure it really works. Maybe it can be the right time to create it, so that you can compare with what you did, and find the hiccup.

I can also send you my compiled and working files (kernel and drivers) if you want. Send me you mail address in private message if you want them.
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by Zeta » Sun Jan 20, 2013 11:13 pm
Dear all,

I'm back with a good news : I finally took some time to create a reference page on elinux.org for using the MCP2515 on the Raspberry Pi : http://elinux.org/RPi_CANBus

There are still some things to add (like the bash commands to load needed modules or to check the status), but there is at least a (I hope) complete list of steps to configure and compile the kernel and modules.

As I have probably forgotten to write down one step, there may be some things missing. I can also have made a mistake when writing it.
In any case, if you try to follow these instructions and find an error, please make me a feedback (or correct it directly), in order to make it usable to all.

Thanks,

Zeta
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by dplamp » Mon Jan 28, 2013 6:02 pm
Hi Zeta,

You did a great job in summarizing this long thread. I could apply your patch for the MCP2515. I could just load all the modules : my PCB is still being manufactured.

However, I can't apply the SPI latency patch. I downloaded the one pointed to by your wiki page. I am using the last rpi-3.6.y branch. The chunks that fail are way too long to apply them by hand. But maybe I skipped something ?

Here is what I get :
Code: Select all
$ patch --dry-run -p1 < spi-latency-branch3.6.y.patch
patching file arch/arm/mach-bcm2708/bcm2708.c
Hunk #1 succeeded at 590 with fuzz 1 (offset 6 lines).
patching file drivers/spi/spi-bcm2708.c
Hunk #7 FAILED at 247.
Hunk #8 FAILED at 334.
Hunk #9 succeeded at 434 (offset 6 lines).
Hunk #10 succeeded at 587 (offset 6 lines).
Hunk #11 succeeded at 627 (offset 6 lines).
Hunk #12 succeeded at 641 (offset 6 lines).
2 out of 12 hunks FAILED -- saving rejects to file drivers/spi/spi-bcm2708.c.rej


Thanks.
Posts: 10
Joined: Mon Jan 21, 2013 9:23 am
Location: France
by Zeta » Mon Jan 28, 2013 10:39 pm
Hello dplamp,

Thanks !

I made it using the following git revision (on branch 3.6.y) :
git log wrote:commit 523029f607564ab2080e83a3384feac4439b2b38
Merge: ff1c7e1 072e44f
Author: popcornmix <popcornmix@gmail.com>
Date: Tue Dec 4 23:06:01 2012 +0000

Merge commit 'v3.6.9' into rpi-3.6.y


There may be something pushed to the repository since the begining of december that changed the files modified by this patch.
I don't have much time today to look at it. I will make a try tomorrow, and see if I can spot where the difference is.

In the meantime, you can try to rewind to the version I cited above, to confirm that it is a change in the repository.

Zeta
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by dplamp » Tue Jan 29, 2013 8:11 am
OK, it worked : your patch 1st, then the SPI latency one. You might want to write the SHA1 in your wiki page ?
Will this SPI patch be merged in the kernel soon ? It seems like it is a major fix.
Posts: 10
Joined: Mon Jan 21, 2013 9:23 am
Location: France
by dplamp » Tue Jan 29, 2013 8:49 am
A commit dated 2013-01-22, SHA1 91a3be5b2b783b930b2d7cdbf38283b613bce7d4 fixes something in drivers/spi/spi-bcm2708.c, which the SPI latency patch wants to modify.
Posts: 10
Joined: Mon Jan 21, 2013 9:23 am
Location: France
by samus8zero2x » Thu Jan 31, 2013 3:21 am
Hi,

Was just thinking of doing this exact same project myself today. It seems the Pi community is quite active. I was thinking of designing a small form-factor PCB that slides onto the GPIO header, leaving enough room for a ribbon connector above (the PCB would be sandwiched between the Pi and the ribbon connector). That way it could still be contained in most (if not all) Pi cases.

Someone mentioned a PCB being manufactured...maybe it's not worth the time. Let me know if interested. I'll probably hit the wiki and see how far you guys are.

Dave
Posts: 1
Joined: Thu Jan 31, 2013 3:08 am
by Zeta » Sat Feb 02, 2013 12:09 am
dplamp wrote:OK, it worked : your patch 1st, then the SPI latency one. You might want to write the SHA1 in your wiki page ?
Will this SPI patch be merged in the kernel soon ? It seems like it is a major fix.

Good to know I didn't miss something.

I did not have a lot of time to work on it this week. I can only confirm that I checked there are some differences since the version I used that makes the patch to not apply directly. I have the same error message as you.
I will try this week end to get back to it and make a new patch to handle the new version.

samus8zero2x wrote:I was thinking of designing a small form-factor PCB that slides onto the GPIO header, leaving enough room for a ribbon connector above (the PCB would be sandwiched between the Pi and the ribbon connector). That way it could still be contained in most (if not all) Pi cases.

Someone mentioned a PCB being manufactured...maybe it's not worth the time. Let me know if interested. I'll probably hit the wiki and see how far you guys are.
Hello Dave,
I am myself making a small board containing the CAN driver along some other components (IO expanders and RTC), and there are other people here that have done some boards too.
Today I am using a simple breadboard with wires everywhere on it, and it is enough for tests.

When I will have finished (and verified it works), I will open-source my design, but it is quite specific.
If you have a design that fits better in current cases, I am sure some people would be interested too.

Check this topic : viewtopic.php?f=41&t=2298&hilit=canbus
it has more discussions about board designs, with some schematics and printed board.

Zeta
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by Zeta » Sat Feb 02, 2013 1:55 pm
dplamp wrote:A commit dated 2013-01-22, SHA1 91a3be5b2b783b930b2d7cdbf38283b613bce7d4 fixes something in drivers/spi/spi-bcm2708.c, which the SPI latency patch wants to modify.

As this discussion is related to the SPI and not directly to the CAN, I continued the discussion in the corresponding forum, with a new patch:
viewtopic.php?f=44&t=19489&p=276372#p276372

I have also added a warning in the elinux wiki page, to state that the patch fails if used with recent kernel, and with a link to the forum above, will waiting for validation of the new patch.

Please make a try to confirm it works, I don't have the hardware with me now.

Thanks.
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by maddin1234 » Thu Feb 14, 2013 9:07 pm
Hello,
I made a post in the SPI driver latency patch
viewtopic.php?f=44&t=19489&start=50
and it looks like we should continue it here in this thread.

Here is a short summary, see other thread for details.
I have a setup with two mcp2515, one connected to SPI_CE0_N and GPIO17 for interrupt
and the other connected to SPI_CE1_N and GPIO18 for interrupt.
I use kernel 3.6, the mcp2515 patch and the spi latency patch.

I have the problem, that one can stops transmitting messages sometimes, when
running this command:
Code: Select all
cangen can0 -g 2 -i -I 400 & cangen can1 -g 2 -i -I 401


Zeta answered, that a possible reason for this is the IRQ_ONESHOT_MODE and it looks
like this is right.

When I look to the interrupts, I can see, that the time between an interrupt for can0 and can1 jitters.
This is normal, because can messages with different DLCs are transmitted.
The error comes, when the interrupt-line for can0 GOES low while the interrupt-line for can1 IS low.
(Or the other way round)

Then interrupt can0 stays low and no more messages are sent on this can.
(until I connect the pin to 3.3V for a short time, then it starts running again)

I also tried to set IRQ_TRIGGER_LOW, but had the same problem.

I think we have two things, that should be discussed:

1.) What is the problem with the ONE_SHOT_MODE, especially because of the interaction of two different interrupts.
I remember that I read in the BCM manual, that the GPIO only has three interrupts. One for each bank and one more for both banks. There is one interrupt named:
Code: Select all
52:          0   ARMCTRL  BCM2708 GPIO catchall handler

I suspected before, that this might react to the GPIO interrupt line, reads out the GPIO pin and sets some kind of a "software-interrupt"

2.) I am just sending messages with the raspi and the other can-node just sends acknowledges.
The interrupts I see must bee the "transmission succeeded" interrupt from the MCP2515 (CANINTF.TXnIF).
I wonder if it is necessary to activate this interrupt in the MCP2515. Perhaps it is faster to poll the TXBnCTRL.TXREQ bits and just use the interrupts for received messages and errors.

Greetings maddin
Posts: 68
Joined: Sat Aug 04, 2012 8:33 pm
by Zeta » Thu Feb 14, 2013 10:42 pm
maddin1234 wrote:I also tried to set IRQ_TRIGGER_LOW, but had the same problem.

Too bad... I have to dig how this is used, it seemed a good fallback.

maddin1234 wrote:1.) What is the problem with the ONE_SHOT_MODE, especially because of the interaction of two different interrupts.
I remember that I read in the BCM manual, that the GPIO only has three interrupts. One for each bank and one more for both banks. There is one interrupt named:
Code: Select all
52:          0   ARMCTRL  BCM2708 GPIO catchall handler

I suspected before, that this might react to the GPIO interrupt line, reads out the GPIO pin and sets some kind of a "software-interrupt"
I read that too somewhere. So maybe the one shot is applied to the catchall handler, and thus can not see if another pin changes while the handler is executed ?
If msperl read this, he has far more knowledge on the subject than me, hope he can answer.

Do you know if we have access to the two banks ? If yes, you may try to put one interrupt to pin of the first bank, and the second interrupt to the second bank. That way they would really be on two separate hardware interrupts ?

maddin1234 wrote:2.) I am just sending messages with the raspi and the other can-node just sends acknowledges.
The interrupts I see must bee the "transmission succeeded" interrupt from the MCP2515 (CANINTF.TXnIF).
I wonder if it is necessary to activate this interrupt in the MCP2515. Perhaps it is faster to poll the TXBnCTRL.TXREQ bits and just use the interrupts for received messages and errors.

This can work for you, but still isn't the definitive solution, as this means the problem can still appear at high bus speed with short messages, if the driver doesn't have the time to process it, the interrupt line will stay low and never restart...

Have you tried with the kernel 3.2, where the one shot flag was not mandatory, just to check that this flag is really the point from which the system stopped working ? If yes, we can start to dig into what these flags (along with the IRQ_TRIGGER_LOW) actually do, and what are the differences between 3.2 and 3.6 in this regard.

I would normally try this myself, but I am lacking time actually to continue my work on this. It should come back in a few weeks...
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm
by maddin1234 » Fri Feb 15, 2013 2:10 pm
Hello,
I will try to test with the 3.2 kernel next week.

I meantioned my idea of the gpio catchall handler here in this thread before.
So perhaps you also read it from me.
To work with two real interrupts, maybe we need three catchall-handlers, one for
each gpio interrupt-line.

Greetings maddin
Posts: 68
Joined: Sat Aug 04, 2012 8:33 pm
by maddin1234 » Wed Feb 20, 2013 9:31 pm
Hello,
I tested it today with kernel 3.2.27 and had the same problem.

There is very little information about the interrupts in the manual.

The GPIO peripheral has three dedicated interrupt lines. These lines are triggered by the
setting of bits in the event detect status register. Each bank has its’ own interrupt line with the
third line shared between all bits.

It is not written how big these banks are. When I assume,
that they have the same size, then GPIO 0 - 26 are one bank
and GPIO 27 - 53 are the second bank.
This would mean, we have only bank one on P1-Header,
but we have pins from bank two on P5 header.

Another information I found in the manual are the numbers of the interrupts.
(I don't know why here are 4 interrupts when they talk from "three dedicated interrupt lines")
49 gpio_int[0]
50 gpio_int[1]
51 gpio_int[2]
52 gpio_int[3]

Interrupt 52 is the one the gpio_catchall_handler is connected to.

Greetings maddin1234
Posts: 68
Joined: Sat Aug 04, 2012 8:33 pm
by Zeta » Wed Feb 20, 2013 10:57 pm
maddin1234 wrote:Hello,
I tested it today with kernel 3.2.27 and had the same problem.
Hello maddin,
I still have no time to look at it, at least since the end of the month...
Did you removed the IRQF_ONE_SHOT, when doing the test with the 3.2.27 kernel ?
Posts: 72
Joined: Wed Dec 12, 2012 9:51 pm