Hi,
Whilst attempting to stream video (using GStreamer) from one leopard board to another leopard board I have observed that a leopard board's network connectivity is disrupted when it receives an APR broadcast. I can stream from a leopard to a PC without any problems.
I have read a couple of posts of apparently random network errors occurring on leopard boards, could this erroneous behaviour be explained the boards response to ARP requests. I can demonstrate the behaviour by having two boards on a network, prove network connectivity by pinging each board from the other, then running arp-scan on the host linux machine a few times will cause the leopard boards to have problems pinging each other. Also it is rare to see the leopard boards respond consistently to arp-scan.
Any ideas guys?
Richard
I tried to reproduce this, but didn't have any luck.
With a leopard board running 2.6.29 from rrsdk, and an NFS mounted filesystem, and a host running ubuntu 9.04, both connected to each other thru a 100mbit ethernet switch.
I did the following command on the host:
arp-scan --interface=eth1 --localnet
But everything seemed to stay up (NFS and pings). I wasn't actually doing any streaming at the time.
Then to up the ante, I tried nmap (where 192.168.1.10 is the ip of the leopard):
nmap -v -A 192.168.1.10
And it gave expected output, and both NFS and PINGs to/from the device still seem to be ok.
Any other suggestions on how I might reproduce your issue?
Thanks,
chris
If the two boards have the same MAC address this might cause the lost connectivity.
Even though I wasn't able to get the ethernet to crash with arp-scan or nmap, there is a real problem with the ethernet going deaf.
For instance, I was streaming and I noticed the streaming stopped. I tried to ping the leopard and it didn't respond. I tried to ping from the leopard, and I see it arp for the destination, and I see the arp response come back, but the leopard never sends out the icmp packet. The tx/rx packet counts from "ifconfig" do increment though, if that helps anyone isolate the issue.
-chris
Got the leopard in the "deaf ethernet" mode again, and I tried a few other things. Its interesting to note that it can't ping itself using either the ethernet's ip address or 127.0.0.1. Wouldn't not being able to ping 127.0.0.1 indicate a stack problem, and not a ethernet driver issue?
Please Let me add another observation about TCP connectivity problem.
Connected a 3G modem to leopardboard, and downloaded several files successfuly from ubuntu.
However when I try to download from Windows, the download stops after downloading several Mbytes.
ppp0 Link encap:Point-to-Point Protocol inet addr:188.58.200.43 P-t-P:10.64.64.64 Mask:255.255.255.255 UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1 RX packets:7549 errors:0 dropped:0 overruns:0 frame:0 TX packets:13841 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:3 RX bytes:418695 (408.8 KiB) TX bytes:19555452 (18.6 MiB)
When you have problems with the 3g modem, can you still ping out? Do the RX/TX packet counts increase?
I'm able to break the network relatively easily now.
I set the leopard to boot with fs in flash, and then flood ping it from my ubuntu host.
Within a few seconds, it will quit responding to pings, and pretty much go deaf to the network.
Any entries in the leopard's arp table usually change from populated to "incomplete" as well, which may indicate something.
Yes, I can ping 127.0.0.1 and 10.64.64.64
ppp0 Link encap:Point-to-Point Protocol
inet addr:188.56.237.178 P-t-P:10.64.64.64 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:3257 errors:0 dropped:0 overruns:0 frame:0
TX packets:5856 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:155457 (151.8 KiB) TX bytes:8308361 (7.9 MiB)
Here is an interesting thread from last year about dm9000/DM355/NFS issues. It looks like it was never fully understood.
linux.omap.com/.../010987.html
The patch that it references does seem to be part of my ridgerun sdk.
I believe I tracked down the ethernet problem which was causing the ethernet to quit receiving and nfs to fail.
If some other people would like to try it and see if it fixes it for them as well, that would be great.
I think it turned out to be a problem with the system thinking the NAPI routine was already running when it wasn't. If you edit /net/core/dev.c and replace the current process_backlog() with the one below (from the davinci git kernel), that should clear it up.
Regards,
static int process_backlog(struct napi_struct *napi, int quota)
{
int work = 0;
struct softnet_data *queue = &__get_cpu_var(softnet_data);
unsigned long start_time = jiffies;
napi->weight = weight_p;
do {
struct sk_buff *skb;
local_irq_disable();
skb = __skb_dequeue(&queue->input_pkt_queue);
if (!skb) {
__napi_complete(napi);
local_irq_enable();
break;
}
netif_receive_skb(skb);
} while (++work < quota && jiffies == start_time);
return work;
I tried these changes, the network issue seems to be solved.
Cristina Murillo
Embedded Software Engineer, RidgeRun