Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror

Slashdot videos: Now with more Slashdot!

  • View

  • Discuss

  • Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

×
User Journal

squiggleslash's Journal: Another weird-ass networking issue: Huge Ethernet packets 5

Journal by squiggleslash

Ok, excerpt from a tcpdump. The network works like this:

Erythrocyte<->XEN bridge<->Endothelial<->PPPoE<->Teh Intertubes<->m4e5e36d0.tmodns.net

11:36:20.625832 IP (tos 0x0, ttl 64, id 21777, offset 0, flags [DF], proto TCP (6), length 2880) erythrocyte.squiggleslash.link.www > m4e5e36d0.tmodns.net.9971: . 1:2841(2840) ack 529 win 211
11:36:20.626425 IP (tos 0xc0, ttl 64, id 41691, offset 0, flags [none], proto ICMP (1), length 576) endothelial > erythrocyte.squiggleslash.link: ICMP m4e5e36d0.tmodns.net unreachable - need to frag (mtu 1492), length 556
IP (tos 0x0, ttl 64, id 21777, offset 0, flags [DF], proto TCP (6), length 2880) erythrocyte.squiggleslash.link.www > m4e5e36d0.tmodns.net.9971: . 1:2841(2840) ack 529 win 211[|icmp]
11:36:23.623281 IP (tos 0x0, ttl 64, id 21779, offset 0, flags [DF], proto TCP (6), length 1460) erythrocyte.squiggleslash.link.www > m4e5e36d0.tmodns.net.9971: . 1:1421(1420) ack 529 win 211
11:36:23.731117 IP (tos 0x0, ttl 185, id 48684, offset 0, flags [none], proto TCP (6), length 40) m4e5e36d0.tmodns.net.9971 > erythrocyte.squiggleslash.link.www: ., cksum 0xcae5 (correct), 529:529(0) ack 1421 win 2190

The tcpdump is on Erythrocyte's pseudo-Ethernet device, though for what it's worth I get the same stuff on Endothelial's.

Now, some background:

  1. Erythrocyte and Endothelial are both virtualized servers running on the same physical hardware using Xen. squiggleslash.link is my internal DNS domain for IPv4 NAT addresses (10.x). Just saying. This doesn't involve IPv6 or anything else like that.
  2. The MTU on Erythrocyte's eth0 is 1500. Though for shits and giggles I did reduce it to 1460, but nothing changed.
  3. The MSS on Erythrocyte's default gateway set up was unset. Again, for shits and giggles I changed it to 1460. Nothing changed
  4. Erythrocyte seems, for the most part, to be having problems with Endothelial's requests to send smaller packets. Essentially, I'm tracing all of this because if I try to hit my website from an external address, such as Chloe (my N800) over T-Mobile's Internet connection, it takes ages to load pages, with many objects just not being transferred.
  5. This doesn't appear to be related to PPPoE. The pppoe driver is correctly notifying Erythrocyte that all packets to Chloe need to have a smaller MSS. And, in any case, I'd expect Endothelial to have a fit about 2880 byte packets regardless of whether the external connection is PPPoE or regular Ethernet.

I may be misreading the log, but it looks like somehow Erythrocyte is able to send a giant, 2880 byte, packet fragment to Endothelial, despite hardware (or virtualized hardware) that makes that impossible, and OS settings telling it specifically not to do that. Am I misreading it?

This discussion was created by squiggleslash (241428) for no Foes, but now has been archived. No new comments can be posted.

Another weird-ass networking issue: Huge Ethernet packets

Comments Filter:
  • Bork bork bork bork [launchpad.net]

    See if that's relevant first.

    • Although, I'm pretty sure if this is your issue, you need to turn off segmentation offload, not tx checksumming for your problem:

      ethtool -K eth0 tso off

      • Looks like that fixed it (I didn't try the tx one) - thanks!
        • Glad to be of help.

          The problem is that Xen's virtual NIC drivers are very generic so they work with a wide array of hardware. As such, they don't support segmentation offloading at all (though TCP Segmentation Offloading, TSO, is generally the only one that causes a problem given that other protocol's packets are rarely large). When Dom0 is talking to DomU and vice versa it's no problem, since they can pass large frames to each other, but if DomU is talking to the outside world AND claiming TSO support, the

          • Thanks for the explanation, it's starting to make sense now.

            Also after Googling for the that command, I found this [launchpad.net].

            Thanks again.

CCI Power 6/40: one board, a megabyte of cache, and an attitude...

Working...