Hughesnet Community

Why does this work for latency?

cancel
Showing results for 
Search instead for 
Did you mean: 
lighthope1
Senior

Why does this work for latency?

On EchoStar 19, beam 5.

 

Had impossible latency spikes. Wildly swinging between 700ms and 44000ms. Latency swings are by the second.

 

Unable to determine source of latency. Could be HughesNet or could be CenturyLink. There is one hop that returns No Response From Host, but every once in a while will return an address.  I can't remember what it is, but it is a qwest domain.  When it does return, the ping time is something like 4000ms or so.

 

What is strange is that, if I run a program called WinMTR and put into it the destination IP, the latency disappears for the most part.  It still will rear its ugly head, but only on occassion.  Without running that, latency is almost always there.

 

For those who do not know, WinMTR will continuously run a traceroute or some such every second, testing the ping from each hop.

 

If I run it with an interval > 1 second, the latency returns.

 

Very strange, and I can't figure out why this works.

 

It is imperfect.  Latency still crawls in there, but at least it is something.

23 REPLIES 23
Michael57
Senior

It can be hard to say.  The most likely scenario is usually the simplest, so it is most likely where your trace is showing it.  The thing is, TCPIP is a self clocking protocol, and there's a bit of handshaking and acknowledging that goes, so when you a spamming with WinMTR you can be generating what appears to be "fast" traffic and lower latency, which is likley why when you slow the intervals down you get different results.  Note: I don't know too much about WinMTR, so I'm just guessing that's what it is doing based on your post.

 

We do a lot of work with throttle proxies (we wrote a custom one, but Charles is sufficient for most purposes) that basically allows you to limit your bandwidth or add latency so you can simulate various scenarios.  It's super helpful when you want to see the way various traffic or URLs are being routed and prioritized, what bandwidth limits are, and how latency maniftests itself on your traffic.  Latency is notoriously weird

 

I bet if you add latency on your end, you'll make the upstream latency more obvious, (which is sort of what you are seeing when youslow your intervals).


@Michael57 wrote:

It can be hard to say.  The most likely scenario is usually the simplest, so it is most likely where your trace is showing it.  The thing is, TCPIP is a self clocking protocol, and there's a bit of handshaking and acknowledging that goes, so when you a spamming with WinMTR you can be generating what appears to be "fast" traffic and lower latency, which is likley why when you slow the intervals down you get different results.  Note: I don't know too much about WinMTR, so I'm just guessing that's what it is doing based on your post.


Ah, I left out an important detail.

 

I am running a second program while I am running WinMTR.  It is with the SECOND program that I am seeing the latency improvement.

 

For example:

 

I run a game called World of Warcraft.

 

Without WinMTR running, I get latency from 2000 to 5000ms.

 

With WinMTR running, I get latency from 900 to 1300ms. (With occassional spikes)

 

Odd, huh?

Oh wow (pun intended), I played World of Warcraft when it first came out through like the 3rd expansion.  That actually makes it a bit more complex, because we don't know if both sets of traffic are following the same prioritization.  If deep packet inspection is happening, they likely are not prioritized the same, but if it's just based on the destination then assuming WinMTR is hitting the same destination, it would be probably the same.

 

Still, I don't really have an explanation for you, other than guesses like the MTR traffic is keeping you cached or hot in some way, though that's not typical behavior...Can you see the difference in gameplay?  I ask that because it could also be false reporting.

 

 


@Michael57 wrote:

Can you see the difference in gameplay?  I ask that because it could also be false reporting.

 


Yes.  There is clearly a difference.  Actions don't take as much time.  Plus there is a monitor in game so you can see your latency.  Latency is absolutely higher when WinMTR is not running.

It's possible this all points to an erratic MTU issue somewhere.


* Disclaimer: I am a HughesNet customer and not a HughesNet employee. All of my comments are my own and do not necessarily represent HughesNet in any way.


@MarkJFine wrote:

It's possible this all points to an erratic MTU issue somewhere.


I don't know what that means.

 

MTU is Maximum Transmission Unit, basically the largest packet size a receiving side will accept. It's possible that CL is dynamically (and incorrectly) dropping theirs depending upon loading patterns. Then they're either rejecting or just flat out dropping whole packets (likely, from what I've seen) waiting for a timeout/resend.


* Disclaimer: I am a HughesNet customer and not a HughesNet employee. All of my comments are my own and do not necessarily represent HughesNet in any way.


@MarkJFine wrote:

MTU is Maximum Transmission Unit, basically the largest packet size a receiving side will accept. It's possible that CL is dynamically (and incorrectly) dropping theirs depending upon loading patterns. Then they're either rejecting or just flat out dropping whole packets (likely, from what I've seen) waiting for a timeout/resend.


Some node waiting for a timeout was my guess, too.  Just couldn't figure out why.

 

On a terrestial connection, it would happen so fast it would probably be unnoticable. But if it is timing out three or four times on satellite, that is monsterous.

 

Wish there was something that could be done about it.

To me it's weird that when you run WinMTR you don't get routed through that node.

 

If it is MTU you can try lowering it or turning off Jumbo Frames if you have it on.  I'm not sure if the HN router lets you adjust MTU size or not, but your ethernet card probably does.


@Michael57 wrote:

To me it's weird that when you run WinMTR you don't get routed through that node.

 

If it is MTU you can try lowering it or turning off Jumbo Frames if you have it on.  I'm not sure if the HN router lets you adjust MTU size or not, but your ethernet card probably does.


I do get routed through that node.  It just doesn't experience the slowdown when I use MTR.  I have absolutely no explanation for that.

 

I have Jumbo Frames disabled on my Ethernet card.

Oh sorry, I misunderstood then.  Well, it's a long shot but you can always try turning on Jumbo Frames.  Jumbo Frames will probably not work with your existing network, so if you do try it, make sure you are ok with your network connection dropping for a minute until you can flip it back.

Yeah, I remember that there was an in game monitor, but I was hoping maybe it was being fooled by something, but if you can see the effects in the game, then it's likely legit. 

 

If you do nslookup, to the same WoW servers every couple seconds versus spamming it via a script, do you get different IPs?  I'm reaching here, but maybe this could highlight if caching or some sort of sticky load balancing was happening.  The challenge is you'd think that WoW traffic would be enough for you to benefity from any of those behaviors, it shouldn't take that additional trace traffic to push it over some threshold.


@Michael57 wrote:

Yeah, I remember that there was an in game monitor, but I was hoping maybe it was being fooled by something, but if you can see the effects in the game, then it's likely legit. 

 

If you do nslookup, to the same WoW servers every couple seconds versus spamming it via a script, do you get different IPs?  I'm reaching here, but maybe this could highlight if caching or some sort of sticky load balancing was happening.  The challenge is you'd think that WoW traffic would be enough for you to benefity from any of those behaviors, it shouldn't take that additional trace traffic to push it over some threshold.


I don't know what nslookup does.

 

I haven't tested this with any other game yet.  I plan to see if it has any affect on Guild Wars 2, since that has an in-game latency monitor as well.  It will be interesting to see the results.

It seems to work in Guild Wars 2 as well, but I've only been able to test it once.

 

It's a bit harder to test in there.  I'll keep trying over a couple days and see what I come up with.


@lighthope1 wrote:


I don't know what nslookup does.

 


No matter, it was a brain fart, it's not going to help you from the public side.  Basically the NS stands for name server, so you can lookup the IP address associated a domain name, like "nslookup google.com" will give you the IP address for google.com.  You can get more advanced info and traverse different types of domain records and CNames, which can allow you to see the IP addresses assigned to specific resources...but you have to be on the private side of that network to do it.

 

Where it can be helpful is that very few resources like google.com are actually a single server, they are often clusters of servers sitting behind load balancers, so you can use nslookup to see the different IPs being used and know that different requests will get routed to different servers based on rules that can change very fast.  So I was trying to determine if spamming the domain address (with your WinMRT running) kept you resolving to the same IP address, while hitting it more slowly would occasionally send you to a different different IP address which would introduce latency.  

 

Basically, my hunch is that you are encountering some sort of race condition (an issue that depends on timing of potentially non-related things) when you see the higher latency, and it seems you hit it more often when your traffic is has bigger intervals, but where and why, I don't have a good answer for you, sorry.

The offending hop seems to be at 63-235-42-186.dia.static.qwest.net.  That is the one that -- when it does return something -- is always 2K - 5K latency.

GabeU
Distinguished Professor IV


@lighthope1 wrote:

The offending hop seems to be at 63-235-42-186.dia.static.qwest.net.  That is the one that -- when it does return something -- is always 2K - 5K latency.


With their track record as of late, it's not surprising to see that the problem may lie with something connected to CenturyLink.  😞 


@GabeU wrote:

@lighthope1 wrote:

The offending hop seems to be at 63-235-42-186.dia.static.qwest.net.  That is the one that -- when it does return something -- is always 2K - 5K latency.


With their track record as of late, it's not surprising to see that the problem may lie with something connected to CenturyLink.  😞 


Yeah.  When I saw that, I knew it was never going to be fixed.  Can't even route around because the hop before it is another CenturyLink owned node.


 

And is that server only in the trace when you have high latency?


@Michael57 wrote:
And is that server only in the trace when you have high latency?

Yes.

 

I did a full day of testing yesterday.  No issues with latency as long as I ran WinMTR.

 

Once I stopped running it, the latency started to come back.