Networking-Forums.com

Professional Discussions => Routing and Switching => Topic started by: Dieselboy on June 29, 2016, 11:25:59 PM

Title: IOS SLA monitoring
Post by: Dieselboy on June 29, 2016, 11:25:59 PM
At silly-o'clock in the morning I see that my remote device loses OSPF relationships across the VPN tunnels. It looks like the internet may have gone down at the remote site, from the main site the nagios and cacti monitoring shows the site unreachable. It also shows the upstram ISP router (our gateway) also unreachable. However since there's no monitoring in-house at the remote site, all that this tells me is that the remote site was inaccessible from the main site. This could be a range of reasons for the loss of connectivity and it may not be the internet at the remote site but could be a transit issue.

I've been meaning to set up a basic IP SLA monitor from the IOS firewall at the remote site, so I can see when things go down and to give a bit more insight.

I found this Cisco doc and will give this a try and see how useful this is on the next occurrence.

https://supportforums.cisco.com/document/11935681/simple-eem-script-alert-ip-sla-failures
Title: Re: IOS SLA monitoring
Post by: deanwebb on June 30, 2016, 06:39:48 AM
I've seen solutions that use the cellular network to send back info on the other side. Really sweet stuff, as they also allow for remote CLI access via mobile.
Title: Re: IOS SLA monitoring
Post by: icecream-guy on June 30, 2016, 07:14:46 AM
not sure if you are intending to monitor remote site from local, or run the IPSLA at the remote site.
Just make sure the SLA router has access to a mail relay and a way to send the email alerts offsite.  not getting alerts, or only getting alerts after the site comes back online, is quite useless.
Title: Re: IOS SLA monitoring
Post by: Dieselboy on June 30, 2016, 10:07:15 PM
I tried to find an internet SMTP relay but gave up. I found a website that boasted to list all open internet relays, but a handful of random ones I chose I couldn't connect to on TCP/25. Or if I could connect, it would not accept trying to send email through it.

As the site only has the one internet link anyway of course if the internet is down it cannot send an email. So I've also got a syslog written to the local device.

Monitoring is via cacti / nagios at the main site.
IPSLA at the remote site to at very least write a syslog when the IP SLA goes down and comes back too. So my monitoring will pick up that the site has gone offline, and I'll just check through the syslogs to see what actually happened once the site returns.

Because there's no monitoring at the site itself I couldn't determine if the site was losing the internet connection from the ISP, or whether there had been an ISP issue at the main office end (with regards to routing to the remote site) or if there was an issue on the internet in-path. This was just to give some further insight to narrow the problem down. :)
Title: Re: IOS SLA monitoring
Post by: wintermute000 on June 30, 2016, 11:20:47 PM
you could just set up a $5 a month VPS in digital ocean (for example) Singapore region and run the internet monitoring from there to rule out Australia to Sri Lanka transit.

pssst if you do use digitalocean, pls PM me for referral code!
Title: Re: IOS SLA monitoring
Post by: Dieselboy on July 01, 2016, 12:09:05 AM
One of our AWS sites are located in Singapore. I had an issue earlier this week where average response time from the Sri Lanka office to AWS in Singapore was 3000ms! I used Dean's tcpping.exe tool and found that max latency to the HTTPS port was in fact 6500ms! I raised a case with the ISP and then routed traffic to AWS through our office in Australia across the VPN. Latency that way was just over 500ms which meant it was rediculously slow but it was at least loading.

I'm still waiting for the ISP to let me know what the issue was and how it was resolved but it's now back to between 60ms and 70ms. The trace route showed that all of the latency was added between 2 hops at Telstra. eg at hop #7 latency is 100ms but at hop #8 latency was over 3000ms.

$5 is pretty cheap. I'm more than happy for you to pay that for me
:awesome:

;)


:zomgwtfbbq: