Author Topic: Vendor HA Recommendations vs. Reality  (Read 262 times)

deanwebb (OP)

  • Permit any any all log
  • Administrator
  • Volume Licensing
  • *****
  • Join Date: Jan 2015
  • Posts: 7858
  • Country: us
  • Rep: 19
  • *I* am the one who NACs.
    • View Profile
  • Certifications: FSCA: ForeScout Certified Administrator, CCNP Security, Tufin CSE, TippingPoint ASE
Vendor HA Recommendations vs. Reality
« on: April 30, 2018, 09:31:17 AM »
"We only support a direct HA connection. If you put a switch between the devices for the HA connection, we don't support that."

^ How many times have you heard that from a vendor? How many vendors may support just one switch in the path, but the devices still have to be in physical proximity (IE, no WAN or MLAN link) for the HA connection to be considered supported?

I can understand not supporting if there's another vendor's switch in the path, as there could be an issue with the switch that breaks HA that would not be directly related to the vendor's gear.

But at the same time, I know of more than one organization that requires HA devices to be in separate buildings of the datacenter, just in case one gets taken out by a tornado or fire or flood. The other building, about a mile away, is still standing with all the HA stuff in it. These firms would rather not see both halves of an HA pair taken out in an outage and then have to use the DR unit in the datacenter on another continent.

Or is this simply a misunderstanding of what HA really is? HA means high availability, as in, should the hardware fail, there's another one right there to keep doing what the other unit was doing. It's only for the hardware failing due to a defect, not from an external event such as would constitute a disaster. Trying to make HA into a localized DR as a step just before using remote DR may not be the right way to go.

Or is it? If your DC buildings are all on 10Gb or faster links, why not stretch that HA? If the vendor hollers, take it under advisement and do it anyway. You get HA and localized DR, all for the same price, right? So long as the network speed is fast enough to do replication, the latency added from a mile of wire isn't going to be all that noticeable, even to the systems doing the HA work. If they're built to use a 100Mb line direct connect and get a 10Gb line stretched over 2km, they're well within speed parameters.

So... do you get vendors saying no HA over distances greater than from one rack to the next? And if you do get those messages, do you follow them or just go with the localized DR, as planned?
Take a baseball bat and trash all the routers, shout out "IT'S A NETWORK PROBLEM NOW, SUCKERS!" and then peel out of the parking lot in your Ferrari.
"The world could perish if people only worked on things that were easy to handle." -- Vladimir Savchenko
Вопросы есть? Вопросов нет! | BCEB: Belkin Certified Expert Baffler | "Plan B is Plan A with an element of panic." -- John Clarke
Accounting is architecture, remember that!

Dieselboy

  • Administrator
  • advipservicesk9
  • *****
  • Join Date: Jan 2015
  • Posts: 1165
  • Country: au
  • Rep: 5
    • View Profile
  • Certifications: CCNP, CCNA-Voice
Re: Vendor HA Recommendations vs. Reality
« Reply #1 on: May 22, 2018, 01:58:14 AM »
It seems like a misunderstanding... This is where I start asking questions and annoying people (as a consequence of asking questions and probing). But how else can you say "what do you mean you only support direct HA connection... Can you tell me why?" One could take it as though they are implying that they are not using Ethernet as a mechanism to communicate between the two devices, ie a switch wouldn't work; or the keepalive timeout value is so low that they can't have latency. But the vendor should be able to give parameters to allow a design to be built. One of the possible problems I can foresee is if there was such an outage event and then it came to light that the equipment was set up in a non-supported way, the vendors like a 'get out clause'...

There's a few instances in my place that are running fine that the vendor has said is not supported or not possible, although nothing to do with HA. In those cases I try and test, and see if it performs as expected.