Networking-Forums.com

General Category => Forum Lobby => Topic started by: LynK on May 26, 2015, 03:23:53 PM

Title: Cisco is at it again!! (Stable releases... NOT)
Post by: LynK on May 26, 2015, 03:23:53 PM
well....

Since we all know cisco is known for its stable releases, I thought we should create a thread to show the funniest (a.k.a - worst) bugs that you have come across. Make sure you specify which version, as well as the bug ID. The second one is pretty bad... if you are on it... upgrade STAT!

Browsing through the new 2960x IOS release, I decided to go to the caveats, and I found this beauty:


Version: 15.2.3E1 (ED)
Bug ID: CSCuo55798
Headline:  Priority Queue Latency increases significantly during congestion (LOOOOOL)
:zomgwtfbbq:  :developers:




Version: 15.0.2-EX5(ED)
Bug ID: CSCur56395
Headline: SFP issues (link flap on 10G SFP interfaces)

:wall: :wall: :wall:
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on May 26, 2015, 03:57:40 PM
Always be on the lookout for microcode updates: http://tekcert.com/blog/2012/04/07/upgrading-3750x-can-take-longer-you-think
Going from 12.2-53 to 12.2-58 can take an extra 30-45 minutes to finish.

Then there's the cute trick you have to do when upgrading from 12.2-58 to 15.0 on your 4500s... http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst4500/release/note/OL_24829.html

QuoteIf a switch uses a config-register ending in 0x2, it may drop into ROMMON if the bootup is interrupted by a powercycle.

Workaround: Use config-register 0x2101. CSCue19458

We hit this issue the hard way, with our 4507s going into ROMMON mode when we upgraded them to 15.0. We were all like :rage: and the switch was all like :problem?: which made us all like :rage: even more until we found the above note and then we were all like :developers: when our Cisco rep came by for a visit to see how the upgrade went.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on May 26, 2015, 08:29:36 PM
The old Cat4000s used to do this all time - go into ROMMON on a reboot - you just typed boot and it kept going
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: Dieselboy on May 29, 2015, 03:20:02 AM
O.M.G. you'll love my recent bug then...

I have a 2921 router, running as a SSL VPN server amongst other things, and we use remote IP phones (9971 and 8945, mainly) and these use AnyConnect VPN app on the phone to connect back to the office. We were running 15.3.3M3 and we were hitting a memory leak bug due to HTTPS / SSL VPN. This consumed all I/O pool memory as a symptom. The fix was to upgrade to 15.3.3M4, "but you wont believe what happened next"

The issue is, from all new IOS versions, the SSL VPN component in IOS expects a DTLS request header from the VPN client to negotiate SSL VPN on DTLS. Since there is absolutely no IP phone firmware whatsoever to send a response to this DTLS request, no AnyConnect phones at all can establish UDP SSL VPN, on newer IOS. The result is voice over TCP, constant phone reboot / lockup / call disconnection and poor audio quality when it does work. Pinging a remote VPN phone which is 150ms away results in a soon as the call is answered, latency shooting up to 1000ms and beyond then timeouts and zero audio.

Affected IOS versions are:
a) 15.3.3M4 onwards (All releases onward)
b) 15.5(1)T onwards (All releases onward)

BUG ID: CSCup56792 (Private / Internal only bug) - although I don't know if this is a bug for the issue I've mentioned but in fact an enhancement request to have this "feature" implemented.

Who in their right mind would implement a "feature" that breaks the entire SSL VPN side of the telephony handsets and not even having a planned firmware release to work with the new feature. I have a TAC case raised, titled "new IOS feature breaks AnyConnect phones" and there is not even a job tasked for the devs to implement this into the IP phones to support the IOS head end.
I've no idea if this feature has been implemented into ASAs...
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on May 29, 2015, 03:53:15 AM
Wtf!!!!!
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on May 29, 2015, 08:24:48 AM
we've still got a feature request to to be able to run ASDM on an ASA in multicontext mode that allows for two-factor authentication.  not yet in the planning stage.  Status unchanged for about a year now.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: Otanx on May 29, 2015, 09:25:06 AM
Technically not a bug, but a field notice. Still my favorite.

http://www.cisco.com/c/en/us/support/docs/field-notices/636/fn63697.html

Plugging a cable into port 1 may cause the switch to reboot, and wipe the start-up config.

-Otanx
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on May 29, 2015, 10:50:37 AM
Quote from: Otanx on May 29, 2015, 09:25:06 AM
Technically not a bug, but a field notice. Still my favorite.

http://www.cisco.com/c/en/us/support/docs/field-notices/636/fn63697.html

Plugging a cable into port 1 may cause the switch to reboot, and wipe the start-up config.

-Otanx

Haha - that's hilarious!
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on May 29, 2015, 12:10:36 PM
 :rofl:

I really enjoyed the laugh. Of course, if it happened to me...

:jackie-chan:
:rage:
:kiwf:
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: Dieselboy on June 02, 2015, 09:16:14 PM
Quote from: Otanx on May 29, 2015, 09:25:06 AM
Technically not a bug, but a field notice. Still my favorite.

http://www.cisco.com/c/en/us/support/docs/field-notices/636/fn63697.html

Plugging a cable into port 1 may cause the switch to reboot, and wipe the start-up config.

-Otanx

Love it!
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on June 30, 2015, 07:18:14 AM
hit another interesting tid bit this morning.

upgraded out 9K distro switches from 6.1(2)I3(2) to 7.0(3)I1(2)  to enable interface flow control which is available in the new release for an application that needs it.  core 9k's still running 6.1(2)I3(2).

OSPF process broke, caused a summary route from the core not to propagate to the OSPF peer distribution switches. thus leaving an island of unhappy devices that had no where to route. :(

TAC case opened and copying show-tech files now.  can't wait to see what they say.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on June 30, 2015, 09:12:47 AM
Current Vegas odds on the outcome of that TAC call:

Upgrade the core switch: 6:5
Reboot the core switch: 5:2
Reboot the distro switches: 3:1
Downgrade the distro switches: 7:2
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on June 30, 2015, 10:25:48 AM
oh, they were downgraded, had about 20 minutes to troubleshoot or revert before the window closed. rollback decision was made and everything works as expected
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on June 30, 2015, 12:11:19 PM
Quote from: ristau5741 on June 30, 2015, 10:25:48 AM
oh, they were downgraded, had about 20 minutes to troubleshoot or revert before the window closed. rollback decision was made and everything works as expected

Yuck.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: LynK on June 30, 2015, 12:43:47 PM
@ristau,


How are you allowed to go ahead with upgrades during normal operational hours, (or even off hours). If I say I am going to upgrade our 7Ks, and there is no outage (ISSU <3). The whole company goes nuts. Honestly... when I upgrade my core it is normally a year or 2 before I upgrade to a new-old version that has been tried and true...

I don't know... call me a baby.. but my upper management would flip bricks.


Access/Distrib. A O K. no problems upgrading.... core... DIR gives me the :zomgwtfbbq:
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on July 01, 2015, 07:53:03 AM
Quote from: LynK on June 30, 2015, 12:43:47 PM
@ristau,


How are you allowed to go ahead with upgrades during normal operational hours, (or even off hours). If I say I am going to upgrade our 7Ks, and there is no outage (ISSU <3). The whole company goes nuts. Honestly... when I upgrade my core it is normally a year or 2 before I upgrade to a new-old version that has been tried and true...

I don't know... call me a baby.. but my upper management would flip bricks.


Access/Distrib. A O K. no problems upgrading.... core... DIR gives me the :zomgwtfbbq:


usually same for us, but the mainframe team's been complaining about "slow replication" since the new data center was built.  after much troubleshooting and discussions with IBM and Cisco,  IBM's recommendation was to implement flow control on the interface, which Cisco recommended the 7.0 release that would support the requirement,  upper management got wind and in order to stop the complaints and to "fix" the issue, were all in favor. CR ween throuhg the full process, notification wen out with a "possibility of network interruption" AS did a bug scrub which was good, and we ran the code on a non-production switch for more than a week with no issues.  Goes to showu you just never know. monitoring and vigilance are required.

Best thing other then monitoring and vigilance, is to document exactly what is occurring and when, and with results, from each step. because when I was interrogated by the deputy director and the Operations manager, I would have looked alot better if I had all my P's and Q's in alignment, and not have to guess what happened and when.

so for today, I start logging my device access, using logging feature in secureCRT, so everything I type is logged for reference. and I changed the PS1 value on my jumpbox through .bashrc, so that the I have a date and time stamp at every command prompt.  just need to figure out if I can do the same D&T prompt in the Cisco cli. because when I'm in a terminal server jumping between devices, there are no unix cli date/time stamp prompts for reference.

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on July 10, 2015, 10:10:48 AM
Quote from: ristau5741 on June 30, 2015, 07:18:14 AM
hit another interesting tid bit this morning.

upgraded out 9K distro switches from 6.1(2)I3(2) to 7.0(3)I1(2)  to enable interface flow control which is available in the new release for an application that needs it.  core 9k's still running 6.1(2)I3(2).

OSPF process broke, caused a summary route from the core not to propagate to the OSPF peer distribution switches. thus leaving an island of unhappy devices that had no where to route. :(

TAC case opened and copying show-tech files now.  can't wait to see what they say.


turns out to be a bug
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on July 11, 2015, 05:54:35 AM
And you're doing it the safe way (NX-OS). You should talk to the guys in my mob doing ACI. The horror.
Even in the lab (training) we were seeing crazy behaviour. Stuff mysteriously not working, then working 2 minutes later with no intervention, etc.

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on July 11, 2015, 10:41:52 AM
Quote from: wintermute000 on July 11, 2015, 05:54:35 AM
And you're doing it the safe way (NX-OS). You should talk to the guys in my mob doing ACI. The horror.
Even in the lab (training) we were seeing crazy behaviour. Stuff mysteriously not working, then working 2 minutes later with no intervention, etc.



Random stuff that happens without intervention = hardware craziness. Something is making that hardware do things that it does not want to do.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on July 11, 2015, 09:24:38 PM
Quote from: wintermute000 on July 11, 2015, 05:54:35 AM
And you're doing it the safe way (NX-OS). You should talk to the guys in my mob doing ACI. The horror.
Even in the lab (training) we were seeing crazy behaviour. Stuff mysteriously not working, then working 2 minutes later with no intervention, etc.

Heh, shocker. 

Any positive feedback been given?  Curious to know where it does well in addition to where it has issues.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on July 11, 2015, 10:18:46 PM
Quote from: deanwebb on July 11, 2015, 10:41:52 AM
Quote from: wintermute000 on July 11, 2015, 05:54:35 AM
And you're doing it the safe way (NX-OS). You should talk to the guys in my mob doing ACI. The horror.
Even in the lab (training) we were seeing crazy behaviour. Stuff mysteriously not working, then working 2 minutes later with no intervention, etc.



Random stuff that happens without intervention = hardware craziness. Something is making that hardware do things that it does not want to do.

Software programs the hardware... especially on switches....
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on July 12, 2015, 02:03:04 AM
I wouldn't be so quick to dismiss the Cisco or Vmware juggernauts, even if Arista is the current leader of the programmable hardware switching. In fact clos designs and SDN may progress to a point where the physical switching is commoditised even further. Vmware is out there pushing the 'fabric is irrelevant just make it L3 multipathing' party line HARD. Who cares if your Arista or Juniper or Chinese knockoff vendor has an API? all leaf switches have the same blog standard OSPF config or whatever.


both Vmware and Cisco ACI are all about the overlay.


Where ACI has a distinct advantage- and disadvantage - is that everything is redefined around the actual software flow. Heck, even a VLAN in ACI is no longer a VLAN (the 'traditional 'VLAN is actually what cisco calls a bridge segment and what everyone else calls a VXLAN ID' - but its still there as an identifier - I could go on, but I'll just say read the book ROFL). The key is that policy drives everything - you have to define whats allowed to pass and whats not - and the ACI managers are not in the control plane at all, unlike every other SDN solution including Vmware. Then you trust the magic overlay special sauce to do its business. Just working out a basic packet flow is like inception if you want to peel back all the layers and go from the abstraction to the overlay mechanics to the underlay beneath the overlay. But put it this way, you can't think in terms of NICs with IP addresses and MAc addresses in a VLAN routed like XYZ. The entire flow is broken down into policy constructs, and defined accordingly. This potentially enables massive benefits in terms of orchestration and not having to worry about layer 3 design or flow.


The flip side is, its hard as hell to understand, everything takes 50 times as long unless you're going to replicate it 1000 times via an API call, and you're at the complete mercy of the 'plug and pray' underlay/overlay magic sauce. Also the feature-set is not mature yet - the service insertion (basically allowing 3rd party appliances into the packet flow likee FWs, load balancers etc.) is completely stuffed and very basic. Finally their ability to reach into the virtual layer which is absolutely critical for end to end control/abstraction is partly stymied by vmware - its unclear how the integration will work with Vsphere 6 (in 5.5 the ACI actually programs the dvswitching transparently).


Vmware hasn't gone halfway as far, and have an immediately attractive / easy to learn model of basically making virtual facimilies of current R&S paradigms and removing some of the hairpinning more or less. The underlay/overlay model is still there. But the flip side is they have no input into the physical layer, and nor do they enable the potential benefits of software defining all the traffic - you're still basically thinking in terms of connecting virtual NICs to virtual switches.


Disclaimer: have done a lot of training on both platforms and done a VCP-NV, but yet to work on production.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on July 12, 2015, 07:23:08 AM
If it's incredibly complicated, then one of two things will happen:

1. Lots of people walking in and out of the data center and all around the corporate floor with big thumb drives until the network guys figure out how to give everyone proper access.

2. The GUI is pretty decent and gets everything running... until a code upgrade somewhere leads to a memory leak somewhere else and then, WHAMMO! Network is down and everyone is using guest wireless and dropbox to share files.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on July 13, 2015, 07:18:33 AM
Quote from: ristau5741 on July 10, 2015, 10:10:48 AM
Quote from: ristau5741 on June 30, 2015, 07:18:14 AM
hit another interesting tid bit this morning.

upgraded out 9K distro switches from 6.1(2)I3(2) to 7.0(3)I1(2)  to enable interface flow control which is available in the new release for an application that needs it.  core 9k's still running 6.1(2)I3(2).

OSPF process broke, caused a summary route from the core not to propagate to the OSPF peer distribution switches. thus leaving an island of unhappy devices that had no where to route. :(

TAC case opened and copying show-tech files now.  can't wait to see what they say.


turns out to be a bug

got the id this morning

CSCuv24226.

it is very specific.

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on July 13, 2015, 11:55:39 AM
Quote from: wintermute000 on July 12, 2015, 02:03:04 AM
I wouldn't be so quick to dismiss the Cisco or Vmware juggernauts, even if Arista is the current leader of the programmable hardware switching. In fact clos designs and SDN may progress to a point where the physical switching is commoditised even further. Vmware is out there pushing the 'fabric is irrelevant just make it L3 multipathing' party line HARD. Who cares if your Arista or Juniper or Chinese knockoff vendor has an API? all leaf switches have the same blog standard OSPF config or whatever.


both Vmware and Cisco ACI are all about the overlay.


Where ACI has a distinct advantage- and disadvantage - is that everything is redefined around the actual software flow. Heck, even a VLAN in ACI is no longer a VLAN (the 'traditional 'VLAN is actually what cisco calls a bridge segment and what everyone else calls a VXLAN ID' - but its still there as an identifier - I could go on, but I'll just say read the book ROFL). The key is that policy drives everything - you have to define whats allowed to pass and whats not - and the ACI managers are not in the control plane at all, unlike every other SDN solution including Vmware. Then you trust the magic overlay special sauce to do its business. Just working out a basic packet flow is like inception if you want to peel back all the layers and go from the abstraction to the overlay mechanics to the underlay beneath the overlay. But put it this way, you can't think in terms of NICs with IP addresses and MAc addresses in a VLAN routed like XYZ. The entire flow is broken down into policy constructs, and defined accordingly. This potentially enables massive benefits in terms of orchestration and not having to worry about layer 3 design or flow.


The flip side is, its hard as hell to understand, everything takes 50 times as long unless you're going to replicate it 1000 times via an API call, and you're at the complete mercy of the 'plug and pray' underlay/overlay magic sauce. Also the feature-set is not mature yet - the service insertion (basically allowing 3rd party appliances into the packet flow likee FWs, load balancers etc.) is completely stuffed and very basic. Finally their ability to reach into the virtual layer which is absolutely critical for end to end control/abstraction is partly stymied by vmware - its unclear how the integration will work with Vsphere 6 (in 5.5 the ACI actually programs the dvswitching transparently).


Vmware hasn't gone halfway as far, and have an immediately attractive / easy to learn model of basically making virtual facimilies of current R&S paradigms and removing some of the hairpinning more or less. The underlay/overlay model is still there. But the flip side is they have no input into the physical layer, and nor do they enable the potential benefits of software defining all the traffic - you're still basically thinking in terms of connecting virtual NICs to virtual switches.


Disclaimer: have done a lot of training on both platforms and done a VCP-NV, but yet to work on production.

No sweat - good analysis, just will be interesting to see the results when the rubber hits the road so to speak.  I said "shocker", because that's all the feedback I've heard so far.  Everyone talks about how wonderful ACI is supposed to be, but horror stories in actual implementation.  The devil is always in the details.  Looking forward to hearing more about it and your experiences with it - good and bad.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on July 23, 2015, 11:01:25 AM
found out the hard way

so on certain releases of IOS code,
when you SNMP poll MIB 1.0.8802.1.1.2.1.5.4795
or in english lldpXMedMIB

it makes your CPU processor go to 100% and stay there.


Bug ID CSCuu05714

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on July 23, 2015, 11:05:40 AM
Quote from: ristau5741 on July 23, 2015, 11:01:25 AM
found out the hard way

so on certain releases of IOS code,
when you SNMP poll MIB 1.0.8802.1.1.2.1.5.4795
or in english lldpXMedMIB

it makes your CPU processor go to 100% and stay there.


Bug ID CSCuu05714


Should I try that on my devices in the data center?
:challenge-considered:
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on July 23, 2015, 05:19:31 PM
Wtf

Sent from my SM-G920I using Tapatalk

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on July 24, 2015, 10:54:01 AM
Todays bug is related to the 1000v,  as we were addling licenses 
Bug IDCSCut56474 - I can't see the bug on CCO due to insufficient privileges,

But it's just that when you have version 1 licenses and install a version 3 license, all your version 1 licenses become invalid
and it looks like we'll need to buy upgrade licenses to replace the version 1 licenses with version 3 licenses.
there is a specific license upgrade product (L-N1K-CPU-V3UP-01) to resolve this.

Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: wintermute000 on July 24, 2015, 06:13:08 PM
As a card carrying vcp, I say again what earth shattering improvement you get from a 1000v over a stuck 5.5 dvswitch. Nothing. So the networking guys get control again? That's political not technical. Let them learn vmware is the better solution.
Did I also mention 1000v is not compatible with aci, nor is it usable with vsphere6
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: Dieselboy on July 31, 2015, 02:28:13 AM
Quote from: ristau5741 on July 23, 2015, 11:01:25 AM
found out the hard way

so on certain releases of IOS code,
when you SNMP poll MIB 1.0.8802.1.1.2.1.5.4795
or in english lldpXMedMIB

it makes your CPU processor go to 100% and stay there.


Bug ID CSCuu05714

That's a nice feature.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: NetworkGroover on July 31, 2015, 11:15:58 AM
Quote from: Dieselboy on July 31, 2015, 02:28:13 AM
Quote from: ristau5741 on July 23, 2015, 11:01:25 AM
found out the hard way

so on certain releases of IOS code,
when you SNMP poll MIB 1.0.8802.1.1.2.1.5.4795
or in english lldpXMedMIB

it makes your CPU processor go to 100% and stay there.


Bug ID CSCuu05714

That's a nice feature.

Yes - say thank you!

So... I heard a crazy one the other day on the 9500.  A VLAN was removed and the switch crashed. Called TAC and the solution was to put the VLAN back.  If anyone is familiar with this, I'd LOVE to get the bug id.  Thanks.
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: icecream-guy on August 27, 2015, 10:45:30 AM
SO if you have a Cisco 6800 with 10G/40G interfaces running 15.1(2)SY code, and specific modules as noted in the bug scrub,
apparently packet padding does not work. Legit packets of less than 64 bytes get dropped rather than padded to 64 bytes
and sent out the egress interface.

took a long time for the team to figure this one. (luckily, I wasn't on this task, i'd of pulled my hair out.)

CSCut40421 if you are interested.



Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: Dieselboy on October 22, 2015, 11:02:33 PM
Just had this bug created from my TAC case: https://tools.cisco.com/bugsearch/bug/CSCuw80259

In short, CUCM 11 / CUPS (IM&P) 11 now uses Active Directory groups so you can bung all your users in to certain groups such as "developers" and then your users can populate their contacts list with a group. So if people join or leave the company or change roles in the company, the contacts list is managed by the AD group and not the end user. So everyones contacts list is updated automatically. However, the problem is when you have users in the AD group contact, their extension number shows up as "unknown" rather than "Work"..

Did Cisco even test it?
Title: Re: Cisco is at it again!! (Stable releases... NOT)
Post by: deanwebb on October 23, 2015, 08:44:06 AM
They probably tested it, but only with the CLI.

:facepalm1: