Hi guys,
I'm writing a paper on BGP in the data center and could use your help understanding something. I've set up a virtual lab in GNS3 and attached a diagram to the post.
My question is in regards to what happens at LEAF2. LEAF2 has two equal cost paths to reach 10.0.2.0/24:
BGPDC-LEAF2(config-router-bgp)#sh ip bgp | i 10.0.2.0/24
* >Ec 10.0.2.0/24 10.255.255.2 0 100 0 64600 65002 i
* ec 10.0.2.0/24 10.255.255.10 0 100 0 64600 65002 i
BGPDC-LEAF2(config-router-bgp)#sh ip route
Codes: C - connected, S - static, K - kernel,
O - OSPF, IA - OSPF inter area, E1 - OSPF external type 1,
E2 - OSPF external type 2, N1 - OSPF NSSA external type 1,
N2 - OSPF NSSA external type2, B I - iBGP, B E - eBGP,
R - RIP, I - ISIS, A B - BGP Aggregate, A O - OSPF Summary,
NG - Nexthop Group Static Route
Gateway of last resort is not set
C 1.1.1.0/31 is directly connected, Vlan4094
C 1.1.1.2/31 is directly connected, Vlan4093
C 10.0.1.0/24 is directly connected, Vlan10
B E 10.0.2.0/24 [200/0] via 10.255.255.2, Ethernet1
via 10.255.255.10, Ethernet2
< .. Omitted for brevity .. >
LEAF2 is only advertising to LEAF1 the 10.0.2.0/24 route via 10.255.255.2:
BGPDC-LEAF1(config-if-Et1-2)#sh ip bgp | i 10.0.2.0/24
* > 10.0.2.0/24 10.255.255.2 0 100 0 64600 65002 i
BGPDC-LEAF1(config-if-Et1-2)#sh ip bgp neighbors 1.1.1.3 received-routes | i 10.0.2.0/24
* > 10.0.2.0/24 10.255.255.2 0 100 - 64600 65002 i
BGPDC-LEAF2(config-router-bgp)#sh ip bgp nei 1.1.1.2 advertised-routes | i 10.0.2.0/24
* >Ec 10.0.2.0/24 10.255.255.2 - 100 - 64600 65002 i
Why is it not advertising both? Does it only advertise the "best" of the ECMP routes?
BGPDC-LEAF2(config-router-bgp)#sh ip bgp 10.0.2.0/24
BGP routing table information for VRF default
Router identifier 10.255.254.12, local AS number 65000
BGP routing table entry for 10.0.2.0/24
Paths: 2 available
64600 65002
10.255.255.2 from 10.255.255.2 (10.255.254.1)
Origin IGP, metric 0, localpref 100, weight -, valid, external, ECMP head, best, ECMP contributor
64600 65002
10.255.255.10 from 10.255.255.10 (10.255.254.2)
Origin IGP, metric 0, localpref 100, weight -, valid, external, ECMP, ECMP contributor
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
Gotcha - thanks for the verification. Been a while since I played around with this in depth.
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
Oh hey while I have you, purely being lazy here, but what makes that route the best? Is it lowest router ID (SPINE1 is lower than SPINE2)?
Note this can be changed in the latest IOS-XE. about time.... but makes life potentially very complicated
http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5 (http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5)
Service providers are not sure whether to say YAY or OMG we have to rip up our entire RD:RT bag of tricks..... i.e. this traditional way of getting around this 'problem'
http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html (http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html)
Quote from: wintermute000 on March 18, 2015, 05:11:27 PM
Note this can be changed in the latest IOS-XE. about time.... but makes life potentially very complicated
http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5 (http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_bgp/configuration/xe-3s/irg-xe-3s-book/irg-additional-paths.html#GUID-4EB13F76-7C14-4B74-AFE0-66BF07976BD5)
Service providers are not sure whether to say YAY or OMG we have to rip up our entire RD:RT bag of tricks..... i.e. this traditional way of getting around this 'problem'
http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html (http://blog.ipspace.net/2012/07/bgp-route-replication-in-mplsvpn-pe.html)
Interesting, but I don't work with IOS. ;) I don't necessarily need to advertise all ECMP routes either, but I'll keep this in mind for future studies - thanks!
Take a look at BGP add-path https://tools.ietf.org/html/draft-ietf-idr-add-paths-10
sent from phone.
Quote from: AspiringNetworker on March 18, 2015, 10:45:31 PM
Interesting, but I don't work with IOS. ;)
Not according to Cisco!
:awesome:
Quote from: AspiringNetworker on March 18, 2015, 03:58:13 PM
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
Gotcha - thanks for the verification. Been a while since I played around with this in depth.
http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html
Quote from: that1guy15 on March 19, 2015, 08:22:39 AM
Quote from: AspiringNetworker on March 18, 2015, 03:58:13 PM
Quote from: that1guy15 on March 18, 2015, 12:57:08 PM
yep, even though the RIB might be doing ECMP using two BGP paths BGP still only advertises the best path to neighbors.
Gotcha - thanks for the verification. Been a while since I played around with this in depth.
http://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html
Thanks sir. I didn't want to include a vendor doc reference in my paper so I found this in the RFC as well - though a little more wordy. RFC 4271, Section 9.1.2.2, "Breaking Ties (Phase 2)"
Add path and PIC work out surprisingly well. However, I have a similar setup like this in all my data centers with NXOS as leaf and spines. I have no issues at all turn on multipathing and we are good to go I have not had to use add-paths or pic in that environments.
Quote from: burnyd on March 21, 2015, 02:46:24 PM
Add path and PIC work out surprisingly well. However, I have a similar setup like this in all my data centers with NXOS as leaf and spines. I have no issues at all turn on multipathing and we are good to go I have not had to use add-paths or pic in that environments.
Yeah as mentioned before, I don't have a need for it either. Out of curiosity though - what's the use case for it?
Its hard to talk about it without drawing it out but I have 4 internet peerings and its all meshed between multiple data centers in one large ibgp/ospf mesh with full ipv4 tables. So the failover once one of the internet circuits was not failing over outbound as quickly as it should because obviously that next hop would disapear. Add paths made it possible to make the failover much faster.
Quote from: burnyd on March 24, 2015, 11:28:19 AM
Its hard to talk about it without drawing it out but I have 4 internet peerings and its all meshed between multiple data centers in one large ibgp/ospf mesh with full ipv4 tables. So the failover once one of the internet circuits was not failing over outbound as quickly as it should because obviously that next hop would disapear. Add paths made it possible to make the failover much faster.
I think I follow - add paths made it so that routes would be advertised with reachable next hops versus the single current best path before the failover occurred, which would disappear in the failure scenario. Makes sense - thanks.
EDIT - And actually, I think I have a use case for this with BGP in the DC with an ECMP switch fabric for that same reason - improving failover time... I'll have to dig into that.