Tuesday, October 8, 2013

BGP Conditionally Injected Loops

While writing up the previous post on BGP Conditional Route Injection I got an idea while I was verifying the information in the post. I thought I saw a way for a loop to form in that the injected route would be learned by a router making it send traffic in the wrong direction.

It turns out I was right.


This scenario continues exactly where I left off with the BGP Conditional Route Injection post.  To refresh, here’s the topology.



And still not shown is BB2 connected to Sw2 that is the source of all our routes.

The idea here is pretty simple. When I was looking at the BGP table for R1 I noticed the the injected route was being learned by R1, even though this was the router that was suppressing the route in the aggregate.

R1#sh ip bgp 192.168.3.0/24
BGP routing table entry for 192.168.3.0/24, version 12
Paths: (2 available, best #2, table Default-IP-Routing-Table, Advertisements suppressed by an aggregate.)
  Not advertised to any peer
  2313, (aggregated by 112 1.1.1.1)
    10.1.13.13 from 10.1.13.13 (13.13.13.13)
      Origin incomplete, localpref 100, valid, external, atomic-aggregate
  (12) 2122
    10.21.12.21 (metric 1) from 12.12.12.12 (12.12.12.12)
      Origin IGP, metric 0, localpref 100, valid, confed-external, best

I expected to see the original route suppressed, but I was momentarily confused about the other copy.  After reading the output it was obvious that the second route was being learned from Sw3, who in turn was learning it from R2.  This was the injected route circling back around!

Good thing that the suppress-map covers ALL instances of a given route in the BGP table. This was something I hadn’t really contemplated before but it made sense when I thought about it. But what if R1 wasn’t suppressing the route?What if R1 wasn’t doing the aggregation, and instead the aggregation was being done on Sw2 as the routes entered the AS?  And what if Sw2 was still suppressing the route that R1 was injecting? 

Time to find out…

First I’ll remove the aggregate, and allow for the network to reconverge.

R1#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R1(config)#router bgp 1
R1(config-router)#$s 192.168.0.0 255.255.248.0 suppress-map SUPPRESS_MAP
R1(config-router)#end
R1#sh ip bgp
BGP table version is 14, local router ID is 1.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.1.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.2.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.3.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.4.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.5.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.6.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.7.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.8.0      10.21.12.21              0    100      0 (12) 2122 i
*> 192.168.9.0      10.21.12.21              0    100      0 (12) 2122 I

Now I’ll paste in the exact same config I had on R1 for the aggregate and suppress-map.

ip prefix-list SUPPRESS seq 5 permit 192.168.3.0/24
!
route-map SUPPRESS_MAP permit 10
 match ip address prefix-list SUPPRESS
!
router bgp 12
  aggregate-address 192.168.0.0 255.255.248.0 suppress-map SUPPRESS_MAP

And a quick verification, because CCIE’s always verify.

Sw2#sh ip bgp
BGP table version is 14, local router ID is 12.12.12.12
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.0.0/21   0.0.0.0                            32768 i
*> 192.168.1.0      10.21.12.21              0             0 2122 i
*> 192.168.2.0      10.21.12.21              0             0 2122 i
s> 192.168.3.0      10.21.12.21              0             0 2122 i
*> 192.168.4.0      10.21.12.21              0             0 2122 i
*> 192.168.5.0      10.21.12.21              0             0 2122 i
*> 192.168.6.0      10.21.12.21              0             0 2122 i
*> 192.168.7.0      10.21.12.21              0             0 2122 i
*> 192.168.8.0      10.21.12.21              0             0 2122 i
*> 192.168.9.0      10.21.12.21              0             0 2122 I

We’ll also verify that R2 still sees the aggregate, and is injecting its route.

R2#sh ip bgp 192.168.3.0/24
BGP routing table entry for 192.168.3.0/24, version 18
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
        1
  Local, (aggregated by 112 12.12.12.12), (injected path from 192.168.0.0/21)
    10.1.2.1 from 10.1.2.1 (1.1.1.1)
      Origin incomplete, localpref 100, valid, external, atomic-aggregate, best

Looking good.  You’ll notice that compared to the last post the aggregator has changed to the RID of Sw2.

Shall we see what R1 thinks about this 192.168.3.0/24 route of ours?

R1#sh ip bgp 192.168.3.0
BGP routing table entry for 192.168.3.0/24, version 17
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
        1    2
  2313, (aggregated by 112 12.12.12.12)
    10.1.13.13 from 10.1.13.13 (13.13.13.13)
      Origin incomplete, localpref 100, valid, external, atomic-aggregate, best

I believe this is where one is suppose to yell SUCCESS!!!

R1 now thinks that the next hop for 192.168.3.0/24 is 10.1.13.13, the address of Sw3. We know that Sw3 is learning that route from R2, and R2 is learning the aggregate from R1.  This, ladies and gentlemen, is a routing loop.

As always, we verify. This time from R3.

R3#trace 192.168.3.1

Type escape sequence to abort.
Tracing the route to 192.168.3.1

  1 10.23.13.2 0 msec 0 msec 0 msec
  2 10.1.2.1 4 msec 0 msec 4 msec
  3 10.1.13.13 0 msec 4 msec 0 msec
  4 10.23.13.2 4 msec 0 msec 0 msec
  5 10.1.2.1 0 msec 4 msec 0 msec
  6 10.1.13.13 4 msec 0 msec 4 msec
  7 10.23.13.2 0 msec 4 msec 0 msec
  8 10.1.2.1 4 msec 4 msec 0 msec
  9 10.1.13.13 4 msec 4 msec 0 msec
 10 10.23.13.2 4 msec 0 msec 4 msec
 11 10.1.2.1 0 msec 0 msec 4 msec
 12 10.1.13.13 4 msec 4 msec 8 msec
 13 10.23.13.2 0 msec 4 msec 0 msec
 14 10.1.2.1 4 msec 4 msec 4 msec
 15 10.1.13.13 4 msec 4 msec 4 msec
 16 10.23.13.2 0 msec 4 msec 4 msec
 17 10.1.2.1 4 msec 0 msec 4 msec
 18 10.1.13.13 4 msec 4 msec 4 msec
 19 10.23.13.2 4 msec 0 msec 4 msec
 20 10.1.2.1 4 msec 4 msec 4 msec
 21 10.1.13.13 8 msec 4 msec 4 msec
 22 10.23.13.2 4 msec 0 msec 4 msec
 23 10.1.2.1 4 msec 4 msec 4 msec
 24  *
    10.1.13.13 4 msec 4 msec
 25 10.23.13.2 0 msec 0 msec 4 msec
 26 10.1.2.1 4 msec 4 msec 4 msec
 27 10.1.13.13 4 msec 4 msec 4 msec
 28 10.23.13.2 4 msec 4 msec 4 msec
 29 10.1.2.1 4 msec 4 msec 4 msec
 30 10.1.13.13 4 msec 4 msec 4 msec

Isn’t it beautiful?  My very own BGP Conditionally Injected Loop!

I suppose the next question is why does this work?  The simple answer is the BGP loop detection mechanism is rendered useless with the injected route not containing an AS path. With this valuable information not carried into the injected route R1 gladly accepts the advertisement in and assumes it’s valid.  If we could copy the AS path attribute to the injected route this issue wouldn’t happen…  If only there was a way to do that… Maybe there was something in the last post that could help?

R2(config-router)# bgp inject-map ADVERTISE-MAP-1 exist-map EXIST-MAP-1 ?
  copy-attributes  Copy attributes from aggregate
  <cr>

That seems like a good bet!  I’ll go ahead and apply it to the aggregate-address command and let things reconverge.  Once they do this is what we now have on R1.


R1#sh ip bgp 192.168.3.0
BGP routing table entry for 192.168.0.0/21, version 15
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
        2
  (12), (aggregated by 112 12.12.12.12)
    12.12.12.12 (metric 1) from 12.12.12.12 (12.12.12.12)
      Origin IGP, metric 0, localpref 100, valid, confed-external, atomic-aggregate, best

It would seem that our route is no longer present.  A quick check on Sw3 as well:

Sw3#sh ip bgp 192.168.3.0
BGP routing table entry for 192.168.3.0/24, version 28
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
     2
  112, (aggregated by 112 12.12.12.12)
    10.1.2.1 (metric 65) from 3.3.3.3 (3.3.3.3)
      Origin IGP, metric 0, localpref 100, valid, internal, atomic-aggregate, best
      Originator: 2.2.2.2, Cluster list: 3.3.3.3

Yup, our AS path shows that this route originated in AS112, thereby allowing the BGP loop detection mechanism to kick in and force R1 to discard the route. We have unlooped the loop.

Of course there are many other ways that this could be solved as well. Setting the no-export community on the injected route comes to mind. Simple filtering at the AS boundary would also kill the loop dead.  I’m sure there’s more. If you can think of them please by all means post them in the comments.




No comments:

Post a Comment