Thursday, February 24, 2011

IPv6 Will Make You Think Differently: We Don't Need No Stinking FHRP

I got into a discussion about the availability of HSRP for IPv6 being in higher end Cisco platforms but not the lower ones.  I didn't think too much of the question when first saw it, but in reading up on IPv6 last night I came across some of the details in Neighbour Discovery that I thought might be a good replacement.  Deep within RFC 2461 lies section 6.3.6:

6.3.6.  Default Router Selection

   The algorithm for selecting a router depends in part on whether or
   not a router is known to be reachable.  The exact details of how a
   node keeps track of a neighbor's reachability state are covered in
   Section 7.3.  The algorithm for selecting a default router is invoked
   during next-hop determination when no Destination Cache entry exists
   for an off-link destination or when communication through an existing
   router appears to be failing.  Under normal conditions, a router
   would be selected the first time traffic is sent to a destination,

   with subsequent traffic for that destination using the same router as
   indicated in the Destination Cache modulo any changes to the
   Destination Cache caused by Redirect messages.

   The policy for selecting routers from the Default Router List is as
   follows:

     1) Routers that are reachable or probably reachable (i.e., in any
        state other than INCOMPLETE) SHOULD be preferred over routers
        whose reachability is unknown or suspect (i.e., in the
        INCOMPLETE state, or for which no Neighbor Cache entry exists).
        An implementation may choose to always return the same router or
        cycle through the router list in a round-robin fashion as long
        as it always returns a reachable or a probably reachable router
        when one is available.

     2) When no routers on the list are known to be reachable or
        probably reachable, routers SHOULD be selected in a round-robin
        fashion, so that subsequent requests for a default router do not
        return the same router until all other routers have been
        selected.

        Cycling through the router list in this case ensures that all
        available routers are actively probed by the Neighbor
        Unreachability Detection algorithm.  A request for a default
        router is made in conjunction with the sending of a packet to a
        router, and the selected router will be probed for reachability
        as a side effect.

     3) If the Default Router List is empty, assume that all
        destinations are on-link as specified in Section 5.2.


Let's take a look at how we can use this to our advantage.

For this discussion I'm using the following topology.



TheInternet is configured as an IPv6 host at :10, the three routers all have a last "chunk" of their own router number in the relevant subnet, and fc00:12::2 off Router2 is just a loopback.  All the interfaces of the routers are running OSPFv3 and have full reachability.  TheInternet does not have a default gateway configured.

So the test here is to see what kind of failover we can get out of IPv6 ND.

Here's some show command output to set the stage.  TheInternet has learned of two routers via ND Router Advertisements (RAs).  Then a traceroute to show TheInternet can reach Router2's loopback, and that there is no default route configured. 
TheInternet#sh ipv6 routers
Router FE80::1 on Vlan111, last update 0 min
  Hops 64, Lifetime 1 sec, AddrFlag=0, OtherFlag=0, MTU=1500
  HomeAgentFlag=0, Preference=High
  Reachable time 0 msec, Retransmit time 0 msec
  Prefix FC00:16::/64 onlink autoconfig
    Valid lifetime 2592000, preferred lifetime 604800
Router FE80::3 on Vlan111, last update 0 min
  Hops 64, Lifetime 1 sec, AddrFlag=0, OtherFlag=0, MTU=1500
  HomeAgentFlag=0, Preference=Medium
  Reachable time 0 msec, Retransmit time 0 msec
  Prefix FC00:16::/64 onlink autoconfig
    Valid lifetime 2592000, preferred lifetime 604800

TheInternet#traceroute ipv6 fc00:11::2

Type escape sequence to abort.
Tracing the route to FC00:11::2

  1 FC00:16::1 0 msec 0 msec 0 msec
  2 FC00:11::2 0 msec 0 msec 0 msec
TheInternet#sh ipv6 ro
IPv6 Routing Table - Default - 3 entries
Codes: C - Connected, L - Local, S - Static, U - Per-user Static route
C   FC00:16::/64 [0/0]
     via Vlan111, directly connected
L   FC00:16::10/128 [0/0]
     via Vlan111, receive
L   FF00::/8 [0/0]
     via Null0, receive
As a side note, IPv6 actually specifies that you SHOULD NOT configure a default gateway on IPv6 hosts so that they learn it via RAs.

You'll notice in the above show output that Router1 has a preference of "High" while Router3 has a preference of "Medium".  I manually configured Router1 to have a High preference using the interface command:
(config-if)# ipv6 nd ra router-preference high
The default is medium, and there is also an low option.  This allows you to have some control over what gateway will be used by the hosts.  If you leave all the gateways to the default then you're at the mercy of the OS vendors to determine how they want to do it (the RFC says either using a single random choice or a round robin are both OK).

What's also important to note here is that our host knows about 2 valid gateways.  They don't show up in the routing table, but they are known, and the traceroute proves that our host will use them. as required.  Pretty cool.


So how about some failover?  Sure, why the heck not.

In the interest of pushing the limits I've modified the RA timers to VERY aggressive settings. There are two RA timers that we're interested in here.
  • RA Interval: The time interval between RA advertisements
  • RA Lifetime: The time that a host should consider an RA valid
For this test the intervals are set to 50 and 70 msec for Router3 and Router1 respectively (their lowest values) and the lifetime is set to 1 second for both.

I've also set up some ACLs on all the interfaces facing the fc00:11::/64 subnet to catch the ICMP counts so we know where things are going. 

Let's run a ping from TheInternet to Router2's loopback, say 1000 of them with a timeout of 0 to really flood them out there.  I'm going to start the ping and then shutdown the interface on Router1 facing TheInternet so we can see if things fail over.
TheInternet#ping fc00:12::2 rep 1000 time 0

Type escape sequence to abort.
Sending 1000, 100-byte ICMP Echos to FC00:12::2, timeout is 0 seconds:
!!!!!!!!!!!!!!!!!.!!!!!!.!!!!!.!!!!!!!!!!!.!!!!!.!!!!!.!!!!!!!!!!!.!!!
!!!!!!!!!!!!!!!!!!!.!!!!!.!!!!!!!!!!!!!!!!!!!!.!!!!.!!!!!.!!!!!.!!!!!!
!!!!!!!!!!.!!!!!!!!!!.!!!!!.!!!!!.!!..................................
......................................................................
......................................................................
......................................................................
......................................................................
......................................................................
......................................................................
......................................................................
..............!!!!!.!!!!.!!!!.!!!!!!!!!!!.!!!!.!!!!!!!!.!!!.!!!!!!!!!!
!!!!!!!.!!!!.!!!!!!!!!!!.!!.!!!!.!!!!.!!!!!!!!.!!!!!!!!.!!!!.!!!!!!!!!
!!!!!!!!!!!.!!!.!!!!.!!!.!!!!.!!!!!.!!!!.!!!!!!!!!!!.!!.!!!!!!!!.!!!!!
!!.!.!!!!.!!!!.!!!!!!!!!!!!.!!!!.!!!!!!!!.!.!!!!!!!!!!!!!!!!.!!!!!!!!!
!!!!!!!!!!!!!!!!!!!.
Success rate is 40 percent (409/1000), round-trip min/avg/max = 0/0/9 ms
Remember, the timeout is zero, so a lot of those last pings aren't really lost.  The big chunk in the middle are true drops, but not the junk around the edges.


So where did our traffic go?
Router1#sh ipv6 access
IPv6 access list icmp_out
    permit icmp any any echo-request (176 matches) sequence 10
    permit ipv6 any any (2 matches) sequence 20
IPv6 access list icmp_in
    permit icmp any any echo-reply (716 matches) sequence 10
    permit ipv6 any any (1196 matches) sequence 20
Router3#sh ipv6 access
IPv6 access list icmp_out
    permit icmp any any echo-request (824 matches) sequence 10
    permit ipv6 any any sequence 20
IPv6 access list icmp_in
    permit icmp any any echo-reply (286 matches) sequence 10
    permit ipv6 any any (207 matches) sequence 20
Router2#sh ipv6 access
IPv6 access list icmp_in
    permit icmp any any echo-request (1000 matches) sequence 10
    permit icmp any any unreachable (2 matches) sequence 15
    permit ipv6 any any (1383 matches) sequence 20
Looking at our hit counts all 1000 pings made it to the destination.  We know all the pings didn't make it back due to the big loss block in the middle, but the counters add up to 1002 echo replies...  1002?  Why 1002?  Where did the extra 2 come from?  They come from the 2 ICMP unreachable packets that R2 received!  That explains that, but what about the perceived loss on TheInternet?  Well, that was actually the time it took for OSPF to converge.  Router2 continued to forward the traffic to Router1 who in turn dropped the traffic due to not having a route to the destination.  Once OSPF re-converged return traffic flowed through Router2 and life was good.

As another side note, I set the OSPF timers to 1 second hello, and 3 second dead.  I wanted to go lower but it turns out OSPF Fast Hellos are not available in OSPFv3 at this time.

What does all this prove?  It proves that you no longer NEED an FHRP in IPv6.  Should you still use one?  Well that depends on your needs and your own network.  There are certainly some limitations here, like I could not for the life of me locate any method with which I could tie in a Tracker to modify the router preference.  So if the upstream interface failed you'd still be dead in the water.  You also can't really load balance like you can in GLBP since that part would be entirely up to the OS Vendor in IPv6. 

To finish off this post I dug just a little bit deeper into the fact that not one single packet was lost on the way to Router2.  That was a very impressive feat for IPv6 (even if all the replies didn't make it back due to OSPF converging).  I decided to 'debug ipv6 nd' on Router1 and then shutdown the same interface I was shutting down for my test.
Router1(config-if)#shut
Router1(config-if)#
*Feb 24 21:58:54.651: ICMPv6-ND: Sending Final RA on GigabitEthernet0/0 from FE80::1
*Feb 24 21:58:54.651: ICMPv6-ND: Freeing RA context for FE80::1
*Feb 24 21:58:54.651: ICMPv6-ND: REACH -> DELETE: FC00:16::10
*Feb 24 21:58:54.651: ICMPv6-ND: STALE -> DELETE: FE80::3
*Feb 24 21:58:54.655: %OSPFv3-5-ADJCHG: Process 1, Nbr 3.3.3.3 on GigabitEthernet0/0 from FULL to DOWN, Neighbor Down: Interface down or detached
*Feb 24 21:58:54.655: ICMPv6-ND: Address FC00:16::1/64 is down on GigabitEthernet0/0
*Feb 24 21:58:54.655: ICMPv6-ND: Linklocal FE80::1 on GigabitEthernet0/0, Down
*Feb 24 21:58:54.655: ICMPv6-ND: Address FE80::1/10 is down on GigabitEthernet0/0
Router1(config-if)#
*Feb 24 21:58:54.663: ICMPv6-ND: DELETE -> INCMP: FC00:16::10
Router1(config-if)#
*Feb 24 21:58:56.651: %LINK-5-CHANGED: Interface GigabitEthernet0/0, changed state to administratively down
Router1(config-if)#
*Feb 24 21:58:57.651: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/0, changed state to down
Hah! Look at that.  As soon as I gracefully shut down the port IPv6 sent a "Final RA" out to the segment informing the hosts that it was shutting down.  This explains why no packets were lost...  TheInternet knew it should start forwarding to Router3 because Router1 told it was no longer valid.


I'm unable to test this by physically pulling the cable out of the interface (this lab is across town from where I was when I was doing the testing) but I will try to get that test done and post an update to see how things go.

1 comment:

  1. I still haven't been able to follow this up properly re: not gracefully shutting down the interface. However, stretch over at Packetlife.net has done similar testing by actually pulling the cable and posted up the results. I encourage anyone who's interested to head on over and check out his findings.

    http://packetlife.net/blog/2011/apr/18/ipv6-neighbor-discovery-high-availability/

    ReplyDelete