OSPF inter-area and intra-area routing rules


The following lab focuses on intra-area and inter-area route selection process.

For the sake of clarity, I put the final conclusions first, wrapped in a table form, with some explanations to ponder upon, followed by the different lab cases used to check OSPF route selection rules.

For each case, I used interface costs and states to illustrate OSPF selection rules in action.

 

Order of preference and criteria Rules
1. Intra-area (O)

  • Lowest cost
  • Multipath
– Intra-area routes are always preferred over inter-area ones.

– Intra-area routing to a destination inside a non-backbone area will take the shortest path without traversing the backbone area.- Intra-area routing to a destination inside a backbone area will take the shortest path without traversing a non-backbone area.
– ABR’s advertise only intra-area routes from non-backbone area to the backbone area and advertise intra-area and inter-area routes from backbone area to a non-backbone area.
– ABRs do not take into account in SPF calculations LSAs received from non-backbone areas.
2. Inter-area (IA) – Inter-area route between two non-backbone areas must pass through the backbone area.
– Inter-area route will take the path with the shortest total cost.
3. External routes
3a. Type 1:

  • Lowest total cost
  • Multipath

3b. Type 2:

  • Redistribution cost
  • Total cost
  • Multipath
For more information about comparing OSPF external routes, please refer to the lab OSPF external E1, E2, N1, N2…Who is the winner?

 

  • References from RFCs:

rfc3509

OSPF prevents inter-area routing loops by implementing a split-horizon mechanism, allowing ABRs to inject into the backbone only Summary-LSAs derived from the intra-area routes, and limiting ABRs’ SPF calculation to consider only Summary-LSAs in the backbone area’s link-state database.

 

rfc2328

Routing in the Autonomous System takes place on two levels, depending on whether the source and destination of a packet reside in the same area (intra-area routing is used) or different areas (inter-area routing is used). In intra-area routing, the packet is routed solely on information obtained within the area; no routing information obtained from outside the area can be used.   This protects intra-area routing from the injection of bad routing information.

 

3.2.   Inter-area routingWhen routing a packet between two non-backbone areas the backbone is used. The path that the packet will travel can be broken up into three contiguous pieces: an intra-area path from the source to an area border router, a backbone path between the source and destination areas, and then another intra-area path to the destination. The algorithm finds the set of such paths that have the smallest cost.The topology of the backbone dictates the backbone paths used between areas.

 


There are four possible types of paths used to route traffic to the destination, listed here in decreasing order of preference:
intra-area, inter-area, type 1 external or type 2 external.

To understand OSPF mechanism of loop prevention, think conceptually of OSPF areas as nodes in a loop-free tree with depth never bigger than 2.

 

OSPF tree: loop-free

OSPF tree: loop-free

You can visually see why 2 non-backbone areas cannot directly exchange routes and they must have area0 as an intermediate area to avoid loops:

 

OSPF tree: loop

OSPF tree: loop

Important notes:

  • Throughout the lab, I am using cost to manipulate route selection.

  • OSPF takes into account the cost of output interface toward the destination, so be careful when you change the cost on one end of a link, this can cause unwanted asymmetric routing.

  • IGP protocols split the router (advertise routes through interfaces) whereas BGP splits the link between routers, this fundamental difference should be clearly depicted in the topology to avoid confusion.

  • If you are advertising your loopback networks with mask less than 32 you will have to to set their ospf network type point-to-point (refer to this lab for more information).

  • Observe the ospf database inf. for LSA3 “Routing Bit Set on this LSA“, this is a Cisco-specific implementation of OSPF protocol, indicating that a specific LSA is taken into account in the calculation of the best route.

  • Multipath selection is considered locally through FIB and provided by CEF load balancing mechanism, if there next-hops leading to the same destination.

 

Low-level lab design topology

Here is the lab topology used for testing:

Figure3: Low Level Design Lab topology

Figure3: Low Level Design Lab topology

 

Test cases

Case1:

  • Traffic between R1 10.10.0.1 (area 123) to R5 50.10.0.5 (area0)
  • Default interface ospf costs
Figure4: Case1

Figure4: Case1

R1#Ping 50.10.0.5 source 10.10.0.1 repeat 5

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 50.10.0.5, timeout is 2 seconds:
Packet sent with a source address of 10.10.0.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/27/40 ms
R1#trace 50.10.0.5 source 10.10.0.1

Type escape sequence to abort.
Tracing the route to 50.10.0.5

  1 192.168.31.3 8 msec
    192.168.21.2 12 msec
    192.168.31.3 16 msec
  2 192.168.42.4 16 msec
    192.168.43.4 16 msec
    192.168.42.4 32 msec
  3 192.168.54.5 28 msec 40 msec 40 msec
R1#sh ip route 50.10.0.5

Routing entry for 50.10.0.5/32

  Known via "ospf 666", distance 110, metric 4, type inter area

  Last update from 192.168.12.2 on FastEthernet1/0, 00:42:05 ago

  Routing Descriptor Blocks:

  * 192.168.13.3, from 3.3.3.3, 00:42:15 ago, via FastEthernet1/1

      Route metric is 4, traffic share count is 1

    192.168.12.2, from 2.2.2.2, 00:42:05 ago, via FastEthernet1/0

      Route metric is 4, traffic share count is 1

R1#
R1#sh ip ospf database summary 50.10.0.5

            OSPF Router with ID (1.1.1.1) (Process ID 666)

        Summary Net Link States (Area 123)

  Routing Bit Set on this LSA

  LS age: 543

  Options: (No TOS-capability, DC, Upward)

  LS Type: Summary Links(Network)

  Link State ID: 50.10.0.5 (summary Network Number)

  Advertising Router: 2.2.2.2

  LS Seq Number: 80000002

  Checksum: 0x32BD

  Length: 28

  Network Mask: /32

    TOS: 0     Metric: 3 

  Routing Bit Set on this LSA

  LS age: 587

  Options: (No TOS-capability, DC, Upward)

  LS Type: Summary Links(Network)

  Link State ID: 50.10.0.5 (summary Network Number)

  Advertising Router: 3.3.3.3

  LS Seq Number: 80000002

  Checksum: 0x14D7

  Length: 28

  Network Mask: /32

    TOS: 0     Metric: 3 

R1#

R1#

 

Case2:

  • Traffic from R1 10.10.0.1 (area123) to R5 50.20.0.5 (backbone)
  • R1 fa1/0 cost = 10
  • R2 fa1/1 cost = 10
Figure5: Case2

Figure5: Case2

Making two inter-area paths with unequal total costs, (unequal intra-area costs)

R1#trace 50.10.0.5 source 10.10.0.1

Type escape sequence to abort.
Tracing the route to 50.10.0.5

  1  *
    192.168.13.3 12 msec 28 msec
  2  *
    192.168.34.4 16 msec 16 msec
  3  *
    192.168.45.5 44 msec 44 msec
R1#sh ip route 50.10.0.5
Routing entry for 50.10.0.5/32
  Known via "ospf 666", distance 110, metric 4, type inter area
  Last update from 192.168.13.3 on FastEthernet1/1, 00:48:22 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 3.3.3.3, 01:06:54 ago, via FastEthernet1/1
      Route metric is 4, traffic share count is 1

R1#

R1#sh ip ospf database summary 50.10.0.5

            OSPF Router with ID (1.1.1.1) (Process ID 666)

        Summary Net Link States (Area 123)

  LS age: 827
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0x825F
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 12 

  Routing Bit Set on this LSA
  LS age: 90
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 8000000A
  Checksum: 0x4DF
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 3 

R1#

 

R5#trace 10.10.0.1 source 50.10.0.5

Type escape sequence to abort.
Tracing the route to 10.10.0.1

  1 192.168.45.4 8 msec 4 msec 8 msec
  2 192.168.34.3 16 msec *  32 msec
  3  *
    192.168.13.1 44 msec *
R5#

R5#sh ip ospf database summ 10.10.0.1

            OSPF Router with ID (5.5.5.5) (Process ID 666)

        Summary Net Link States (Area 0)

  LS age: 194
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0x50C7
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 2 

  Routing Bit Set on this LSA
  LS age: 691
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000008
  Checksum: 0x30E2
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 2 

        Summary Net Link States (Area 25)

  LS age: 198
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000007
  Checksum: 0x50C7
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 2 

  LS age: 203
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 5.5.5.5
  LS Seq Number: 80000007
  Checksum: 0xAFF
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 4 

R5#

Note that, for the return traffic R5 will receive both summary LSA3 from R2 and R3, but will take into account only R3 because of the ABR’s router ID = 3.3.3.3

Multipath is not considered because there is only one next-hop (R4) in the FIB.

Case3:

  • Traffic from R1 10.10.0.1 (area 123) to R5 50.10.0.2 (backbone)
  • R1 fa1/0 cost = 10
  • R3 fa1/2 cost = 100
Figure6: Case3

Figure6: Case3

R1#sh ip ospf database summ 50.10.0.5

            OSPF Router with ID (1.1.1.1) (Process ID 666)

        Summary Net Link States (Area 123)

  Routing Bit Set on this LSA
  LS age: 697
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000004
  Checksum: 0x2EBF
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 3

  LS age: 46
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000002
  Checksum: 0xF592
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 102

R1#      
R1#sh ip route 50.10.0.5             
Routing entry for 50.10.0.5/32
  Known via "ospf 666", distance 110, metric 13, type inter area
  Last update from 192.168.12.2 on FastEthernet1/0, 00:01:22 ago
  Routing Descriptor Blocks:
  * 192.168.12.2, from 2.2.2.2, 00:01:22 ago, via FastEthernet1/0
      Route metric is 13, traffic share count is 1

R1#
R1#trace 50.10.0.5 source 10.10.0.1         

Type escape sequence to abort.
Tracing the route to 50.10.0.5

  1 192.168.12.2 20 msec 20 msec 20 msec
  2 192.168.24.4 28 msec 20 msec 24 msec
  3 192.168.45.5 28 msec 36 msec 40 msec
R1#

 

With unequal costs to ABRs and unequal costs advertised by ABRs, R1 OSPF has chosen the path with the lowest total cost to destination: cost to ABRs + cost of LSA3 summary advertised by each ABR.

Case4:

  • Traffic from R1 10.10.0.1 (area 123) to R5 50.10.0.2 (backbone)
  • R1 fa1/0 cost = 10
  • R3 fa1/2 cost = 10
Figure7: Case4

Figure7: Case4

R1#sh ip ospf database summ 50.10.0.5    

            OSPF Router with ID (1.1.1.1) (Process ID 666)

        Summary Net Link States (Area 123)

  Routing Bit Set on this LSA
  LS age: 1072
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000004
  Checksum: 0x2EBF
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 3

  Routing Bit Set on this LSA
  LS age: 12
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 50.10.0.5 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000003
  Checksum: 0x6C75
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 12

R1#
R1#sh ip route 50.10.0.5                 
Routing entry for 50.10.0.5/32
  Known via "ospf 666", distance 110, metric 13, type inter area
  Last update from 192.168.13.3 on FastEthernet1/1, 00:01:21 ago
  Routing Descriptor Blocks:
    192.168.13.3, from 3.3.3.3, 00:01:21 ago, via FastEthernet1/1
      Route metric is 13, traffic share count is 1
  * 192.168.12.2, from 2.2.2.2, 00:08:09 ago, via FastEthernet1/0
      Route metric is 13, traffic share count is 1

R1#
R1#trace 50.10.0.5 source 10.10.0.1  

Type escape sequence to abort.
Tracing the route to 50.10.0.5

  1 192.168.13.3 8 msec
    192.168.12.2 8 msec
    192.168.13.3 8 msec
  2 192.168.24.4 20 msec
    192.168.34.4 24 msec
    192.168.24.4 16 msec
  3 192.168.45.5 20 msec 32 msec 24 msec
R1#

 

With unequal costs to ABRs and unequal costs advertised by ABRs, R1 OSPF has chosen multipath because of the equal total cost to destination: cost to ABRs + cost of LSA3 summary advertised by each ABR.

Case5:

  • Traffic from R5 50.10.0.5 (backbone) to R1 10.10.0.1 (area 123)
  • R3 fa1/1 cost = 10
Figure8: Case5

Figure8: Case5

R5#sh ip ospf database summary 10.10.0.1

            OSPF Router with ID (50.10.0.5) (Process ID 666)

        Summary Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 1906
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000011
  Checksum: 0x3CD1
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 2

  LS age: 19
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000003
  Checksum: 0x947A
  Length: 28
  Network Mask: /32
        TOS: 0     Metric: 11
          
...
R5#
R5#sh ip route 10.10.0.1                
Routing entry for 10.10.0.1/32
  Known via "ospf 666", distance 110, metric 4, type inter area
  Last update from 192.168.45.4 on FastEthernet1/0, 00:02:53 ago
  Routing Descriptor Blocks:
  * 192.168.45.4, from 2.2.2.2, 00:02:53 ago, via FastEthernet1/0
      Route metric is 4, traffic share count is 1

R5#
R5#trace 10.10.0.1 source 50.10.0.5     

Type escape sequence to abort.
Tracing the route to 10.10.0.1

  1 192.168.45.4 4 msec 12 msec 8 msec
  2 192.168.24.2 24 msec 20 msec 20 msec
  3 192.168.12.1 20 msec 28 msec 20 msec
R5#

 

With equal paths to ABRs R2 and R3, R5 ospf choose the path with the lowest total cost (cost to ABR + cost advertised by ABR)

Case6:

  • Traffic from R5 50.10.0.5 (backbone) to R1 10.10.0.1 (area 123)
  • R3 fa1/1 cost = 10
  • R4 fa1/1 cost = 5
Figure9: Case6

Figure9: Case6

R5#sh ip ospf database summary 10.10.0.1

            OSPF Router with ID (50.10.0.5) (Process ID 666)

        Summary Net Link States (Area 0)

  Routing Bit Set on this LSA
  LS age: 573
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 80000012
  Checksum: 0x3AD2
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 2

  LS age: 710
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 10.10.0.1 (summary Network Number)
  Advertising Router: 3.3.3.3
  LS Seq Number: 80000003
  Checksum: 0x947A
  Length: 28
  Network Mask: /32
        TOS: 0     Metric: 11
          
...   
R5#
R5#sh ip route 10.10.0.1                
Routing entry for 10.10.0.1/32
  Known via "ospf 666", distance 110, metric 8, type inter area
  Last update from 192.168.45.4 on FastEthernet1/0, 00:02:49 ago
  Routing Descriptor Blocks:
  * 192.168.45.4, from 2.2.2.2, 00:02:49 ago, via FastEthernet1/0
      Route metric is 8, traffic share count is 1

R5#
R5#trace 10.10.0.1 source 50.10.0.5     

Type escape sequence to abort.
Tracing the route to 10.10.0.1

  1 192.168.45.4 16 msec 12 msec 8 msec
  2 192.168.24.2 20 msec 20 msec 20 msec
  3 192.168.12.1 28 msec 24 msec 20 msec
R5#

 

Note that OSPF on R5 did not choose the shortest path to ABR (R3), but the total cost.

==> The same from area0 to non-backbone area, the router looks at the total cost of LSA3 + cost of the route inside area0

Case7:

  • Traffic from R1 10.10.0.1 (area123) to R2 20.10.0.2 (area 123)
  • R1 fa1/0 cost = 100
Figure10: Case7

Figure10: Case7

R1#sh ip route 20.10.0.2
Routing entry for 20.10.0.2/32
  Known via "ospf 666", distance 110, metric 101, type intra area
  Last update from 192.168.12.2 on FastEthernet1/0, 00:00:11 ago
  Routing Descriptor Blocks:
  * 192.168.12.2, from 2.2.2.2, 00:00:11 ago, via FastEthernet1/0
      Route metric is 101, traffic share count is 1

R1#trace 20.10.0.2 source 10.10.0.1

Type escape sequence to abort.
Tracing the route to 20.10.0.2

  1 192.168.12.2 16 msec 12 msec 8 msec
R1#

 

R3#sh ip route 20.10.0.2
Routing entry for 20.10.0.2/32
  Known via "ospf 666", distance 110, metric 102, type intra area
  Last update from 192.168.13.1 on FastEthernet1/1, 00:01:24 ago
  Routing Descriptor Blocks:
  * 192.168.13.1, from 2.2.2.2, 00:01:24 ago, via FastEthernet1/1
      Route metric is 102, traffic share count is 1

R3#

 

Case8:

  • Traffic from R1 10.10.0.1 (area123) to R2 20.10.0.2 (area 123)
  • R1-R2 link down (no inter-area route to 20.10.0.2)
Figure11: Case8

Figure11: Case8

R1#sh ip route 20.10.0.2
% Subnet not in table
R1#
R1#
R1#sh ip ospf database summ
R1#sh ip ospf database summary 20.10.0.2

            OSPF Router with ID (1.1.1.1) (Process ID 666)
R1#

 

R1 can no more reach the destination in the same area, though it is reachable from R3 which is itself reachable to R1

R3#sh ip route 20.10.0.2
Routing entry for 20.10.0.2/32
  Known via "ospf 666", distance 110, metric 3, type inter area
  Last update from 192.168.34.4 on FastEthernet1/2, 00:01:12 ago
  Routing Descriptor Blocks:
  * 192.168.34.4, from 2.2.2.2, 00:01:12 ago, via FastEthernet1/2
      Route metric is 3, traffic share count is 1

R3#ping 20.10.0.2

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 20.10.0.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 24/27/32 ms
R3#trace 20.10.0.2

Type escape sequence to abort.
Tracing the route to 20.10.0.2

  1 192.168.34.4 12 msec 8 msec 12 msec
  2 192.168.24.2 16 msec 24 msec 16 msec
R3#

 

OSPF will always choose the intra-area path without crossing area 0

Case9:

  • Intra-area traffic from R4 40.10.0.4 (backbone) to R2 20.10.0.2 (backbone)
  • R4 f1/1 cost = 100
Figure12: Case9

Figure12: Case9

R4#sh ip route 20.20.0.2
Routing entry for 20.20.0.2/32
  Known via "ospf 666", distance 110, metric 101, type intra area
  Last update from 192.168.24.2 on FastEthernet1/1, 00:01:51 ago
  Routing Descriptor Blocks:
  * 192.168.24.2, from 2.2.2.2, 00:01:51 ago, via FastEthernet1/1
      Route metric is 101, traffic share count is 1

R4#trace 20.20.0.2 source 40.10.0.4

Type escape sequence to abort.
Tracing the route to 20.20.0.2

  1 192.168.24.2 20 msec 12 msec 8 msec
R4#

 

R3#sh ip route 20.20.0.2
Routing entry for 20.20.0.2/32
  Known via "ospf 666", distance 110, metric 102, type intra area
  Last update from 192.168.34.4 on FastEthernet1/2, 00:02:44 ago
  Routing Descriptor Blocks:
  * 192.168.34.4, from 2.2.2.2, 00:02:44 ago, via FastEthernet1/2
      Route metric is 102, traffic share count is 1

R3#

 

R4 chose the worse path through R2 inside the backbone without crossing non-backbone area.

Case10:

  • Traffic from R1 10.10.0.2 (area123) to R2 20.20.0.2 (backbone)
  • R4-R2 link down (no inter-area route to 20.20.0.2)
Figure13: Case10

Figure13: Case10

R1#sh ip route 20.20.0.2
Routing entry for 20.20.0.2/32
  Known via "ospf 666", distance 110, metric 2, type inter area
  Last update from 192.168.12.2 on FastEthernet1/0, 00:00:02 ago
  Routing Descriptor Blocks:
  * 192.168.12.2, from 2.2.2.2, 00:00:02 ago, via FastEthernet1/0
      Route metric is 2, traffic share count is 1

R1#trace 20.20.0.2 source 10.10.0.2

Type escape sequence to abort.
Tracing the route to 20.20.0.2

  1 192.168.12.2 12 msec 8 msec 8 msec
R1#

R4#sh ip route 20.20.0.2
% Network not in table
R4#
R4#sh ip ospf data summ 20.20.0.2  

            OSPF Router with ID (4.4.4.4) (Process ID 666)
R4#
R3#sh ip route 20.20.0.2
% Network not in table
R3#sh ip ospf data summary  20.20.0.2

            OSPF Router with ID (3.3.3.3) (Process ID 666)

        Summary Net Link States (Area 123)

  LS age: 3429
  Options: (No TOS-capability, DC, Upward)
  LS Type: Summary Links(Network)
  Link State ID: 20.20.0.2 (summary Network Number)
  Advertising Router: 2.2.2.2
  LS Seq Number: 8000001C
  Checksum: 0x17D7
  Length: 28
  Network Mask: /32
    TOS: 0     Metric: 1

R3#

Though R3 has received the summary LSA3 from R2 though the non-backbone area 123, it did not include it in the routing table, even if it is reachable from R1

Case11:

  • Traffic between two non-backbone areas. From area123 to area25.
  • Default interface costs
Figure14: Case11

Figure14: Case11

R1#sh ip route 50.20.0.5
Routing entry for 50.20.0.5/32
  Known via "ospf 666", distance 110, metric 3, type inter area
  Last update from 192.168.12.2 on FastEthernet1/0, 00:02:54 ago
  Routing Descriptor Blocks:
  * 192.168.12.2, from 2.2.2.2, 00:02:54 ago, via FastEthernet1/0
      Route metric is 3, traffic share count is 1

R1#trace 50.20.0.5 source 10.10.0.1

Type escape sequence to abort.
Tracing the route to 50.20.0.5

  1 192.168.12.2 16 msec 0 msec 8 msec
  2 192.168.25.5 20 msec 24 msec 32 msec
R1#

From R1, OSPF will choose the path with the lowest total cost within area 123, the backbone and area 25. This happens to be the path through R2, which is directly connected to area25. This seems to defeat the rule B, but it doesn’t, because the ABR R2 has an interface in the backbone.

Case12:

  • Traffic generated from R2: 20.10.0.2 (area 123) to R5 50.20.0.5 (area 25).
  • R2 fa1/2 cost = 100
Figure15: Case12

Figure15: Case12

R2(config-if)#do sh ip route 50.20.0.5           
Routing entry for 50.20.0.5/32
  Known via "ospf 666", distance 110, metric 101, type intra area
  Last update from 192.168.25.5 on FastEthernet1/2, 00:04:03 ago
  Routing Descriptor Blocks:
  * 192.168.25.5, from 5.5.5.5, 00:04:03 ago, via FastEthernet1/2
      Route metric is 101, traffic share count is 1

R2(config-if)#
R2(config-if)#do trace 50.20.0.5 source 20.10.0.2

Type escape sequence to abort.
Tracing the route to 50.20.0.5

  1 192.168.25.5 20 msec 24 msec 20 msec
R2(config-if)#

Even though inter-area link cost is made worse (higher cost), R2 ospf will choose the shortest path without crossing the backbone.

Case13:

  • R2 fa1/1 Down
Figure16: Case13

Figure16: Case13

R2#sh ip route 50.20.0.2
% Subnet not in table
R2#
R1#sh ip route 50.20.0.5           
Routing entry for 50.20.0.5/32
  Known via "ospf 666", distance 110, metric 4, type inter area
  Last update from 192.168.13.3 on FastEthernet1/1, 00:08:28 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 3.3.3.3, 00:12:15 ago, via FastEthernet1/1
      Route metric is 4, traffic share count is 1

R1#trace 50.20.0.5 source 10.10.0.1

Type escape sequence to abort.
Tracing the route to 50.20.0.5

  1 192.168.13.3 12 msec 8 msec 8 msec
  2 192.168.34.4 16 msec 16 msec 20 msec
  3 192.168.45.5 20 msec 28 msec 28 msec
R1#

Note that, as soon as R2 interface connected to the backbone is down, R2 can no more reach area25. And R1 will turn to the path advertised through R3.

Case14:

  • R2 fa1/1 Down
  • R1 fa1/1 Down
Figure17: Case14

Figure17: Case14

R1#sh ip route 50.20.0.5           
% Network not in table
R1#t  

Even though R1 link to R2 is up and R2 link (area 25) to R5 is up, R1 will not be able to use the inter-area path, because it doesn’t cross the backbone (not even a connected interface to the backbone).

 

 

GET VPN, it is all about group.


GET VPN uses a group security paradigm comparing to the traditional point-to-point security paradigm like DMVPN, GRE IPSec or SSL. Do not confuse with any-to-any mesh which is the result of n(n-1)/2 point-to-point security associations between n peers.

We are talking about group security association (SA), group states and group keys.

Because each group member doesn’t know in advance about other group members, a central entity named key server (KS) need to manage GET VPN control plane: group member registration, distribute group keys and do the rekeys (updating security policy and group keys).

Once a group member has registered with the KS to a group, it can take care of the data plane encrypted communication with any of the other group members using the received group key and group security policy.

GET VPN doesn’t create the VPN perimeter (like Frame Relay for example), all group members and the key server are supposed to be already in the same VPN.

This lab is not supposed to be a comprehensive GET-VPN reference. Because it is all about a stories and process dynamics, I opted for a more visual approach: less words, more pictures.

I included a table summarizing the key characteristics of GET-VPN comparing with traditional IPSec, followed by a step-by-step animation of the main processes of GET-VPN.

Only then, I provide some configurations of the core MPLS-VPN and GET-VPN key server and group member.

And finally the offline lab with a myriad of useful commands collected during a running session.

By rushing to the CLI, you can easily drown yourself in a cup of water 🙂 Better take the time to understand the overall process, then configuration and troubleshooting will be straightforward and more instructive.

Outline:

  • Comparison of GETVPN vs traditional IPSec.
  • Summary of the main concepts
  • Step-by-step GETVPN process
  • Lab details
    • MPLS-VPN configuration
      • Core
      • PE_CE
    • GET VPN configuration
      • Basic Group member template
      • Basic key server
  • Offline lab
  • Troubleshooting

Comparison table (IPSec <> GET vpn)

IPSec Crypto GET Crypto
Main concept Per link point-to-point security paradigm: each pair of devices establishes a separate security association.(figure3) Group-based security paradigm: allows a set of network entities to connect through a single group security association.
Encapsulation Requires the establishment of tunnel to secure the traffic between encrypting gateways.- a new header is added to the packet (figure4).- Limited QoS preservation (only ToS). Tuneless: does not use tunnels and preserves the original src/dst IP in the header of the encrypted packet for optimal routing (figure6).- Preserve same established routing path.- Preserve entire QoS parameter set.
Multicast – Traditional IPSec encapsulate multicast into encrypted PTP unicast GRE tunnels.- Alternatively, inefficient multicast replication at the HUB using DMVPN with many multicast deployment restrictions (RP, MA and multicast source location at the HUB). GET VPN is particularly suited for multicast traffic encryption with native support. The WAN multicast infrastructure is used to transport encrypted GET-VPN multicast traffic.
Infrastructure IPSec creates Overlay VPN and secures it over the Internet. • Relies on MPLS/VPLS Any-to-Any VPN.• Direct Flows between all CE.• Leverages Existing VPNs.- GET VPN suited for enterprises running over MPLS private networks.- GET-VPN ONLY secure an already established VPN (MPLS-VPN), it doesn’t create it
Management – Per-link security management: Each pair holds a security association for each other.- Resource consumption.- Alternative centralized HUB-based management with DMVPN though heterogeneous protocol suite (mGRE, NHRP)- Less scalable. – Full mesh direct communications between sites.Same group members hold a single security association; don’t care about others on the group.- Relies on centralized membership management through key servers :A centralized entity (Key Server manage the control plane) and not involved in the data plane communication.Data plane communications don’t require transport through a hub HUB entity.- Low latency/jitter.

– More scalable.

Security Perimeter Per-link All Sites within the same group.
Routing Two routing levels(figure5a):Core routing+Overlay routing between encrypting gateways Single end-to-end routing level with MPLS-VPN as the underlying VPN infrastructure(figure6a).
Traditional IPSec PTP SAs between GMs

Figure3- Traditional IPSec PTP SAs between GMs

Figure4- Traditional IPSec PTP Packet

Figure4- Traditional IPSec PTP Packet

 

Figure5a-Traditional IPSec routing overlay

Figure5a-Traditional IPSec routing overlay

 

Figure6- Get-VPN IPSec PTP Packet + TEK

Figure6- Get-VPN IPSec PTP Packet + TEK

 

Figure6a- GETVPN end-to-end routing

Figure6a- GETVPN end-to-end routing

 

 Summary of the main concepts

The key server manages the control operations (group member registration and rekey and policy distribution) on a different plane, initially, under the protection of IKE, after that, protected by a common key (KEK) (figure7/8).

Rekey process can be done using unicast (default) or multicast.

Any-to-any GM communications are performed on a separate data plane using the received TEK (Traffic Encryption Key) (figure9).

COOP (Co-operative key server protocol)

  • A Key Server redundancy feature allowing key server communication to synchronize the keying information.

GDOI protocol (Group Domain of Interpretation)

  • Used for policy & key deployment with group members.

Make sure to differentiate between registration vs rekey and between authentication vs authorization processes.

A-Registration process:

1- There is two types of KS-GM authentications):

  • Shared key (deployed on the lab): a simple secret key shared (hopefully out-of-band) by the KS and all GMs (or per-pair).
  • RSA (CA-based authentication): uses asymmetric encryption with public/private key pair and requires PKI infrastructure:
    • Enrollment phase:
      • Client (GM) gets CA certificate (CA public key signed with CA private key)
      • Client enrollment with CA
        • Client sends its own certificate (client public key signed with client private key)
        • CA signs the client certificate with its own (CA) public key and sends it to the client.
    • Peer-to-peer (GM-to-KS) authentication:
      • Peers exchanges and verify each other signatures using CA public key.

2- Authorization (during registration):

  • KS verifies the group id number of the GM: If this id number is a valid and the GM has provided valid Internet Key Exchange (IKE) credentials, the key server sends the SA policy and the Keys to the group member.

B-Rekey process (after registration): uses asymmetric encryption with public/private key pair:

  • The Primary KS uses the private & public keys (asymmetric) generated before the registration and export them to other secondary KSs.
  • Primary KS generates Group control / data plane keys, TEK and KEK (symmetric keys)
  • The KS signs TEK/KEK + policy with its own private key and sends them to GMs(figure7/8/9)
  • The KS distributes public key to all GMs.
  • GMs check the rekeys with the received KS public key

 

Figure7- Control plane: GM registration

Figure7- Control plane: GM registration

Figure8- Control plane: rekey using KEK key

Figure8- Control plane: rekey using KEK key

 

Figure9- GM data plane communications using TEK key

Figure9- GM data plane communications using TEK key

 

 

Step-by-step GET-VPN process

Lab details

Here is the lab topology, pretty straight forward, three group members and one key server with MPLS-VPN as the underlying VPN infrastructure.

 

Figure1- High-level lab topology

Figure1- High-level lab topology

Figure2- Low-level lab topology

Figure2- Low-level lab topology

 

GET VPN:

  • Customer equipments (CE) R9, R10 and R12 play the role of GETVPN group members (GM),
  • R11 plays the role of the key server.

MPLS_VPN:

  • R2, R3 and R5 are the MPLS-VPN PEs (Provider Edge equipment)

MPLS-VPN configuration:

Make sure the underlying MPLS-VPN topology works fine and all connectivity are successful before touching GET VPN.

For the sake of simplicity, a single MPLS core router (label switching router) R4 is used and static routing between CE_PE.

 

Figure2- Low-level MPLS-VPN lab infrastructure (MPLS LSR + PEs)

Figure2- Low-level MPLS-VPN lab infrastructure (MPLS LSR + PEs)

 

MPLS-VPN basic configuration with PE-CE static routing

  • R4 the MPLS core router (LSR) is configured as BGP route reflector.
  • Make sure PE equipments (wan interfaces and loopback rid’s) are reachable though MPLS core OSPF.
  • Enable MPLS on the interfaces facing PE’s.
  • Each PE (R2, R3 and R5) has similar configuration and a template can be used to scale client numbers.

Because this lab is not focused on MPLS-VPN, a simple PE-CE static routing is deployed. Each client has its interface and static route in the appropriate VRF. Generally multiple clients (VRFs) separately coexist in a single PE and the core transport is done through vpnv4 to and from other PE’s (figure2a)

 

Figure2a- MPLS-VPN PE Virtual Router Forwarding

Figure2a- MPLS-VPN PE Virtual Router Forwarding

With PE-CE static routing, only one-side redistribution is done, from static to vpnv4 address family.

You will find all details in the offline lab.

R4 (MPLS LSR router)

interface Loopback0ip address 10.0.0.4 255.255.255.255ip ospf 2345 area 0!

interface FastEthernet0/0

ip address 192.168.45.4 255.255.255.0

ip ospf 2345 area 0

mpls label protocol ldp

mpls ip

!

interface FastEthernet0/1

ip address 192.168.24.4 255.255.255.0

ip ospf 2345 area 0

mpls label protocol ldp

mpls ip

!

interface FastEthernet1/0

ip address 192.168.34.4 255.255.255.0

ip ospf 2345 area 0

mpls label protocol ldp

mpls ip

!

router ospf 2345

!

router bgp 2345

no synchronization

bgp router-id 4.4.4.4

network 192.168.24.0

network 192.168.34.0

network 192.168.45.0

neighbor 10.0.0.2 remote-as 2345

neighbor 10.0.0.2 route-reflector-client

neighbor 10.0.0.3 remote-as 2345

neighbor 10.0.0.3 route-reflector-client

neighbor 10.0.0.5 remote-as 2345

neighbor 10.0.0.5 route-reflector-client

no auto-summary

!

address-family vpnv4

neighbor 10.0.0.2 activate

neighbor 10.0.0.2 send-community extended

neighbor 10.0.0.2 route-reflector-client

neighbor 10.0.0.3 activate

neighbor 10.0.0.3 send-community extended

neighbor 10.0.0.3 route-reflector-client

neighbor 10.0.0.5 activate

neighbor 10.0.0.5 send-community extended

neighbor 10.0.0.5 route-reflector-client

exit-address-family

R4#sh ip bgp summBGP router identifier 4.4.4.4, local AS number 2345…10.0.0.2 4 2345 134 137 4 0 0 02:09:53 0

10.0.0.3 4 2345 134 137 4 0 0 02:09:43 0

10.0.0.5 4 2345 136 137 4 0 0 02:09:52 0

R4#

GET-VPN configuration:

Key server:

crypto isakmp policy 1encr aesauthentication pre-sharegroup 2!crypto isakmp key cisco address 192.168.103.10

crypto isakmp key cisco address 192.168.29.9

crypto isakmp key cisco address 192.168.125.12

!

crypto ipsec transform-set transform128 esp-aes esp-sha-hmac

!

crypto ipsec profile profile1

set security-association lifetime seconds 7200

set transform-set transform128

!

crypto gdoi group gdoi-group1

identity number 91011

server local

rekey algorithm aes 128

rekey retransmit 10 number 2

rekey authentication mypubkey rsa rekey-rsa

rekey transport unicast

sa ipsec 1

profile profile1

match address ipv4 getvpn-acl

address ipv4 192.168.115.11

!

interface FastEthernet0/0

ip address 192.168.115.11 255.255.255.0

!

ip access-list extended getvpn-acl

deny icmp any any

deny udp any eq 848 any eq 848

deny tcp any any eq 22

deny tcp any eq 22 any

deny tcp any any eq tacacs

deny tcp any eq tacacs any

deny tcp any any eq bgp

deny tcp any eq bgp any

deny ospf any any

deny eigrp any any

deny udp any any eq ntp

deny udp any eq ntp any

deny udp any any eq snmp

deny udp any eq snmp any

deny udp any any eq snmptrap

deny udp any eq snmptrap any

deny udp any any eq syslog

deny udp any eq syslog any

permit ip any any

You can recognize usual IPSec components: crypto policy, shared keys, transform-sets wrapped in an ipsec profile, followed by GETVPN GDOI group policy ( ipsec profile and group policy (group-id, symm. encr. Algorithm, asymmetric keys to use during rekey authentication and getvpn acl)

The GETVPN ACL defines what will not be encrypted; consider your own topology needs:

  • Exclude already encrypted traffic (https, ssh, etc.)
  • Exclude PE-CE routing protocols.
  • Exclude GETVPN control plane protocols( gdoi udp 848, tacacs, radius, etc.)
  • Exclude Management/monitoring protocols (snmp. syslog, device remote access, icmp, etc.)

Group member template:

crypto isakmp policy 1encr aesauthentication pre-sharegroup 2crypto isakmp key cisco address 192.168.115.11!

crypto gdoi group gdoi-group1

identity number 91011

server address ipv4 192.168.115.11

!

crypto map map-group1 10 gdoi

set group gdoi-group1

!

interface FastEthernet0/0

crypto map map-group1

All control plane parameters are controlled by the KS, GMs are dumb devices. All they have to know is how to initiate IKE phase1 for preshared or RSA authentication with the KSs to register and get the keys, which groups they want to join, to which KS they will talk (if many configured) and finally, the get crypto-map assigned to the outbound interface.

Troubleshooting:

When practicing or during tenacious data plane issues, a good idea is to use ESP-NULL, the null encryption, where the encryption process is engaged, but packets are not encrypted.

This will make much easier the packet content inspection and marking between GMs.

To facilitate the deployment and the troubleshooting, think about the topology in terms of separate technology sub-domains; make sure everything works fine before moving to the next sub-domain.

  • Layer2 MPLS
  • MPLS-VPN
  • Core routing
  • PE-CE routing
  • Core SSM multicast (if not unicast rekey)
  • CA infrastructure (if you are not using shared keys to authenticate GMs to the KS)
Possible issues Possible causes
key servers doesn’t exchange all announcements Bad buffer – default buffer is not enough for large nbr of gm
Reassembly timed out Some announcement fragments are dropped or lostICMP (packet too big) denied in the pathNo fragmentation because df bit setPresence of MTU bottlenecks, multipaths, firewalls.Not tuned tcp mss “ip tcp adjust-mss”
ike failure- failed negotiation Preshared key mismatchIKE parameters mismatchTransport issues between GMs and KSs
Nothing works just after GM registration. GETVPN acl policy cause routing traffic encryption: traffic not symmetrically excluded
KS not synchronizing Check KS-KS transport: avoid making KS-KS communications dependent on GET-VPN successful transport.Keys not exported or not imported
Rekey not delivered Issue with multicast infrastructure.Issue with mVPN service.

For more detailed GETVPN troubleshooting, I urge you to watch the excellent CiscoLive session by Wen Zhang https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=78699&tclass=popup

Offline lab:

References:

http://www.cisco.com/c/en/us/products/security/group-encrypted-transport-vpn/index.html

http://www.cisco.com/c/en/us/products/collateral/security/group-encrypted-transport-vpn/deployment_guide_c07_554713.html

https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=78699&tclass=popup

http://www.cisco.com/c/dam/en/us/td/docs/solutions/CVD/Aug2013/CVD-GETVPNDesignGuide-AUG13.pdf

https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=7907&backBtn=true

http://www.cisco.com/c/en/us/td/docs/ios/12_4t/12_4t11/htgetvpn.html

http://tools.ietf.org/html/rfc3547

http://tools.ietf.org/rfc/rfc4534.txt

IPv4 and IPv6 dual-stack PPPoE


The lab covers a scenario of adding basic IPv6 access to an existing PPPoE (PPP for IPv4).

PPPoE is established between CPE (Client Premise Equipment) the PPPoE client and the PPPoE server also known as BNG (Broadband Network Gateway).

ipv4 and IPv6 dual-stack PPPoe

Figure1: ipv4 and IPv6 dual-stack PPPoe

PPPoE server plays the role of the authenticator (local AAA) as well as the authentication and address pool server (figure1). Obviously, a higher centralized prefix assignment and authentication architecture (using AAA RADIUS) is more scalable for broadband access scenarios (figure2).

For more information about RADIUS attributes for IPv6 access networks, start from rfc6911 (http://www.rfc-editor.org/rfc/rfc6911.txt).

Figure2: PPPoE with RADIUS

Figure2: PPPoE with RADIUS

PPPoE for IPv6 is based on the same PPP model as for PPPoE over IPv4. The main difference in deployment is related to the nature of the routed protocol assignment to CPEs (PPPoE clients).

  • IPv4 in routed mode, each CPE gets its WAN interface IP centrally from the PPPoE server and it’s up to the customer to deploy an rfc1918 prefix to the local LAN through DHCP.
  • PPPoE client gets its WAN interface IPv6 address through SLAAC and a delegated prefix to be used for the LAN segment though DHCPv6.

Animation: PPP encapsulation model

Let’s begin with a quick reminder of a basic configuration of PPPoE for IPv4.

PPPoE for IPv4

pppoe-client WAN address assignment

The main steps of a basic PPPoE configuration are:

  • Create a BBAG (BroadBand Access Group).
  • Tie the BBAG to virtual template interface
  • Assign a loopback interface IP (always UP/UP) to the virtual template.
  • Create and assign the address pool (from which client will get their IPs) to the virtual template interface.
  • Create local user credentials.
  • Set the authentication type (chap)
  • Bind the virtual template interface to a physical interface (incoming interface for dial-in).
  • The virtual template interface will be used as a model to generate instances (virtual access interfaces) for each dial-in session.
Figure3: PPPoE server

Figure3: PPPoE server model

pppoe-server

ip local pool PPPOE_POOL 172.31.156.1 172.31.156.100
!
bba-group pppoe BBAG
virtual-template 1
!
interface Virtual-Template1
ip unnumbered Loopback0
ip mtu 1492
peer default ip address pool PPPOE_POOL
ppp authentication chap callin

!

interface FastEthernet0/0

pppoe enable group BBAG

pppoe-client

interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
!
interface FastEthernet1/0
ip address 192.168.0.201 255.255.255.0
!
interface Dialer1
mtu 1492
ip address negotiated

encapsulation ppp

dialer pool 1

dialer-group 1

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

Figure4: PPPoE client model

Figure4: PPPoE client model


As mentioned in the beginning, DHCPv4 is deployed at the CPE device to assign rfc1819 addresses to LAN clients and then translated, generally using PAT (Port Address Translation) with the assigned IPv4 to the WAN interface.

You should have the possibility to configure static NAT or static port-mapping to give public access to internal services.

Address translation

interface Dialer1
ip address negotiated
ip nat outside
!
interface FastEthernet0/0
ip address 192.168.4.1 255.255.255.224
ip nat inside
!
ip nat inside source list NAT_ACL interface Dialer1 overload
!

ip access-list standard NAT_ACL

permit any

pppoe-client LAN IPv4 address assignment

pppoe-client

ip dhcp excluded-address 192.168.4.1
!
ip dhcp pool LAN_POOL
network 192.168.4.0 255.255.255.224
domain-name cciethebeginning.wordpress.com
default-router 192.168.4.1
!
interface FastEthernet0/0
ip address 192.168.4.1 255.255.255.224

PPPoE for IPv6

pppoe-client WAN address assignment

All IPv6 prefixes are planned from the 2001:db8::

Pppoe-server

ipv6 local pool PPPOE_POOL6 2001:DB8:5AB:10::/60 64
!
bba-group pppoe BBAG
virtual-template 1
!
interface Virtual-Template1
ipv6 address FE80::22 link-local
ipv6 enable
ipv6 nd ra lifetime 21600
ipv6 nd ra interval 4 3


peer default ipv6 pool PPPOE_POOL6

ppp authentication chap callin

!

interface FastEthernet0/0

pppoe enable group BBAG

IPCP (IPv4) negotiates the IPv4 address to be assigned to the client, where IPC6CP negotiates only the interface identifier, the prefix information is performed through SLAAC.

pppoe-client

interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
!
interface Dialer1
mtu 1492
dialer pool 1
dialer-group 1
ipv6 address FE80::10 link-local

ipv6 address autoconfig default

ipv6 enable

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

The CPE (PPPoE client) is assigned an IPv6 address through SLAAC along with a static default route: ipv6 address autoconfig default

pppoe-client#sh ipv6 interface dialer 1
Dialer1 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::10
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:10::10, subnet is 2001:DB8:5AB:10::/64 [EUI/CAL/PRE]
valid lifetime 2587443 preferred lifetime 600243

Note from the below traffic capture (figure5) that both IPv6 and IPv4 use the same PPP session (layer2 model) (same session ID=0x0006) because the Link Control Protocol is independent of the network layer.

Figure5: Wireshark capture of common PPP layer2 model

Figure5: Wireshark capture of common PPP layer2 model


pppoe-client LAN IPv6 assignment

The advantage of using DHCPv6 PD (Prefix Delegation is that the PPPoE will automatically add a static route to the assigned prefix, very handy!

pppoe-server

ipv6 dhcp pool CPE_LAN_DP
prefix-delegation 2001:DB8:5AB:2000::/56
00030001CA00075C0008 lifetime infinite infinite
!
interface Virtual-Template1

ipv6 dhcp server CPE_LAN_DP

Now the PPPoE client can use the delegated prefix to assign an IPv6 address (::1) to its own interface (fa0/0) and the remaining for SLAAC advertisement.

No NAT needed for the delegated prefixes to be used publically, so no translation states on the PPPoE server. The prefix is directly accessible from outside.

For more information about the client ID used for DHCPv6 assignment, please refer to the prior post about DHCPv6. https://cciethebeginning.wordpress.com/2012/01/18/ios-dhcpv6-deployment-schemes/

pppoe-client

pppoe-client#sh ipv6 dhcp
This device’s DHCPv6 unique identifier(DUID): 00030001CA00075C0008
pppoe-client#
interface Dialer1

ipv6 dhcp client pd PREFIX_FROM_ISP
!
interface FastEthernet0/0
ipv6 address FE80::2000:1 link-local

ipv6 address PREFIX_FROM_ISP ::1/64
ipv6 enable
pppoe-client#sh ipv6 dhcp interface
Dialer1 is in client mode
Prefix State is OPEN
Renew will be sent in 3d11h
Address State is IDLE
List of known servers:
Reachable via address: FE80::22
DUID: 00030001CA011F780008
Preference: 0
Configuration parameters:

IA PD: IA ID 0x00090001, T1 302400, T2 483840

Prefix: 2001:DB8:5AB:2000::/56

preferred lifetime INFINITY, valid lifetime INFINITY

Information refresh time: 0

Prefix name: PREFIX_FROM_ISP

Prefix Rapid-Commit: disabled

Address Rapid-Commit: disabled

client-LAN

Now the customer LAN is assigned globally available IPv6 from the CPE (PPPoE client).

client-LAN#sh ipv6 interface fa0/0
FastEthernet0/0 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::2000:F
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:2000::2000:F, subnet is 2001:DB8:5AB:2000::/64 [EUI/CAL/PRE]
client-LAN#sh ipv6 route

S ::/0 [2/0]

via FE80::2000:1, FastEthernet0/0

C 2001:DB8:5AB:2000::/64 [0/0]

via FastEthernet0/0, directly connected

L 2001:DB8:5AB:2000::2000:F/128 [0/0]

via FastEthernet0/0, receive

L FF00::/8 [0/0]

via Null0, receive

client-LAN#

End-to-end dual-stack connectivity check

client-LAN#ping 2001:DB8:5AB:3::100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:DB8:5AB:3::100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/45/88 ms
client-LAN#trace 2001:DB8:5AB:3::100
Type escape sequence to abort.
Tracing the route to 2001:DB8:5AB:3::100

1 2001:DB8:5AB:2000::1 28 msec 20 msec 12 msec

2 2001:DB8:5AB:2::FF 44 msec 20 msec 32 msec

3 2001:DB8:5AB:3::100 48 msec 20 msec 24 msec

client-LAN#

client-LAN#ping 192.168.3.100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.3.100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 52/63/96 ms
client-LAN#trace 192.168.3.100
Type escape sequence to abort.
Tracing the route to 192.168.3.100

1 192.168.4.1 32 msec 44 msec 20 msec

2 192.168.2.1 56 msec 68 msec 80 msec

3 192.168.3.100 72 msec 56 msec 116 msec

client-LAN#

I assigned PREFIX_FROM_ISP as locally significant name for the delegated prefix, no need to match the name on the DHCPv6 server side.

Finally, the offline lab with all the commands needed for more detailed inspection:

 

References

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/bbdsl/configuration/15-mt/bba-15-mt-book/bba-ppoe-client.html

http://www.cisco.com/en/US/docs/ios-xml/ios/bbdsl/configuration/15-mt/ip6-adsl_external_docbase_0900e4b182dbdf4f_4container_external_docbase_0900e4b182dc25f3.html

http://www.broadband-forum.org/technical/download/TR-187.pdf

https://tools.ietf.org/html/rfc5072

https://tools.ietf.org/html/rfc5072

http://www.bortzmeyer.org/6911.html (french)

http://packetsize.net/cisco-pppoe-ipv4-ipv6-mppe.htm

     

Embedded Packet Capture, let’s go fishing for some packets!


EPC (Embedded Packet Capture) is another useful troubleshooting tool to occasionally capture traffic to be analyzed locally or exported to remote device. Occasionally, in contrast with RITE (Router IP Traffic Export) or SPAN on switches which are meant to have permanent flow of copied traffic directed to a traffic analyzer or IDS (Intrusion Detection System).

The configuration workflow is straightforward, but I would like to make a conceptual graphical analogy to illustrate it.

Let’s imagine traffic flowing through a router interface like the following:

Embedded Packet Capture

1- Capture point:


Specify the protocol to capture, the interface and the direction, this is the Here you indicate which IP protocol you need to capture.

monitor capture point ip cef CAPTURE_POINT fastEthernet 0/0 both
monitor capture point ipv6 cef CAPTURE_POINT fastEthernet 0/0 both

2- Packet buffer:


Memory area where the frames are stored once captured. 

monitor capture buffer CAPTURE_BUFFER

 

Embedded Packet Capture

3- ACL:


If needed you can filter a specific type of traffic, available only for IPv4. 

(config)#access-list 100 permit icmp host 192.168.0.1 host 172.16.1.1#monitor capture buffer CAPTURE_BUFFER filter access-list 100 

 

Except the optional IPv4 ACL, configured at the global configuration mode, everything else is configured at the privileged EXEC mode.

Embedded Packet Capture

4- Associate capture point with capture buffer

monitor capture point associate CAPTURE_POINT CAPTURE_BUFFER

You can associate multiple capture points (on the same or multiple interfaces) to the same buffer.

Embedded Packet Capture

5- Start and stop capture process

monitor capture point start CAPTURE_POINTmonitor capture point stop CAPTURE_POINT

 concept6

If you are familiar with wireshark, it will be easier to remember the steps needed to capture traffic.

Wireshark analogy

wireshark and Embedded Packet Capture

Deployment 1

Two capture points are created to capture IPv4 and IPv6 traffic into separate capture buffers. 

monitor capture point ipv6 cef CAPTURE_POINT6 fa0/0 bothmonitor capture buffer CAPTURE_BUFFER6monitor capture point associate CAPTURE_POINT6 CAPTURE_BUFFER6

!

monitor capture point ip cef CAPTURE_POINT4 fa0/0 both

monitor capture buffer CAPTURE_BUFFER4

monitor capture point associate CAPTURE_POINT4 CAPTURE_BUFFER4

Following is the result on the router

Deployment 2

Two capture points are created to capture IPv4 and IPv6 traffic into single capture buffer. 

monitor capture point ipv6 cef CAPTURE_POINT6 fa0/0 bothmonitor capture point ip cef CAPTURE_POINT4 fa0/0 both!monitor capture buffer CAPTURE_BUFFER46

!

monitor capture point associate CAPTURE_POINT6 CAPTURE_BUFFER46

monitor capture point associate CAPTURE_POINT4 CAPTURE_BUFFER46

 

Following is the result on the router

Exporting

!Example of export to tftpR1#monitor capture buffer CAPTURE_BUFFER46 export ftp://login:password@192.168.0.32/Volume_1/ecp.pcapWriting Volume_1/ecp.pcap

R1#

!Example of export to tftp

R1# monitor capture buffer CAPTURE_BUFFER46 export tftp://192.168.0.145/ecp.pcap

!

R1#

And the file opened in wireshark:

EPC traffic opened with wireshark

wireshark

That’s all folks!

DMVPN animation


Here is an interactive animation of DMVPN (Dynamic Multipoint VPN), followed by a detailed offline lab (a snapshot of the topology under test with hopefully all commands needed for analysis and study).

Finally, check your understanding of the fundamental concepts by taking a small quiz.

Studied topology:

DMVPN animation

Animation

Offline Lab

You might consider the following key points for troubleshooting:

Routing protocols:

To avoid RPF failure, announce routing protocols only through tunnel interfaces.

EIGRP

  • Turn off “next-hop-self” to makes spokes speak directly. Without it traffic between spokes will always pass through the HUB and NHRP resolution will not occur.
  • Turn off “split-horizon” to allow eigrp to advertise a received route from one spoke to another spoke through the same interface.
  • Turn off sumarization
  • Pay attention to the bandwidth required for EIGRP communication. requires BW > tunnel default BW “bandwidth 1000”

OSPF

  • “ip ospf network point-to-multipoint”, allows only phase1 (Spokes Data plane communication through the HUB)
  • “ip ospf broadcast” on all routers allows Phase2 (Direct Spoke-to-spoke Data plane communication)
  • Set the ospf priority on the HUBs (DR/BDR) to be bigger than the priority on spokes (“ip ospf priority 0”).
  • Make sure OSPF timers match if spokes and the HUB use different OSPF types.
  • Because spokes are generally low-end devices, they probably can’t cope with LSA flooding generated within the OSPF domain. Therefore, it’s recommended to make areas Stubby (filter-in LSA5 from external areas) or totally stubby (neither LSA5 nor inter-area LSA3 are accepted)

Make sure appropriate MTU value matches between tunnel interfaces (“ip mtu 1400 / ip tcp mss-adjust 1360”)

Consider the OSPF scalability limitation (50 routers per area). OSPF requires much more tweekening for large scale deployments.

Layered approach:

DMVPN involves multiple layers of technologies (mGRE, routing, NHRP, IPSec), troubleshooting an issue can be very tricky.

To avoid cascading errors, test your configuration after each step and move forward only when the current step works fine. For example: IPSec encryption is not required to the functioning of DMVPN, so make sure your configuration works without it and only then you add it (set IPSEc parameters and just add “tunnel protection ipsec profile” to the tunnel interface).

Quiz

Read more of this post

IPv6 multicast over IPv6 IPSec VTI


IPv4 IPSec doesn’t support multicast, we need to use GRE (unicast) to encapsulate multicast traffic and encrypt it. As a consequence, more complication and an additional level of routing, so less performance.

One of the advantages of IPv6 is the support of IPSec authentication and encryption (AH, ESP) right in the extension headers, which makes it natively support IPv6 multicast.

In this lab we will be using IPv6 IPSec site-to-site protection using VTI to natively support IPv6 multicast.

The configuration involves three topics: IPv6 routing, IPv6 IPSec and IPv6 multicast. Each process is built on the top the previous one, so before touching IPsec, make sure you have local connectivity for each segment of the network and complete reachability through IPv6 routing.

Next step, you can move to IPv6 IPSec and change routing configuration accordingly (through VTI).

IPv6 multicast relies on a solid foundation of unicast reachability, so once you have routes exchanged between the two sides through the secure connection you can start configuring IPv6 multicast (BSR, RP, client and server simulation).

Picture1: Lab topology

IPv6 multicast over IPv6 IPSec VTI

Lab outline

  • Routing
    • OSPFv3
    • EIGRP for IPv6
  • IPv6 IPSec
    • Using IPv6 IPSec VTI
    • Using OSPFv3 IPSec security feature
  • IPv6 Multicast
    • IPv6 PIM BSR
  • Offline lab
  • Troubleshooting cases
  • Performance testing

Routing

Note:
IPv6 Routing relies on link-local addresses, so for troubleshooting purpose, link-local IPs are configured to be similar to their respective global addresses, so they are easily recognisable. This will be of a tremendous help during troubleshooting. Otherwise you will find yourself trying to decode the matrix : )

OSPFv3

Needs an interface configured with IPv4 address for Router-id

OSPFv3 offloads security to IPv6 native IPv6, so you can secure OSPFv3 communications on purpose: per- interface or per-area basis.
  Table1: OSPFv3 configuration

  R2 R1
IPv6 routing processes need IPv4-format router ids ipv6 router ospf 12
router-id 2.2.2.2
 ipv6 router ospf 12
router-id 1.1.1.1
Announce respective LAN interfaces interface FastEthernet0/1
ipv6 ospf 12 area 22
interface FastEthernet0/1
ipv6 ospf 12 area 11 
Disable routing on the physical BTB connection to avoid RPF failure interface FastEthernet0/0
 ipv6 ospf 12 … 
interface FastEthernet0/0
 ipv6 ospf 12 …
IPv6 gateways exchange routes through the VTI encrypted interface interface Tunnel12
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Tunnel12
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Set the ospf network type on loopback interfaces if you want to advertise masks other that 128-length interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Table2: EIGRP for IPv6 configuration
  R2 R1
IPv6 routing processes need IPv4-format router ids ipv6 router eigrp 12
eigrp router-id 2.2.2.2
ipv6 router eigrp 12
eigrp router-id 1.1.1.1
Announce respective LAN interfaces interface FastEthernet0/1
ipv6 eigrp 12
interface FastEthernet0/1
ipv6 eigrp 12
Disable routing on the physical BTB connection to avoid RPF failure interface FastEthernet0/0
ipv6 eigrp 12
interface FastEthernet0/0
ipv6 eigrp 12
IPv6 gateways exchange routes through the VTI encrypted interface interface Tunnel12
ipv6 eigrp 12
interface Tunnel12
ipv6 eigrp 12
Set the ospf network type on loopback interfaces if you want to advertise masks other that 128-length interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Enable EIGRP process ipv6 router eigrp 12
no shutdown
ipv6 router eigrp 12
no shutdown

In case you want to configure EIGRP for IPv6:

– No shutdown inside EIGRP configuration mode

– Similarly to OSPFv3, we need an interface configured with IPv4 address for Router-id

IPv6 IPSec

  • Using IPv6 IPSec VTI
Table3: IPSec configuration
  R1 R2
Set the type of ISAKMP authentication crypto keyring keyring1
pre-shared-key address ipv6 2001:DB8::2/128 key cisco
crypto keyring keyring1
pre-shared-key address ipv6 2001:DB8::1/128 key cisco
  crypto isakmp key cisco address ipv6 2001:DB8::2/128 crypto isakmp key cisco address ipv6 2001:DB8::1/128
ISAKMP profile crypto isakmp policy 10
encr 3des
hash md5
authentication pre-share
lifetime 3600
crypto isakmp policy 10
encr 3des
hash md5
authentication pre-share
lifetime 3600
Transform sets: symmetric encryption and signed hash algorithms crypto ipsec transform-set 3des ah-sha-hmac esp-3des crypto ipsec transform-set 3des ah-sha-hmac esp-3des
  crypto ipsec profile profile0
set transform-set 3des
crypto ipsec profile profile0
set transform-set 3des
Tunnel mode and bind the ipsec profile interface Tunnel12
ipv6 address FE80::DB8:12:1 link-local
ipv6 address 2001:DB8:12::1/64
tunnel source FastEthernet0/0
tunnel destination 2001:DB8::2
tunnel mode ipsec ipv6
tunnel protection ipsec profile profile0
interface Tunnel12
ipv6 address FE80::DB8:12:2 link-local
ipv6 address 2001:DB8:12::2/64
tunnel source FastEthernet0/0
tunnel destination 2001:DB8::1
tunnel mode ipsec ipv6
tunnel protection ipsec profile profile0
Make sure to not advertise the routes through the physical interface to avoid RPF failures (when the source of the multicast traffic is reached from an different interface than the one provided by the RIB) interface FastEthernet0/0
ipv6 address FE80::DB8:1 link-local
ipv6 address 2001:DB8::1/64
ipv6 enable
interface FastEthernet0/0
ipv6 address FE80::DB8:2 link-local
ipv6 address 2001:DB8::2/64
ipv6 enable
 

Here is a capture of the traffic (secured) between R1 and R2 gateways

Picture2: Wireshark IPv6 IPSec trafic capture

IPv6-IPSec-VTI

What could go wrong?

– Encryption doesn’t match

– Shared key doesn’t match

– Wrong ISAKMP peers

– ACL in the path between the 2 gateways blocking gateways IPs or protocol 500

– IPSec profile no assigned to the tunnel int ( tunnel protection ipsec profile < …>)

– Ipsec Encryption and/or signed hashes don’t match.

  • Using OSPFv3 IPSec security feature

You still can use IPv6 IPSec to encrypt and authenticate only OSPF per-interface basis.

OSPFv3 will use the IPv6-enabled IP Security (IPsec) secure socket API.

R1

interface FastEthernet0/0
ipv6 ospf 12 area 0
ipv6 ospf encryption ipsec spi 256 esp 3des 123456789A123456789A123456789A123456789A12345678 md5 123456789A123456789A123456789A12

R2

interface FastEthernet0/0
ipv6 ospf 12 area 0
ipv6 ospf encryption ipsec spi 256 esp 3des 123456789A123456789A123456789A123456789A12345678 md5 123456789A123456789A123456789A12

Picture4: Wireshark traffic capture – OSPFv3 IPSec feature :

ipv6-ospf-feature

Note only OSPFv3 traffic is encrypted

IPv6 Multicast

IPv6 PIM BSR

The RP (Rendez-vous point) is the point where multicast server offer meets member’s demand.

First hop routers build (S,G) source trees with candidate RPs and register directly connected multicast sources.

Candidate- RPs announce themselves to candidate-BSRs, and the latter announce the inf. to all PIM routers.

All PIM routers looking for a particular multicast group learn Candidate RP IP addresses from BSR and build (*, G) shared trees.

Table4: Multicast configuration

  R1(candidate RP) R2(candidate BSR)
Enable multicast routing ipv6 multicast-routing ipv6 multicast-routing
R1 announced as BSR candidate ipv6 pim bsr candidate bsr 2001:DB8:10::1  
R2 announced as RP candidate   ipv6 pim bsr candidate rp 2001:DB8:20::2
Everything should be routed through the tunnel interface, to be encrypted ipv6 route ::/0 Tunnel12 FE80::DB8:12:2 ipv6 route ::/0 Tunnel12 FE80::DB8:12:1
For testing purpose, make one router join a multicast traffic and ping it from a LAN router on the other side or you can opt for more fun by running VLC on one host to read a network stream and stream a video from a host on the other side.   interface FastEthernet0/1
ipv6 mld join-group ff0E::5AB

Make sure that:

  • At least one router is manually configured as a candidate RP
  • At least one router is manually configured as a candidate BSR
During multicasting of the traffic, sll PIM routers knows about the RP and the BSR

– (*,G) shared tree is spread over PIM routers from the last hop router (connected to multicast members).

– (S,G) source tree is established between the first hop router (connected to the multicast server) and the RP.

– The idea behind IPv6 PIM BSR is the same as in IPv4; here an animation explaining the process for IPv4.

Let’s check end-to-end multicast streaming:

Before going to troubleshooting here is the offline lab with all commands:

Troubleshooting

If something doesn’t work and you are stuck, isolate the area of work and inspect each process separately step by step.

Check each step using “show…” commands, so you know each time what you are looking for to spot what is wrong.

“sh run” and script comparison technique is limited by the visual perception capability which is illusory and far from being reliable.

Common routing issues

– Make sure you have successful back-to-back connectivity everywhere.

– With EIGRP for IPv6 make sure the process is enabled.

– If routing neighbors are connected through NBMA network, make sure to enable pseudo broadcasting and manually set neighbor commands.

Common IPSec issues

– ISAKMP phase

– Wrong peer

– Wrong shared password

– Not matching isakmp profile

– IPSec phase

– Not matching ipsec profile

Common PIM issues

– If routing neighbors are connected through NBMA network, make sure C-RPs and C-BSRs are locate on the main site.

– Issue with the client: => no (*,G)

– MLD query issue with the last hop.

– Last hop PIM router cannot build the shared tree.

– Issue with RP registration  => no (S,G)

– Multicast server MLD issue with the 1st hop router

– 1st hop router cannot register with th RP.

– Issue with C-BSR candidate doesn’t advertise RP inf. to PIM routers (BSRs collect all candidate RPs and announce them to all PIM routers to choose the best RP for each group)

– Issue with C-RP candidate doesn’t announce themselves to C-BSRs (RPs announce to C-BSRs which multicast groups they are responsible for)

-RPF failure (the interface used to reach the multicast source, through RIB, is not the interface sourcing the multicast traffic)

Picture5: RPF Failure

Replace test case 6 with RPF failure (enable PIM & routing through physical int.)

Table5: troubleshooting cases
Case Description Simulated wrong configuration Correct configuration
ISAKMP policy, encryption key mismatch crypto isakmp policy 10
encr aes
crypto isakmp policy 10
encr 3des
2 ISAKMP policy, Hash algorithm mismatch crypto isakmp policy 10
Hash sha
crypto isakmp policy 10
Hash md5 
3 Wrong ISAKMP peer crypto isakmp key cisco address ipv6 2001:DB8::3/128 crypto isakmp key cisco address ipv6 2001:DB8::2/128 
4 Wrong ISAKMP key crypto isakmp key cisco1 address ipv6 2001:DB8::2/128 crypto isakmp key cisco address ipv6 2001:DB8::2/128 
5 Wrong tunnel destination interface Tunnel12
tunnel destination 2001:DB8::3
interface Tunnel12
tunnel destination 2001:DB8::2 
6 Wrong tunnel source interface Tunnel12
tunnel source FastEthernet0/1
interface Tunnel12
tunnel source FastEthernet0/0
 

For more details about each case, refer to the offline lab below, you will find an extensive coverage of all important commands along with debug for each case:

Performance testing

Three cases are tested: multicast traffic between R1 and R2 is routed through:

– Physical interfaces (serial connection): MTU=1500 bytes

– IPv6 GRE: MTU=1456 bytes

– IPv6 IPSec VTI: MTU=1391 bytes

The following tests are performed using iperf in GNS3 lab environment, so results are to keep relative.

Picture6: Iperf testing

perfs

References

http://www.faqs.org/rfcs/rfc6226.html

http://tools.ietf.org/html/rfc5059

http://www.cisco.com/en/US/docs/ios/ipv6/configuration/guide/ip6-multicast.html#wp1055997

https://supportforums.cisco.com/docs/DOC-27971

http://www.cisco.com/en/US/docs/ios/ipv6/configuration/guide/ip6-tunnel_external_docbase_0900e4b1805a3c71_4container_external_docbase_0900e4b181b83f78.html

http://www.cisco.com/web/learning/le21/le39/docs/TDW_112_Prezo.pdf

http://networklessons.com/multicast/ipv6-pim-mld-example/

http://www.gogo6.com/profiles/blogs/ietf-discusses-deprecating-ipv6-fragments

http://tools.ietf.org/html/draft-taylor-v6ops-fragdrop-01

https://datatracker.ietf.org/doc/draft-bonica-6man-frag-deprecate

http://blog.initialdraft.com/archives/1648/

Let’s 6rd!


6rd mechanism belongs to the same family as automatic 6to4, in which IPv6 traffic is encapsulated inside IPv4.

The key difference is that with 6rd, Service Providers use their own 6rd prefix and control the transition of their access-aggregation IPv4-only part of their networks to native IPv6. In the same time, SPs transparently provide IPv6 availability service to their customers.

6rd is generally referred as stateless transition mechanism.

Stateless
In stateless mechanisms an algorithm is used to automatically map between addresses, the scope of mechanism is limited to a local domain in which devices, mapping device (6rd BR) and devices that need mapping (6rd CE), share a common elements of the configuration.
Stateful
On a gateway device, we need to specify a specific address or a range of addresses (not used elsewhere) that will represent another range of addresses.
For example IPv4 NAT on Cisco (NAT44):
ip nat source …
The router relies on the configured statement which address (all bits) to translate to which address (all bits). Which is done independently of devices whose address needs to be translated (inside local/outside global).
For redundancy we need additional configuration to synchronize connection state information between devices. for example SNAT(Stateful NAT failover).

Customer CE routers generate their own IPv6 from the delegated 6rd prefix from BRs (Border relays).


Both CEs and BRs encapsulate IPv6 traffic into IPv4 traffic by automatically reconstructing the header IPv4 addresses from IPv6.


  • Lab topology

top1

For end-to-end testing I am using Ubunu Server version for client host behind CE and Internet host.

Here, is a brief and I hope concise explanation of the main 6rd operations:


6rd configuration

6rd address planning depends on each SP. IPv4 bits must be unique to each CE to show the flexibility of the configuration, I fixed the first 16 bits (10.1) as prefix and the last octets (.1) as suffix and attributed the third octet to CEs.

6rd domain configured parameters:

Tunnel source interface fa0/0
6rd prefix 2001:DEAD::/32
IPv4 prefix length 16
IPv4 bits 8
IPv4 suffix 8
Tunnel source interface IP 10.1.4.1

BR1

ipv6 general-prefix 6RD-PREFIX 6rd Tunnel0
!
interface Tunnel0
ipv6 address 6RD-PREFIX 2001:DEAD::/128 anycast
ipv6 enable

tunnel source FastEthernet0/0
tunnel mode ipv6ip 6rd
tunnel 6rd ipv4 prefix-len 16 suffix-len 8


tunnel 6rd prefix 2001:DEAD::/32

interface FastEthernet0/0

ip address 10.1.4.1 255.255.0.0

We need a couple of static routes to make 6rd work in lab conditions; generally, BR announces client assigned IPv4 to clients to Internet.

  • Default ipv4 static route to outside
  • Static route to SP 6rd prefix pointing to the tunnel
  • Default ipv6 static route to outside
ip route 0.0.0.0 0.0.0.0 192.168.20.100
ipv6 route 2001:DEAD::/32 Tunnel0
ipv6 route ::/0 2001:DB9:5AB::100

CE1

The same 6rd parameters are configured on CE:

  • IPv4 affixes
  • 6rd domain global prefix
  • BR IPv4 address (remote tunnel end point)
interface Tunnel0
ipv6 enable
tunnel source 10.1.1.1
tunnel mode ipv6ip 6rd

tunnel 6rd ipv4 prefix-len 16 suffix-len 8

tunnel 6rd prefix 2001:DEAD::/32

tunnel 6rd br 10.1.4.1
!

interface FastEthernet0/0

ip address 10.1.1.1 255.255.0.0

ipv6 enable

!

interface FastEthernet0/1

ip address 192.168.10.100 255.255.255.0


ipv6 address 6RD-PREFIX ::/64 eui-64 ! * <<<

* Note the CE WAN interface fa0/0 is only enabled for IPv6 to be attributed a link-local address.

Fa0/0 IPv4 address is generally assigned by IPv4 DHCP. If the ISP assigns private addresses, CGN NAT44 is needed at the BR to translate them into global IPv4.

6rd prefix is delegated not to CE fa0/0 WAN interface but CE inside LAN interface fa0/1.

This way the customer LAN can benefit directly from the globally IPv6 address without interrupting IPv6 address continuity and the same prefix can be assigned to client IPv6 network using SLAAC (stateless auto configuration).

A recursive (output interface + next-hop) IPv6 default route points to the BR tunnel interface.

ipv6 route ::/0 Tunnel0 2001:DEAD:400::1

Debugging 6rd tunnel

CE1

CE1#debug tunnel
Tunnel Interface debugging is on
CE1#
Tunnel0: IPv6/IP adjacency fixup, 10.1.1.1->10.1.4.1, tos set to 0x0
Tunnel0: IPv6/IP (PS) to decaps 10.1.4.1->10.1.1.1 (tbl=0, “default”, len=124, ttl=254)
Tunnel0: decapsulated IPv6/IP packet (len 124)

BR1

BR1#debug tunnel
Tunnel Interface debugging is on
BR1#
Tunnel0: IPv6/IP to classify 10.1.1.1->10.1.4.1 (tbl=0,”default” len=124 ttl=254 tos=0x0) ok, oce_rc=0x0
Tunnel0: IPv6/IP adjacency fixup, 10.1.4.1->10.1.1.1, tos set to 0x0
BR1#

As shown by the debug, the end-to-end IPv6 traffic is encapsulated into IPv4 packets between CE and BR.

$iperf -u -t -i1 -V -c 2001:db9:5ab::10 -b 10K
WARNING: delay too large, reducing from 1.2 to 1.0 seconds.
————————————————————
Client connecting to 2001:db9:5ab::100, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 112 KByte (default)
————————————————————
[ 3] local 2001:dead:100:0:a00:27ff:fe0f:20e9 port 39710 connected with 2001:db9:5ab::100 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0- 1.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 1.0- 2.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 2.0- 3.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 3.0- 4.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 4.0- 5.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 5.0- 6.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 6.0- 7.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 7.0- 8.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 8.0- 9.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 9.0-10.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 0.0-11.0 sec 16.0 GBytes 12.5 Gbits/sec

[ 3] Sent 11 datagrams

read failed: Connection refused

[ 3] WARNING: did not receive ack of last datagram after 4 tries.

Following, is a wireshark traffic capture of the previous iperf testing

6rd-iperf-wireshark

Verification commands

BR1:

BR1#sh tunnel 6rd tunnel 0
Interface Tunnel0:
Tunnel Source: 10.1.4.1
6RD: Operational, V6 Prefix: 2001:DEAD::/32
V4 Prefix, Length: 16, Value: 10.1.0.0
V4 Suffix, Length: 8, Value: 0.0.0.1
General Prefix: 2001:DEAD:400::/40
BR1#
BR1#sh tunnel 6rd destination 2001:dead:100:: tunnel0
Interface: Tunnel0
6RD Prefix: 2001:DEAD:100::
Destination: 10.1.1.1
BR1#
BR1#sh tunnel 6rd prefix 10.1.1.1 tunnel 0
Interface: Tunnel0
Destination: 10.1.1.1
6RD Prefix: 2001:DEAD:100::
BR1#

CE1:

CE1#sh tunnel 6rd tunnel 0
Interface Tunnel0:
Tunnel Source: 10.1.1.1
6RD: Operational, V6 Prefix: 2001:DEAD::/32
V4 Prefix, Length: 16, Value: 10.1.0.0
V4 Suffix, Length: 8, Value: 0.0.0.1
Border Relay address: 10.1.4.1
General Prefix: 2001:DEAD:100::/40

CE1#

CE1#sh tunnel 6rd destination 2001:dead:100:: tunnel0
Interface: Tunnel0
6RD Prefix: 2001:DEAD:100::
Destination: 10.1.1.1
CE1#
CE1#sh tunnel 6rd prefix 10.1.4.1 tunnel 0
Interface: Tunnel0
Destination: 10.1.4.1
6RD Prefix: 2001:DEAD:400::
CE1#

We can use “mtr” command to check the performance of the end-to-end (linux-to-linux) communication.

router@router1:~$ mtr 2001:db9:5ab::100
HOST: router1 Loss% Snt Last Avg Best Wrst StDev
1.|– 2001:dead:100:0:c801:3dff 0.0% 30 27.7 25.2 9.7 34.4 5.9
2.|– 2001:db9:5ab::1 0.0% 30 181.3 126.2 99.1 181.3 19.3
3.|– 2001:db9:5ab::100 0.0% 30 67.3 82.8 67.3 121.6 14.1
router@router1:~$

Customer internal network


6rd and MTU

The default MTU on IOS is 1480 bytes, so the maximum IPv4 packet size encapsulating IPv6 is 1500 bytes.

userver1 end-to-end MTU

router@router1:~$ tracepath6 2001:db9:5ab::100
1?: [LOCALHOST] 0.051ms pmtu 1500
1: 2001:dead:100:0:c801:3dff:fe5c:6 27.130ms
1: 2001:dead:100:0:c801:3dff:fe5c:6 57.536ms
2: 2001:dead:100:0:c801:3dff:fe5c:6 30.005ms pmtu 1480
2: 2001:db9:5ab::1 135.158ms
3: 2001:db9:5ab::100 79.603ms reached

Resume: pmtu 1480 hops 3 back 253

router@router1:~$

Here is an animation explaining 6rd and fragmentation:

MTU recommendations

  • Using a redundant BR, there is no guarantee that traffic will be handled by the same BR, so fragmented packets are lost between BRs è BR anycast IPv4 + IPv4 fragmentation is not recommended.
  • Configure the same IPv4 MTU everywhere within the IPv4 segment and (DF=1) to disable fragmentation.
    • make sure the IPv4 MTU is coordinated with IPv6 MTU (IPv4 MTU < IPv6 MTU + 20 bytes)
  • Enable PMTUD to choose the smallest MTU in the path of CE-to-BR communication.
  • DO NOT Filter ICMP messages “Packet Too Big” and “Destination Unreachable” at routers and end-hosts, they provide inf. about transport issues, worse than traffic black hole is a silent traffic black hole.

Offline Lab

Finally, the offline lab with comprehensive command output during the lab:

%d bloggers like this: