Load balancing GRE tunneled multicast traffic
July 26, 2010 1 Comment
This Lab is a part of a whole bundle aimed to illustrate different methods used to load balance multicast traffic through multiple equal-cost paths. For each individual Lab the topology could be slightly modified to match the appropriate conditions:
- Direct multicast Load Balancing through ECMP.
- Load balancing GRE tunneled multicast traffic.
- Load Balancing Multicast traffic through EtherChannel.
- Load Balancing Multicast traffic through MLP.
To avoid direct load balancing using ECMP (Equal-Cost Multiple Path), it is possible to offload the load sharing to unicast traffic processing by encapsulating multicast into GRE tunnels in such way that the multipath topology (ramification nodes and parallel paths) are aware only of the unicast tunnel sessions.
Outline
Configuration
Routing
CEF
Tunneling and RPF check
Configuration check:
Layer3 Load balancing
Testing unicast CEF Load sharing
Simulate a path failure with each of the three paths
Increasing the number of sessions
Conclusion
Picture1 illustrates the topology used here; note the two ramification nodes R10 and R1 delimiting three parallel paths.
Pcicture1: Lab topology
Configuration
Routing
The routing protocol deployed is EIGRP; by default it allows four equal-cost paths (six configurable), if needed use “maximum-path <max>” command to allow more.
R2#sh ip pro Routing Protocol is “eigrp 10” … Automatic network summarization is not in effect Maximum path: 4 Routing for Networks: … |
The same autonomous system 50 is used everywhere with auto-summarization disabled and advertisement of the directly connected segments.
CEF
CEF is a key feature, because load balancing at the data plane uses FIB which is directly inspired from Control plane RIB, in addition to the adjacency table of course.
CEF allows:
Per-destination load-sharing
- More appropriate for BIG number of (src-dst) sessions
- Load balance individual sessions (src, dst) over multiple paths
- Default for CEF
Per-Packet (Round-Robin distribution) load-sharing
- Load balance individual packets for a given session over multiple paths
- Not recommended for VoIP because of packet re-ordering.
Tunneling and RPF check:
– Unicast Routing protocol EIGRP will process the tunnel outer header with IPs from interfaces used for tunnel source and destination, that’s why only these subnets are advertised.
– Multicast routing protocol PIM will process tunnel inner header, for that reason PIM must be announced on the tunnel interface itself, not tunnel sources/destination interfaces.
– These two routing levels and their corresponding interfaces must be strictly separated to avoid RPF failure.
Tunneling:
On router « vhost_member » the unique physical interface fa0/0 cannot be used for tunneling, because incoming traffic is de-multiplexed to the appropriate tunnel using the tunnel source, that’s why three loopbacks are created to be used as source for each tunnel interface.
GRE Tunnels are formed between multicast source routers (SRC1, SRC2 and SRC3) and Last-hop PIM router “vhost_member”
First tunnel :
|
SRC1 router | vhost_member router |
tunnel interface | tunnel1 | tunnel2 |
ip | 1.1.10.1/24 | 1.1.10.6/24 |
tunnel source | fa0/0 | loopback1 |
tunnel destination | 6.6.6.10 | 10.0.1.1 |
mode | GRE | GRE |
Second tunnel :
|
SRC2 router | vhost_member router |
tunnel interface | tunnel1 | tunnel2 |
ip | 1.1.20.2/24 | 1.1.20.6/24 |
tunnel source | fa0/0 | loopback2 |
tunnel destination | 6.6.6.20 | 10.0.2.2 |
mode | GRE | GRE |
Third tunnel :
|
SRC3 router | vhost_member router |
tunnel interface | tunnel1 | tunnel3 |
ip | 1.1.30.3/24 | 1.1.30.6/24 |
tunnel source | fa0/0 | loopback3 |
tunnel destination | 6.6.6.30 | 10.0.3.3 |
mode | GRE | GRE |
Picture2 tunneling:
SRC1 router :
interface Tunnel1 ip address 1.1.10.1 255.255.255.0 tunnel source FastEthernet0/0 tunnel destination 6.6.6.10 tunnel mode gre ip |
SRC2 router :
interface Tunnel1 ip address 1.1.20.2 255.255.255.0 tunnel source FastEthernet0/0 tunnel destination 6.6.6.20 tunnel mode gre ip |
SRC3 router :
interface Tunnel1 ip address 1.1.30.3 255.255.255.0 tunnel source FastEthernet0/0 tunnel destination 6.6.6.30 tunnel mode gre ip |
vhost_member (Multicast last hop):
interface Tunnel1 ip address 1.1.10.6 255.255.255.0 tunnel source Loopback1 tunnel destination 10.0.1.1 interface Tunnel2 ip address 1.1.20.6 255.255.255.0 tunnel source Loopback2 tunnel destination 10.0.2.2 ! interface Tunnel3 ip address 1.1.30.6 255.255.255.0 tunnel source Loopback3 tunnel destination 10.0.3.3 |
Multicast source routers
Enable pim ONLY on tunnel interfaces:
SRC3:
ip multicast-routing ! interface Tunnel1 ip pim dense-mode |
SRC2:
ip multicast-routing ! interface Tunnel1 ip pim dense-mode |
SRC1:
ip multicast-routing ! interface Tunnel1 ip pim dense-mode |
Router “vhost_member”:
ip multicast-routing interface Tunnel1 ip pim dense-mode ip igmp join-group 239.1.1.1 ! interface Tunnel2 ip pim dense-mode ip igmp join-group 239.2.2.2 ! interface Tunnel3 ip pim dense-mode ip igmp join-group 239.3.3.3 |
PIM Check:
Gmembers#sh ip pim interface Address Interface Ver/ Nbr Query DR DR Mode Count Intvl Prior 1.1.10.6 Tunnel1 v2/D 1 30 1 0.0.0.0 1.1.20.6 Tunnel2 v2/D 1 30 1 0.0.0.0 1.1.30.6 Tunnel3 v2/D 1 30 1 0.0.0.0 Gmembers# |
Multicast sources (Incoming interfaces) are reachable through tunnel interfaces (unicast routing outbound interfaces) from which the multicast is received:
Gmembers#sh ip route … Gateway of last resort is not set 1.0.0.0/8 is variably subnetted, 4 subnets, 2 masks D 1.1.1.1/32 [90/156160] via 10.0.0.21, 00:37:18, FastEthernet0/0 C 1.1.10.0/24 is directly connected, C 1.1.20.0/24 is directly connected, C 1.1.30.0/24 is directly connected, … Gmembers#sh ip mroute IP Multicast Routing Table … (1.1.10.1, 239.1.1.1), 00:02:29/00:02:58, flags: LT Incoming interface: Tunnel1, RPF nbr 0.0.0.0 Outgoing interface list: Tunnel2, Forward/Dense, 00:02:29/00:00:00 Tunnel3, Forward/Dense, 00:02:29/00:00:00 … (1.1.20.2, 239.2.2.2), 00:02:11/00:02:58, flags: LT Incoming interface: Tunnel2, RPF nbr 0.0.0.0 Outgoing interface list: Tunnel1, Forward/Dense, 00:02:11/00:00:00 Tunnel3, Forward/Dense, 00:02:11/00:00:00 … (1.1.30.3, 239.3.3.3), 00:01:54/00:02:59, flags: LT Incoming interface: Tunnel3, RPF nbr 0.0.0.0 Outgoing interface list: Tunnel1, Forward/Dense, 00:01:54/00:00:00 Tunnel2, Forward/Dense, 00:01:54/00:00:00 Gmembers# |
Depending on the complexity of your topology you may need to route statically or dynamically some subnets through the tunnel.
Configuration check:
First let’s start multicast advertisement:
SRC3#p 239.3.3.3 repeat 1000 Type escape sequence to abort. Sending 1000, 100-byte ICMP Echos to 239.3.3.3, timeout is 2 seconds: Reply to request 0 from 1.1.30.6, 84 ms Reply to request 1 from 1.1.30.6, 84 ms … |
SRC2#ping 239.2.2.2 repeat 1000 Type escape sequence to abort. Sending 1000, 100-byte ICMP Echos to 239.2.2.2, timeout is 2 seconds: Reply to request 0 from 1.1.20.6, 128 ms Reply to request 1 from 1.1.20.6, 104 ms … |
SRC1#p 239.1.1.1 repeat 1000 Type escape sequence to abort. Sending 1000, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds: Reply to request 0 from 1.1.10.6, 120 ms Reply to request 1 from 1.1.10.6, 140 ms … |
From the multicast point of view (multicast source and members), traffic is forwarded through distinct PTP links (tunnels)
Picture3: Logical multicast topology
Note the multicast path, is not aware of any topology out of the tunnels through which it is advertised:
Gmembers#mtrace 1.1.10.1 1.1.10.6 Type escape sequence to abort. Mtrace from 1.1.10.1 to 1.1.10.6 via RPF From source (?) to destination (?) Querying full reverse path… 0 1.1.10.6 -1 1.1.10.6 PIM [1.1.10.0/24] -2 1.1.10.1 Gmembers# Gmembers#mtrace 1.1.20.2 1.1.20.6 Type escape sequence to abort. Mtrace from 1.1.20.2 to 1.1.20.6 via RPF From source (?) to destination (?) Querying full reverse path… 0 1.1.20.6 -1 1.1.20.6 PIM [1.1.20.0/24] -2 1.1.20.2 Gmembers# Gmembers#mtrace 1.1.30.6 1.1.30.3 Type escape sequence to abort. Mtrace from 1.1.30.6 to 1.1.30.3 via RPF From source (?) to destination (?) Querying full reverse path… 0 1.1.30.3 -1 1.1.30.3 PIM [1.1.30.0/24] -2 1.1.30.6 Gmembers# |
The following picture4 illustrates how intermediate routers (forming the parallel paths) consider only unicast sessions
Picture4: unicast sessions
Layer3 Load balancing
R10#sh ip route … 6.0.0.0/32 is subnetted, 3 subnets D 6.6.6.10 [90/158976] via 10.0.40.4, 02:13:11, Vlan104 [90/158976] via 10.0.30.3, 02:13:11, Vlan103 [90/158976] via 10.0.20.2, 02:13:11, Vlan102 D 6.6.6.20 [90/158976] via 10.0.40.4, 02:13:16, Vlan104 [90/158976] via 10.0.30.3, 02:13:16, Vlan103 [90/158976] via 10.0.20.2, 02:13:16, Vlan102 D 6.6.6.30 [90/158976] via 10.0.40.4, 02:13:17, Vlan104 [90/158976] via 10.0.30.3, 02:13:17, Vlan103 [90/158976] via 10.0.20.2, 02:13:17, Vlan102 Layer2, CEF, Load balance according to FIB (from RIB) R10# R10# R10#sh ip cef 6.6.6.20 internal 6.6.6.20/32, version 57, epoch 0, per-destination sharing 0 packets, 0 bytes via 10.0.40.4, Vlan104, 0 dependencies traffic share 1 next hop 10.0.40.4, Vlan104 valid adjacency via 10.0.30.3, Vlan103, 0 dependencies traffic share 1 next hop 10.0.30.3, Vlan103 valid adjacency via 10.0.20.2, Vlan102, 0 dependencies traffic share 1 next hop 10.0.20.2, Vlan102 valid adjacency 0 packets, 0 bytes switched through the prefix tmstats: external 0 packets, 0 bytes internal 0 packets, 0 bytes Load distribution: 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 (refcount 1) Hash OK Interface Address Packets 1 Y Vlan104 10.0.40.4 0 2 Y Vlan103 10.0.30.3 0 3 Y Vlan102 10.0.20.2 0 4 Y Vlan104 10.0.40.4 0 5 Y Vlan103 10.0.30.3 0 6 Y Vlan102 10.0.20.2 0 7 Y Vlan104 10.0.40.4 0 8 Y Vlan103 10.0.30.3 0 9 Y Vlan102 10.0.20.2 0 10 Y Vlan104 10.0.40.4 0 11 Y Vlan103 10.0.30.3 0 12 Y Vlan102 10.0.20.2 0 13 Y Vlan104 10.0.40.4 0 14 Y Vlan103 10.0.30.3 0 15 Y Vlan102 10.0.20.2 0 R10# |
So far everything seems to work as expected, but the two ramification routers, R10 and R1, show that it is not exactly an even distribution per-destination (picture5).
Observe unicast path using traceroute
Gmembers#trace 10.0.1.1 source 6.6.6.10 Type escape sequence to abort. Tracing the route to 10.0.1.1 1 10.0.0.21 44 msec 64 msec 24 msec 2 10.0.0.41 16 msec 24 msec 20 msec 3 10.0.30.30 20 msec 32 msec 20 msec 4 10.0.1.1 64 msec * 72 msec Gmembers# Gmembers#trace 10.0.2.2 source 6.6.6.20 Type escape sequence to abort. Tracing the route to 10.0.2.2 1 10.0.0.21 40 msec 32 msec 16 msec 2 10.0.0.41 8 msec 88 msec 20 msec 3 10.0.30.30 80 msec 48 msec 12 msec 4 10.0.2.2 140 msec * 68 msec Gmembers# Gmembers#trace 10.0.3.3 source 6.6.6.30 Type escape sequence to abort. Tracing the route to 10.0.3.3 1 10.0.0.21 56 msec 32 msec 24 msec 2 10.0.0.37 32 msec 120 msec 16 msec 3 10.0.40.40 56 msec 16 msec 100 msec 4 10.0.3.3 48 msec * 56 msec Gmembers# |
Picture5: traffic distribution for the three sessions
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 internal 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) Bucket 5 from 15, total 3 paths R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 internal 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) Bucket 5 from 15, total 3 paths R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 internal 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) Bucket 6 from 15, total 3 paths R10# |
R1#sh ip cef exact-route 6.6.6.10 10.0.1.1 6.6.6.10 -> 10.0.1.1 : FastEthernet0/0 (next hop 10.0.0.41) R1#sh ip cef exact-route 6.6.6.20 10.0.2.2 6.6.6.20 -> 10.0.2.2 : FastEthernet0/0 (next hop 10.0.0.41) R1#sh ip cef exact-route 6.6.6.30 10.0.3.3 6.6.6.30 -> 10.0.3.3 : FastEthernet2/0 (next hop 10.0.0.37) R1# |
According to Cisco Documentation, per-destination load balancing depends on statistical distribution of traffic and more appropriate for a big number of sessions.
Resisting the confirmation bias to justify these results, I decided to conduct a series of tests and see what they will lead to:
1) – Simulate a path failure with each of the three paths.
2) – Progressively increase the number of sessions.
Three paths are available for each destination prefix (used in tunneling)
Testing unicast CEF Load sharing:
1) – simulate a path failure with each of the three paths.
Normal situation with no failures:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10# |
Picture6: NO failure
Situation with R3 failure:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10# |
Picture7: failure of R3 path
Situation with R2 failure:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan103 (next hop 10.0.30.3) R10# |
Picture8: failure of R2 path
Situation with R4 failure:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan103 (next hop 10.0.30.3) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan103 (next hop 10.0.30.3) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10# |
Picture9: failure of R4 path
2) – Increasing the number of sessions:
With 4 sessions:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.30 10.0.2.2 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10# |
Picture10: Distribution with a 4th session between 10.0.2.2 and 6.6.6.30
With 5 sessions:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.30 10.0.2.2 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.10 10.0.3.3 -> 6.6.6.10 : Vlan103 (next hop 10.0.30.3) R10# |
Picture11: Distribution with a 4th session between 10.0.3.3 and 6.6.6.10
With 6 sessions:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.30 10.0.2.2 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.10 10.0.3.3 -> 6.6.6.10 : Vlan103 (next hop 10.0.30.3) R10#sh ip cef exact-route 10.0.1.1 6.6.6.20 10.0.1.1 -> 6.6.6.20 : Vlan103 (next hop 10.0.30.3) R10# |
Picture12: Distribution with a 6th session between 10.0.1.1 and 6.6.6.20
With 7 sessions:
R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 10.0.1.1 -> 6.6.6.10 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 10.0.2.2 -> 6.6.6.20 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 10.0.3.3 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10#sh ip cef exact-route 10.0.2.2 6.6.6.30 10.0.2.2 -> 6.6.6.30 : Vlan102 (next hop 10.0.20.2) R10#sh ip cef exact-route 10.0.3.3 6.6.6.10 10.0.3.3 -> 6.6.6.10 : Vlan103 (next hop 10.0.30.3) R10#sh ip cef exact-route 10.0.1.1 6.6.6.20 10.0.1.1 -> 6.6.6.20 : Vlan103 (next hop 10.0.30.3) R10#sh ip cef exact-route 10.0.1.1 6.6.6.30 10.0.1.1 -> 6.6.6.30 : Vlan104 (next hop 10.0.40.4) R10# |
Picture13: Distribution with a 7th session between 10.0.1.1 and 6.6.6.30
Conclusion
The result of the test confirm that for destination-based CEF Load-sharing, the more sessions the better the load distribution.
Hi sir,
Can we exchange the website link?
My website: http://www.ciscobibles.com
Title: CiscoBibles
Description: Free Cisco Training & Resources – Certification Exam Preparation
If you want, plz let me know.
Thanks.
Best Regards
Raiy