Deploying F5 BIG-IP LTM VE within GNS3 (part-1)


One of the advantages of deploying VMware (or VirtualBox) machines inside GNS3, is the available rich networking infrastructure environment. No need to hassle yourself about interface types, vmnet or private? Shared or ad-hoc?

In GNS3 it is as simple and intuitive as choosing  a node interface and connect it to whatever other node interface.

In this lab, we are testing basic F5 BIG-IP LTM VE deployment within GNS3. The only Virtual machine used in this lab is F5 BIG-IP all other devices are docker containers: 

  • Nginx Docker containers for internal web servers.
  • ab(Apache Benchmark) docker container for the client used for performence testing.
  • gns3/webterm containers used as Firefox browser for client testing and F5 web management.

 

Outline:

  1. Docker image import
  2. F5 Big-IP VE installation and activation
  3. Building the topology
  4. Setting F5 Big-IP interfaces
  5. Connectivity check between devices
  6. Load balancing configuration
  7. Generating client http queries
  8. Monitoring Load balancing

Devices used:

Environment:

  • Debian host GNU/Linux 8.5 (jessie)
  • GNS3 version 1.5.2 on Linux (64-bit)

System requirements:

  • F5 Big IP VE requires 2GB of RAM (recommended >= 8GB)
  • VT-x / AMD-V support

The only virtual machine used in the lab is F5 Big-IP, all other devices are Docker containers.

 

1.Docker image import

Create a new docker template in GNS3. Create new docker template: Edit > Preferences > Docker > Docker containers  and then “New”.

Choose “New image” option and type the Docker image in the format provided in “devices used” section <account/<repo>>, then choose a name (without the slash “/”).

Note:

By default GNS3 derives a name in the same format as the Docker registry (<account>/repository) which can cause an error in some earlier versions. In the latest GNS3 versions, the slash “/” is removed from the derived name.


Installing 
gns3/openvswitch:

selection_293
selection_278

Set the number of interfaces to eight and accept default parameters with “next” until “finish”.

– Repeat the same procedure for gns3/webterm

selection_372

Choose a name for the image (without slash “/”)

selection_340

Choose vnc as the console type to allow Firefox (GUI) browsing

selection_373

And keep the remaining default parameters.

– Repeat the same procedure for the image ajnouri/nginx.

Create a new image with name ajnouri/nginx

selection_339

Name it as you like

selection_340

And keep the remaining default parameters.

2. F5 Big-IP VE installation and activation

 

– From F5 site, import the file BIG-11.3.0.0-scsi.ova. Go to https://www.f5.com/trial/big-ip-ltm-virtual-edition.php

selection_341

You’ll have to request a registration key for the trial version that you’ll receive by email.

selection_342

Open the ova file in VMWare workstation.

– To register the trial version, bridge the first interface to your network connected to Internet.

selection_343

– Start the VM and log in to the console with root/default

selection_344

– Type “config” to access the text user interface.

selection_345– Choose “No” to configure a static IP address, a mask and a default gateway in the same subnet of the bridged network. Or “Yes” if you have a dhcp server and want to get a dynamic IP.

– Check the interface IP.

selection_346

– Ping an Internet host ex: gns3.com to verify the connectivity and name resolution.

– Browse Big-IP interface IP and accept the certificate.

selection_347

Use admin/admin for login credentials

selection_348

Put the key you received by email in the Base Registration Key field and push “Next”, wait a couple of seconds for activation and you are good to go.

selection_349
Now you can shutdown F5 VM.

 

3. Building the topology

 

Importing F5 VE Virtual machine to GNS3

From GNS3 “preference” import a new VMWare virtual machine

selection_350

Choose BIG-IP VM we have just installed and activated

selection_351

Make sure to set minimum 3 adapters and allow GNS3 to use any of the VM interfaces

selection_352

Topology

Now we can build our topology using BIG-IP VM and the Docker images installed.

Below, is an example of topology in which we will import F5 VM and put some containers..

Internal:

– 3 nginx containers

– 1 Openvswitch

Management:

– GUI browser webterm container

– 1 Openvswitch

– Cloud mapped to host interface tap0

External:

– Apache Benchmark (ab) container

– GUI browser webterm container

selection_353

Notice the BIG-IP VM interface e0, the one priorly bridged to host network, is now connected to a browser container for management.

I attached the host interface “tap0” to the management switch because, for some reasons, without it, arp doesn’t work on that segment.

 

Address configuration:

– Assign each of the nginx container an IP in subnet of your choice (ex: 192.168.10.0/24)

selection_354

In the same manner:

192.168.10.2/24 for ajnouri/nginx-2

192.168.10.3/24 for ajnouri/nginx-3

 

On all three nginx containers, start nginx and php servers:

service php5-fpm start

service nginx start

 

– Assign an IP to the management browser in the same subnet as BIG-IP management IP

192.168.0.222/24 for gns3/webterm-2

– Assign addresses default gateway and dns server to ab container and webterm-1 containers

selection_355

selection_356

And make sure both client devices resolve  ajnouri.local host to BIG-IP address 192.168.20.254

echo "192.168.20.254   ajnouri.local" >> /etc/hosts

– Openvswitch containers don’t need to be configured, it acts like a single vlan.

– Start the topology

 

4. Setting F5 Big-IP interfaces

 

To manage the load balancer from the webterm-2, open the console to the container, this will open a Firefox from the container .

selection_357

Browse the VM management IP https://192.168.0.151 and exception for the certificate and log in with F5 BigIP default credentials admin/admin.

selection_358

Go through the initial configuration steps

– You will have to set the hostname (ex: ajnouri.local), change the root and admin account passwords

selection_359

You will be logged out to take into account password changes, log in back

– For the purpose of this lab, not redundancy not high availability
selection_360

– Now you will have to configure internal (real servers) and external (client side) vlans and associated interfaces and self IPs.

(Self IPs are the equivalent of VLAN interface IP in Cisco switching)

Internal VLAN (connected to web servers):

selection_361

External VLAN (facing clients):

selection_362

5. Connectivity check between devices

 

Now make sure you have successful connectivity from each container to the corresponding Big-IP interface.

Ex: from ab container

selection_363

Ex: from nginx-1 server container

selection_364

selection_365

The interface connected to your host network will get ip parameters (address, gw and dns) from your dhcp server.

 

6. Load balancing configuration

 

Back to the browser webterm-2

For BIG-IP to load balance http requests from client to the servers, we need to configure:

  • Virtual Server: single entity (virtual server) visible to client0
  • Pool : associated to the Virtual server and contains the list of real web servers to balance between
  • Algorithm used to load balance between members of the pool

– Create a pool of web servers “Pool0” with “RoundRobin” as algorithm and http as protocol to monitor the members.

selection_366

-Associate to the virtual server “VServer0” to the pool “Pool0”

selection_367

Check the network map to see if everything is configured correctly and monitoring shows everything OK (green)

selection_368

From client container webterm-1, you can start a firefox browser (console to the container) and test the server name “ajnouri/local”

selection_369

If everything is ok, you’ll see the php page showing the real server ip used, the client ip and the dns name used by the client.

Everytime you refresh the page, you’ll see a different server IP used.

 

7. Performance testing

 

with Apache Benchmark container ajnouri/ab, we can generate client request to the load balancer virtual server by its hostname (ajnouri.local).

Let’s open an aux console to the container ajnouri/ab and generate 50.000 connections with 200 concurrent ones to the url ajnouri.local/test.php

ab -n 50000 -c 200 ajnouri.local/test.php

selection_370

 

8. Monitoring load balancing

 

Monitoring the load balancer performance shows a peek of connections corresponding to Apache benchmark generated requests

selection_371


In the upcoming part-2, the 3 web server containers are replaced with a single container in which we can spawn as many servers as we want (Docker-in-Docker) as well as test a custom python client script container that generates http traffic from spoofed IP addresses as opposed to a container  (Apache Benchmark) that generate traffic from a single source IP.

Advertisements

IOS server load balancing with mininet server farm


The idea is to play with IOS load balancing mechanism using large number of “real” servers (50 servers), and observe the difference in behavior between different load balancing algorithms.

Due to resource scarcity in the lab environment, I use mininet to emulate “real” servers.

I will stick to the general definition for load balancing:

A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications.

The publically announced IP address of the server is called Virtual IP (VIP). Behind the scene, the server services are provided not by a single server but a cluster of servers,real” servers with their real IP’s (RIP) hidden from outside world.

The Load Balancer, IOS SLB in our case, distributes user connections, sent to the VIP, to the real servers according to the load balancing algorithm.

Figure1: Generic load balancing

Figure1: Generic load balancing

 

Figure2: High-Level-Design Network topology

Figure2: High-Level-Design Network topology

 

Load balancing algorithms:

The tested load balancing algorithms are:

  • Weighted Round Robin (with equal weights for all real servers): New connections to the virtual SRV are directed to real servers equally in a circular fashion (Default weight = 8 for all servers).
  • Weighted Round Robin (with inequal weights): New connection to the virtual SRV are directed to real servers proportionally to their weights.
  • Weighted Least Connections: New connection to the virtual SRV are directed to real servers with the fewest number of active connections

 

Session redirection modes:

Dispatched NAT Virtual IP configured on ALL real servers as  loopback or secondary.Real servers are layer2-adjacent to SLB.SLB redirect traffic to real servers at MAC layer.
Directed VIP can be unknown to real servers.NO FTP/FW support.Support server NAT for ESP/GRE virtual servers.Use NAT to translate VIP => RIP.
Server NAT VIP translated to RIP and vice-versa.Real servers not required to be directly connected.
Client NAT Used for multiple SLBs.Replace client IP with one of the SLB IP to guarantee handling the returning traffic by the same SLB.
Static NAT Use static NAT for traffic from real server responding to clients Real servers (ex: in the same ethernet) use their own IP

The lab deploys Directed session redirection with server NAT.

 

IOS SLB configuration:

The configuration of load balancing in Cisco is IOS is pretty straightforward

Server Farm(Required) ip slb serverfarm <serverfarm-name>
Load-Balancing Algorithm (Optional) predictor [roundrobin | leastconns]
Real Server (Required) real <ip-address>
Enabling the Real Server for Service (Required) inservice
Virtual Server(Required) ip slb vserver virtserver-name
Associating a Virtual Server with a Server Farm (Required) serverfarm serverfarm-name
Virtual Server Attributes (Required)Specifies the virtual server IP address, type of connection, port number, and optional service coupling. virtual ip-address {tcp | udp} port-number [service service-name]
Enabling the Virtual Server for Service (Required) inservice

 

GNS3 lab topology

The lab is running on GNS3 with mininet VM and the host generating client traffic.

Figure3: GNS3 topology

Figure3: GNS3 topology

 

Building mininet VM server farm

mininet VM preparation:

  • Bridge and attach guest mininet VM interface to the SLB device.
  • Bring up the VM interface, without configuring any IP address.

Routing:

Because I am generating user traffic from the host machine, I need to configure static routing pointing to GNS3 subnets and the VIP:

&lt;/pre&gt;
&lt;pre&gt;sudo ip a a 192.168.10.121/24 dev tap2
sudo ip a a 192.168.20.0/24 via 192.168.10.201
sudo ip a a 66.66.66.66/32 via 192.168.10.201

mininet python API script:

The script builds mininet machines and set their default gateways to GNS3 IOS SLB device IP and start UDP server on port 5555 using netcat utility.

&lt;/pre&gt;
&lt;pre&gt;ip route add default via 10.0.0.254
nc -lu 5555 &amp;

Here is the python mininet API script:

https://github.com/AJNOURI/Software-Defined-Networking/blob/master/Mininet-Scripts/mininet-dc.py

#!/usr/bin/python

import re
from mininet.net import Mininet
from mininet.node import Controller
from mininet.cli import CLI
from mininet.link import Intf
from mininet.log import setLogLevel, info, error
from mininet.util import quietRun

def checkIntf( intf ):
&quot;Make sure intf exists and is not configured.&quot;
if ( ' %s:' % intf ) not in quietRun( 'ip link show' ):
error( 'Error:', intf, 'does not exist!\n' )
exit( 1 )
ips = re.findall( r'\d+\.\d+\.\d+\.\d+', quietRun( 'ifconfig ' + intf ) )
if ips:
error( 'Error:', intf, 'has an IP address and is probably in use!\n' )
exit( 1 )

def myNetwork():

net = Mininet( topo=None, build=False)

info( '*** Adding controller\n' )
net.addController(name='c0')

info( '*** Add switches\n')
s1 = net.addSwitch('s1')

max_hosts = 50
newIntf = 'eth1'

host_list = {}

info( '*** Add hosts\n')
for i in xrange(1,max_hosts+1):
host_list[i] = net.addHost('h'+str(i))
info( '*** Add links between ',host_list[i],' and s1 \r')
net.addLink(host_list[i], s1)

info( '*** Checking the interface ', newIntf, '\n' )
checkIntf( newIntf )

switch = net.switches[ 0 ]
info( '*** Adding', newIntf, 'to switch', switch.name, '\n' )
brintf = Intf( newIntf, node=switch )

info( '*** Starting network\n')
net.start()

for i in xrange(1,max_hosts+1):
info( '*** setting default gateway &amp; udp server on ', host_list[i], '\r' )
host_list[i].cmd('ip r a default via 10.0.0.254')
host_list[i].cmd('nc -lu 5555 &amp;')

CLI(net)
net.stop()

if __name__ == '__main__':
setLogLevel( 'info' )
myNetwork()

 

 

UDP traffic generation using scapy

I used scapy to emulate client connections from random IP addresses

Sticky connections:

Sticky connections are connections from the same client IP address or subnet and for a given period of time should be assigned to the same previous real server.

The sticky objects created to track client assignments are kept in the database for a period of time defined by sticky timer.

If both conditions are met : 

  • A connection for the same client already exists.
  • the amount of time between the end of a previous connection from the client and the start of the new connection is within the timer duration.

The server assigns the client connection to the same real server.

Router(config-slb-vserver)# sticky duration [group group-id]

A FIFO queue is used to emulate sticky connections. The process is triggered randomly.

If the queue is not full, the ramdomly generated source IP addresses is pushed to the queue, otherwise, an IP is pulled from the queue to be used, a second time, as source of the generated packet.

Figure4: Random Genetation of  sticky connections

Figure4: Random Genetation of sticky connections

 

https://github.com/AJNOURI/traffic-generator/blob/master/gen_udp_sticky.py

&lt;/pre&gt;
&lt;pre&gt;#! /usr/bin/env python

import random
from scapy.all import *
import time
import Queue

# (2014) AJ NOURI ajn.bin@gmail.com

dsthost = '66.66.66.66'

q = Queue.Queue(maxsize=5)

for i in xrange(1000):
rint = random.randint(1,10)
if rint % 5 == 0:
print '==&gt; Random queue processing'
if not q.full():
ipin = &quot;.&quot;.join(map(str, (random.randint(0, 255) for _ in range(4))))
q.put(ipin)
srchost = ipin
print ipin,' into the queue'
else:
ipout = q.get()
srchost = ipout
print ' *** This is sticky src IP',ipout
else:
srchost = &quot;.&quot;.join(map(str, (random.randint(0, 255) for _ in range(4))))
print 'one time src IP', srchost
#srchost = scapy.RandIP()
p = IP(src=srchost,dst=dsthost) / UDP(dport=5555)
print 'src= ',srchost, 'dst= ',dsthost
send(p, iface='tap2')
print 'sending packet\n'
time.sleep(1)

 

Randomly, the generated source IP used for the packet and in the same time pushed to the queue if it is not yet full:

one time src IP 48.235.35.122
src=  48.235.35.122 dst=  66.66.66.66
.
Sent 1 packets. 

one time src IP 48.235.35.122
src=  48.235.35.122 dst=  66.66.66.66
.
Sent 1 packets.
...

==&gt; Random queue processing
40.147.224.72  into the queue
src=  40.147.224.72 dst=  66.66.66.66
.
Sent 1 packets.

otherwise, an IP (previously generated) is pulled out from the queue and reused as source IP.

==&gt; Random queue processing
 *** This is sticky src IP 88.27.24.177
src=  88.27.24.177 dst=  66.66.66.66
.
Sent 1 packets.

Building Mininet server farm

ajn@ubuntu:~$ sudo python mininet-dc.py
[sudo] password for ajn:
Sorry, try again.
[sudo] password for ajn:
*** Adding controller
*** Add switches
*** Add hosts
*** Checking the interface eth1 1
*** Adding eth1 to switch s1
*** Starting network
*** Configuring hosts
h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20 h21
 h22 h23 h24 h25 h26 h27 h28 h29 h30 h31 h32 h33 h34 h35 h36 h37 h38 h39
 h40 h41 h42 h43 h44 h45 h46 h47 h48 h49 h50
*** Starting controller
*** Starting 1 switches
s1
*** Starting CLI:lt gateway &amp; udp server on h50
mininet&gt;

 

Weighted Round Robin (with equal weights):

 

IOS router configuration
ip slb serverfarm MININETFARM
 nat server
 real 10.0.0.1
 inservice
 real 10.0.0.2
 inservice
 real 10.0.0.3
 inservice
…
 real 10.0.0.50
 inservice
!
ip slb vserver VSRVNAME
 virtual 66.66.66.66 udp 5555
 serverfarm MININETFARM
 sticky 5
 idle 300
 inservice

 

Starting traffic generator
ajn:~/coding/python/scapy$ sudo python udpqueue.py
one time src IP 142.124.66.30
src= 142.124.66.30 dst= 66.66.66.66
.
Sent 1 packets.

sending packet
one time src IP 11.125.212.0
src= 11.125.212.0 dst= 66.66.66.66
.
Sent 1 packets.

sending packet
one time src IP 148.97.164.124
src= 148.97.164.124 dst= 66.66.66.66
.
Sent 1 packets.

sending packet
one time src IP 101.234.155.254
src= 101.234.155.254 dst= 66.66.66.66
.
Sent 1 packets.

sending packet
==&gt; Random queue processing
78.19.5.190 into the queue
src= 78.19.5.190 dst= 66.66.66.66
.
Sent 1 packets.

...

The router has already started associating incoming UDP connections to real server according to the LB algorithm.

Router IOS SLB
SLB#sh ip slb stick 

client netmask group real conns
-----------------------------------------------------------------------
43.149.57.102 255.255.255.255 4097 10.0.0.3 1
78.159.83.228 255.255.255.255 4097 10.0.0.3 1
160.130.143.14 255.255.255.255 4097 10.0.0.3 1
188.26.251.226 255.255.255.255 4097 10.0.0.3 1
166.43.203.95 255.255.255.255 4097 10.0.0.3 1
201.49.188.108 255.255.255.255 4097 10.0.0.3 1
230.46.94.201 255.255.255.255 4097 10.0.0.4 1
122.139.198.227 255.255.255.255 4097 10.0.0.3 1
219.210.19.107 255.255.255.255 4097 10.0.0.4 1
155.53.69.23 255.255.255.255 4097 10.0.0.3 1
196.166.41.76 255.255.255.255 4097 10.0.0.4 1
…
Result: (accelerated video)

Weighted Round Robin (with unequal weights):

Let’s suppose we need to assign a weight of 16, twice the default weight, to each 5th server: 1, 5, 10, 15…

 

IOS router configuration
ip slb serverfarm MININETFARM
 nat server
 real 10.0.0.1
 weight 16
 inservice
 real 10.0.0.2
 inservice
 real 10.0.0.3
 inservice
 real 10.0.0.4
 inservice
 real 10.0.0.5
 weight 16
…
Result: (accelerated video)

Least connection:

 

IOS router configuration

ip slb serverfarm MININETFARM
 nat server
 predictor leastconns
 real 10.0.0.1
 weight 16
 inservice
 real 10.0.0.2
 inservice
 real 10.0.0.3
…
Result: (accelerated video)

 

Stopping Mininet Server farm
mininet&gt; exit
*** Stopping 1 switches
s1 ..................................................
*** Stopping 50 hosts
h1 h2 h3 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 h14 h15 h16 h17 h18 h19 h20
 h21 h22 h23 h24 h25 h26 h27 h28 h29 h30 h31 h32 h33 h34 h35 h36 h37
 h38 h39 h40 h41 h42 h43 h44 h45 h46 h47 h48 h49 h50
*** Stopping 1 controllers
c0
*** Done
ajn@ubuntu:~$

References
http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/slb/configuration/15-s/slb-15-s-book.html

Load balancing GRE tunneled multicast traffic


This Lab is a part of a whole bundle aimed to illustrate different methods used to load balance multicast traffic through multiple equal-cost paths. For each individual Lab the topology could be slightly modified to match the appropriate conditions:

  • Direct multicast Load Balancing through ECMP.
  • Load balancing GRE tunneled multicast traffic.
  • Load Balancing Multicast traffic through EtherChannel.
  • Load Balancing Multicast traffic through MLP.

To avoid direct load balancing using ECMP (Equal-Cost Multiple Path), it is possible to offload the load sharing to unicast traffic processing by encapsulating multicast into GRE tunnels in such way that the multipath topology (ramification nodes and parallel paths) are aware only of the unicast tunnel sessions.

Outline

Configuration

Routing

CEF

Tunneling and RPF check

Configuration check:

Layer3 Load balancing

Testing unicast CEF Load sharing

Simulate a path failure with each of the three paths

Increasing the number of sessions

Conclusion

Picture1 illustrates the topology used here; note the two ramification nodes R10 and R1 delimiting three parallel paths.

Pcicture1: Lab topology

Configuration

Routing

The routing protocol deployed is EIGRP; by default it allows four equal-cost paths (six configurable), if needed use “maximum-path <max>” command to allow more.

R2#sh ip pro

Routing Protocol is “eigrp 10”


Automatic network summarization is not in effect

Maximum path: 4

Routing for Networks:

The same autonomous system 50 is used everywhere with auto-summarization disabled and advertisement of the directly connected segments.

CEF

CEF is a key feature, because load balancing at the data plane uses FIB which is directly inspired from Control plane RIB, in addition to the adjacency table of course.

CEF allows:

Per-destination load-sharing

  • More appropriate for BIG number of (src-dst) sessions
  • Load balance individual sessions (src, dst) over multiple paths
  • Default for CEF

Per-Packet (Round-Robin distribution) load-sharing

  • Load balance individual packets for a given session over multiple paths
  • Not recommended for VoIP because of packet re-ordering.

Tunneling and RPF check:

– Unicast Routing protocol EIGRP will process the tunnel outer header with IPs from interfaces used for tunnel source and destination, that’s why only these subnets are advertised.

– Multicast routing protocol PIM will process tunnel inner header, for that reason PIM must be announced on the tunnel interface itself, not tunnel sources/destination interfaces.

– These two routing levels and their corresponding interfaces must be strictly separated to avoid RPF failure.

Tunneling:

On router « vhost_member » the unique physical interface fa0/0 cannot be used for tunneling, because incoming traffic is de-multiplexed to the appropriate tunnel using the tunnel source, that’s why three loopbacks are created to be used as source for each tunnel interface.

GRE Tunnels are formed between multicast source routers (SRC1, SRC2 and SRC3) and Last-hop PIM router “vhost_member”

First tunnel :

SRC1 router vhost_member router
tunnel interface tunnel1 tunnel2
ip 1.1.10.1/24 1.1.10.6/24
tunnel source fa0/0 loopback1
tunnel destination 6.6.6.10 10.0.1.1
mode GRE GRE

Second tunnel :

SRC2 router vhost_member router
tunnel interface tunnel1 tunnel2
ip 1.1.20.2/24 1.1.20.6/24
tunnel source fa0/0 loopback2
tunnel destination 6.6.6.20 10.0.2.2
mode GRE GRE

Third tunnel :

SRC3 router vhost_member router
tunnel interface tunnel1 tunnel3
ip 1.1.30.3/24 1.1.30.6/24
tunnel source fa0/0 loopback3
tunnel destination 6.6.6.30 10.0.3.3
mode GRE GRE

Picture2 tunneling:

SRC1 router :

interface Tunnel1

ip address 1.1.10.1 255.255.255.0

tunnel source FastEthernet0/0

tunnel destination 6.6.6.10

tunnel mode gre ip

SRC2 router :

interface Tunnel1

ip address 1.1.20.2 255.255.255.0

tunnel source FastEthernet0/0

tunnel destination 6.6.6.20

tunnel mode gre ip

SRC3 router :

interface Tunnel1

ip address 1.1.30.3 255.255.255.0

tunnel source FastEthernet0/0

tunnel destination 6.6.6.30

tunnel mode gre ip

vhost_member (Multicast last hop):

interface Tunnel1

ip address 1.1.10.6 255.255.255.0

tunnel source Loopback1

tunnel destination 10.0.1.1

interface Tunnel2

ip address 1.1.20.6 255.255.255.0

tunnel source Loopback2

tunnel destination 10.0.2.2

!

interface Tunnel3

ip address 1.1.30.6 255.255.255.0

tunnel source Loopback3

tunnel destination 10.0.3.3

Multicast source routers

Enable pim ONLY on tunnel interfaces:

SRC3:

ip multicast-routing

!

interface Tunnel1

ip pim dense-mode

SRC2:

ip multicast-routing

!

interface Tunnel1

ip pim dense-mode

SRC1:

ip multicast-routing

!

interface Tunnel1

ip pim dense-mode

Router “vhost_member”:

ip multicast-routing

interface Tunnel1

ip pim dense-mode

ip igmp join-group 239.1.1.1

!

interface Tunnel2

ip pim dense-mode

ip igmp join-group 239.2.2.2

!

interface Tunnel3

ip pim dense-mode

ip igmp join-group 239.3.3.3

PIM Check:

Gmembers#sh ip pim interface

Address          Interface                Ver/   Nbr    Query  DR     DR

Mode   Count  Intvl  Prior

1.1.10.6         Tunnel1 v2/D   1      30     1      0.0.0.0

1.1.20.6         Tunnel2 v2/D   1      30     1      0.0.0.0

1.1.30.6         Tunnel3 v2/D   1      30     1      0.0.0.0

Gmembers#

Multicast sources (Incoming interfaces) are reachable through tunnel interfaces (unicast routing outbound interfaces) from which the multicast is received:

Gmembers#sh ip route


Gateway of last resort is not set

1.0.0.0/8 is variably subnetted, 4 subnets, 2 masks

D       1.1.1.1/32 [90/156160] via 10.0.0.21, 00:37:18, FastEthernet0/0

C       1.1.10.0/24 is directly connected, 
Tunnel1

C       1.1.20.0/24 is directly connected, 
Tunnel2

C       1.1.30.0/24 is directly connected, 
Tunnel3


Gmembers#sh ip mroute

IP Multicast Routing Table


(1.1.10.1, 239.1.1.1), 00:02:29/00:02:58, flags: LT

Incoming interface: Tunnel1, RPF nbr 0.0.0.0

Outgoing interface list:

Tunnel2, Forward/Dense, 00:02:29/00:00:00

Tunnel3, Forward/Dense, 00:02:29/00:00:00


(1.1.20.2, 239.2.2.2), 00:02:11/00:02:58, flags: LT

Incoming interface: Tunnel2, RPF nbr 0.0.0.0

Outgoing interface list:

Tunnel1, Forward/Dense, 00:02:11/00:00:00

Tunnel3, Forward/Dense, 00:02:11/00:00:00


(1.1.30.3, 239.3.3.3), 00:01:54/00:02:59, flags: LT

Incoming interface: Tunnel3, RPF nbr 0.0.0.0

Outgoing interface list:

Tunnel1, Forward/Dense, 00:01:54/00:00:00

Tunnel2, Forward/Dense, 00:01:54/00:00:00

Gmembers#

Depending on the complexity of your topology you may need to route statically or dynamically some subnets through the tunnel.

Configuration check:

First let’s start multicast advertisement:

SRC3#p 239.3.3.3 repeat 1000

Type escape sequence to abort.

Sending 1000, 100-byte ICMP Echos to 239.3.3.3, timeout is 2 seconds:

Reply to request 0 from 1.1.30.6, 84 ms

Reply to request 1 from 1.1.30.6, 84 ms

SRC2#ping 239.2.2.2 repeat 1000

Type escape sequence to abort.

Sending 1000, 100-byte ICMP Echos to 239.2.2.2, timeout is 2 seconds:

Reply to request 0 from 1.1.20.6, 128 ms

Reply to request 1 from 1.1.20.6, 104 ms

SRC1#p 239.1.1.1 repeat 1000

Type escape sequence to abort.

Sending 1000, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 1.1.10.6, 120 ms

Reply to request 1 from 1.1.10.6, 140 ms

From the multicast point of view (multicast source and members), traffic is forwarded through distinct PTP links (tunnels)

Picture3: Logical multicast topology

Note the multicast path, is not aware of any topology out of the tunnels through which it is advertised:

Gmembers#mtrace 1.1.10.1 1.1.10.6

Type escape sequence to abort.

Mtrace from 1.1.10.1 to 1.1.10.6 via RPF

From source (?) to destination (?)

Querying full reverse path…

0  1.1.10.6

-1  1.1.10.6 PIM  [1.1.10.0/24]

-2  1.1.10.1

Gmembers#

Gmembers#mtrace 1.1.20.2 1.1.20.6

Type escape sequence to abort.

Mtrace from 1.1.20.2 to 1.1.20.6 via RPF

From source (?) to destination (?)

Querying full reverse path…

0  1.1.20.6

-1  1.1.20.6 PIM  [1.1.20.0/24]

-2  1.1.20.2

Gmembers#

Gmembers#mtrace 1.1.30.6 1.1.30.3

Type escape sequence to abort.

Mtrace from 1.1.30.6 to 1.1.30.3 via RPF

From source (?) to destination (?)

Querying full reverse path…

0  1.1.30.3

-1  1.1.30.3 PIM  [1.1.30.0/24]

-2  1.1.30.6

Gmembers#

The following picture4 illustrates how intermediate routers (forming the parallel paths) consider only unicast sessions

Picture4: unicast sessions


Layer3 Load balancing

R10#sh ip route


6.0.0.0/32 is subnetted, 3 subnets

D       6.6.6.10 [90/158976] via 10.0.40.4, 02:13:11, Vlan104

[90/158976] via 10.0.30.3, 02:13:11, Vlan103

[90/158976] via 10.0.20.2, 02:13:11, Vlan102

D       6.6.6.20 [90/158976] via 10.0.40.4, 02:13:16, Vlan104

[90/158976] via 10.0.30.3, 02:13:16, Vlan103

[90/158976] via 10.0.20.2, 02:13:16, Vlan102

D       6.6.6.30 [90/158976] via 10.0.40.4, 02:13:17, Vlan104

[90/158976] via 10.0.30.3, 02:13:17, Vlan103

[90/158976] via 10.0.20.2, 02:13:17, Vlan102

Layer2, CEF, Load balance according to FIB (from RIB)

R10#

R10#

R10#sh ip cef 6.6.6.20 internal

6.6.6.20/32, version 57, epoch 0, per-destination sharing

0 packets, 0 bytes

via 10.0.40.4, Vlan104, 0 dependencies

traffic share 1

next hop 10.0.40.4, Vlan104

valid adjacency

via 10.0.30.3, Vlan103, 0 dependencies

traffic share 1

next hop 10.0.30.3, Vlan103

valid adjacency

via 10.0.20.2, Vlan102, 0 dependencies

traffic share 1

next hop 10.0.20.2, Vlan102

valid adjacency

0 packets, 0 bytes switched through the prefix

tmstats: external 0 packets, 0 bytes

internal 0 packets, 0 bytes

Load distribution: 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 (refcount 1)

Hash  OK  Interface                 Address         Packets

1     Y   Vlan104                   10.0.40.4             0

2     Y   Vlan103                   10.0.30.3             0

3     Y   Vlan102                   10.0.20.2             0

4     Y   Vlan104                   10.0.40.4             0

5     Y   Vlan103                   10.0.30.3             0

6     Y   Vlan102                   10.0.20.2             0

7     Y   Vlan104                   10.0.40.4             0

8     Y   Vlan103                   10.0.30.3             0

9     Y   Vlan102                   10.0.20.2             0

10    Y   Vlan104                   10.0.40.4             0

11    Y   Vlan103                   10.0.30.3             0

12    Y   Vlan102                   10.0.20.2             0

13    Y   Vlan104                   10.0.40.4             0

14    Y   Vlan103                   10.0.30.3             0

15    Y   Vlan102                   10.0.20.2             0

R10#

So far everything seems to work as expected, but the two ramification routers, R10 and R1, show that it is not exactly an even distribution per-destination (picture5).

Observe unicast path using traceroute

Gmembers#trace 10.0.1.1 source 6.6.6.10

Type escape sequence to abort.

Tracing the route to 10.0.1.1

10.0.0.21 44 msec 64 msec 24 msec

10.0.0.41 16 msec 24 msec 20 msec

10.0.30.30 20 msec 32 msec 20 msec

10.0.1.1 64 msec *  72 msec

Gmembers#

Gmembers#trace 10.0.2.2 source 6.6.6.20

Type escape sequence to abort.

Tracing the route to 10.0.2.2

10.0.0.21 40 msec 32 msec 16 msec

10.0.0.41 8 msec 88 msec 20 msec

10.0.30.30 80 msec 48 msec 12 msec

10.0.2.2 140 msec *  68 msec

Gmembers#

Gmembers#trace 10.0.3.3 source 6.6.6.30

Type escape sequence to abort.

Tracing the route to 10.0.3.3

10.0.0.21 56 msec 32 msec 24 msec

10.0.0.37 32 msec 120 msec 16 msec

10.0.40.40 56 msec 16 msec 100 msec

10.0.3.3 48 msec *  56 msec

Gmembers#

Picture5: traffic distribution for the three sessions


R10#sh ip cef exact-route 10.0.1.1 6.6.6.10 internal

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

Bucket 5 from 15, total 3 paths

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20 internal

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

Bucket 5 from 15, total 3 paths

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30 internal

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

Bucket 6 from 15, total 3 paths

R10#

R1#sh ip cef exact-route 6.6.6.10 10.0.1.1

6.6.6.10        -> 10.0.1.1       : FastEthernet0/0 (next hop 10.0.0.41)

R1#sh ip cef exact-route 6.6.6.20 10.0.2.2

6.6.6.20        -> 10.0.2.2       : FastEthernet0/0 (next hop 10.0.0.41)

R1#sh ip cef exact-route 6.6.6.30 10.0.3.3

6.6.6.30        -> 10.0.3.3       : FastEthernet2/0 (next hop 10.0.0.37)

R1#

According to Cisco Documentation, per-destination load balancing depends on statistical distribution of traffic and more appropriate for a big number of sessions.

Resisting the confirmation bias to justify these results, I decided to conduct a series of tests and see what they will lead to:

1) – Simulate a path failure with each of the three paths.

2) – Progressively increase the number of sessions.

Three paths are available for each destination prefix (used in tunneling)

Testing unicast CEF Load sharing:

1) – simulate a path failure with each of the three paths.

Normal situation with no failures:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#

Picture6: NO failure


Situation with R3 failure:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#

Picture7: failure of R3 path


Situation with R2 failure:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan103 (next hop 10.0.30.3)

R10#

Picture8: failure of R2 path


Situation with R4 failure:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan103 (next hop 10.0.30.3)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan103 (next hop 10.0.30.3)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#

Picture9: failure of R4 path


2) – Increasing the number of sessions:

With 4 sessions:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.30

10.0.2.2        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#

Picture10: Distribution with  a 4th session between 10.0.2.2 and 6.6.6.30


With 5 sessions:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.30

10.0.2.2        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.10

10.0.3.3        -> 6.6.6.10       : Vlan103 (next hop 10.0.30.3)

R10#

Picture11: Distribution with  a 4th session between 10.0.3.3 and 6.6.6.10


With 6 sessions:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.30

10.0.2.2        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.10

10.0.3.3        -> 6.6.6.10       : Vlan103 (next hop 10.0.30.3)

R10#sh ip cef exact-route 10.0.1.1 6.6.6.20

10.0.1.1        -> 6.6.6.20       : Vlan103 (next hop 10.0.30.3)

R10#

Picture12: Distribution with a 6th session between 10.0.1.1 and 6.6.6.20

With 7 sessions:

R10#sh ip cef exact-route 10.0.1.1 6.6.6.10

10.0.1.1        -> 6.6.6.10       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.20

10.0.2.2        -> 6.6.6.20       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.30

10.0.3.3        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#sh ip cef exact-route 10.0.2.2 6.6.6.30

10.0.2.2        -> 6.6.6.30       : Vlan102 (next hop 10.0.20.2)

R10#sh ip cef exact-route 10.0.3.3 6.6.6.10

10.0.3.3        -> 6.6.6.10       : Vlan103 (next hop 10.0.30.3)

R10#sh ip cef exact-route 10.0.1.1 6.6.6.20

10.0.1.1        -> 6.6.6.20       : Vlan103 (next hop 10.0.30.3)

R10#sh ip cef exact-route 10.0.1.1 6.6.6.30

10.0.1.1        -> 6.6.6.30       : Vlan104 (next hop 10.0.40.4)

R10#

Picture13: Distribution with a 7th session between 10.0.1.1 and 6.6.6.30


Conclusion

The result of the test confirm that for destination-based CEF Load-sharing, the more sessions the better the load distribution.

BGP link-bw & multipath Load Balancing


 

An autonomous system can be connected to another through multiple links and according to the company business and redundancy requirements different schemes can be used:

          Primary/secondary: where the second link is used only when the first link fails.

           Symmetric load-sharing: where the traffic is equally distributed among multiple links in the same time, which provides a high level of redundancy for the enterprise.

But, it’s not always possible to provide equal bandwidth links because of either financial limits or availability of such solution. So the need to engineer traffic through these links according to their  bandwidth  capacity.

Here comes the solution of BGP link bandwidth.

With the deployment of BGP multipath, generally the decision of using multiple path to deliver the traffic is performed inside the autonomous system by an iBGP according to multiple criteria excluding the eBGP link bandwidth.

BGP link-bw advertise bandwidth of an autonomous system exit link as extended community to iBGP.

Some requirements are to be considered:

          Only between directly connected eBGP peers.

          BGP extended community should be enabled between iBGP.

          CEF should be enabled everywhere.

Figure 1 illustrates the lab topology used to implement BGP link-bw

Figure1: Topology

Inside AS 64540, R1, R2 and R3 establish full mesh iBGP sessions, the same for AS 64550: R4, R5, R6 and R7 establish full mesh iBGP sessions.

Links R2-R4, R5-R3, R6-R3 are direct eBGP sessions using interfaces ip addresses as sources and destinations.

 

Network default behavior

The network default configuration is as follow:

AS 64540:

R1:

R1(config-router)#do sh ip bgp

BGP table version is 3, local router ID is 10.10.10.1

Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,

              r RIB-failure, S Stale

Origin codes: i – IGP, e – EGP, ? – incomplete

 

   Network          Next Hop            Metric LocPrf Weight Path

*> 10.10.10.0/24    0.0.0.0                  0         32768 i

* i70.70.70.0/24    3.3.3.3                  0    100      0 64550 i

*>i                 2.2.2.2                  0    100      0 64550 i

R1(config-router)#

 

R1(config-router)#do sh ip bgp 70.70.70.0

BGP routing table entry for 70.70.70.0/24, version 3

Paths: (2 available, best #2, table Default-IP-Routing-Table)

  Not advertised to any peer

  64550

    3.3.3.3 (metric 2297856) from 3.3.3.3 (3.3.3.3)

      Origin IGP, metric 0, localpref 100, valid, internal

  64550

    2.2.2.2 (metric 2297856) from 2.2.2.2 (2.2.2.2)

      Origin IGP, metric 0, localpref 100, valid, internal, best

R1(config-router)#

the default path chosen is through R2-R4:

R1(config-router)#do traceroute 70.70.70.1 source 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 70.70.70.1

 

  1 192.168.12.2 24 msec 320 msec 452 msec

  2 192.168.24.2 1004 msec 716 msec 484 msec

  3 192.168.47.2 292 msec *  556 msec

R1(config-router)#

So the traffic from R1 to R7 takes the path R1-R2-R7

 Table1: best path selection for 70.70.70.1/24 from R1

 

Attribute

Path1

Path2

1

weight

0

0

2

local preference

100

100

3

originated locally

No

No

4

AS_PATH

64550

64550

5

ORIGIN

i

i

6

MED

0

0

7

eBGP<>iBGP

iBGP

iBGP

8

Best IGP metric to NEXT-HOP

2297856

2297856

9

Multipath

No

No

10

oldest path

No

No

11

Lowest neighbor router-ID

3.3.3.3

2.2.2.2  <<<

 

AS 64550:

R7:

R7(config-router)#do sh ip bgp

BGP table version is 3, local router ID is 70.70.70.1

Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,

              r RIB-failure, S Stale

Origin codes: i – IGP, e – EGP, ? – incomplete

 

   Network          Next Hop            Metric LocPrf Weight Path

* i10.10.10.0/24    5.5.5.5                  0    100      0 64540 i

*>i                 4.4.4.4                  0    100      0 64540 i

* i                 6.6.6.6                  0    100      0 64540 i

*> 70.70.70.0/24    0.0.0.0                  0         32768 i

R7(config-router)#

 

R7(config-router)#do traceroute 10.10.10.1 source 70.70.70.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

  1 192.168.47.1 8 msec 268 msec 104 msec

  2 192.168.24.1 164 msec 348 msec 136 msec

  3 192.168.12.1 276 msec *  260 msec

R7(config-router)#

So the traffic from R7 to R1 takes the path R7-R4-R2-R1

R7(config-router)#do sh ip bgp 10.10.10.0

BGP routing table entry for 10.10.10.0/24, version 3

Paths: (3 available, best #2, table Default-IP-Routing-Table)

  Not advertised to any peer

  64540

    5.5.5.5 (metric 2297856) from 5.5.5.5 (5.5.5.5)

      Origin IGP, metric 0, localpref 100, valid, internal

  64540

    4.4.4.4 (metric 2297856) from 4.4.4.4 (4.4.4.4)

      Origin IGP, metric 0, localpref 100, valid, internal, best

  64540

    6.6.6.6 (metric 2297856) from 6.6.6.6 (6.6.6.6)

      Origin IGP, metric 0, localpref 100, valid, internal

R7(config-router)#

 

R4-R2 link is chosen as the best path to reach the prefix 10.10.10.1/24:

Table2: best path selection for 10.10.10.1/24 from R7

 

Attribute

Path1

Path2

Path3

1

weight

0

0

0

2

local preference

100

100

100

3

originated locally

No

No

No

4

AS_PATH

64540

64540

64540

5

ORIGIN

i

i

i

6

MED

0

0

0

7

eBGP<>iBGP

iBGP

iBGP

iBGP

8

Best IGP metric to NEXT-HOP

2297856

2297856

2297856

9

Multipath

No

No

No

10

oldest path

No

No

No

11

Lowest neighbor router-ID

5.5.5.5

4.4.4.4  <<<

6.6.6.6

 

BGP Link-BW deployment

The best way to utilize BW resources is to load-share the traffic among the three eBGP link according to their BW:

let’s recall the requirements for using BGP link BW:

– Requires BGP multipath configured.

– Enable BGP ext. community between iBGP.

– Enable CEF everywhere.

General configuration:

On each iBGP speaker with multilink ramification, enable iBGP multipath

router bgp <ASnbr>

 maximum-paths <n>

 maximum-paths ibgp <n>

 

router bgp <ASnbr>

 address-family ipv4

 neighbor <iBGP_peer> activate

 neighbor <iBGP_peer> send-community extended

!iBGP peer to which extended community is to be send.

 

 neighbor <eBGP_peer> activate

 neighbor <eBGP_peer> dmzlink-bw

!Allow eBGP bandwidth to be propagated through link-bw extended community

 

 bgp dmzlink-bw

!“bgp dmzlink-bw” is configured on any router whose eBGP link bandwidth !will be used for load-balancing.

 exit-address-family

 

As 65540:

R1(iBGP):

router bgp 64540

 address-family ipv4

 neighbor 2.2.2.2 activate

 neighbor 3.3.3.3 activate

 

 maximum-paths 3

 maximum-paths ibgp 3

 

 exit-address-family

eBGP speaker R2:

router bgp 64540

 address-family ipv4

 neighbor 1.1.1.1 activate

 neighbor 1.1.1.1 send-community extended

 neighbor 1.1.1.1 next-hop-self

 

 neighbor 3.3.3.3 activate

 neighbor 3.3.3.3 next-hop-self

 

 neighbor 192.168.24.2 activate

 neighbor 192.168.24.2 dmzlink-bw

 bgp dmzlink-bw

 exit-address-family

eBGP speaker R3:

router bgp 64540

 

 address-family ipv4

 neighbor 1.1.1.1 activate

 neighbor 1.1.1.1 send-community extended

 neighbor 1.1.1.1 next-hop-self

 

 neighbor 2.2.2.2 activate

 neighbor 2.2.2.2 next-hop-self

 

 neighbor 192.168.35.2 activate

 neighbor 192.168.35.2 dmzlink-bw

 

 neighbor 192.168.36.2 activate

 neighbor 192.168.36.2 dmzlink-bw

 

 maximum-paths 2

 maximum-paths ibgp 2

 

 bgp dmzlink-bw

 

 exit-address-family

 

Verification:

R1#sh ip route

Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP

       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area

       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2

       E1 – OSPF external type 1, E2 – OSPF external type 2

       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2

       ia – IS-IS inter area, * – candidate default, U – per-user static route

       o – ODR, P – periodic downloaded static route

 

Gateway of last resort is not set

 

     192.168.12.0/30 is subnetted, 1 subnets

C       192.168.12.0 is directly connected, Serial1/0

     1.0.0.0/32 is subnetted, 1 subnets

C       1.1.1.1 is directly connected, Loopback0

     192.168.13.0/30 is subnetted, 1 subnets

C       192.168.13.0 is directly connected, Serial1/1

     2.0.0.0/32 is subnetted, 1 subnets

D       2.2.2.2 [90/2297856] via 192.168.12.2, 03:20:35, Serial1/0

     70.0.0.0/24 is subnetted, 1 subnets

B       70.70.70.0 [200/0] via 3.3.3.3, 01:11:12

                   [200/0] via 2.2.2.2, 01:11:12

     3.0.0.0/32 is subnetted, 1 subnets

D       3.3.3.3 [90/2297856] via 192.168.13.2, 03:20:29, Serial1/1

     10.0.0.0/24 is subnetted, 1 subnets

C       10.10.10.0 is directly connected, Loopback1

R1#

 

R1#sh ip route 70.70.70.1

Routing entry for 70.70.70.0/24

  Known via “bgp 64540”, distance 200, metric 0

  Tag 64550, type internal

  Last update from 2.2.2.2 01:08:48 ago

  Routing Descriptor Blocks:

    3.3.3.3, from 3.3.3.3, 01:08:48 ago

      Route metric is 0, traffic share count is 1

      AS Hops 1

      Route tag 64550

  * 2.2.2.2, from 2.2.2.2, 01:08:48 ago

      Route metric is 0, traffic share count is 1

      AS Hops 1

      Route tag 64550

 

R1#

R1:

R1#sh ip bgp 70.70.70.1

BGP routing table entry for 70.70.70.0/24, version 7

Paths: (2 available, best #2, table Default-IP-Routing-Table)

Multipath: eBGP iBGP

  Not advertised to any peer

  64550

    3.3.3.3 (metric 2297856) from 3.3.3.3 (3.3.3.3)

      Origin IGP, metric 0, localpref 100, valid, internal, multipath

      DMZ-Link Bw 1443 kbytes

  64550

    2.2.2.2 (metric 2297856) from 2.2.2.2 (2.2.2.2)

      Origin IGP, metric 0, localpref 100, valid, internal, multipath, best

      DMZ-Link Bw 12500 kbytes

R1#

Note the proportion of the link BW of path 2 (through 2.2.2.2) against link BW of path 1 (through 3.3.3.3).

 Table3: best path selection for 70.70.70.1/24 from R1 after BGP Link-bw

 

Attribute

Path1

Path2

1

weight

0

0

2

local preference

100

100

3

originated locally

No

No

4

AS_PATH

64550

64550

5

ORIGIN

i

i

6

MED

0

0

7

eBGP<>iBGP

iBGP

iBGP

8

Best IGP metric to NEXT-HOP

2297856

2297856

9

Multipath

2 <<<<

2 <<<<

10

oldest path

No

No

11

Lowest neighbor router-ID

3.3.3.3

2.2.2.2 

 

R3:

R3#sh ip route

Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP

       D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area

       N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2

       E1 – OSPF external type 1, E2 – OSPF external type 2

       i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2

       ia – IS-IS inter area, * – candidate default, U – per-user static route

       o – ODR, P – periodic downloaded static route

 

Gateway of last resort is not set

 

     192.168.12.0/30 is subnetted, 1 subnets

D       192.168.12.0 [90/2681856] via 192.168.13.1, 03:21:04, Serial1/0

     1.0.0.0/32 is subnetted, 1 subnets

D       1.1.1.1 [90/2297856] via 192.168.13.1, 03:21:04, Serial1/0

     192.168.13.0/30 is subnetted, 1 subnets

C       192.168.13.0 is directly connected, Serial1/0

     2.0.0.0/32 is subnetted, 1 subnets

D       2.2.2.2 [90/2809856] via 192.168.13.1, 03:21:04, Serial1/0

     70.0.0.0/24 is subnetted, 1 subnets

B       70.70.70.0 [20/0] via 192.168.35.2, 01:11:47

                   [20/0] via 192.168.36.2, 01:11:47

     3.0.0.0/32 is subnetted, 1 subnets

C       3.3.3.3 is directly connected, Loopback0

     10.0.0.0/24 is subnetted, 1 subnets

B       10.10.10.0 [200/0] via 1.1.1.1, 01:18:16

     192.168.36.0/30 is subnetted, 1 subnets

C       192.168.36.0 is directly connected, Serial1/1

     192.168.35.0/30 is subnetted, 1 subnets

C       192.168.35.0 is directly connected, Ethernet0/0

R3#

 

R3#sh ip route 70.70.70.1

Routing entry for 70.70.70.0/24

  Known via “bgp 64540”, distance 20, metric 0

  Tag 64550, type external

  Last update from 192.168.36.2 01:09:28 ago

  Routing Descriptor Blocks:

  * 192.168.35.2, from 192.168.35.2, 01:09:28 ago

      Route metric is 0, traffic share count is 1

      AS Hops 1

      Route tag 64550

    192.168.36.2, from 192.168.36.2, 01:09:28 ago

      Route metric is 0, traffic share count is 1

      AS Hops 1

      Route tag 64550

 

R3#

 

R3#sh ip bgp 70.70.70.1

BGP routing table entry for 70.70.70.0/24, version 6

Paths: (3 available, best #1, table Default-IP-Routing-Table)

Multipath: eBGP iBGP

  Advertised to update-groups:

     1          2          3

  64550

    192.168.35.2 from 192.168.35.2 (5.5.5.5)

      Origin IGP, localpref 100, valid, external, multipath, best

      DMZ-Link Bw 1250 kbytes

  64550

    2.2.2.2 (metric 2809856) from 2.2.2.2 (2.2.2.2)

      Origin IGP, metric 0, localpref 100, valid, internal

  64550

    192.168.36.2 from 192.168.36.2 (6.6.6.6)

      Origin IGP, localpref 100, valid, external, multipath

      DMZ-Link Bw 193 kbytes

R3#

Note the proportion of the link BW of path 1 (through 192.168.35.2) against link BW of path 1 (through 192.168.36.2).

AS 64550:

The same configuration can be done for AS 64550 to have a symmetric traffic flow between the two ASs:

R4:

R4#bgpcf

router bgp 64550

 address-family ipv4

 neighbor 5.5.5.5 activate

 

 neighbor 6.6.6.6 activate

 

 neighbor 7.7.7.7 activate

 

 neighbor 7.7.7.7 send-community extended

 

 neighbor 192.168.24.1 activate

 

 neighbor 192.168.24.1 dmzlink-bw

 

 bgp dmzlink-bw

 exit-address-family

R5:

bgp 64550

 address-family ipv4

 neighbor 4.4.4.4 activate

 

 neighbor 6.6.6.6 activate

  

 neighbor 7.7.7.7 activate

 

 neighbor 7.7.7.7 send-community extended

 

 neighbor 192.168.35.1 activate

 

 neighbor 192.168.35.1 dmzlink-bw

 

 bgp dmzlink-bw

 

 exit-address-family

R6:

router bgp 64550

 address-family ipv4

 neighbor 4.4.4.4 activate

  

 neighbor 5.5.5.5 activate

 

 neighbor 7.7.7.7 activate

 neighbor 7.7.7.7 send-community extended

  

 neighbor 192.168.36.1 activate

 

 neighbor 192.168.36.1 dmzlink-bw

 

 bgp dmzlink-bw

 

 exit-address-family

R7:

router bgp 64550

 address-family ipv4

 neighbor 4.4.4.4 activate

 neighbor 5.5.5.5 activate

 neighbor 6.6.6.6 activate

  

 maximum-paths 3

 maximum-paths ibgp 3

 

 exit-address-family

 

R7#sh ip bgp 10.10.10.1

BGP routing table entry for 10.10.10.0/24, version 9

Paths: (3 available, best #3, table Default-IP-Routing-Table)

Multipath: eBGP iBGP

Flag: 0x800

  Not advertised to any peer

  64540

    5.5.5.5 (metric 2297856) from 5.5.5.5 (5.5.5.5)

      Origin IGP, metric 0, localpref 100, valid, internal, multipath

      DMZ-Link Bw 1250 kbytes

  64540

    6.6.6.6 (metric 2297856) from 6.6.6.6 (6.6.6.6)

      Origin IGP, metric 0, localpref 100, valid, internal, multipath

      DMZ-Link Bw 193 kbytes

  64540

    4.4.4.4 (metric 2297856) from 4.4.4.4 (4.4.4.4)

      Origin IGP, metric 0, localpref 100, valid, internal, multipath, best

      DMZ-Link Bw 12500 kbytes

R7#

 

Table4: best path selection for 10.10.10.1/24 from R7 after configuring BGP link-bw

 

Attribute

Path1

Path2

Path3

1

weight

0

0

0

2

local preference

100

100

100

3

originated locally

No

No

No

4

AS_PATH

64540

64540

64540

5

ORIGIN

i

i

i

6

MED

0

0

0

7

eBGP<>iBGP

iBGP

iBGP

iBGP

8

Best IGP metric to NEXT-HOP

2297856

2297856

2297856

9

Multipath

3 <<<<

3 <<<<

3 <<<<

10

oldest path

No

No

No

11

Lowest neighbor router-ID

5.5.5.5

4.4.4.4

6.6.6.6

 

CONCLUSION

BGP link-bw provides an optimal way to use link bandwidth resources between autonomous systems, make sure CEF is enabled (enabled by default), iBGP multipath is already configured and enable the propagation of the extended community to iBGP neighbors.

GLBP (Gateway Load Balancing Protocol)


After HSRP and VRRP it is time to discover GLBP, my favorite because unlike the others, GLBP is a native First hop/gateway load balancer.

Let’s take a brief look at GLBP main characteristics:

Router’s roles

Active AVG (Active Virtual Gateway): There is only one AVG (with the highest priority) in a GLBP group with the following responsibilities:

– Assign a virtual MAC ( vMAC) address for itself and for each member of the group.

– Respond to ARP requests from clients with vMAC of other group member, the method used for selecting vMAC determine the load balancing algorithm.

Standby AVG: The router with the highest priority among those remaining will be elected the standby AVG and will take the place of the active AVG if it is down.

– AVF (Active Virtual Forwarder):Up to four AVF in a group, Active AVG and Standby AVG are also AVF:

    – Each AVF is active and responsible for forwarding traffic destined to its vMAC.

– Each AVF listens to others, if one AVF can no more forward traffic, all listening AVF will compete to take the responsibility of the failed AVF vMAC along with its own (AVF with higher weighting wins).

Timers

Holdtime: Used to monitor the presence of a group member If for the holdtime for a particular member expires, the owner of the timer can consider that member unavailable and send its hello packet to participate to the election of who will be responsible for its vMAC.

Redirect timer: if expires the AVG will no more consider the vMAC of the failed AVF, even if there is still an AVF forwarding traffic destined to that vMAC.

Secondary holdtimer :if this one expires all AVF will drop that vMAC and no one will be responsible for it.

Tracking: can be configured to monitor interface conditions like line-protocol and ip routing, if a tracked interface fails to maintain the configured condition the weighting will be reduced by a configured penality, and if the weight is below the lower threshold the routers stop being AVF.

Preempt

Whether it is an AVG or AVF, after recover from failure, the router is capable to claim back its status after a configured time period.

Protocol

GLBP group member use the multicast address 224.0.0.102 udp 3222 to communicate between each others.

Load Balancing algorithm

Round-Robin: the default LB algorithm, the AVG will alternate available vMAC in ARP reply messages.

Weighted: the weight of each AVF is an indicator of the percentage of clients that will use its vMAC.

Host-dependent: each client will always have the same vMAC.

Recommendations

All efforts of GLBP (also HSRP and VRRP by the way) are concentrated to provide redundancy and load balancing upstream from clients to the gateways, you have to take into consideration that the return or downstream traffic is asymmetric and will depend on the routing protocol decision, and this can create some issues with CAM table aging on multilayer switches used as gateways, therefore it is recommended to match CAM table MAC aging with ARP timeout.

For optimal utilization of GLBP, it is recommended to match GLBP client-cache timeout with ARP timeout.

Figure 1 illustrates the lab topology used with R4 a multilayer switch configured with an upstream interface Fa0/0 as routed interface (tracked by GLBP) and an SVI (switch virtual interface) VLAN10 configured for GLBP.

Figure1: Lab topology

The lab is organized as follow:

-I- GLBP configuration

Verification

-II- Testing

Primary AVG failure

Primary AVG recovery

Active AVF failure

AVF tracked interface failure

AVF tracked interface recovery

GLBP configuration

 

R2:

track 1 interface FastEthernet0/0 line-protocol

interface FastEthernet1/0

ip address 10.20.20.2 255.255.255.0

speed auto

full-duplex

glbp 10 ip 10.20.20.1

glbp 10 preempt

glbp 10 weighting 70

glbp 10 weighting track 1 decrement 50

R3:

track 1 interface FastEthernet0/0 line-protocol

interface FastEthernet1/0

ip address 10.20.20.3 255.255.255.0

speed auto

full-duplex

glbp 10 ip 10.20.20.1

glbp 10 priority 200

glbp 10 preempt

glbp 10 weighting track 1 decrement 50 

R4:

track 1 interface FastEthernet0/0 line-protocol

interface Vlan10

ip address 10.20.20.4 255.255.255.0

glbp 10 ip 10.20.20.1

glbp 10 priority 50

glbp 10 preempt

glbp 10 weighting 30 

 

Verification

#sh track 1

Track 1

Interface FastEthernet0/0 line-protocol

Line protocol is Up

1 change, last change 02:30:20

Tracked by:

GLBP FastEthernet1/0 10 

The same interface is tracked in all routers R2, R3 and R4.

R2:

R2#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 100 Standby 10.20.20.1 10.20.20.3 local

Fa1/0 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Fa1/0 10 2 7 Active 0007.b400.0a02 local –

Fa1/0 10 3 7 Listen 0007.b400.0a03 10.20.20.4 –

R2#

 

R2#sh glbp

FastEthernet1/0 – Group 10


State is Standby

1 state change, last state change 02:30:47

Virtual IP address is 10.20.20.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 0.912 secs

Redirect time 600 sec, forwarder time-out 14400 sec

Preemption enabled, min delay 0 sec


Active is 10.20.20.3, priority 200 (expires in 7.220 sec)


Standby is local

Priority 100 (default)

Weighting 70 (configured 70), thresholds: lower 1, upper 70

Track object 1 undefined

Load balancing: round-robin

Group members:

cc00.026c.0010 (10.20.20.2) local

cc01.026c.0010 (10.20.20.3)

cc02.1514.0000 (10.20.20.4)

There are 3 forwarders (1 active)


Forwarder 1


State is Listen

MAC address is 0007.b400.0a01 (learnt)

Owner ID is cc01.026c.0010

Time to live: 14398.920 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec


Active is 10.20.20.3 (primary), weighting 100 (expires in 8.912 sec)


Forwarder 2


State is Active

1 state change, last state change 02:30:55

MAC address is 0007.b400.0a02 (default)

Owner ID is cc00.026c.0010

Preemption enabled, min delay 30 sec


Active is local, weighting 70


Forwarder 3


State is Listen

MAC address is 0007.b400.0a03 (learnt)

Owner ID is cc02.1514.0000

Time to live: 14399.268 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec


Active is 10.20.20.4 (primary), weighting 30 (expires in 9.260 sec)

R2#

R2 is the standby AVG with priority=100, the active AVG is R3 with a priority=200.

The group members contains the list of real MAC addresses of routers participating in the GLBP group

All three routers R2, R3 and R4 are AVF, forwarders for traffic destined to their virtual MAC addresses:

R2 virtual MAC = 0007.b400.0a01

R3 virtual MAC = 0007.b400.0a02

R4 virtual MAC = 0007.b400.0a03

R2, R3 and R4 AVFs are in listening state because they are all ready to take care of each other virtual MAC in case one AVF is down and the priority is givne to the AVF with the highest weight first.

this information is confirmed in all routers R3 and R4.

R3:

R3#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 200 Active 10.20.20.1 local 10.20.20.2

Fa1/0 10 1 7 Active 0007.b400.0a01 local –

Fa1/0 10 2 7 Listen 0007.b400.0a02 10.20.20.2 –

Fa1/0 10 3 7 Listen 0007.b400.0a03 10.20.20.4 –

R3# 

 

R3#sh glbp

FastEthernet1/0 – Group 10

State is Active

2 state changes, last state change 02:45:37

Virtual IP address is 10.20.20.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 0.860 secs

Redirect time 600 sec, forwarder time-out 14400 sec

Preemption enabled, min delay 0 sec

Active is local

Standby is 10.20.20.2, priority 100 (expires in 8.556 sec)

Priority 200 (configured)

Weighting 100 (default 100), thresholds: lower 1, upper 100

Track object 1 state Up decrement 50

Load balancing: round-robin

Group members:

cc00.026c.0010 (10.20.20.2)

cc01.026c.0010 (10.20.20.3) local

cc02.1514.0000 (10.20.20.4)

There are 3 forwarders (1 active)

Forwarder 1

State is Active

1 state change, last state change 02:45:27

MAC address is 0007.b400.0a01 (default)

Owner ID is cc01.026c.0010

Redirection enabled

Preemption enabled, min delay 30 sec

Active is local, weighting 100

Forwarder 2

State is Listen

MAC address is 0007.b400.0a02 (learnt)

Owner ID is cc00.026c.0010

Redirection enabled, 597.784 sec remaining (maximum 600 sec)

Time to live: 14397.784 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec

Active is 10.20.20.2 (primary), weighting 70 (expires in 7.788 sec)

Forwarder 3

State is Listen

MAC address is 0007.b400.0a03 (learnt)

Owner ID is cc02.1514.0000

Redirection enabled, 597.512 sec remaining (maximum 600 sec)

Time to live: 14397.512 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec

Active is 10.20.20.4 (primary), weighting 30 (expires in 7.512 sec)

R3# 

R4:

R4#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Vl10 10 – 50 Listen 10.20.20.1 10.20.20.3 10.20.20.2

Vl10 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Vl10 10 2 7 Listen 0007.b400.0a02 10.20.20.2 –

Vl10 10 3 7 Active 0007.b400.0a03 local –

R4# 

 

R4#sh glbp

Vlan10 – Group 10


State is Listen

Virtual IP address is 10.20.20.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 1.640 secs

Redirect time 600 sec, forwarder time-out 14400 sec

Preemption enabled, min delay 0 sec


Active is 10.20.20.3, priority 200 (expires in 8.192 sec)


Standby is 10.20.20.2, priority 100 (expires in 8.900 sec)

Priority 50 (configured)

Weighting 30 (configured 30), thresholds: lower 1, upper 30

Load balancing: round-robin

Group members:

cc00.026c.0010 (10.20.20.2)

cc01.026c.0010 (10.20.20.3)

cc02.1514.0000 (10.20.20.4) local

There are 3 forwarders (1 active)


Forwarder 1


State is Listen

MAC address is 0007.b400.0a01 (learnt)

Owner ID is cc01.026c.0010

Time to live: 14398.188 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec


Active is 10.20.20.3 (primary), weighting 100 (expires in 7.500 sec)


Forwarder 2


State is Listen

MAC address is 0007.b400.0a02 (learnt)

Owner ID is cc00.026c.0010

Time to live: 14398.212 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec


Active is 10.20.20.2 (primary), weighting 70 (expires in 8.212 sec)


Forwarder 3


State is Active

1 state change, last state change 02:34:56

MAC address is 0007.b400.0a03 (default)

Owner ID is cc02.1514.0000

Preemption enabled, min delay 30 sec


Active is local, weighting 30

R4#

 

Figure 2 resumes GLBP states:

 

Figure2: GLBP states

Virtual MAC assignment:

R22:

R22#ping 10.20.20.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.20.20.1, timeout is 2 seconds:

.!!!!

Success rate is 80 percent (4/5), round-trip min/avg/max = 32/62/88 ms

R22#sh arp

Protocol Address Age (min) Hardware Addr Type Interface

Internet 10.20.20.1 0 0007.b400.0a01 ARPA FastEthernet0/0

Internet 10.20.20.4 37 cc02.1514.0000 ARPA FastEthernet0/0

Internet 10.20.20.22 – cc03.1514.0000 ARPA FastEthernet0/0

R22#

Note: generally the first ping after a clear ARP table will cause the first ICMP message to fail, this corresponds to the ARP operations to learn the MAC of the destination IP.

As the result of ARP request R22 has received from the virtual gateway, the AVG R2, the virtual MAC of R2, so all traffic from R22 is forwarded to R2 as confirmed by output of the trace command:

R22#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 10.20.20.3 68 msec 64 msec 96 msec

2 10.10.10.1 68 msec 76 msec 36 msec

R22#

R33:

R33#ping 10.20.20.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.20.20.1, timeout is 2 seconds:

..!!!

Success rate is 60 percent (3/5), round-trip min/avg/max = 28/64/120 ms

R33#sh arp

Protocol Address Age (min) Hardware Addr Type Interface

Internet 10.20.20.2 0 cc00.026c.0010 ARPA FastEthernet0/0

Internet 10.20.20.1 0 0007.b400.0a02 ARPA FastEthernet0/0

Internet 10.20.20.4 37 cc02.1514.0000 ARPA FastEthernet0/0

Internet 10.20.20.33 – cc04.1514.0000 ARPA FastEthernet0/0

R33#

As the result of ARP request R33 has received from the virtual gateway, the AVG R2, the virtual MAC of R3, so all traffic from R33 is forwarded to R3 as confirmed by output of the trace command:

 

R33#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 10.20.20.2 76 msec 108 msec 32 msec

2 10.10.10.1 64 msec 36 msec 72 msec

R33#

R44:

R44#ping 10.20.20.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.20.20.1, timeout is 2 seconds:

.!!!!

Success rate is 80 percent (4/5), round-trip min/avg/max = 20/45/76 ms

R44#sh arp

Protocol Address Age (min) Hardware Addr Type Interface

Internet 10.20.20.1 0 0007.b400.0a03 ARPA FastEthernet0/0

Internet 10.20.20.4 37 cc02.1514.0000 ARPA FastEthernet0/0

Internet 10.20.20.44 – cc05.1514.0000 ARPA FastEthernet0/0

R44#

As the result of ARP request R44 has received from the virtual gateway, the AVG R2, the virtual MAC of R4, so all traffic from R44 is forwarded to R4 as confirmed by output of the trace command:

R44#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 10.20.20.4 40 msec 80 msec 28 msec

2 10.10.10.1 212 msec 76 msec 104 msec

R44#

 

TESTING

Primary AVG failure

 

At this point se stop the active AVG R3.

R2:

*Mar 1 00:02:30.759: %SYS-5-CONFIG_I: Configured from console by admin on console

*Mar 1 00:03:08.003: GLBP: Fa1/0 10 Standby: g/Active timer expired (10.20.20.3)

*Mar 1 00:03:08.007: GLBP: Fa1/0 10 Active router IP is local, was 10.20.20.3

*Mar 1 00:03:08.007: GLBP: Fa1/0 10 Standby router is unknown, was local

*Mar 1 00:03:08.011: GLBP: Fa1/0 10 Standby -> Active

*Mar 1 00:03:08.011: %GLBP-6-STATECHANGE: FastEthernet1/0 Grp 10 state Standby -> Active

 

R2#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 100 Active 10.20.20.1 local 10.20.20.4

Fa1/0 10 1 7 Listen 0007.b400.0a01 10.20.20.4 –

Fa1/0 10 2 7 Active 0007.b400.0a02 local –

Fa1/0 10 3 7 Listen 0007.b400.0a03 10.20.20.4 –

R2#

10.20.20.1 (R4) is now honoring the memory of the defunct R3 🙂 by forwarding all traffic destined to R3 vitual MAC (0007.b400.0a01).

R4:

*Mar 1 05:30:33.074: GLBP: Vl10 10 Listen: g/Active timer expired (10.20.20.3)

*Mar 1 05:30:33.078: GLBP: Vl10 10 Active router IP is unknown, was 10.20.20.3

*Mar 1 05:30:33.082: GLBP: Vl10 10 Listen -> Speak

*Mar 1 05:30:33.086: GLBP: Vl10 10.1 Listen: g/Active timer expired

*Mar 1 05:30:33.090: GLBP: Vl10 10.1 Listen -> Active

*Mar 1 05:30:33.094: %GLBP-6-FWDSTATECHANGE: Vlan10 Grp 10 Fwd 1 state Listen -> Active

*Mar 1 05:30:33.110: GLBP: Vl10 10 Active router IP is 10.20.20.2

*Mar 1 05:30:33.114: GLBP: Vl10 10 Standby router is unknown, was 10.20.20.2

*Mar 1 05:30:33.118: GLBP: Vl10 10.1 Active: j/Hello rcvd from lower pri Active router (135/10.20.20.2)

*Mar 1 05:30:43.078: GLBP: Vl10 10 Speak: f/Standby timer expired (unknown)

*Mar 1 05:30:43.082: GLBP: Vl10 10 Standby router is local

*Mar 1 05:30:43.082: GLBP: Vl10 10 Speak -> Standby

 

R4#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Vl10 10 – 50 Standby 10.20.20.1 10.20.20.2 local

Vl10 10 1 7 Active 0007.b400.0a01 local –

Vl10 10 2 7 Listen 0007.b400.0a02 10.20.20.2 –

Vl10 10 3 7 Active 0007.b400.0a03 local –

R4#

R4 became the standby AVG for the new active AVG R2.

And during this time R33 still pointing toward R3 virtual MAC, but traffic is forwarded through R2.

To see if R3 vMAC is still present in the AVG vMAC list, we cleared R33 ARP table to renew the vMAC of 10.20.20.1

R33#clear arp 

 

R33#ping 10.10.10.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.10.10.1, timeout is 2 seconds:

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 44/75/112 ms

R33#sh arp

Protocol Address Age (min) Hardware Addr Type Interface

Internet 10.20.20.2 0 cc00.026c.0010 ARPA FastEthernet0/0

Internet 10.20.20.1 0 0007.b400.0a01 ARPA FastEthernet0/0

Internet 10.20.20.4 0 cc02.1514.0000 ARPA FastEthernet0/0

Internet 10.20.20.33 – cc04.1514.0000 ARPA FastEthernet0/0

R33#

And the result is that vMAC of R3 still used by the AVG to respond to ARP requests.

Primary AVG recovery

 

Now we bring R3 back to live:

R2:

*Mar 1 01:25:00.195: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 10: Neighbor 10.10.10.3 (FastEthernet0/0) is up: new adjacency

*Mar 1 01:25:00.199: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 10: Neighbor 10.20.20.3 (FastEthernet1/0) is up: new adjacency

*Mar 1 01:25:14.175: GLBP: Fa1/0 Grp 10 Hello in 10.20.20.3 VG Active pri 200 vIP 10.20.20.1 hello 3000, hold 10000

*Mar 1 01:25:14.179: GLBP: Fa1/0 10 Active router IP is 10.20.20.3, was local

*Mar 1 01:25:14.183: GLBP: Fa1/0 10 Active: k/Hello rcvd from higher pri Active router (200/10.20.20.3)

*Mar 1 01:25:14.183: GLBP: Fa1/0 10 Active -> Speak

*Mar 1 01:25:14.187: %GLBP-6-STATECHANGE: FastEthernet1/0 Grp 10 state Active -> Speak

*Mar 1 01:25:24.183: GLBP: Fa1/0 10 Speak: f/Standby timer expired (10.20.20.4)

*Mar 1 01:25:24.187: GLBP: Fa1/0 10 Standby router is local, was 10.20.20.4

*Mar 1 01:25:24.187: GLBP: Fa1/0 10 Speak -> Standby

 

R2# sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 100 Standby 10.20.20.1 10.20.20.3 local

Fa1/0 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Fa1/0 10 2 7 Active 0007.b400.0a02 local –

Fa1/0 10 3 7 Listen 0007.b400.0a03 10.20.20.4 –

R2#

As soon as R3 is back to business, it took over the active AVG state using its highest priority 200.

The ARP information on R33 is still using R4 vMAC because ARP information didn’t timeout yet, let’s help it by clearing ARP table:

R33#clear arp

R33#

R33#ping 10.20.20.1

 

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 10.20.20.1, timeout is 2 seconds:

.!!!!

Success rate is 80 percent (4/5), round-trip min/avg/max = 28/71/128 ms

R33# 

After the new requests, R33 has received R3 vMAC and now traffic destined to 0007.b400.0a01 (R3 vMAC) is forwarded by R3 itself.

R33#sh arp | i 10.20.20.1

Internet 10.20.20.1 1 0007.b400.0a01 ARPA FastEthernet0/0

R33#

 

R33#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 10.20.20.3 36 msec 88 msec 60 msec

2 *

10.10.10.1 188 msec 92 msec

R33#

 

Active AVF failure

 

First, lets’ take a look at the actual distrubution of vMAC by the active AVG:

R22:

R22#sh arp | i 10.20.20.1

Internet 10.20.20.1 0 0007.b400.0a02 ARPA FastEthernet0/0

R22#

R33:

R33#sh arp | i 10.20.20.1

Internet 10.20.20.1 0 0007.b400.0a03 ARPA FastEthernet0/0

R33#

R44:

R44#sh arp | i 10.20.20.1

Internet 10.20.20.1 0 0007.b400.0a01 ARPA FastEthernet0/0

R44#

R22 uses R2 vMAC as gateway MAC

R33 uses R4 vMAC as gateway MAC

R44 uses R3 vMAC as gateway MAC

Now let’s shutdown R4:

R3:

*Mar 1 00:38:14.503: GLBP: Fa1/0 10.3 Listen: g/Active timer expired

*Mar 1 00:38:14.507: GLBP: Fa1/0 10.3 Listen -> Active

*Mar 1 00:38:14.507: %GLBP-6-FWDSTATECHANGE: FastEthernet1/0 Grp 10 Fwd 3 state Listen -> Active

*Mar 1 00:38:14.587: GLBP: Fa1/0 10.3 Active: j/Hello rcvd from lower pri Active router (135/10.20.20.2)

R4 was responsible for its vMAC 0007.b400.0a03, now R3 is the new responsible for it (R2 has lower priority)

R33#sh arp | i 10.20.20.1

Internet 10.20.20.1 16 0007.b400.0a03 ARPA FastEthernet0/0

R33#

Still uses R4 vMAC but traffic is forwarded through R3.

R33#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 10.20.20.3 124 msec 128 msec 64 msec

2 10.10.10.1 120 msec 88 msec 32 msec

R33#

 

AVF tracked interface failure

 

Next, starting with the following configuration:

R2:

R2#sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 100 Standby 10.20.20.1 10.20.20.3 local

Fa1/0 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Fa1/0 10 2 7 Listen 0007.b400.0a02 10.20.20.4 –

Fa1/0 10 3 7 Active 0007.b400.0a03 local –

R2#

R2 plays the following roles:

– the standby AVG.

– the Active AVF for its vMAC 0007.b400.0a03.

– Listenning AVF for R3 vMAC 0007.b400.0a01 and R4 vMAC 0007.b400.0a02.

R3:

R3(config-if)#do sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Fa1/0 10 – 200 Active 10.20.20.1 local 10.20.20.2

Fa1/0 10 1 7 Active 0007.b400.0a01 local –

Fa1/0 10 2 7 Listen 0007.b400.0a02 10.20.20.4 –

Fa1/0 10 3 7 Listen 0007.b400.0a03 10.20.20.2 –

R3(config-if)#

R3 plays the following roles:

– the Active AVG.

– the Active AVF for its vMAC 0007.b400.0a01.

– Listenning AVF for R2 vMAC 0007.b400.0a03 and R4 vMAC 0007.b400.0a02.

R4:

R4(config)#do sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Vl10 10 – 50 Listen 10.20.20.1 10.20.20.3 10.20.20.2

Vl10 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Vl10 10 2 7 Active 0007.b400.0a02 local –

Vl10 10 3 7 Listen 0007.b400.0a03 10.20.20.2 –

R4(config)#

R4 plays the following roles:

– Listen to AVG.

– the Active AVF for its vMAC 0007.b400.0a02.

– Listenning AVF for R2 vMAC 0007.b400.0a03 and R3 vMAC 0007.b400.0a01.

R4(config)#do sh glbp

Vlan10 – Group 10

State is Listen

2 state changes, last state change 00:11:20

Virtual IP address is 10.20.20.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 1.016 secs

Redirect time 600 sec, forwarder time-out 14400 sec

Preemption enabled, min delay 0 sec

Active is 10.20.20.3, priority 200 (expires in 7.012 sec)

Standby is 10.20.20.2, priority 100 (expires in 8.180 sec)

Priority 50 (configured)


Weighting 30 (configured 30), thresholds: lower 1, upper 30


Track object 1 state Up decrement 30

Load balancing: round-robin

Group members:

cc00.097c.0010 (10.20.20.2)

cc01.097c.0010 (10.20.20.3)

cc02.15e8.0000 (10.20.20.4) local

There are 3 forwarders (1 active)

Forwarder 1

State is Listen

2 state changes, last state change 00:10:43

MAC address is 0007.b400.0a01 (learnt)

Owner ID is cc01.097c.0010

Time to live: 14398.316 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec

Active is 10.20.20.3 (primary), weighting 100 (expires in 8.308 sec)

Forwarder 2

State is Active

1 state change, last state change 00:26:38

MAC address is 0007.b400.0a02 (default)

Owner ID is cc02.15e8.0000

Preemption enabled, min delay 30 sec

Active is local, weighting 30

Forwarder 3

State is Listen

MAC address is 0007.b400.0a03 (learnt)

Owner ID is cc00.097c.0010

Time to live: 14399.508 sec (maximum 14400 sec)

Preemption enabled, min delay 30 sec

Active is 10.20.20.2 (primary), weighting 70 (expires in 9.504 sec)

R4(config)#

Note that R4 has a weight of 30 and in penality of -30 if the tracked interface line-protocol is down, also note the lower threshold of 1 under which the router will not even be the AVF for its own vMAC.

let’s shutdown the upstream tracked interface of R4 (AVF):

R4:

R4(config)#int fa 0/0

R4(config-if)#sh

*Mar 1 00:43:09.359: %LINK-5-CHANGED: Interface FastEthernet0/0, changed state to administratively down

*Mar 1 00:43:10.359: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/0, changed state to down

*Mar 1 00:43:42.827: %GLBP-6-FWDSTATECHANGE: Vlan10 Grp 10 Fwd 2 state Active -> Listen

R4(config-if)#

 

R4(config-if)#do sh glbp

Vlan10 – Group 10

State is Listen

2 state changes, last state change 00:31:31

Virtual IP address is 10.20.20.1

Hello time 3 sec, hold time 10 sec

Next hello sent in 2.772 secs

Redirect time 600 sec, forwarder time-out 14400 sec

Preemption enabled, min delay 0 sec

Active is 10.20.20.3, priority 200 (expires in 8.488 sec)

Standby is 10.20.20.2, priority 100 (expires in 9.668 sec)

Priority 50 (configured)


Weighting 0, low (configured 30), thresholds: lower 1, upper 30

Track object 1 state Down decrement 30

Load balancing: round-robin

The initial weight of 30 is decremented by 30, therefore with a priority of 0 lower that the lower threshold of 1, R4 glbp status change from active AVF for its own vMAC to listenning AVF.

R3:

*Mar 1 00:27:12.595: GLBP: Fa1/0 10.2 Preemption delayed, 30 secs remaining

*Mar 1 00:27:24.627: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 10: Neighbor 10.10.10.4 (FastEthernet0/0) is down: holding time expired

*Mar 1 00:27:45.607: GLBP: Fa1/0 10.2 Listen: k/Hello rcvd from lower pri Active router (39/10.20.20.4)

*Mar 1 00:27:45.611: GLBP: Fa1/0 10.2 Listen -> Active

*Mar 1 00:27:45.611: %GLBP-6-FWDSTATECHANGE: FastEthernet1/0 Grp 10 Fwd 2 state Listen -> Active

*Mar 1 00:27:45.623: GLBP: Fa1/0 10.2 Active: j/Hello rcvd from lower pri Active router (135/10.20.20.2)

R3 and R2 have been informed by the new status of R4, compete with each other and R3 win because of its higher priority, so R3 become responsible for forwarding R4 vMAC 0007.b400.0a02

 

let’s check this from R33:

R33#sh arp | i 10.20.20.1

Internet 10.20.20.1 141 0007.b400.0a02 ARPA FastEthernet0/0

R33#

 

R33#trace 10.10.10.1

 

Type escape sequence to abort.

Tracing the route to 10.10.10.1

 

1 *


10.20.20.3 84 msec 56 msec

2 *

10.10.10.1 108 msec 132 msec

R33#

traffic send by R33 to the gateway virtual IP 10.20.20.1 with the virtual MAC 0007.b400.0a02 (R4) is forwarded by R3.

 

AVF tracked interface recovery

*Mar 1 01:02:08.971: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 10: Neighbor 10.10.10.3 (FastEthernet0/0) is up: new adjacency

*Mar 1 01:02:09.619: %LINK-3-UPDOWN: Interface FastEthernet0/0, changed state to up

*Mar 1 01:02:12.627: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/0, changed state to up

*Mar 1 01:02:13.827: %DUAL-5-NBRCHANGE: IP-EIGRP(0) 10: Neighbor 10.10.10.2 (FastEthernet0/0) is up: new adjacency

*Mar 1 01:02:38.523: %GLBP-6-FWDSTATECHANGE: Vlan10 Grp 10 Fwd 2 state Listen -> Active

30 seconds (the default AVF preempt time) after R4 interface is back to live, the weight is incremented by 30 and the router can take back its AVF status for its vMAC.

R4(config-if)#do sh glbp brief

Interface Grp Fwd Pri State Address Active router Standby route

Vl10 10 – 50 Listen 10.20.20.1 10.20.20.3 10.20.20.2

Vl10 10 1 7 Listen 0007.b400.0a01 10.20.20.3 –

Vl10 10 2 7 Active 0007.b400.0a02 local –

Vl10 10 3 7 Listen 0007.b400.0a03 10.20.20.2 –

R4(config-if)#

 

GLBP is more optimal solution for gateway (first hop) loadbalancing than HSRP or VRRP, for more information about differences between these protocols, refer to the post entitled “First Hop Redundancy protocol comparison”: https://cciethebeginning.wordpress.com/2008/08/23/router-high-availability-protocol-comparison-2/

First Hop Redundancy protocol comparison (HSRP,VRRP,GLBP)


Protocol
Features

HSRP

(Hot Standby Router protocol)

VRRP

(Virtual Redundancy Router Protocol)

GLBP

(Gateway Load Balancing Protocol)

Router role – 1 active router.- 1 standby router.- 1 or more listening routers. – 1 master router.- 1 or more backup routers. – 1 AVG (Active Virtual Gateway).- up to 4 AVF routers on the group (Active Virtual Forwarder) passing traffic.- up to 1024 virtual routers (GLBP groups) per physical interface.
– Use virtual ip address. – Can use real router ip address, if not, the one with highest priority become master. – Use virtual ip address.
Scope Cisco proprietary IEEE standard Cisco proprietary
Election Active Router:
1-Highest Priority
2-Highest IP (tiebreaker)
Master Router: (*)
1-Highest Priority
2-Highest IP (tiebreaker)
Active Virtual Gateway:
1-Highest Priority
2-Highest IP (tiebreaker)
Optimization features Tracking

yes

yes

yes

Preempt

yes

yes

yes

Timer adjustments

yes

yes

yes

Traffic type 224.0.0.2 – udp 1985 (version1)
224.0.0.102-udp 1985 (version2)
224.0.0.18 – IP 112 224.0.0.102 udp 3222
Timers Hello – 3 seconds Advertisement – 1 second Hello – 3 seconds
(Hold) 10 seconds (Master Down Interval)3 * Advertisement + skew time (Hold) 10 seconds
(Skew time)(256-priority) / 256
Load-balancing functionality – Multiple HSRP group per interface/SVI/routed int. – Multiple VRRP group per interface/SVI/routed int. Load-balancing oriented- Weighted algorithm.- Host-dependent algorithm.

– Round-Robin algorithm (default).

Requires appropriate distribution of Virtual GW IP per Clients for optimal load-balancing.(generally through DHCP) Requires appropriate distribution of Virtual GW IP per Clients for optimal load-balancing.(generally through DHCP) Clients are transparently updated with virtual MAC according to load-balancing algorithm through ARP requesting a unique virtual gateway.

* If the group VRRP Virtual IP on the master (higher priority) is the real IP configured on a different VRRP (Backup with lower priority) IOS will manage to make the VRRP router with the real IP, the master, by setting its priority to 255, knowing that the configurable range is [1-254].

%d bloggers like this: