BGP link-bw & multipath Load Balancing
January 19, 2009 7 Comments
An autonomous system can be connected to another through multiple links and according to the company business and redundancy requirements different schemes can be used:
– Primary/secondary: where the second link is used only when the first link fails.
– Symmetric load-sharing: where the traffic is equally distributed among multiple links in the same time, which provides a high level of redundancy for the enterprise.
But, it’s not always possible to provide equal bandwidth links because of either financial limits or availability of such solution. So the need to engineer traffic through these links according to their bandwidth capacity.
Here comes the solution of BGP link bandwidth.
With the deployment of BGP multipath, generally the decision of using multiple path to deliver the traffic is performed inside the autonomous system by an iBGP according to multiple criteria excluding the eBGP link bandwidth.
BGP link-bw advertise bandwidth of an autonomous system exit link as extended community to iBGP.
Some requirements are to be considered:
– Only between directly connected eBGP peers.
– BGP extended community should be enabled between iBGP.
– CEF should be enabled everywhere.
Figure 1 illustrates the lab topology used to implement BGP link-bw
Figure1: Topology
Inside AS 64540, R1, R2 and R3 establish full mesh iBGP sessions, the same for AS 64550: R4, R5, R6 and R7 establish full mesh iBGP sessions.
Links R2-R4, R5-R3, R6-R3 are direct eBGP sessions using interfaces ip addresses as sources and destinations.
Network default behavior
The network default configuration is as follow:
AS 64540:
R1:
R1(config-router)#do sh ip bgp BGP table version is 3, local router ID is 10.10.10.1 Status codes: s suppressed, d damped, h history, * valid, > best, i – internal, r RIB-failure, S Stale Origin codes: i – IGP, e – EGP, ? – incomplete
Network Next Hop Metric LocPrf Weight Path *> 10.10.10.0/24 0.0.0.0 0 32768 i * i70.70.70.0/24 3.3.3.3 0 100 0 64550 i *>i 2.2.2.2 0 100 0 64550 i R1(config-router)# |
R1(config-router)#do sh ip bgp 70.70.70.0 BGP routing table entry for 70.70.70.0/24, version 3 Paths: (2 available, best #2, table Default-IP-Routing-Table) Not advertised to any peer 64550 3.3.3.3 (metric 2297856) from 3.3.3.3 (3.3.3.3) Origin IGP, metric 0, localpref 100, valid, internal 64550 2.2.2.2 (metric 2297856) from 2.2.2.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, internal, best R1(config-router)# |
the default path chosen is through R2-R4:
R1(config-router)#do traceroute 70.70.70.1 source 10.10.10.1
Type escape sequence to abort. Tracing the route to 70.70.70.1
1 192.168.12.2 24 msec 320 msec 452 msec 2 192.168.24.2 1004 msec 716 msec 484 msec 3 192.168.47.2 292 msec * 556 msec R1(config-router)# |
So the traffic from R1 to R7 takes the path R1-R2-R7
Table1: best path selection for 70.70.70.1/24 from R1
|
Attribute |
Path1 |
Path2 |
1 |
weight |
0 |
0 |
2 |
local preference |
100 |
100 |
3 |
originated locally |
No |
No |
4 |
AS_PATH |
64550 |
64550 |
5 |
ORIGIN |
i |
i |
6 |
MED |
0 |
0 |
7 |
eBGP<>iBGP |
iBGP |
iBGP |
8 |
Best IGP metric to NEXT-HOP |
2297856 |
2297856 |
9 |
Multipath |
No |
No |
10 |
oldest path |
No |
No |
11 |
Lowest neighbor router-ID |
3.3.3.3 |
2.2.2.2 <<< |
AS 64550:
R7:
R7(config-router)#do sh ip bgp BGP table version is 3, local router ID is 70.70.70.1 Status codes: s suppressed, d damped, h history, * valid, > best, i – internal, r RIB-failure, S Stale Origin codes: i – IGP, e – EGP, ? – incomplete
Network Next Hop Metric LocPrf Weight Path * i10.10.10.0/24 5.5.5.5 0 100 0 64540 i *>i 4.4.4.4 0 100 0 64540 i * i 6.6.6.6 0 100 0 64540 i *> 70.70.70.0/24 0.0.0.0 0 32768 i R7(config-router)# |
R7(config-router)#do traceroute 10.10.10.1 source 70.70.70.1
Type escape sequence to abort. Tracing the route to 10.10.10.1
1 192.168.47.1 8 msec 268 msec 104 msec 2 192.168.24.1 164 msec 348 msec 136 msec 3 192.168.12.1 276 msec * 260 msec R7(config-router)# |
So the traffic from R7 to R1 takes the path R7-R4-R2-R1
R7(config-router)#do sh ip bgp 10.10.10.0 BGP routing table entry for 10.10.10.0/24, version 3 Paths: (3 available, best #2, table Default-IP-Routing-Table) Not advertised to any peer 64540 5.5.5.5 (metric 2297856) from 5.5.5.5 (5.5.5.5) Origin IGP, metric 0, localpref 100, valid, internal 64540 4.4.4.4 (metric 2297856) from 4.4.4.4 (4.4.4.4) Origin IGP, metric 0, localpref 100, valid, internal, best 64540 6.6.6.6 (metric 2297856) from 6.6.6.6 (6.6.6.6) Origin IGP, metric 0, localpref 100, valid, internal R7(config-router)# |
R4-R2 link is chosen as the best path to reach the prefix 10.10.10.1/24:
Table2: best path selection for 10.10.10.1/24 from R7
|
Attribute |
Path1 |
Path2 |
Path3 |
1 |
weight |
0 |
0 |
0 |
2 |
local preference |
100 |
100 |
100 |
3 |
originated locally |
No |
No |
No |
4 |
AS_PATH |
64540 |
64540 |
64540 |
5 |
ORIGIN |
i |
i |
i |
6 |
MED |
0 |
0 |
0 |
7 |
eBGP<>iBGP |
iBGP |
iBGP |
iBGP |
8 |
Best IGP metric to NEXT-HOP |
2297856 |
2297856 |
2297856 |
9 |
Multipath |
No |
No |
No |
10 |
oldest path |
No |
No |
No |
11 |
Lowest neighbor router-ID |
5.5.5.5 |
4.4.4.4 <<< |
6.6.6.6 |
BGP Link-BW deployment
The best way to utilize BW resources is to load-share the traffic among the three eBGP link according to their BW:
let’s recall the requirements for using BGP link BW:
– Requires BGP multipath configured.
– Enable BGP ext. community between iBGP.
– Enable CEF everywhere.
General configuration:
On each iBGP speaker with multilink ramification, enable iBGP multipath
router bgp <ASnbr> maximum-paths <n> maximum-paths ibgp <n> |
router bgp <ASnbr> address-family ipv4 neighbor <iBGP_peer> activate neighbor <iBGP_peer> send-community extended !iBGP peer to which extended community is to be send.
neighbor <eBGP_peer> activate neighbor <eBGP_peer> dmzlink-bw !Allow eBGP bandwidth to be propagated through link-bw extended community
bgp dmzlink-bw !“bgp dmzlink-bw” is configured on any router whose eBGP link bandwidth !will be used for load-balancing. exit-address-family |
As 65540:
R1(iBGP):
router bgp 64540 address-family ipv4 neighbor 2.2.2.2 activate neighbor 3.3.3.3 activate
maximum-paths 3 maximum-paths ibgp 3
exit-address-family |
eBGP speaker R2:
router bgp 64540 address-family ipv4 neighbor 1.1.1.1 activate neighbor 1.1.1.1 send-community extended neighbor 1.1.1.1 next-hop-self
neighbor 3.3.3.3 activate neighbor 3.3.3.3 next-hop-self
neighbor 192.168.24.2 activate neighbor 192.168.24.2 dmzlink-bw bgp dmzlink-bw exit-address-family |
eBGP speaker R3:
router bgp 64540
address-family ipv4 neighbor 1.1.1.1 activate neighbor 1.1.1.1 send-community extended neighbor 1.1.1.1 next-hop-self
neighbor 2.2.2.2 activate neighbor 2.2.2.2 next-hop-self
neighbor 192.168.35.2 activate neighbor 192.168.35.2 dmzlink-bw
neighbor 192.168.36.2 activate neighbor 192.168.36.2 dmzlink-bw
maximum-paths 2 maximum-paths ibgp 2
bgp dmzlink-bw
exit-address-family |
Verification:
R1#sh ip route Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2 E1 – OSPF external type 1, E2 – OSPF external type 2 i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2 ia – IS-IS inter area, * – candidate default, U – per-user static route o – ODR, P – periodic downloaded static route
Gateway of last resort is not set
192.168.12.0/30 is subnetted, 1 subnets C 192.168.12.0 is directly connected, Serial1/0 1.0.0.0/32 is subnetted, 1 subnets C 1.1.1.1 is directly connected, Loopback0 192.168.13.0/30 is subnetted, 1 subnets C 192.168.13.0 is directly connected, Serial1/1 2.0.0.0/32 is subnetted, 1 subnets D 2.2.2.2 [90/2297856] via 192.168.12.2, 03:20:35, Serial1/0 70.0.0.0/24 is subnetted, 1 subnets B 70.70.70.0 [200/0] via 3.3.3.3, 01:11:12 [200/0] via 2.2.2.2, 01:11:12 3.0.0.0/32 is subnetted, 1 subnets D 3.3.3.3 [90/2297856] via 192.168.13.2, 03:20:29, Serial1/1 10.0.0.0/24 is subnetted, 1 subnets C 10.10.10.0 is directly connected, Loopback1 R1# |
R1#sh ip route 70.70.70.1 Routing entry for 70.70.70.0/24 Known via “bgp 64540”, distance 200, metric 0 Tag 64550, type internal Last update from 2.2.2.2 01:08:48 ago Routing Descriptor Blocks: 3.3.3.3, from 3.3.3.3, 01:08:48 ago Route metric is 0, traffic share count is 1 AS Hops 1 Route tag 64550 * 2.2.2.2, from 2.2.2.2, 01:08:48 ago Route metric is 0, traffic share count is 1 AS Hops 1 Route tag 64550
R1# |
R1:
R1#sh ip bgp 70.70.70.1 BGP routing table entry for 70.70.70.0/24, version 7 Paths: (2 available, best #2, table Default-IP-Routing-Table) Multipath: eBGP iBGP Not advertised to any peer 64550 3.3.3.3 (metric 2297856) from 3.3.3.3 (3.3.3.3) Origin IGP, metric 0, localpref 100, valid, internal, multipath DMZ-Link Bw 1443 kbytes 64550 2.2.2.2 (metric 2297856) from 2.2.2.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, internal, multipath, best DMZ-Link Bw 12500 kbytes R1# |
Note the proportion of the link BW of path 2 (through 2.2.2.2) against link BW of path 1 (through 3.3.3.3).
Table3: best path selection for 70.70.70.1/24 from R1 after BGP Link-bw
|
Attribute |
Path1 |
Path2 |
1 |
weight |
0 |
0 |
2 |
local preference |
100 |
100 |
3 |
originated locally |
No |
No |
4 |
AS_PATH |
64550 |
64550 |
5 |
ORIGIN |
i |
i |
6 |
MED |
0 |
0 |
7 |
eBGP<>iBGP |
iBGP |
iBGP |
8 |
Best IGP metric to NEXT-HOP |
2297856 |
2297856 |
9 |
Multipath |
2 <<<< |
2 <<<< |
|
|
|
|
|
|
|
|
R3:
R3#sh ip route Codes: C – connected, S – static, R – RIP, M – mobile, B – BGP D – EIGRP, EX – EIGRP external, O – OSPF, IA – OSPF inter area N1 – OSPF NSSA external type 1, N2 – OSPF NSSA external type 2 E1 – OSPF external type 1, E2 – OSPF external type 2 i – IS-IS, su – IS-IS summary, L1 – IS-IS level-1, L2 – IS-IS level-2 ia – IS-IS inter area, * – candidate default, U – per-user static route o – ODR, P – periodic downloaded static route
Gateway of last resort is not set
192.168.12.0/30 is subnetted, 1 subnets D 192.168.12.0 [90/2681856] via 192.168.13.1, 03:21:04, Serial1/0 1.0.0.0/32 is subnetted, 1 subnets D 1.1.1.1 [90/2297856] via 192.168.13.1, 03:21:04, Serial1/0 192.168.13.0/30 is subnetted, 1 subnets C 192.168.13.0 is directly connected, Serial1/0 2.0.0.0/32 is subnetted, 1 subnets D 2.2.2.2 [90/2809856] via 192.168.13.1, 03:21:04, Serial1/0 70.0.0.0/24 is subnetted, 1 subnets B 70.70.70.0 [20/0] via 192.168.35.2, 01:11:47 [20/0] via 192.168.36.2, 01:11:47 3.0.0.0/32 is subnetted, 1 subnets C 3.3.3.3 is directly connected, Loopback0 10.0.0.0/24 is subnetted, 1 subnets B 10.10.10.0 [200/0] via 1.1.1.1, 01:18:16 192.168.36.0/30 is subnetted, 1 subnets C 192.168.36.0 is directly connected, Serial1/1 192.168.35.0/30 is subnetted, 1 subnets C 192.168.35.0 is directly connected, Ethernet0/0 R3# |
R3#sh ip route 70.70.70.1 Routing entry for 70.70.70.0/24 Known via “bgp 64540”, distance 20, metric 0 Tag 64550, type external Last update from 192.168.36.2 01:09:28 ago Routing Descriptor Blocks: * 192.168.35.2, from 192.168.35.2, 01:09:28 ago Route metric is 0, traffic share count is 1 AS Hops 1 Route tag 64550 192.168.36.2, from 192.168.36.2, 01:09:28 ago Route metric is 0, traffic share count is 1 AS Hops 1 Route tag 64550
R3# |
R3#sh ip bgp 70.70.70.1 BGP routing table entry for 70.70.70.0/24, version 6 Paths: (3 available, best #1, table Default-IP-Routing-Table) Multipath: eBGP iBGP Advertised to update-groups: 1 2 3 64550 192.168.35.2 from 192.168.35.2 (5.5.5.5) Origin IGP, localpref 100, valid, external, multipath, best DMZ-Link Bw 1250 kbytes 64550 2.2.2.2 (metric 2809856) from 2.2.2.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, internal 64550 192.168.36.2 from 192.168.36.2 (6.6.6.6) Origin IGP, localpref 100, valid, external, multipath DMZ-Link Bw 193 kbytes R3# |
Note the proportion of the link BW of path 1 (through 192.168.35.2) against link BW of path 1 (through 192.168.36.2).
AS 64550:
The same configuration can be done for AS 64550 to have a symmetric traffic flow between the two ASs:
R4:
R4#bgpcf router bgp 64550 address-family ipv4 neighbor 5.5.5.5 activate
neighbor 6.6.6.6 activate
neighbor 7.7.7.7 activate
neighbor 7.7.7.7 send-community extended
neighbor 192.168.24.1 activate
neighbor 192.168.24.1 dmzlink-bw
bgp dmzlink-bw exit-address-family |
R5:
bgp 64550 address-family ipv4 neighbor 4.4.4.4 activate
neighbor 6.6.6.6 activate
neighbor 7.7.7.7 activate
neighbor 7.7.7.7 send-community extended
neighbor 192.168.35.1 activate
neighbor 192.168.35.1 dmzlink-bw
bgp dmzlink-bw
exit-address-family |
R6:
router bgp 64550 address-family ipv4 neighbor 4.4.4.4 activate
neighbor 5.5.5.5 activate
neighbor 7.7.7.7 activate neighbor 7.7.7.7 send-community extended
neighbor 192.168.36.1 activate
neighbor 192.168.36.1 dmzlink-bw
bgp dmzlink-bw
exit-address-family |
R7:
router bgp 64550 address-family ipv4 neighbor 4.4.4.4 activate neighbor 5.5.5.5 activate neighbor 6.6.6.6 activate
maximum-paths 3 maximum-paths ibgp 3
exit-address-family |
R7#sh ip bgp 10.10.10.1 BGP routing table entry for 10.10.10.0/24, version 9 Paths: (3 available, best #3, table Default-IP-Routing-Table) Multipath: eBGP iBGP Flag: 0x800 Not advertised to any peer 64540 5.5.5.5 (metric 2297856) from 5.5.5.5 (5.5.5.5) Origin IGP, metric 0, localpref 100, valid, internal, multipath DMZ-Link Bw 1250 kbytes 64540 6.6.6.6 (metric 2297856) from 6.6.6.6 (6.6.6.6) Origin IGP, metric 0, localpref 100, valid, internal, multipath DMZ-Link Bw 193 kbytes 64540 4.4.4.4 (metric 2297856) from 4.4.4.4 (4.4.4.4) Origin IGP, metric 0, localpref 100, valid, internal, multipath, best DMZ-Link Bw 12500 kbytes R7# |
Table4: best path selection for 10.10.10.1/24 from R7 after configuring BGP link-bw
|
Attribute |
Path1 |
Path2 |
Path3 |
1 |
weight |
0 |
0 |
0 |
2 |
local preference |
100 |
100 |
100 |
3 |
originated locally |
No |
No |
No |
4 |
AS_PATH |
64540 |
64540 |
64540 |
5 |
ORIGIN |
i |
i |
i |
6 |
MED |
0 |
0 |
0 |
7 |
eBGP<>iBGP |
iBGP |
iBGP |
iBGP |
8 |
Best IGP metric to NEXT-HOP |
2297856 |
2297856 |
2297856 |
9 |
Multipath |
3 <<<< |
3 <<<< |
3 <<<< |
|
|
|
|
|
|
|
|
|
|
CONCLUSION
BGP link-bw provides an optimal way to use link bandwidth resources between autonomous systems, make sure CEF is enabled (enabled by default), iBGP multipath is already configured and enable the propagation of the extended community to iBGP neighbors.
What RFC contain the feature Load Sharing to BGP-Multipath ? This is standard or not ?
Thank´s regard,
Hi Maurício,
It looks like there is an RFC about how to use extended community attribute for BGP Link Bandwidth Extended Community
Hope it helps.
AJN
I’ve been attempting unequal cost load balancing over bgp but have run into a snag.
R3#sh ip bgp 0.0.0.0
BGP routing table entry for 0.0.0.0/0, version 485
Paths: (2 available, best #1, table Default-IP-Routing-Table)
Multipath: eBGP iBGP
Not advertised to any peer
999, (received & used)
10.1.1.2 (metric 130816) from 10.1.1.2 (10.1.1.2)
Origin IGP, metric 0, localpref 400, valid, internal, multipath, best
DMZ-Link Bw 250000 kbytes
888, (received & used)
10.1.1.1 (metric 130816) from 10.1.1.1 (10.10.1.1)
Origin IGP, metric 0, localpref 400, valid, internal, multipath
DMZ-Link Bw 1250000 kbytes
R3#sh ip route 0.0.0.0 0.0.0.0
Routing entry for 0.0.0.0/0, supernet
Known via “bgp 111”, distance 200, metric 0, candidate default path
Tag 999, type internal
Last update from 10.1.1.1 00:14:42 ago
Routing Descriptor Blocks:
10.1.1.2, from 10.1.1.2, 00:14:42 ago
Route metric is 0, traffic share count is 1
AS Hops 1
Route tag 999
* 10.1.1.1, from 10.1.1.1, 00:14:42 ago
Route metric is 0, traffic share count is 5
AS Hops 1
Route tag 999
It looks good so far… BUT:
R3#sh ip cef 0.0.0.0 0.0.0.0 int
0.0.0.0/0, epoch 0, RIB[B], refcount 6, per-destination sharing
sources: RIB, D/N, DRH
subblocks:
DefNet source: 0.0.0.0/0
ifnums:
GigabitEthernet0/2(134): 190.10.10.1
GigabitEthernet0/1(146): 190.10.10.5
path 028518F0, path list 02848C5C, share 1/1, type recursive nexthop, for IPv4, flags resolved
recursive via 10.1.1.2[IPv4:Default], fib 0288E290, 1 terminal fib
path 0318BAD8, path list 028471F0, share 1/1, type attached nexthop, for IPv4
nexthop 190.10.10.1 GigabitEthernet0/2, adjacency IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
path 0318BBC0, path list 02848C5C, share 5/5, type recursive nexthop, for IPv4, flags resolved
recursive via 10.1.1.1[IPv4:Default], fib 028998B8, 1 terminal fib
path 0318BB4C, path list 030E696C, share 1/1, type attached nexthop, for IPv4
nexthop 190.10.10.5 GigabitEthernet0/1, adjacency IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
output chain:
loadinfo 028347D4, per-session, 2 choices, flags 0003, 5 locks
flags: Per-session, for-rx-IPv4
16 hash buckets
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
IP adj out of GigabitEthernet0/2, addr 190.10.10.1 02CC7940
IP adj out of GigabitEthernet0/1, addr 190.10.10.5 02CC44C0
Subblocks:
None
The routing table shows traffic share ratio of 5:1, but the CEF table still shows 1:1 traffic sharing. The end result is the router does not perform unequal load balancing.
Topology is:
ISP1 ISP2
| |
eBGP eBGP
| |
R1 – R2
| |
iBGP iBGP
\ /
R3
wow this site is lovely i like he comments
Very good. Thanks
nice article but complicated. Here is a simple post about eBGP load balancing with single-homed BGP environment & two ISP connected through Static route.
Hi Shahed, nice post.
The purpose of this one is to provide an appropriate topology that allows to influence BGP choice of routes, by manipulating all BGP attributes, taking into account their relative order of preference, including the criterion multipath (load balancing).