Routing between Docker containers using GNS3.


The idea is to route (IPv4 and IPv6) between Dockers containers using GNS3 and use them as end-hosts instead of Virtual Machines.

Containers use only the resources necessary for the application they run. They use an image of the host file system and can share the same environment (binaries and libraries).

In the other hand, virtual machines require entire OS’s, with reserved RAM and disk space.

Virtual machines vs Docker containers

Virtual machines vs Docker containers

 

If you are not familiar with Docker, I urge you to take a look at the below excellent short introduction and some additional explanation from Docker site. :

 

 

As for now, Docker has limited networking functionalities. This is where pipework comes to the rescue. Pipework allows more advanced networking settings like adding new interfaces, IP’s from a different subnets and set gateways and many more…

To be able to route between the containers using your own GNS3 topology (the sky the limit!), pipework allows to create a new interface inside a running container, connect it to a host bridge interface, give it an IP/mask in any subnet you want and set a default gateway pointing to a device in GNS3. Consequently all egress traffic from the container is routed to your GNS3 topology.

 

GNS3 connection to Docker a container

GNS3 connection to Docker a container

 

How pipework connects exposes container network

How pipework connects exposes container network

Lab requirements:

Docker:
https://docs.docker.com/installation/ubuntulinux/#docker-maintained-package-installation
Pipework:

sudo bash -c "curl https://raw.githubusercontent.com/jpetazzo/pipework/master/pipework\
 > /usr/local/bin/pipework"

For each container, we will generate docker image, run a container with an interactive terminal and set networking parameters (IP and default gateway).

To demonstrate docker flexibility, we will use 4 docker containers with 4 different subnets:

 

 

This is how containers are built for this lab:

 

 .

 .

Here is the general workflow for each container.

1- build image from Dockerfile (https://docs.docker.com/reference/builder/):

An image is readonly.

sudo docker build -t <image-tag> .

Or (docker v1.5) sudo docker build -t <image-tag> <DockerfileLocation>

2- Run the built image:

Spawn and run a writable container with interactive console.

The parameters of this command may differ slightly for each GUI containers.

sudo docker run -t -i <image id from `sudo docker images`> /bin/bash

3- Set container networking:

Create host bridge interface and link to a new interface inside the container, assign to it an IP and a new default gateway.

sudo pipework <bridge> -i <int> <container if from `sudo docker ps`> <ip/mask>@<gateway-ip

 

To avoid manipulating image id’s and container id’s for each of the images and the containers, I use a bash script to build and run all containers automatically:

https://github.com/AJNOURI/Docker-files/blob/master/gns3-docker.sh

 

#!/bin/bash
IMGLIST="$(sudo docker images | grep mybimage | awk '{ print $1; }')"
[[ $IMGLIST =~ "mybimage" ]] && sudo docker build -t mybimage -f phusion-dockerbase .
[[ $IMGLIST =~ "myapache" ]] && sudo docker build -t myapache -f apache-docker .
[[ $IMGLIST =~ "myfirefox" ]] && sudo docker build -t myfirefox -f firefox-docker .

BASE_I1="$(sudo docker images | grep mybimage | awk '{ print $3; }')"
lxterminal -e "sudo docker run -t -i --name baseimage1 $BASE_I1 /bin/bash"
sleep 2
BASE_C1="$(sudo docker ps | grep baseimage1 | awk '{ print $1; }')"
sudo pipework br4 -i eth1 $BASE_C1 192.168.44.1/24@192.168.44.100 

BASE_I2="$(sudo docker images | grep mybimage | awk '{ print $3; }')"
lxterminal -e "sudo docker run -t -i --name baseimage2 $BASE_I2 /bin/bash"
sleep 2
BASE_C2="$(sudo docker ps | grep baseimage2 | awk '{ print $1; }')"
sudo pipework br5 -i eth1 $BASE_C2 192.168.55.1/24@192.168.55.100 

APACHE_I1="$(sudo docker images | grep myapache | awk '{ print $3; }')"
lxterminal -t "Base apache" -e "sudo docker run -t -i --name apache1 $APACHE_I1 /bin/bash"
sleep 2
APACHE_C1="$(sudo docker ps | grep apache1 | awk '{ print $1; }')"
sudo pipework br6 -i eth1 $APACHE_C1 192.168.66.1/24@192.168.66.100 

lxterminal -t "Firefox" -e "sudo docker run -ti --name firefox1 --rm -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix myfirefox"
sleep 2
FIREFOX_C1="$(sudo docker ps | grep firefox1 | awk '{ print $1; }')"
sudo pipework br7 -i eth1 $FIREFOX_C1 192.168.77.1/24@192.168.77.100

 

And we end up with the following conainers:

Containers, images and dependencies.

Containers, images and dependencies.


 

GNS3

All you have to do is to bind a separate cloud to each bridge interface (br4,br5,br6 and br7) created by pipework, and then connect them to the appropriate segment in your topology.

 

Lab topology

Lab topology

Note that GNS3 topology is already configured for IPv6, so as soon as you start the routers, Docker containers will be assigned IPv6 addresses from the routers through SLAAC (Stateles Auto Configuration) which makes them reachable through IPv6.

 

Here is a video on how to launch the lab:


 

Cleaning up

To clean your host from all containers and images use the following bash script:

https://github.com/AJNOURI/Docker-files/blob/master/clean_docker.sh which uses the below docker commands:

Stop running containers:

  • sudo docker stop <container id’s from `sudo docker ps`>

Remove the stopped container:

  • sudo docker rm <container id’s from `sudo docker ps -a`>

Remove the image:

  • sudo docker rmi <image id’s from `sudo docker images`>
sudo ./clean_docker.sh
Stopping all running containers...
bf3d37220391
f8ad6f5c354f
Removing all stopped containers...
bf3d37220391
f8ad6f5c354f
Erasing all images...
Make sure you are generating image from a Dockerfile
or have pushed your images to DockerHub.
*** Do you want to continue? No

I answered “No”, because I still need those images to spawn containers, you can answer “Yes” to the question if you don’t need the images anymore or if you need to change the images.


 

References:

Docker:

pipework for advanced Docker networking:

Running firefox inside Docker container:

Baseimage-Docker:

3D model shipping container:

IPv4 and IPv6 dual-stack PPPoE


The lab covers a scenario of adding basic IPv6 access to an existing PPPoE (PPP for IPv4).

PPPoE is established between CPE (Client Premise Equipment) the PPPoE client and the PPPoE server also known as BNG (Broadband Network Gateway).

ipv4 and IPv6 dual-stack PPPoe

Figure1: ipv4 and IPv6 dual-stack PPPoe

PPPoE server plays the role of the authenticator (local AAA) as well as the authentication and address pool server (figure1). Obviously, a higher centralized prefix assignment and authentication architecture (using AAA RADIUS) is more scalable for broadband access scenarios (figure2).

For more information about RADIUS attributes for IPv6 access networks, start from rfc6911 (http://www.rfc-editor.org/rfc/rfc6911.txt).

Figure2: PPPoE with RADIUS

Figure2: PPPoE with RADIUS

PPPoE for IPv6 is based on the same PPP model as for PPPoE over IPv4. The main difference in deployment is related to the nature of the routed protocol assignment to CPEs (PPPoE clients).

  • IPv4 in routed mode, each CPE gets its WAN interface IP centrally from the PPPoE server and it’s up to the customer to deploy an rfc1918 prefix to the local LAN through DHCP.
  • PPPoE client gets its WAN interface IPv6 address through SLAAC and a delegated prefix to be used for the LAN segment though DHCPv6.

Animation: PPP encapsulation model

Let’s begin with a quick reminder of a basic configuration of PPPoE for IPv4.

PPPoE for IPv4

pppoe-client WAN address assignment

The main steps of a basic PPPoE configuration are:

  • Create a BBAG (BroadBand Access Group).
  • Tie the BBAG to virtual template interface
  • Assign a loopback interface IP (always UP/UP) to the virtual template.
  • Create and assign the address pool (from which client will get their IPs) to the virtual template interface.
  • Create local user credentials.
  • Set the authentication type (chap)
  • Bind the virtual template interface to a physical interface (incoming interface for dial-in).
  • The virtual template interface will be used as a model to generate instances (virtual access interfaces) for each dial-in session.
Figure3: PPPoE server

Figure3: PPPoE server model

pppoe-server

ip local pool PPPOE_POOL 172.31.156.1 172.31.156.100
!
bba-group pppoe BBAG
virtual-template 1
!
interface Virtual-Template1
ip unnumbered Loopback0
ip mtu 1492
peer default ip address pool PPPOE_POOL
ppp authentication chap callin

!

interface FastEthernet0/0

pppoe enable group BBAG

pppoe-client

interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
!
interface FastEthernet1/0
ip address 192.168.0.201 255.255.255.0
!
interface Dialer1
mtu 1492
ip address negotiated

encapsulation ppp

dialer pool 1

dialer-group 1

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

Figure4: PPPoE client model

Figure4: PPPoE client model


As mentioned in the beginning, DHCPv4 is deployed at the CPE device to assign rfc1819 addresses to LAN clients and then translated, generally using PAT (Port Address Translation) with the assigned IPv4 to the WAN interface.

You should have the possibility to configure static NAT or static port-mapping to give public access to internal services.

Address translation

interface Dialer1
ip address negotiated
ip nat outside
!
interface FastEthernet0/0
ip address 192.168.4.1 255.255.255.224
ip nat inside
!
ip nat inside source list NAT_ACL interface Dialer1 overload
!

ip access-list standard NAT_ACL

permit any

pppoe-client LAN IPv4 address assignment

pppoe-client

ip dhcp excluded-address 192.168.4.1
!
ip dhcp pool LAN_POOL
network 192.168.4.0 255.255.255.224
domain-name cciethebeginning.wordpress.com
default-router 192.168.4.1
!
interface FastEthernet0/0
ip address 192.168.4.1 255.255.255.224

PPPoE for IPv6

pppoe-client WAN address assignment

All IPv6 prefixes are planned from the 2001:db8::

Pppoe-server

ipv6 local pool PPPOE_POOL6 2001:DB8:5AB:10::/60 64
!
bba-group pppoe BBAG
virtual-template 1
!
interface Virtual-Template1
ipv6 address FE80::22 link-local
ipv6 enable
ipv6 nd ra lifetime 21600
ipv6 nd ra interval 4 3


peer default ipv6 pool PPPOE_POOL6

ppp authentication chap callin

!

interface FastEthernet0/0

pppoe enable group BBAG

IPCP (IPv4) negotiates the IPv4 address to be assigned to the client, where IPC6CP negotiates only the interface identifier, the prefix information is performed through SLAAC.

pppoe-client

interface FastEthernet0/1
pppoe enable group global
pppoe-client dial-pool-number 1
!
interface Dialer1
mtu 1492
dialer pool 1
dialer-group 1
ipv6 address FE80::10 link-local

ipv6 address autoconfig default

ipv6 enable

ppp authentication chap callin

ppp chap hostname pppoe-client

ppp chap password 0 cisco

The CPE (PPPoE client) is assigned an IPv6 address through SLAAC along with a static default route: ipv6 address autoconfig default

pppoe-client#sh ipv6 interface dialer 1
Dialer1 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::10
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:10::10, subnet is 2001:DB8:5AB:10::/64 [EUI/CAL/PRE]
valid lifetime 2587443 preferred lifetime 600243

Note from the below traffic capture (figure5) that both IPv6 and IPv4 use the same PPP session (layer2 model) (same session ID=0x0006) because the Link Control Protocol is independent of the network layer.

Figure5: Wireshark capture of common PPP layer2 model

Figure5: Wireshark capture of common PPP layer2 model


pppoe-client LAN IPv6 assignment

The advantage of using DHCPv6 PD (Prefix Delegation is that the PPPoE will automatically add a static route to the assigned prefix, very handy!

pppoe-server

ipv6 dhcp pool CPE_LAN_DP
prefix-delegation 2001:DB8:5AB:2000::/56
00030001CA00075C0008 lifetime infinite infinite
!
interface Virtual-Template1

ipv6 dhcp server CPE_LAN_DP

Now the PPPoE client can use the delegated prefix to assign an IPv6 address (::1) to its own interface (fa0/0) and the remaining for SLAAC advertisement.

No NAT needed for the delegated prefixes to be used publically, so no translation states on the PPPoE server. The prefix is directly accessible from outside.

For more information about the client ID used for DHCPv6 assignment, please refer to the prior post about DHCPv6. https://cciethebeginning.wordpress.com/2012/01/18/ios-dhcpv6-deployment-schemes/

pppoe-client

pppoe-client#sh ipv6 dhcp
This device’s DHCPv6 unique identifier(DUID): 00030001CA00075C0008
pppoe-client#
interface Dialer1

ipv6 dhcp client pd PREFIX_FROM_ISP
!
interface FastEthernet0/0
ipv6 address FE80::2000:1 link-local

ipv6 address PREFIX_FROM_ISP ::1/64
ipv6 enable
pppoe-client#sh ipv6 dhcp interface
Dialer1 is in client mode
Prefix State is OPEN
Renew will be sent in 3d11h
Address State is IDLE
List of known servers:
Reachable via address: FE80::22
DUID: 00030001CA011F780008
Preference: 0
Configuration parameters:

IA PD: IA ID 0x00090001, T1 302400, T2 483840

Prefix: 2001:DB8:5AB:2000::/56

preferred lifetime INFINITY, valid lifetime INFINITY

Information refresh time: 0

Prefix name: PREFIX_FROM_ISP

Prefix Rapid-Commit: disabled

Address Rapid-Commit: disabled

client-LAN

Now the customer LAN is assigned globally available IPv6 from the CPE (PPPoE client).

client-LAN#sh ipv6 interface fa0/0
FastEthernet0/0 is up, line protocol is up
IPv6 is enabled, link-local address is FE80::2000:F
No Virtual link-local address(es):

Stateless address autoconfig enabled
Global unicast address(es):

2001:DB8:5AB:2000::2000:F, subnet is 2001:DB8:5AB:2000::/64 [EUI/CAL/PRE]
client-LAN#sh ipv6 route

S ::/0 [2/0]

via FE80::2000:1, FastEthernet0/0

C 2001:DB8:5AB:2000::/64 [0/0]

via FastEthernet0/0, directly connected

L 2001:DB8:5AB:2000::2000:F/128 [0/0]

via FastEthernet0/0, receive

L FF00::/8 [0/0]

via Null0, receive

client-LAN#

End-to-end dual-stack connectivity check

client-LAN#ping 2001:DB8:5AB:3::100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:DB8:5AB:3::100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/45/88 ms
client-LAN#trace 2001:DB8:5AB:3::100
Type escape sequence to abort.
Tracing the route to 2001:DB8:5AB:3::100

1 2001:DB8:5AB:2000::1 28 msec 20 msec 12 msec

2 2001:DB8:5AB:2::FF 44 msec 20 msec 32 msec

3 2001:DB8:5AB:3::100 48 msec 20 msec 24 msec

client-LAN#

client-LAN#ping 192.168.3.100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.3.100, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 52/63/96 ms
client-LAN#trace 192.168.3.100
Type escape sequence to abort.
Tracing the route to 192.168.3.100

1 192.168.4.1 32 msec 44 msec 20 msec

2 192.168.2.1 56 msec 68 msec 80 msec

3 192.168.3.100 72 msec 56 msec 116 msec

client-LAN#

I assigned PREFIX_FROM_ISP as locally significant name for the delegated prefix, no need to match the name on the DHCPv6 server side.

Finally, the offline lab with all the commands needed for more detailed inspection:

 

References

http://www.cisco.com/c/en/us/td/docs/ios-xml/ios/bbdsl/configuration/15-mt/bba-15-mt-book/bba-ppoe-client.html

http://www.cisco.com/en/US/docs/ios-xml/ios/bbdsl/configuration/15-mt/ip6-adsl_external_docbase_0900e4b182dbdf4f_4container_external_docbase_0900e4b182dc25f3.html

http://www.broadband-forum.org/technical/download/TR-187.pdf

https://tools.ietf.org/html/rfc5072

https://tools.ietf.org/html/rfc5072

http://www.bortzmeyer.org/6911.html (french)

http://packetsize.net/cisco-pppoe-ipv4-ipv6-mppe.htm

     

Embedded Packet Capture, let’s go fishing for some packets!


EPC (Embedded Packet Capture) is another useful troubleshooting tool to occasionally capture traffic to be analyzed locally or exported to remote device. Occasionally, in contrast with RITE (Router IP Traffic Export) or SPAN on switches which are meant to have permanent flow of copied traffic directed to a traffic analyzer or IDS (Intrusion Detection System).

The configuration workflow is straightforward, but I would like to make a conceptual graphical analogy to illustrate it.

Let’s imagine traffic flowing through a router interface like the following:

Embedded Packet Capture

1- Capture point:


Specify the protocol to capture, the interface and the direction, this is the Here you indicate which IP protocol you need to capture.

monitor capture point ip cef CAPTURE_POINT fastEthernet 0/0 both
monitor capture point ipv6 cef CAPTURE_POINT fastEthernet 0/0 both

2- Packet buffer:


Memory area where the frames are stored once captured. 

monitor capture buffer CAPTURE_BUFFER

 

Embedded Packet Capture

3- ACL:


If needed you can filter a specific type of traffic, available only for IPv4. 

(config)#access-list 100 permit icmp host 192.168.0.1 host 172.16.1.1#monitor capture buffer CAPTURE_BUFFER filter access-list 100 

 

Except the optional IPv4 ACL, configured at the global configuration mode, everything else is configured at the privileged EXEC mode.

Embedded Packet Capture

4- Associate capture point with capture buffer

monitor capture point associate CAPTURE_POINT CAPTURE_BUFFER

You can associate multiple capture points (on the same or multiple interfaces) to the same buffer.

Embedded Packet Capture

5- Start and stop capture process

monitor capture point start CAPTURE_POINTmonitor capture point stop CAPTURE_POINT

 concept6

If you are familiar with wireshark, it will be easier to remember the steps needed to capture traffic.

Wireshark analogy

wireshark and Embedded Packet Capture

Deployment 1

Two capture points are created to capture IPv4 and IPv6 traffic into separate capture buffers. 

monitor capture point ipv6 cef CAPTURE_POINT6 fa0/0 bothmonitor capture buffer CAPTURE_BUFFER6monitor capture point associate CAPTURE_POINT6 CAPTURE_BUFFER6

!

monitor capture point ip cef CAPTURE_POINT4 fa0/0 both

monitor capture buffer CAPTURE_BUFFER4

monitor capture point associate CAPTURE_POINT4 CAPTURE_BUFFER4

Following is the result on the router

Deployment 2

Two capture points are created to capture IPv4 and IPv6 traffic into single capture buffer. 

monitor capture point ipv6 cef CAPTURE_POINT6 fa0/0 bothmonitor capture point ip cef CAPTURE_POINT4 fa0/0 both!monitor capture buffer CAPTURE_BUFFER46

!

monitor capture point associate CAPTURE_POINT6 CAPTURE_BUFFER46

monitor capture point associate CAPTURE_POINT4 CAPTURE_BUFFER46

 

Following is the result on the router

Exporting

!Example of export to tftpR1#monitor capture buffer CAPTURE_BUFFER46 export ftp://login:password@192.168.0.32/Volume_1/ecp.pcapWriting Volume_1/ecp.pcap

R1#

!Example of export to tftp

R1# monitor capture buffer CAPTURE_BUFFER46 export tftp://192.168.0.145/ecp.pcap

!

R1#

And the file opened in wireshark:

EPC traffic opened with wireshark

wireshark

That’s all folks!

IPv6 multicast over IPv6 IPSec VTI


IPv4 IPSec doesn’t support multicast, we need to use GRE (unicast) to encapsulate multicast traffic and encrypt it. As a consequence, more complication and an additional level of routing, so less performance.

One of the advantages of IPv6 is the support of IPSec authentication and encryption (AH, ESP) right in the extension headers, which makes it natively support IPv6 multicast.

In this lab we will be using IPv6 IPSec site-to-site protection using VTI to natively support IPv6 multicast.

The configuration involves three topics: IPv6 routing, IPv6 IPSec and IPv6 multicast. Each process is built on the top the previous one, so before touching IPsec, make sure you have local connectivity for each segment of the network and complete reachability through IPv6 routing.

Next step, you can move to IPv6 IPSec and change routing configuration accordingly (through VTI).

IPv6 multicast relies on a solid foundation of unicast reachability, so once you have routes exchanged between the two sides through the secure connection you can start configuring IPv6 multicast (BSR, RP, client and server simulation).

Picture1: Lab topology

IPv6 multicast over IPv6 IPSec VTI

Lab outline

  • Routing
    • OSPFv3
    • EIGRP for IPv6
  • IPv6 IPSec
    • Using IPv6 IPSec VTI
    • Using OSPFv3 IPSec security feature
  • IPv6 Multicast
    • IPv6 PIM BSR
  • Offline lab
  • Troubleshooting cases
  • Performance testing

Routing

Note:
IPv6 Routing relies on link-local addresses, so for troubleshooting purpose, link-local IPs are configured to be similar to their respective global addresses, so they are easily recognisable. This will be of a tremendous help during troubleshooting. Otherwise you will find yourself trying to decode the matrix : )

OSPFv3

Needs an interface configured with IPv4 address for Router-id

OSPFv3 offloads security to IPv6 native IPv6, so you can secure OSPFv3 communications on purpose: per- interface or per-area basis.
  Table1: OSPFv3 configuration

  R2 R1
IPv6 routing processes need IPv4-format router ids ipv6 router ospf 12
router-id 2.2.2.2
 ipv6 router ospf 12
router-id 1.1.1.1
Announce respective LAN interfaces interface FastEthernet0/1
ipv6 ospf 12 area 22
interface FastEthernet0/1
ipv6 ospf 12 area 11 
Disable routing on the physical BTB connection to avoid RPF failure interface FastEthernet0/0
 ipv6 ospf 12 … 
interface FastEthernet0/0
 ipv6 ospf 12 …
IPv6 gateways exchange routes through the VTI encrypted interface interface Tunnel12
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Tunnel12
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Set the ospf network type on loopback interfaces if you want to advertise masks other that 128-length interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Table2: EIGRP for IPv6 configuration
  R2 R1
IPv6 routing processes need IPv4-format router ids ipv6 router eigrp 12
eigrp router-id 2.2.2.2
ipv6 router eigrp 12
eigrp router-id 1.1.1.1
Announce respective LAN interfaces interface FastEthernet0/1
ipv6 eigrp 12
interface FastEthernet0/1
ipv6 eigrp 12
Disable routing on the physical BTB connection to avoid RPF failure interface FastEthernet0/0
ipv6 eigrp 12
interface FastEthernet0/0
ipv6 eigrp 12
IPv6 gateways exchange routes through the VTI encrypted interface interface Tunnel12
ipv6 eigrp 12
interface Tunnel12
ipv6 eigrp 12
Set the ospf network type on loopback interfaces if you want to advertise masks other that 128-length interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
interface Loopback0
ipv6 ospf network point-to-point
ipv6 ospf 12 area 0
Enable EIGRP process ipv6 router eigrp 12
no shutdown
ipv6 router eigrp 12
no shutdown

In case you want to configure EIGRP for IPv6:

– No shutdown inside EIGRP configuration mode

– Similarly to OSPFv3, we need an interface configured with IPv4 address for Router-id

IPv6 IPSec

  • Using IPv6 IPSec VTI
Table3: IPSec configuration
  R1 R2
Set the type of ISAKMP authentication crypto keyring keyring1
pre-shared-key address ipv6 2001:DB8::2/128 key cisco
crypto keyring keyring1
pre-shared-key address ipv6 2001:DB8::1/128 key cisco
  crypto isakmp key cisco address ipv6 2001:DB8::2/128 crypto isakmp key cisco address ipv6 2001:DB8::1/128
ISAKMP profile crypto isakmp policy 10
encr 3des
hash md5
authentication pre-share
lifetime 3600
crypto isakmp policy 10
encr 3des
hash md5
authentication pre-share
lifetime 3600
Transform sets: symmetric encryption and signed hash algorithms crypto ipsec transform-set 3des ah-sha-hmac esp-3des crypto ipsec transform-set 3des ah-sha-hmac esp-3des
  crypto ipsec profile profile0
set transform-set 3des
crypto ipsec profile profile0
set transform-set 3des
Tunnel mode and bind the ipsec profile interface Tunnel12
ipv6 address FE80::DB8:12:1 link-local
ipv6 address 2001:DB8:12::1/64
tunnel source FastEthernet0/0
tunnel destination 2001:DB8::2
tunnel mode ipsec ipv6
tunnel protection ipsec profile profile0
interface Tunnel12
ipv6 address FE80::DB8:12:2 link-local
ipv6 address 2001:DB8:12::2/64
tunnel source FastEthernet0/0
tunnel destination 2001:DB8::1
tunnel mode ipsec ipv6
tunnel protection ipsec profile profile0
Make sure to not advertise the routes through the physical interface to avoid RPF failures (when the source of the multicast traffic is reached from an different interface than the one provided by the RIB) interface FastEthernet0/0
ipv6 address FE80::DB8:1 link-local
ipv6 address 2001:DB8::1/64
ipv6 enable
interface FastEthernet0/0
ipv6 address FE80::DB8:2 link-local
ipv6 address 2001:DB8::2/64
ipv6 enable
 

Here is a capture of the traffic (secured) between R1 and R2 gateways

Picture2: Wireshark IPv6 IPSec trafic capture

IPv6-IPSec-VTI

What could go wrong?

– Encryption doesn’t match

– Shared key doesn’t match

– Wrong ISAKMP peers

– ACL in the path between the 2 gateways blocking gateways IPs or protocol 500

– IPSec profile no assigned to the tunnel int ( tunnel protection ipsec profile < …>)

– Ipsec Encryption and/or signed hashes don’t match.

  • Using OSPFv3 IPSec security feature

You still can use IPv6 IPSec to encrypt and authenticate only OSPF per-interface basis.

OSPFv3 will use the IPv6-enabled IP Security (IPsec) secure socket API.

R1

interface FastEthernet0/0
ipv6 ospf 12 area 0
ipv6 ospf encryption ipsec spi 256 esp 3des 123456789A123456789A123456789A123456789A12345678 md5 123456789A123456789A123456789A12

R2

interface FastEthernet0/0
ipv6 ospf 12 area 0
ipv6 ospf encryption ipsec spi 256 esp 3des 123456789A123456789A123456789A123456789A12345678 md5 123456789A123456789A123456789A12

Picture4: Wireshark traffic capture – OSPFv3 IPSec feature :

ipv6-ospf-feature

Note only OSPFv3 traffic is encrypted

IPv6 Multicast

IPv6 PIM BSR

The RP (Rendez-vous point) is the point where multicast server offer meets member’s demand.

First hop routers build (S,G) source trees with candidate RPs and register directly connected multicast sources.

Candidate- RPs announce themselves to candidate-BSRs, and the latter announce the inf. to all PIM routers.

All PIM routers looking for a particular multicast group learn Candidate RP IP addresses from BSR and build (*, G) shared trees.

Table4: Multicast configuration

  R1(candidate RP) R2(candidate BSR)
Enable multicast routing ipv6 multicast-routing ipv6 multicast-routing
R1 announced as BSR candidate ipv6 pim bsr candidate bsr 2001:DB8:10::1  
R2 announced as RP candidate   ipv6 pim bsr candidate rp 2001:DB8:20::2
Everything should be routed through the tunnel interface, to be encrypted ipv6 route ::/0 Tunnel12 FE80::DB8:12:2 ipv6 route ::/0 Tunnel12 FE80::DB8:12:1
For testing purpose, make one router join a multicast traffic and ping it from a LAN router on the other side or you can opt for more fun by running VLC on one host to read a network stream and stream a video from a host on the other side.   interface FastEthernet0/1
ipv6 mld join-group ff0E::5AB

Make sure that:

  • At least one router is manually configured as a candidate RP
  • At least one router is manually configured as a candidate BSR
During multicasting of the traffic, sll PIM routers knows about the RP and the BSR

– (*,G) shared tree is spread over PIM routers from the last hop router (connected to multicast members).

– (S,G) source tree is established between the first hop router (connected to the multicast server) and the RP.

– The idea behind IPv6 PIM BSR is the same as in IPv4; here an animation explaining the process for IPv4.

Let’s check end-to-end multicast streaming:

Before going to troubleshooting here is the offline lab with all commands:

Troubleshooting

If something doesn’t work and you are stuck, isolate the area of work and inspect each process separately step by step.

Check each step using “show…” commands, so you know each time what you are looking for to spot what is wrong.

“sh run” and script comparison technique is limited by the visual perception capability which is illusory and far from being reliable.

Common routing issues

– Make sure you have successful back-to-back connectivity everywhere.

– With EIGRP for IPv6 make sure the process is enabled.

– If routing neighbors are connected through NBMA network, make sure to enable pseudo broadcasting and manually set neighbor commands.

Common IPSec issues

– ISAKMP phase

– Wrong peer

– Wrong shared password

– Not matching isakmp profile

– IPSec phase

– Not matching ipsec profile

Common PIM issues

– If routing neighbors are connected through NBMA network, make sure C-RPs and C-BSRs are locate on the main site.

– Issue with the client: => no (*,G)

– MLD query issue with the last hop.

– Last hop PIM router cannot build the shared tree.

– Issue with RP registration  => no (S,G)

– Multicast server MLD issue with the 1st hop router

– 1st hop router cannot register with th RP.

– Issue with C-BSR candidate doesn’t advertise RP inf. to PIM routers (BSRs collect all candidate RPs and announce them to all PIM routers to choose the best RP for each group)

– Issue with C-RP candidate doesn’t announce themselves to C-BSRs (RPs announce to C-BSRs which multicast groups they are responsible for)

-RPF failure (the interface used to reach the multicast source, through RIB, is not the interface sourcing the multicast traffic)

Picture5: RPF Failure

Replace test case 6 with RPF failure (enable PIM & routing through physical int.)

Table5: troubleshooting cases
Case Description Simulated wrong configuration Correct configuration
ISAKMP policy, encryption key mismatch crypto isakmp policy 10
encr aes
crypto isakmp policy 10
encr 3des
2 ISAKMP policy, Hash algorithm mismatch crypto isakmp policy 10
Hash sha
crypto isakmp policy 10
Hash md5 
3 Wrong ISAKMP peer crypto isakmp key cisco address ipv6 2001:DB8::3/128 crypto isakmp key cisco address ipv6 2001:DB8::2/128 
4 Wrong ISAKMP key crypto isakmp key cisco1 address ipv6 2001:DB8::2/128 crypto isakmp key cisco address ipv6 2001:DB8::2/128 
5 Wrong tunnel destination interface Tunnel12
tunnel destination 2001:DB8::3
interface Tunnel12
tunnel destination 2001:DB8::2 
6 Wrong tunnel source interface Tunnel12
tunnel source FastEthernet0/1
interface Tunnel12
tunnel source FastEthernet0/0
 

For more details about each case, refer to the offline lab below, you will find an extensive coverage of all important commands along with debug for each case:

Performance testing

Three cases are tested: multicast traffic between R1 and R2 is routed through:

– Physical interfaces (serial connection): MTU=1500 bytes

– IPv6 GRE: MTU=1456 bytes

– IPv6 IPSec VTI: MTU=1391 bytes

The following tests are performed using iperf in GNS3 lab environment, so results are to keep relative.

Picture6: Iperf testing

perfs

References

http://www.faqs.org/rfcs/rfc6226.html

http://tools.ietf.org/html/rfc5059

http://www.cisco.com/en/US/docs/ios/ipv6/configuration/guide/ip6-multicast.html#wp1055997

https://supportforums.cisco.com/docs/DOC-27971

http://www.cisco.com/en/US/docs/ios/ipv6/configuration/guide/ip6-tunnel_external_docbase_0900e4b1805a3c71_4container_external_docbase_0900e4b181b83f78.html

http://www.cisco.com/web/learning/le21/le39/docs/TDW_112_Prezo.pdf

http://networklessons.com/multicast/ipv6-pim-mld-example/

http://www.gogo6.com/profiles/blogs/ietf-discusses-deprecating-ipv6-fragments

http://tools.ietf.org/html/draft-taylor-v6ops-fragdrop-01

https://datatracker.ietf.org/doc/draft-bonica-6man-frag-deprecate

http://blog.initialdraft.com/archives/1648/

Let’s 6rd!


6rd mechanism belongs to the same family as automatic 6to4, in which IPv6 traffic is encapsulated inside IPv4.

The key difference is that with 6rd, Service Providers use their own 6rd prefix and control the transition of their access-aggregation IPv4-only part of their networks to native IPv6. In the same time, SPs transparently provide IPv6 availability service to their customers.

6rd is generally referred as stateless transition mechanism.

Stateless
In stateless mechanisms an algorithm is used to automatically map between addresses, the scope of mechanism is limited to a local domain in which devices, mapping device (6rd BR) and devices that need mapping (6rd CE), share a common elements of the configuration.
Stateful
On a gateway device, we need to specify a specific address or a range of addresses (not used elsewhere) that will represent another range of addresses.
For example IPv4 NAT on Cisco (NAT44):
ip nat source …
The router relies on the configured statement which address (all bits) to translate to which address (all bits). Which is done independently of devices whose address needs to be translated (inside local/outside global).
For redundancy we need additional configuration to synchronize connection state information between devices. for example SNAT(Stateful NAT failover).

Customer CE routers generate their own IPv6 from the delegated 6rd prefix from BRs (Border relays).


Both CEs and BRs encapsulate IPv6 traffic into IPv4 traffic by automatically reconstructing the header IPv4 addresses from IPv6.


  • Lab topology

top1

For end-to-end testing I am using Ubunu Server version for client host behind CE and Internet host.

Here, is a brief and I hope concise explanation of the main 6rd operations:


6rd configuration

6rd address planning depends on each SP. IPv4 bits must be unique to each CE to show the flexibility of the configuration, I fixed the first 16 bits (10.1) as prefix and the last octets (.1) as suffix and attributed the third octet to CEs.

6rd domain configured parameters:

Tunnel source interface fa0/0
6rd prefix 2001:DEAD::/32
IPv4 prefix length 16
IPv4 bits 8
IPv4 suffix 8
Tunnel source interface IP 10.1.4.1

BR1

ipv6 general-prefix 6RD-PREFIX 6rd Tunnel0
!
interface Tunnel0
ipv6 address 6RD-PREFIX 2001:DEAD::/128 anycast
ipv6 enable

tunnel source FastEthernet0/0
tunnel mode ipv6ip 6rd
tunnel 6rd ipv4 prefix-len 16 suffix-len 8


tunnel 6rd prefix 2001:DEAD::/32

interface FastEthernet0/0

ip address 10.1.4.1 255.255.0.0

We need a couple of static routes to make 6rd work in lab conditions; generally, BR announces client assigned IPv4 to clients to Internet.

  • Default ipv4 static route to outside
  • Static route to SP 6rd prefix pointing to the tunnel
  • Default ipv6 static route to outside
ip route 0.0.0.0 0.0.0.0 192.168.20.100
ipv6 route 2001:DEAD::/32 Tunnel0
ipv6 route ::/0 2001:DB9:5AB::100

CE1

The same 6rd parameters are configured on CE:

  • IPv4 affixes
  • 6rd domain global prefix
  • BR IPv4 address (remote tunnel end point)
interface Tunnel0
ipv6 enable
tunnel source 10.1.1.1
tunnel mode ipv6ip 6rd

tunnel 6rd ipv4 prefix-len 16 suffix-len 8

tunnel 6rd prefix 2001:DEAD::/32

tunnel 6rd br 10.1.4.1
!

interface FastEthernet0/0

ip address 10.1.1.1 255.255.0.0

ipv6 enable

!

interface FastEthernet0/1

ip address 192.168.10.100 255.255.255.0


ipv6 address 6RD-PREFIX ::/64 eui-64 ! * <<<

* Note the CE WAN interface fa0/0 is only enabled for IPv6 to be attributed a link-local address.

Fa0/0 IPv4 address is generally assigned by IPv4 DHCP. If the ISP assigns private addresses, CGN NAT44 is needed at the BR to translate them into global IPv4.

6rd prefix is delegated not to CE fa0/0 WAN interface but CE inside LAN interface fa0/1.

This way the customer LAN can benefit directly from the globally IPv6 address without interrupting IPv6 address continuity and the same prefix can be assigned to client IPv6 network using SLAAC (stateless auto configuration).

A recursive (output interface + next-hop) IPv6 default route points to the BR tunnel interface.

ipv6 route ::/0 Tunnel0 2001:DEAD:400::1

Debugging 6rd tunnel

CE1

CE1#debug tunnel
Tunnel Interface debugging is on
CE1#
Tunnel0: IPv6/IP adjacency fixup, 10.1.1.1->10.1.4.1, tos set to 0x0
Tunnel0: IPv6/IP (PS) to decaps 10.1.4.1->10.1.1.1 (tbl=0, “default”, len=124, ttl=254)
Tunnel0: decapsulated IPv6/IP packet (len 124)

BR1

BR1#debug tunnel
Tunnel Interface debugging is on
BR1#
Tunnel0: IPv6/IP to classify 10.1.1.1->10.1.4.1 (tbl=0,”default” len=124 ttl=254 tos=0x0) ok, oce_rc=0x0
Tunnel0: IPv6/IP adjacency fixup, 10.1.4.1->10.1.1.1, tos set to 0x0
BR1#

As shown by the debug, the end-to-end IPv6 traffic is encapsulated into IPv4 packets between CE and BR.

$iperf -u -t -i1 -V -c 2001:db9:5ab::10 -b 10K
WARNING: delay too large, reducing from 1.2 to 1.0 seconds.
————————————————————
Client connecting to 2001:db9:5ab::100, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 112 KByte (default)
————————————————————
[ 3] local 2001:dead:100:0:a00:27ff:fe0f:20e9 port 39710 connected with 2001:db9:5ab::100 port 5001

[ ID] Interval Transfer Bandwidth

[ 3] 0.0- 1.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 1.0- 2.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 2.0- 3.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 3.0- 4.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 4.0- 5.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 5.0- 6.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 6.0- 7.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 7.0- 8.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 8.0- 9.0 sec 1.44 KBytes 11.8 Kbits/sec

[ 3] 9.0-10.0 sec 4.00 GBytes 34.4 Gbits/sec

[ 3] 0.0-11.0 sec 16.0 GBytes 12.5 Gbits/sec

[ 3] Sent 11 datagrams

read failed: Connection refused

[ 3] WARNING: did not receive ack of last datagram after 4 tries.

Following, is a wireshark traffic capture of the previous iperf testing

6rd-iperf-wireshark

Verification commands

BR1:

BR1#sh tunnel 6rd tunnel 0
Interface Tunnel0:
Tunnel Source: 10.1.4.1
6RD: Operational, V6 Prefix: 2001:DEAD::/32
V4 Prefix, Length: 16, Value: 10.1.0.0
V4 Suffix, Length: 8, Value: 0.0.0.1
General Prefix: 2001:DEAD:400::/40
BR1#
BR1#sh tunnel 6rd destination 2001:dead:100:: tunnel0
Interface: Tunnel0
6RD Prefix: 2001:DEAD:100::
Destination: 10.1.1.1
BR1#
BR1#sh tunnel 6rd prefix 10.1.1.1 tunnel 0
Interface: Tunnel0
Destination: 10.1.1.1
6RD Prefix: 2001:DEAD:100::
BR1#

CE1:

CE1#sh tunnel 6rd tunnel 0
Interface Tunnel0:
Tunnel Source: 10.1.1.1
6RD: Operational, V6 Prefix: 2001:DEAD::/32
V4 Prefix, Length: 16, Value: 10.1.0.0
V4 Suffix, Length: 8, Value: 0.0.0.1
Border Relay address: 10.1.4.1
General Prefix: 2001:DEAD:100::/40

CE1#

CE1#sh tunnel 6rd destination 2001:dead:100:: tunnel0
Interface: Tunnel0
6RD Prefix: 2001:DEAD:100::
Destination: 10.1.1.1
CE1#
CE1#sh tunnel 6rd prefix 10.1.4.1 tunnel 0
Interface: Tunnel0
Destination: 10.1.4.1
6RD Prefix: 2001:DEAD:400::
CE1#

We can use “mtr” command to check the performance of the end-to-end (linux-to-linux) communication.

router@router1:~$ mtr 2001:db9:5ab::100
HOST: router1 Loss% Snt Last Avg Best Wrst StDev
1.|– 2001:dead:100:0:c801:3dff 0.0% 30 27.7 25.2 9.7 34.4 5.9
2.|– 2001:db9:5ab::1 0.0% 30 181.3 126.2 99.1 181.3 19.3
3.|– 2001:db9:5ab::100 0.0% 30 67.3 82.8 67.3 121.6 14.1
router@router1:~$

Customer internal network


6rd and MTU

The default MTU on IOS is 1480 bytes, so the maximum IPv4 packet size encapsulating IPv6 is 1500 bytes.

userver1 end-to-end MTU

router@router1:~$ tracepath6 2001:db9:5ab::100
1?: [LOCALHOST] 0.051ms pmtu 1500
1: 2001:dead:100:0:c801:3dff:fe5c:6 27.130ms
1: 2001:dead:100:0:c801:3dff:fe5c:6 57.536ms
2: 2001:dead:100:0:c801:3dff:fe5c:6 30.005ms pmtu 1480
2: 2001:db9:5ab::1 135.158ms
3: 2001:db9:5ab::100 79.603ms reached

Resume: pmtu 1480 hops 3 back 253

router@router1:~$

Here is an animation explaining 6rd and fragmentation:

MTU recommendations

  • Using a redundant BR, there is no guarantee that traffic will be handled by the same BR, so fragmented packets are lost between BRs è BR anycast IPv4 + IPv4 fragmentation is not recommended.
  • Configure the same IPv4 MTU everywhere within the IPv4 segment and (DF=1) to disable fragmentation.
    • make sure the IPv4 MTU is coordinated with IPv6 MTU (IPv4 MTU < IPv6 MTU + 20 bytes)
  • Enable PMTUD to choose the smallest MTU in the path of CE-to-BR communication.
  • DO NOT Filter ICMP messages “Packet Too Big” and “Destination Unreachable” at routers and end-hosts, they provide inf. about transport issues, worse than traffic black hole is a silent traffic black hole.

Offline Lab

Finally, the offline lab with comprehensive command output during the lab:

IPv6 over AToM pseudowire


The purpose of this lab is to show the flexibility of Layer2 VPN technology AToM (Any Transport over MPLS), which allows service providers to smoothly transit the core network from legacy layer2 technologies into a single MPLS infrastructure ready for customer IPv6 transport.

Customer transition from IPv4 to dual stack is as easy as adding an IPv6 configuration to a point-to-point segment.

The lab is organized as follow:

  • Lab topology overview
  • MPLS Core
  • Pseudowire circuit establishment over MPLS
  • Customer configuration
  • MTU
  • Offline Lab
  • Quiz
  • Conclusion

Lab topology overview

Let’s consider the following lab topology: one core MPLS and three customers. Customer1 uses Ethernet port-to-port layer2 circuit to connect to Provider Edge access router, Customer2 uses Ethernet VLAN layer2 circuit and customer3 uses Frame Relay layer2 circuit.

Picture1: High-level Lab topology

AToM pseudowire High level design

Picture2: Low-level Lab topology

AToM pseudowire low level design

MPLS Core

The MPLS core is configured independently from any pseudowire configuration.

Picture3: MPLS core

AToM-pseudowire-MPLS-core

In the core MPLS, there is practically nothing special to do. IGP and LDP configuration is straightforward. The goal is to guarantee core stability.

Make sure LDP router id is forced to a loopback interface.

Pseudowire circuit establishment over MPLS

Picture4: AToM Pseudowire establishment

AToM Pseudowire establishment

The configuration to establish the different pseudowires do not depend on client configuration.

Note for each virtual circuit a directed LDP session is established between PEs connecting customer sites. Each PE uses a /32 loopback IP.

Layer2 Circuit PE2 PE1
Port-to-port interface FastEthernet0/0
no cdp enable

xconnect 22.2.2.2 24 encapsulation mpls
interface FastEthernet0/0
no cdp enable
xconnect 44.4.4.4 24 encapsulation mpls
Layer2 Circuit PE2 PE1
Vlan interface FastEthernet1/0
no cdp enable
!
interface FastEthernet1/0.10

encapsulation dot1Q 10

xconnect 22.2.2.2 242 encapsulation mpls
interface FastEthernet1/0
no cdp enable
!
interface FastEthernet1/0.10

encapsulation dot1Q 10

xconnect 44.4.4.4 242 encapsulation mpls
Layer2 Circuit PE2 PE1
FR connect fratom Serial2/0 501 l2transport

xconnect 22.2.2.2 241 encapsulation mpls
!
interface Serial2/0

encapsulation frame-relay

frame-relay lmi-type cisco

frame-relay intf-type dce
connect fratom Serial2/0 105 l2transport

xconnect 44.4.4.4 241 encapsulation mpls
!
interface Serial2/0

encapsulation frame-relay

frame-relay lmi-type cisco

frame-relay intf-type dce

Now, let’s take the east side as example of configuration between clients and the provider edge

Customer configuration

Picture5: Customer circuits

AToM pseudowire customer circuits

Note the provider edge is configured independently of the client layer3 protocol IPv4/IPv6.

Customer devices are configured in dual stack.

Ethernet port-to-port pseudowire

East C1 PE1
IPv4 interface FastEthernet0/0
ip address 192.168.15.1 255.255.255.0
ip ospf 15 area 0
interface FastEthernet0/0
no cdp enable
xconnect 44.4.4.4 24 encapsulation mpls
IPv6 interface FastEthernet0/0
ipv6 address FE80::15:5 link-local
ipv6 address 2001:DB8:15::5/64
ipv6 ospf 15 area 0

Ethernet vlan pseudowire

East C3 PE1
IPv4 interface Vlan10
ip address 192.168.152.1 255.255.255.0
ip ospf 15 area 0
interface FastEthernet1/0
switchport access vlan 10
switchport mode trunk
interface FastEthernet1/0
no cdp enable
!
interface FastEthernet1/0.10
encapsulation dot1Q 10
xconnect 44.4.4.4 242 encapsulation mpls
IPv6 interface Vlan10
ipv6 address FE80::152:1 link-local
ipv6 address 2001:DB8:152::1/64
ipv6 ospf 15 area 0

Use sub-interface on PE and disable CDP on the main interface.

FR pseudowire

East C2 PE1
IPv4 interface Serial1/0
ip address 192.168.151.1 255.255.255.0
encapsulation frame-relay
ip ospf network broadcast
frame-relay interface-dlci 105
connect fratom Serial2/0 105 l2transport
xconnect 44.4.4.4 241 encapsulation mpls
!
interface Serial2/0
encapsulation frame-relay
clock rate 2016000
frame-relay lmi-type cisco
frame-relay intf-type dce
IPv6 interface Serial1/0
encapsulation frame-relay
ipv6 address FE80::151:5 link-local
ipv6 address 2001:DB8:151::5/64
ipv6 enable
ipv6 ospf network broadcast
ipv6 ospf 15 area 0
frame-relay map ipv6 FE80::151:1 105
frame-relay map ipv6 2001:DB8:151::1 105 broadcast

For the west side the client configuration is mirrored.

The offline lab provides a complete access to outcomes of large range of commands related to AToM.

The resulting virtual topology looks as follow, typical point-to-point circuits between client devices:

Picture6: Logical connections

AToM pseudowire Logical connections

MTU

Let’s analyse the path for each type of AToM:

  • VC label (identify the pseudowire) = 4 bytes
  • LDP Core switching label = 4 bytes
  • AToM header for Ethernet = 4 bytes (empty)
  • AToM header for Frame Relay = 4 bytes
  • Ethernet port-to-port = 14 bytes
  • Ethernet VLAN = 14 bytes (Ethernet port-to-port) + 4 bytes (VLAN tag) = 18 bytes
  • Frame Relay encapsulation (Cisco) = 2 bytes.

Ethernet port-to-port AToM

Edge MTU AToM header for Ethernet Ethernet port-to-port LDP Core switching label VC label
1500 4 (empty) 14 4 4 1526 bytes

Ethernet VLAN AToM

Edge MTU AToM header for Ethernet Ethernet VLAN LDP Core switching label VC label
1500 4 (empty) 18 4 4 1530 bytes

FR AToM

Edge MTU AToM header for FR FR encapsulation LDP Core switching label VC label
1500 4 2 4 4 1514 bytes

Let’s set Max calculated MTU as Interface MTU of P/PE routers

On PE1 (Fa0/1), PE2 (FA0/1) and P (Fa0/0, Fa0/1)

(config-if)# mtu 1530
#sh int fa1/0 | i MTU
MTU 1530 bytes, BW 100000 Kbit/sec, DLY 100 usec,
#

MPLS MTU <= Core interfaces MTU

It is very important to distinguish IOS commands for setting MTU:

Hardware MTU

(config-if)# mtu <>: The maximum packet length the interface can support.

IP MTU

(config-if)# ip mtu <>: The maximum size of a non-labelled IP packet without fragmentation.

MPLS MTU

(config-if)# mpls mtu <>: The maximum size of a labelled IP packet without fragmentation (<= Hardware MTU).

Ivan Pepelnjak provides an excellent article about the difference between different MTU commands.

PE2#sh mpls int fa0/1 detail
Interface FastEthernet0/1:
IP labeling enabled (ldp):
Interface config
LSP Tunnel labeling not enabled
BGP labeling not enabled
MPLS operational

MTU = 1500
PE2#

Testing end-to-end MTU

Note: Lab limitation

This lab was performed on GNS3 and I had some difficulties building MPLS core using C7200 platform with IOS 12.4(24) as P router. 3700 platform IOS doesn’t allow me to change Hardware MTU.

So the following test is done considering the maximum MTU through MPLS core of 1500 bytes.

The ping test is performed on a client site using EoMPLS VLAN to test the MTU limit

WestC3#ping
Protocol [ip]: ipv6
Target IPv6 address: 2001:db8:52::1
Repeat count [5]:
Datagram size [100]: 1400
Timeout in seconds [2]:
Extended commands? [no]: yes
Source address or interface: 2001:db8:12::1
UDP protocol? [no]:

Verbose? [no]:

Precedence [0]:

DSCP [0]:

Include hop by hop option? [no]:

Include destination option? [no]:

Sweep range of sizes? [no]: yes

Sweep min size [1400]:

Sweep max size [18024]: 1530

Sweep interval [1]: 4

Type escape sequence to abort.

Sending 165, [1400..1530]-byte ICMP Echos to 2001:DB8:52::1, timeout is 2 seconds:

Packet sent with a source address of 2001:DB8:12::1

!!!!!!!!!!!!!!!!C!. (size 1472)

. (size 1476)

. (size 1480)

. (size 1484)

. (size 1488)

. (size 1492)

. (size 1496)

. (size 1500)

. (size 1504)

. (size 1508)

. (size 1512)

. (size 1516)

. (size 1520)

. (size 1524)

. (size 1528)

… <output omitted>

Success rate is 53 percent (88/165), round-trip min/avg/max = 28/136/344 ms

WestC3#

Note that ping fails starting from a packet size of 1472.

EoMPLS VLAN pseudowire adds 30 bytes to 1472 bytes which makes the packet bigger than 1500 bytes (lab max MTU limitation).

In fact, beyond the configured MTU on the core MPLS there is an implicit 18 bytes of the underlying Ethernet.

Following an illustrating hopefully clarifies the relationship between different MTUs:

AToM pseudowire MTU

Interactive illustration of Wireshark captured MPLS core transport packet

Offline Lab

Quiz

Conclusion

  • AToM pseudowires are configured independently of IPv4/IPv6, which makes the client transition from IPv4 to IPv6 transparent.
  • From the client point of view it is a directly connected point-to-point circuit.
  • Make sure MPLS core interfaces MTU are configured with the maximum packet size and the MPLS MTU is not bigger that interface hardware MTU to avoid unnecessary fragmentation.

6VPE (IPv6 VPN Provider Edge Router)


6VPE is an easy solution to connect IPv6 customers through an existing stable IPv4 MPLS infrastructure.

All clients have to do is to connect to a Provider Edge (configured with IPv6 VRFs) using IPv6.

6VPE MPLS

I hope this post will provide you with a brief and concise explanation about 6VPE.

Let’s start with a short animation resuming the 6VPE forwarding process:

Following the main configuration steps.

Lab topology

6VPE lab topology

Core IGP

For the sake of backbone stability, we need to configure the Core IGP (OSPF) to use loopback interfaces (always UP/UP) on all P and PE routers. {2.2.2.2, 3.3.3.3, 4.4.4.4}

22.2.2.2, 33.3.3.3, 44.4.4.4 loopback interfaces are used for MPLS router-id and need to be advertised through Core OSPF.

22.22.2.2, 44.4.4.4 loopback interfaces are used for MP-iBGP neighbor relationships and need to be advertised through Core OSPF.

By default OSPF will not advertise a 32-bit loopback mask. We need to configure the interface to be an OSPF network type point-to-point. For more details refer to this post…

6VPE2

interface Loopback0
ip address 4.4.4.4 255.255.255.255
!
!
interface Loopback2
ip address 44.4.4.4 255.255.255.0
!
!
interface Loopback3
ip address 44.44.4.4 255.255.255.255
ip ospf network point-to-point
!
router ospf 234
router-id 4.4.4.4
network 44.4.4.4 0.0.0.0 area 0
network 44.44.4.4 0.0.0.0 area 0
network 192.0.0.0 0.255.255.255 area 0

Core

interface Loopback0
ip address 3.3.3.3 255.255.255.255
!
router ospf 234
router-id 3.3.3.3
network 33.3.3.3 0.0.0.0 area 0
network 192.0.0.0 0.255.255.255 area 0

6VPE1

interface Loopback0
ip address 2.2.2.2 255.255.255.255
!
!
interface Loopback2
ip address 22.2.2.2 255.255.255.255
!
!
interface Loopback3
ip address 22.22.2.2 255.255.255.255
ip ospf network point-to-point
!
mpls ldp router-id Loopback2 force
!
router ospf 234
router-id 2.2.2.2
network 22.2.2.2 0.0.0.0 area 0
network 22.22.2.2 0.0.0.0 area 0
network 192.0.0.0 0.255.255.255 area 0

MPLS-LDP

MPLS-LDP establishes back-to-back sessions for label exchange in the control plane and label swapping in the forwarding plane.

Just configure MPLS LDP the appropriate interfaces and force a loopback interface for MPLS LDP router-id.

Core

interface FastEthernet0/0
ip address 192.168.23.3 255.255.255.0
mpls label protocol ldp
mpls ip
!
interface FastEthernet0/1
ip address 192.168.34.3 255.255.255.0
mpls label protocol ldp
mpls ip
!
mpls ldp router-id Loopback2 force

6VPE1

interface FastEthernet1/0
ip address 192.168.23.2 255.255.255.0
mpls label protocol ldp
mpls ip
!
mpls ldp router-id Loopback2 force

6VPE2

interface FastEthernet1/0
ip address 192.168.34.4 255.255.255.0
mpls label protocol ldp
mpls ip
!
mpls ldp router-id Loopback2 force

Provider Edge VRFs

Make sure IPv6 routing and IPv6 CEF are enabled.

6VPE1

vrf definition west-c1
rd 100:100
!
address-family ipv6
route-target export 100:100
route-target import 100:100
exit-address-family
interface FastEthernet0/0
vrf forwarding west-c1
ipv6 address FE80::12:2 link-local
ipv6 address 2001:DB8:12::2/64
router bgp 65234
no synchronization
bgp router-id 22.2.2.2
bgp log-neighbor-changes
!
address-family ipv6 vrf west-c1
redistribute connected
no synchronization
neighbor 2001:DB8:12::1 remote-as 65010
neighbor 2001:DB8:12::1 activate
exit-address-family

West6

router bgp 65010
bgp router-id 1.1.1.1
bgp log-neighbor-changes
neighbor 2001:DB8:12::2 remote-as 65234
!
address-family ipv6
neighbor 2001:DB8:12::2 activate
network 2001:DB8:10::/64
network 2001:DB8:12::/64
exit-address-family

6VPE2

vrf definition east-c1
rd 100:100
!
address-family ipv6
route-target export 100:100
route-target import 100:100
exit-address-family
interface FastEthernet0/0
vrf forwarding east-c1
ipv6 address FE80::45:4 link-local
ipv6 address 2001:DB8:45::4/64
router bgp 65234
no synchronization
bgp router-id 44.4.4.4
bgp log-neighbor-changes
!
address-family ipv6 vrf east-c1
redistribute connected
no synchronization
neighbor 2001:DB8:45::5 remote-as 65050
neighbor 2001:DB8:45::5 activate
exit-address-family

Note: With BGP used as customer protocol redistribution is not needed between PE-CE routing protocol (BGP) and MP-BGP

East6

router bgp 65050
bgp router-id 5.5.5.5
bgp log-neighbor-changes
neighbor 2001:DB8:45::4 remote-as 65234
!
address-family ipv6
neighbor 2001:DB8:45::4 activate
network 2001:DB8:45::/64
network 2001:DB8:50::/64
exit-address-family

Provider Edge-to-Provider Edge VPNv6

To understand the difference between PE-PE and PE-CE interactions, think about the difference between a routing protocol and a routed protocol:

  • BGP, OSPF, EIGRP, RIP are routing protocols.
  • IPv4, IPv6, IPX, AppleTalk are routed protocol.

So routing protocols exchange routed protocol information. In our particular case:

  • PE-CE routing protocol is BGP and PE-CE routed protocol is IPv6.
  • PE-PE routing protocol is MP-BGP and PE-PE routed protocol is vpnv6.
  • Core routing protocol is OSPF and the routed protocol is IPv4.

Vpnv4 = RD + VRF IPv4 prefix

Vpnv6 = RD + VRF IPv6 prefix

RD (Route Distinguisher) uniquely identifies the VRF on the PE and allows having multiple customer VPNs with overlapping address schemas.

RT is a BGP extended community attribute (need to be enabled) used to control the installation of exchanged routes between PEs into the correct VRF.

PE-PE (MP-BGP) updates containing MP_BGP_NLRI information:

vpnv4 + (BGP attributes+ RT extended attribute) + Label

VPNv6 route exchange (using MP-BGP)

MPLS network autonomous system 65234 transits traffic between customer autonomous systems 65010 and 65040.

In our case all MPLS routers (P and PE) belong to the same AS. Therefore we need to configure next-hop-self on each PE; otherwise customer prefixes will be visible in BGP table with unreachable next-hops.

Another solution is to enable MPLS IP on interfaces facing clients to include them in MPLS updates, then don’t forget to filter LDP(UDP 646),TDP(TCP 711) traffic with the clients .

6VPE1

router bgp 65234
no synchronization
bgp router-id 22.2.2.2
neighbor 44.44.4.4 remote-as 65234
neighbor 44.44.4.4 update-source Loopback3
neighbor 44.44.4.4 next-hop-self
no auto-summary

The address-family vpnv4 is used to exchange customer IPv4 prefixes between PEs (through IPv4 core)

The address-family vpnv6 is used to exchange customer IPv6 prefixes between PEs (through IPv4 core)

router bgp 65234
address-family vpnv4
neighbor 44.44.4.4 activate
neighbor 44.44.4.4 send-community extended
exit-address-family
!
address-family vpnv6
neighbor 44.44.4.4 activate
neighbor 44.44.4.4 send-community extended
exit-address-family

6VPE2

router bgp 65234
no synchronization
bgp router-id 44.4.4.4
neighbor 22.22.2.2 remote-as 65234
neighbor 22.22.2.2 update-source Loopback3
neighbor 22.22.2.2 next-hop-self
no auto-summary
router bgp 65234
address-family vpnv4
neighbor 22.22.2.2 activate
neighbor 22.22.2.2 send-community extended
exit-address-family
!
address-family vpnv6
neighbor 22.22.2.2 activate
neighbor 22.22.2.2 send-community extended
exit-address-family

And a small QUIZ to check the very basic

The offline lab provides you with more information about the network behavior and its states in different test cases.

An extensive range of commands is provided.

I hope you will find it useful. Suggestions and critics are welcome.

%d bloggers like this: