OVN (Open Virtual Network) Introduction

OVN (Open Virtual Network) is an open source software that provides virtual networking for features like L2, L3 and ACL, etc. OVN works with OVS (Open vSwitch) which has been adopted widely. OVN has been developed by the same community as that of OVS and it is treated almost like a sub-project of OVS. Just like OVS, the development of OVN is 100% open and discussion is being made on a public mailing list and IRC. While VMware and RedHat are the primary contributors today, it is open to everybody who wishes to contribute to OVN.

The target platform for OVN is Linux-derived hypervisors such as KVM and Xen and containers. DPDK is also supported. As of this writing, there is no ESXi support. Because OVS has been ported to Hyper-V (still in progress, though), OVN may support Hyper-V in the future.

It is important to note that OVN is CMS (Cloud Management System) agnostic and takes OpenStack, Docker and Mesos into the scope. Among these CMPs, OpenStack would have the highest significance for OVN. A better integration with OpenStack (compared to Neutron OVS plugin as of today) is probably one of the driving factors of OVN development.

OVN share the same goal as OVS, that is, supporting large scale deployment consisting of thousands of hypervisors with production quality.

Please see the diagram depicted below showing the high level architecture of OVN.

OVN Architecture
OVN Architecture

As you can see, OVN has two new components (processes). One is “ovn-northd” and another is “ovn-controller”. As its name implies, ovn-northd provides a northbound interface to CMS. As of this writing, only one ovn-northd exists in one OVN deployment. However, this part will be enhanced in the future to support some sort of clustering for redundancy and the scale-out.

One of the most unique architectures of OVN is probably the fact that the integration point with CMS is a database (DB). Although many people would expect that RESTful API is the integration point as a northbound interface, ovn-northd is designed to interact with CMS via DB.

ovn-northd uses two databases. One is called “Northbound DB”, which holds a “desired state” of virtual network. More specifically, it holds information about logical switch, logical port, logical router, logical router port, and ACL. It is worth noting that Northbound DB doesn’t hold any “physical” information. An entity sitting north side of ovn-northd (typically CMS) writes information to this Northbound DB to interact with ovn-northd. Another database that ovn-northd takes care of is “Southbound DB” which holds runtime state such as physical, logical and binding information. Specifically, Southbound DB includes information related to chassis, datapath binding, encapsulation port binding, and logical flows. One of the important roles of ovn-northd is to read Northbound DB, translates it to logical flows and write them to Southbound DB.

On the other hand, ovn-controller is a distributed controller, which will be installed on every single hypervisor. ovn-controller reads information from Southbound DB and configures ovsdb-server and ovs-vswitchd running on the same host accordingly. ovn-controller translates logical flow information populated in Southbound DB to physical flow information and then install it to OVS.

Today OVN uses OVSDB as a database system. Inherently database can be anything and doesn’t have to be OVSDB. However, as OVSDB is an intrinsic part of OVS, on which OVN always depends, and developers of OVN know the characteristics of OVSDB very well, they decided to use OVSDB as a database system for OVN for the time being.

Basically OVN provides three features; L2, L3 and ACL (Security Group).

As an L2 feature, OVN provides a logical switch. Specifically, OVN creates an L2-over-L3 overlay tunnel between hosts automatically. OVN uses Geneve as a default encapsulation. Considering the necessity of metadata support, multipath capability and hardware acceleration, Geneve would be the most desired encapsulation of choice. In case where hardware acceleration on the NIC for Geneve is not available, OVN allows to use STT as the second choice. In general, HW-VTEP doesn’t support Geneve/STT today, OVN uses VXLAN when talking to HW-VTEPs.

In terms of L3 feature, OVN provides so called a “distributed logical routing”. L3 features provided by OVN are not centralized meaning each host executes L3 function autonomously. L3 topology that OVN supports today is very simple. It routes traffic between logical switches directly connected to it and to the default gateway. As of this writing it is not possible to configure a static route other than the default route. It is simple but sufficient to support basic L3 function that OpenStack Neutron requires. NAT will be supported soon.

OVS plugin in OpenStack Neutron today implements Security Group by applying iptables to tap interface (vnet) on Linux Bridge. Having both OVS and Linux Bridge at the same time makes the architecture somewhat complex.

under-the-hood-scenario-1-ovs-compute
Conventional OpenStack Neutron OVS plugin Architecture (An excerpt from http://docs.ocselected.org/openstack-manuals/kilo/networking-guide/content/figures/6/a/a/common/figures/under-the-hood-scenario-1-ovs-compute.png)

Since OVS 2.4, OVS has been integrated with “conntrack” feature available on Linux, so it is possible to implement stateful ACL by OVS without relying on iptables. OVN takes advantages of this OVS & conntrack integration to implement the ACL.

Since conntrack integration is an OVS feature, one can use OVS+conntrack without OVN. However, OVN allows you to use stateful ACL without explicit awareness of conntrack because OVN compiles logical ACL rules to conntrack-based rules automatically, which would be appreciated by many people.

I will go into a bit more detail about L2, L3 and ACL features of OVN in the subsequent posts.

NetFlow on Open vSwitch

Open vSwitch (OVS) has been supporting NetFlow for a long time (since 2009). To enable NetFlow on OVS, you can use the following command for example:

[shell]# ovs-vsctl — set Bridge br0 netflow=@nf — –id=@nf create NetFlow targets=\”10.127.1.67\”[/shell]

When you want to disable NetFlow, you can do it in the following way:

[shell]# ovs-vsctl — clear Bridge br0 netflow[/shell]

NetFlow has several versions. V5 and V9 are the most commonly used versions today. OVS supports NetFlow V5 only. NetFlow V9 is not supported as of this writing (and very unlikely to be supported because OVS already supports IPFIX, a direct successor of NetFlow V9).

NetFlow V5 packet consists of a header and flow records (up to 30) following the header (see below).

NetFlow V5 header format NetFlow V5 header format

NetFlow V5 flow record format NetFlow V5 flow record format

NetFlow V5 cannot handle IPv6 flow records. If you need to monitor IPv6 traffic, you must use sFlow or IPFIX.

Comparing to NetFlow implementation on typical routers and switches, the one on OVS has a couple of unique points that you should keep in mind. I will describe such points below.

Most NetFlow-capable switches and routers support so called a “sampling”, where only subset of packets are processed for NetFlow (there are couple of ways how to sample the packet but it is beyond the scope of this blog post). NetFlow on OVS doesn’t support sampling. If you need to sample the traffic, you need to use sFlow or IPFIX instead.

Somewhat related to the fact that NetFlow on OVS doesn’t do sampling, it is worth noting that “byte count (dOctets)” and “packet count (dPkts)” fields in a NetFlow flow record (both are 32bit fields) may wrap around in case of elephant flows. In order to circumvent this issue, OVS sends multiple flow records when bytes or packets count exceeds 32bit maximum so that it can tell the collector with the accurate bytes and packets count.

Typically, NetFlow-capable switches and routers have a per-interface configuration to enable/disable NetFlow in addition to a global NetFlow configuration. OVS on the other hand doesn’t have a per-interface configuration. Instead it is a per-bridge configuration that allows you to enable/disable NetFlow on a per-bridge basis.

Most router/switch-based NetFlow exporters allow to configure the source IP address of NetFlow packet to be exported (and if it is the case, loopback address is a reasonable choice for this address). OVS, however, doesn’t have this capability. The source IP address of NetFlow packet is determined by the IP stack of host operating system and it is usually an IP address associated with the outgoing interface. Since NetFlow V5 doesn’t have a concept like “agent address” in sFlow, most collectors distinguish the exporters by the source IP address of NetFlow packets. Because OVS doesn’t allow us to configure the source IP address of NetFlow packets explicitly, we’d better be aware that there is a chance for the source IP address of NetFlow packets to change when the outgoing interface changes.

Although it is not clearly described in the document, OVS, in fact, supports multiple collectors as shown in the example below. This configuration provides redundancy of collectors.

[shell]# ovs-vsctl — set Bridge br0 netflow=@nf — –id=@nf create NetFlow targets=\[\”10.127.1.67:2055\”,\”10.127.1.68:2055\”\][/shell]

When flow-based network management is adopted, In/Out interface number included in a flow record has a significant importance because it is often used to fitter the traffic of interest. Most commercial collector products have such a sophisticated filtering capability. Router/switch-based NetFlow exporter uses SNMP’s IfIndex to report In/Out interface number. NetFlow on OVS on the other hand uses OpenFlow port number instead to report In/Out interface number in the flow record. OpenFlow port number can be found by ovs-ofctl command as follows:

[shell]# ovs-ofctl show br0
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000000c29eed295
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
1(eth1): addr:00:0c:29:ee:d2:95
config: 0
state: 0
current: 1GB-FD COPPER AUTO_NEG
advertised: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-FD COPPER AUTO_NEG
supported: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-FD COPPER AUTO_NEG
speed: 1000 Mbps now, 1000 Mbps max
LOCAL(br0): addr:00:0c:29:ee:d2:95
config: 0
state: 0
speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0[/shell]

In this example, OpenFlow port number of “eth1” is 1. Some In/Out interface number have a special meaning in OVS. An interface local to the host (interfaces that is labeled as “LOCAL” in the example above) is represented as 65534. Output interface number 65535 is used for broadcast/multicast packets whereas most router/switch-based NetFlow exporters use “0” for both these two cases.

Those who are aware that IfIndex information was added to “Interface” table in OVS relatively recently may think that using IfIndex instead of OpenFlow port number is the right thing to do. May be true, but it is not that simple. For example, tunnel interface created by OVS doesn’t have an IfIndex, so it is not possible to export a flow record for traffic that traverses over tunnel interfaces if we simply chose IfIndex as NetFlow In/Out interface number.

NetFlow V5 header has a field called “Engine ID” and “Engine Type”. How these fields are set in OVS by default depends on the type of datapath. If OVS is run in a user space using netdev, Engine Type and Engine ID are derived from a hash value based on the datapath name and Engine ID and Engine Type are the most significant and least significant 8 bit of the hash value respectively. On the other hand, in the case of Linux kernel datapath using netlink, IfIndex of datapath is set to both Engine ID and Engine Type. You can find the IfIndex of OVS datapath by the following command:

[shell]# cat /proc/sys/net/ipv4/conf/ovs-system[/shell]

If you don’t like the default value of these fields, you can configure them explicitly as shown below:

[shell]# ovs-vsctl set Bridge br0 netflow=@nf — –id=@nf create NetFlow targets=\”10.127.1.67:2055\” engine_id=10 engine_type=20[/shell]

The example above shows the case where Engine Type and Engine ID are set at the same time when NetFlow configuration is created for the first time. You can also change NetFlow-related parameters after NetFlow configuration is created by doing like:

[shell]# ovs-vsctl clear NetFlow br0 engine_id=10 engine_type=20[/shell]

In general, typical use case of Engine Type and Engine ID is to distinguish logically separated but physically a single exporter. Good example of such a case is Cisco 6500, which has MSFC and PFC in a single chassis but having separate NetFlow export engines. In OVS case, it can be used to distinguish two or more bridges that is generating NetFlow flow records. As I mentioned earlier, the source IP address of NetFlow packet that OVS exports is determined by the standard IP stack (and it is usually not the IP address associated to the bridge interface in OVS that is NetFlow-enabled.) Therefore it is not possible to use the source IP address of NetFlow packets to tell which bridge is exporting the flow records. By setting a distinct Engine Type and Engine ID on each bridge, then you can tell it to the collector. To my knowledge, not many collectors can use Engine Type and/or Engine ID to distinguish multiple logical exporters however.

There is another use case for Engine ID. As I already explained that OVS uses OpenFlow port number as an In/Out interface number in NetFlow flow record. Because OpenFlow port number is a per bridge unique number, there is a chance for these numbers to collide across bridges. To get around this problem, you can set “add_to_interface” to true.

[shell]# ovs-vsctl set Bridge br0 netflow=@nf — –id=@nf create NetFlow targets=\”10.127.1.67:2055\” add_to_interface=true[/shell]

When this parameter is set to true, the 7 most significant bits of In/Out interface number is replaced with the 7 least significant bits of Engine ID. This will help interface number collision happen less likely.

Similar to typical router/switch-based NetFlow exporters, OVS also has a concept of active and inactive timeout. You can explicitly configure active timeout (in seconds) using the following command:

[shell]# ovs-vsctl set Bridge br0 netflow=@nf — –id=@nf create NetFlow targets=\”10.127.1.67:2055\” active_timeout=60[/shell]

If it is not explicitly specified, it defaults to 600 seconds. If it is specified as -1, then active timeout will be disabled.

While OVS has an inactive timeout mechanism for NetFlow, it doesn’t have an explicit configuration knob for it. When flow information that OVS maintains is removed from the datapath, information about those flows will also be exported via NetFlow. This timeout is dynamic; it varies depending on many factors like the version of OVS, CPU and memory load etc., but typically it is 1 to 2 seconds in recent OVS. This duration is considerably shorter than that of typical router/switch-based NetFlow exporters which is 15 seconds in most cases.

As with most router/switch-based NetFlow exporters, OVS exports flow records for ICMP packet by filling the number calculated by “ICMP Type * 256 + Code” to source port number in a flow record. Destination port number for ICMP packets is always set to 0.

NextHop, source and destination of AS number, source and destination of Netmask are always set to 0. This is an expected behavior as OVS is inherently a “switch”.

While there are some caveats as I described above, NetFlow on OVS is a very useful tool if you want to monitor traffic handled by OVS. One advantage of NetFlow over sFlow or IPFIX is that there are so many open source or commercial collectors available today. Whatever flow collector you choose most likely supports NetFlow V5. Please give it a try. It will give you a great visibility about the traffic of your network.

Geneve on Open vSwitch

A few weeks back, I posted a blog (sorry it was done only in Japanese) about a new encapsulation called “Geneve” which is being proposed to IETF as an Internet-Draft. Recently the first implementation of Geneve became available for Open vSwitch (OVS) contributed by Jesse Gross, a main author of Geneve Internet Draft, and the patch was upstream to a master branch on github where the latest OVS code resides. I pulled the latest source code of OVS from github and played with Geneve encapsulation. The following part of this post explains how I tested it. Since this effort is purely for testing Geneve and nothing else, I didn’t use KVM this time. Instead I used two Ubuntu 14.04 VM instances (host-1 and host-2) running on VMware Fusion with the latest OVS installed. In terms of VMware Fusion configuration, I assigned 1 Ethernet NIC on each VM which obtains an IP address from DHCP provided by Fusion by default. In the following example, let’s assume that host-1 and host-2 obtained an IP address 192.168.203.151 and 192.168.203.149 respectively. Next, two bridges are created (called br0 and br1), where br0 connecting to the network via eth0 while br1 on each VM talking with each other using Geneve encapsulation.

Geneve Test with Open vSwitch Geneve Test with Open vSwitch

OVS configuration for host-1 and host-2 are shown below:

[shell]mshindo@host-1:~$ sudo ovs-vsctl add-br br0
mshindo@host-1:~$ sudo ovs-vsctl add-br br1
mshindo@host-1:~$ sudo ovs-vsctl add-port br0 eth0
mshindo@host-1:~$ sudo ifconfig eth0 0
mshindo@host-1:~$ sudo dhclient br0
mshindo@host-1:~$ sudo ifconfig br1 10.0.0.1 netmask 255.255.255.0
mshindo@host-1:~$ sudo ovs-vsctl add-port br1 geneve1 — set interface geneve1 type=geneve options:remote_ip=192.168.203.149[/shell]

[shell]mshindo@host-2:~$ sudo ovs-vsctl add-br br0
mshindo@host-2:~$ sudo ovs-vsctl add-br br1
mshindo@host-2:~$ sudo ovs-vsctl add-port br0 eth0
mshindo@host-2:~$ sudo ifconfig eth0 0
mshindo@host-2:~$ sudo dhclient br0
mshindo@host-2:~$ sudo ifconfig br1 10.0.0.2 netmask 255.255.255.0
mshindo@host-2:~$ sudo ovs-vsctl add-port br1 geneve1 — set interface geneve1 type=geneve options:remote_ip=192.168.203.151[/shell]

Once this configuration has been done, now ping should work between br1 on each VM and those ping packets are encapsulated by Geneve.

[shell]mshindo@host-1:~$ ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.759 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.486 ms
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.514 ms
64 bytes from 10.0.0.2: icmp_seq=4 ttl=64 time=0.544 ms
64 bytes from 10.0.0.2: icmp_seq=5 ttl=64 time=0.527 ms
^C
— 10.0.0.2 ping statistics —
5 packets transmitted, 5 received, 0% packet loss, time 3998ms
rtt min/avg/max/mdev = 0.486/0.566/0.759/0.098 ms
mshindo@host-1:~$ [/shell]

Let’s take a closer look how Genve encapsulated packets look like using Wireshark. A Geneve dissector for Wireshark became available recently (this is also a contribution from Jesse, thanks again!) and merged into the latest master branch. Using this latest Wireshark, we can see how Geneve packet looks like as follows:

Geneve Frame by Wireshark
Geneve Frame by Wireshark

As you can see, Geneve uses 6081/udp as its port number. This is a port number officially assigned by IANA on Mar.27, 2014. Just to connect two bridges together by Geneve tunnel, there’s no need to specify a VNI (Virtual Network Identifier) specifically. If VNI is not specified, VNI=0 will be used as you can see in this Wireshark capture.

On the other hand if you need to multiplex more than 1 virtual networks over a single Geneve tunnel, VNI needs to be specified. In such a case, you can designate VNI using a parameter called “key” as an option to ovs-vsctl command as shown below:

[shell]mshindo@host-1:~$ sudo ovs-vsctl add-port br1 geneve1 — set interface geneve1 type=geneve options:remote_ip=192.168.203.149 options:key=5000[/shell]

The following is a Wireshark capture when VNI was specified as 5000 (0x1388):

Geneve Frame with VNI 5000 by Wireshark
Geneve Frame with VNI 5000 by Wireshark

Geneve is capable of encapsulating not only Ethernet frame but also arbitrary frame types. For this purpose Geneve header has a field called “Protocol Type”. In this example, Ethernet frames are encapsulated so this filed is specified as 0x6558 meaning “Transparent Ethernet Bdiging”.

As of this writing, Geneve Options are not supported (more specifically, there is no way which Geneve Options to be added to Geneve header). Please note that Geneve Options are yet to be defined in Geneve’s Internet Draft. Most likely a separate Internet Draft will be submitted to define Geneve Options sooner or later. As such a standardization process progresses, Geneve implementation in OVS will also evolve for sure.

Although Geneve-aware NIC which can perform TSO against Geneve encapsulated packets is not available on the market yet, OVS is at least “Geneve Ready” now. Geneve code is only included in the latest master branch of OVS at this point, but it will be included in the subsequent official release of OVS (hopefully 2.2). When that happens you can play with it more easily. Enjoy!