Project

General

Profile

Feature #2623

Package and provide OpenVPN and tun/tap drivers manageable by dladm

Added by Jim Klimov over 8 years ago. Updated almost 4 years ago.

Status:
Rejected
Priority:
Low
Assignee:
Category:
OI-Userland
Target version:
-
Start date:
2012-04-14
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

SHORT PITCH:

Many users have implemented the free OpenSSL-based VPN solution in their networks - OpenVPN servers and clients. This software is easily installable on small router boxes with Linux inside, on servers and PCs to connect into their networks from "the scary outside", as well as to unite several LANs (branch offices etc.) with a VPN over untrusted WANs, using an L3 routing or L2 bridging approach. In some ways this solution might be easier to use than GRE or PPTP based VPNs, and it is somewhat more "standardized" being a single-vendor's open product running on countless hardware and OS platforms ;)

Pre-packaged OpenVPN and TUN/TAP tunnel driver software can be downloaded from repositories like Blastwave, or they can be compiled from source by the user; however, having a standard solution in the OpenIndiana distribution out-of-the-box seems like a better solution. Especially since there is some work to be done on OpenVPN and/or its drivers to make the solution not "just able to work", but also become more performant (think network tunnel driver changes) and tighter integrated with the OS (think SMF).

Unfortunately, the solution does often perform poorly, with the most notable example being the use of CIFS/SMBfs over OpenVPN as compared to native VPN-less CIFS, rsync or scp (using same OpenSSL) between the same two hosts over the same WAN or even over same OpenVPN tunnels over this WAN.
However, there are vast performance differences between implementations and platforms (i.e. upload from the Windows OpenVPN client into the secured server behind an OpenSolaris OpenVPN router works a lot faster than downloads from that secured server over the same VPN), so it might be possible that changes in (or replacement of) the TUN/TAP drivers and/or updates to OpenVPN might solve the Solaris-side problems. Further detail will be posted below.

MORE DETAIL:
I will post further thoughts on this matter as an update to the RFE, because otherwise the whole flood of text is mailed on every update to the ticket ;)

EXPECTED DELIVERABLES:
I think this project is complex in research in integration, so it can be phased.

Phase 1 - Package the 3rd party software as-is, and include SMFization for OpenVPN instances:
  • Check that current OpenVPN (http://openvpn.net/index.php/open-source/downloads.html) and TUN/TAP drivers for Solaris (http://www.whiteboard.ne.jp/~admin2/tuntap/) can be compiled for illumos/OpenIndiana by both SS12 and GCC3.4.3 or whatever compiler stacks are officially supported at that moment.
  • Prepare the several packages for 32/64-bit platforms - Compile TUN/TAP drivers, Bridge drivers, their management utilities; Compile OpenVPN software; Create SMF manifests and methods to manage OpenVPN instances; prepare packaging manifests; publish to repo.

At this moment the end-users can use OpenVPN in OpenIndiana at least as well as they can do with other sources of the package, it's integrated with SMF and it's available from the default repositories.
Now let's try to make it better.

Phase 2 - Research the (CIFS) performance problems with TUN/TAP drivers and OpenVPN as they are:
  • determine whether TUN/TAP, OpenVPN or OpenSSL are at fault in CIFS-over-OpenVPN performance degradation
  • see if something can be done in the default stack to improve (CIFS) performance
  • see if existing OpenSolaris interfaces for IP tunneling (ClearView/CrossBow) can be used with OpenVPN, out-of-the-box or after some OpenVPN patching, or after some OpenSolaris patching, and if the resulting VPN performs better than TUN/TAP-based VPN ;)
    The positive result includes that these tunnels should work correctly with TUN/TAP drivers on the other side of the tunnel.
  • If any of the attempted solutions does indeed improve the VPN performance, work with upstream of the opensource projects to integrate the patches for everyone's benefit.
Phase 3 depends on results of Phase 2 research, and may involve some or all of the following:
  • if TUN/TAP and bridge drivers are here to stay, as-is or improved by patching - integrate them tighter with OpenSolaris network management (dladm configuration, flow controls and bandwidth limits per-link, maybe add dtrace visibility, etc.)
  • if OpenVPN can work with existing OpenSolaris IP tunnels, document how that can be done
  • optionally, for the CIFS problem in particular, it might be possible to create a proxy service, which would somehow streamline the IP dialog within the VPN (so that its encryption is more optimal and fast) and "serve" the end-client on the other side, while the proxy would act as a client to (its local) CIFS server and interact with it at wire-speed.

At this point, if the project succeeds, the illumos OpenVPN tunnels work better and faster than original ones (after Phase 1), while keeping interoperability with TUN/TAP/bridge on the other side - either by improving TUN/TAP/bridge drivers for OpenSolaris and expanding their integration with illumos, or by using native OpenSolaris tunnels and improving OpenVPN to use them. Maybe different improvement solutions spring up and are implemented instead.

See also:

Files

openvpn.init (8.72 KB) openvpn.init My init-script for managing OpenVPN servers and clients in one blow on OpenSolaris SXCE, evolved from stock script in OpenVPN 2.1rc15 Jim Klimov, 2012-04-14 04:44 PM

History

#1

Updated by Jim Klimov over 8 years ago

SOME FURTHER DETAILS

This post further describes the IP tunneling driver used by OpenVPN on various platforms, and details on the problem I've seen with slowdowns over OpenVPN as compared to other OpenSSL-encrypted interactions (rsync).

=== SMF

There are no out-of-the-box SMF wrappers for OpenVPN that I know of, and they would be useful since a typical OpenVPN installation can have many separate instances running (a tcp and an udp server, perhaps several copies for different categories of external clients with different settings, as well as client instances to connect to external servers). An example to get started with SMFization of OpenVPN can be seen in J.Moelenkamp's blog: http://www.c0t0d0s0.org/archives/4147-Solaris-Features-Service-Management-Facility-Part-4-Developing-for-SMF.html

A heavy modification of original OpenVPN initscript which properly works for me on an OpenSolaris SXCE server/client as well as in RHEL is attached to this post. This script properly shuts down the TUN interfaces when stopping OpenVPN and adds some other tweaks for Solaris vs. Linux.
This script might be used to bootstrap development of the SMF method script ;)

=== TUNNEL DRIVERS

OpenVPN relies on special "Universal TUN/TAP" networking interface drivers, originally written for Linux (http://vtun.sourceforge.net/tun/faq.html) to establish IP (TUN) or Ethernet (TAP) tunnels between two hosts. The original driver was patched for Solaris by Kazuyoshi Aizawa (http://www.whiteboard.ne.jp/~admin2/tuntap/), his project also provides a "bridge" module and management tools - which may have been obsoleted by OpenSolaris dladm bridges.
These interfaces seem different from the tunnels, bridges, IPSec interfaces and etherstubs provided by OpenSolaris Clearview.
Determining whether they are different in fact, and how much (i.e. can a VPN server's dladm tunnel endpoint interact with a TUN/TAP on the client), is part of this RFE.

OpenVPN authenticates the server and client hosts to each other (using PKI certificates) and shares the desired networking config, then uses TUN/TAP to establish a tunnel between the server and client hosts over WAN, assigns private IP addresses in a /30 subnet dedicated for these two hosts bound to these new tunnel interface instances, sets up routing for VPN-protected subnets, and then apparently uses OpenSSL to encrypt each packet transferred in the tunnel.
Two OpenVPN hosts (client and server) can create tunnels and interact over either TCP/IP or UDP/IP. It is possible to configure two server instances with different client IP address ranges dedicated to support the two types of clients. It is also possible to redirect or NAT the (TCP/IP) service on the firewall, just like you would publish an internal HTTP server with NAT or even inetd+netcat.
Being a TCP/IP or UDP/IP session, OpenVPN does not come into conflict with firewalls which cut off other IP protocols, as well as it can work over NAT and those internet service providers who implement their generic internet services with an IP VPN (PPTP, GRE, PPPoE and so on).

Quoting from the TUN/TAP project FAQ,
1.1 The TUN is Virtual Point-to-Point network device. TUN driver was designed as low level kernel support for IP tunneling. It provides to userland application two interfaces:
- /dev/tunX - character device;
- tunX - virtual Point-to-Point interface.
Userland application can write IP frame to /dev/tunX and kernel will receive this frame from tunX interface. In the same time every frame that kernel writes to tunX interface can be read by userland application from /dev/tunX device.

1.2 The TAP is a Virtual Ethernet network device. TAP driver was designed as low level kernel support for Ethernet tunneling. It provides to userland application two interfaces:
- /dev/tapX - character device;
- tapX - virtual Ethernet interface.
Userland application can write Ethernet frame to /dev/tapX and kernel will receive this frame from tapX interface. In the same time every frame that kernel writes to tapX interface can be read by userland application from /dev/tapX device.

1.6 What is the difference between TUN driver and TAP driver?
TUN works with IP frames. TAP works with Ethernet frames.

1.5 How does Virtual network device actually work?
Virtual network device can be viewed as a simple Point-to-Point or Ethernet device, which instead of receiving packets from a physical media, receives them from user space program and instead of sending packets via physical media sends them to the user space program.

And paraphrasing the original example into VPN terms:
Let's say that you configured OpenVPN on the tun0, then whenever kernel sends any packet into the VPN network via tun0, it is passed to the application (OpenVPN daemon). Application encrypts, compresses and sends it to the other side over TCP or UDP. (In case of OpenVPN it is apparently OpenVPN's job to determine the internet client based on its virtual address). Application on other side decompress and decrypts the packets and writes packet to its local TUN device, and the remote kernel handles the packet like it came from real physical device.

The Solaris-patched TUN/TAP interfaces are STREAMS2, and so they can be snooped, they can be IP-routed and filtered/NATed by ipfilter.
However, there is no separate Solaris interface for each individual server-client tunnel - there is only a single interface with a larger network range dedicated for further micromanagement of dedicated subnets by OpenVPN (i.e. my management tools can only see a tun0 and tun1 interfaces on my server with two OpenVPN instances, one providing TCP and another providing UDP). As I said above, apparently it is OpenVPN's job to determine the internet counterpart's address based on its virtual address in the tunnel, and send packets into the correct socket. Likewise, it is problematic to enforce dladm bandwidth controls onto a certain client's tunnel.

=== Solaris TUN/TAP performance and CIFS

There are some performance problems (most notably in SMB/CIFS transfers), possibly related to these drivers, and the theoretical ability to use the new optimized OS-standard interfaces for tunneling might improve the situation. Alternatively, perhaps something in the original stack can be improved to speed things up. If the TUN/TAP interfaces are to stay, at least they could become managed by dladm and/or nwam to improve overall administerability and standardization of illumos/OpenIndiana networking.

One problematic use-case is detailed below:

I'm using OpenVPN 2.1_rc15 with Kazuyoshi's TUN/TAP driver 1.1 for Solaris on OpenSolaris (SXCE b129) as a router for a few years now, and it works reliably for a number of remote clients (staff from home, etc.).

One noted problem of our setup is the abysmal performance, most notably of CIFS downloads to the client from the secured network (Samba/Linux, Solaris/kCIFS and Windows servers).
For example, I can rsync files onto a Wndows notebook at about 800-900KBps from Solaris servers behind this router's firewall without OpenVPN, which roughly matches the client's internet speeds (marketed as "10Mbps DL/4Mbps UL", so around 1Mbyte/sec is expected). Using the OpenVPN tunnel, however, rsync speeds drop to 100-300KBps. Worse limitations - down to 60-100KBps (though somewhat expected after reading the forums, blogs and docs which all note CIFS' dependency on small packets) happen for CIFS over OpenVPN, both when browsing the shares and downloading files, even singular large ones.
Overall this is a 10x-15x degradation compared to "wirespeed" of the internet link.

In some sources it was implied that while some performance degradation may be due to OpenSSL operating on small packets, or TCP/UDP flow control mismatches, or MTU and fragmentations, much of the rest can be blamed on the TUN/TAP interface software - which is made differently for different OSes, often by different authors, and performs - well - differently. We've tried playing with MTU and TCP/UDP tunnels, but to no significant avail.

One interesting part of the experiment was the upload of files from the same remote client to the same server. Both the major methods - rsync over WAN, and CIFS over OpenVPN, performed roughly the same, at about 400KBps (client's upload speed provided by the ISP). With all other parts of the equation being the same, including the almost-idle CPUs of the client (Core DUO 2GHz) and the server (Dual-core AMD Opteron 2214), this means that pushing the packets into the tunnel and/or OpenSSL encryption on the Windows client notebook works a lot faster (at least 4x-7x) than on a better-sized OpenSolaris server.

In fact, a clumsy workaround was found to be better than direct CIFS: the Solaris fileserver automounts needed shares from the Linux and Windows Samba/CIFS services, and those "virtually local paths" are rsync'ed from Solaris server to the notebook client bypassing the VPN. This also works at internet speed (about 1Mbyte/sec). It also shows that the server network per se is not the bottleneck... The same rsync session to the server's private IP address (over VPN) runs at roughly 180-300Kb sec - better than CIFS, but 5x worse than "raw" rsync. CPUs are not busy during transfer, the openvpn process consuming about 1%, and 90% idling overall.

OpenSolaris ClearView/CrossBow tunneling interfaces, which are around for a while (since OpenSolaris build 125, that's about 3 years now), also present IP-in-IP tunnels, they require knowledge of endpoints' IP addresses, and are configured by commands like this:
  1. dladm create-iptun -t -T ipv4 -s $LocalIP -d $RemoteIP
    It is possible to separately configure IPSec security for the tunnel.

Perhaps it is possible to change OpenVPN to utilize these tunnels on OpenSolaris (via command-line or library API calls) instead of using TUN/TAP, and perhaps that would be more performant. Subject to research and testing...

#2

Updated by Bayard Bell over 8 years ago

  • Status changed from New to Rejected
  • Priority changed from Normal to Low

The scope of this issue is all over the place. We don't really do big RFEs, and this has a bunch of touch points in different gates. Slim this down to something a developer is likely to read and circulate it for discussion before putting it into a tracker. Start by talking to the networking list.

#3

Updated by Jim Klimov almost 4 years ago

FWIW, Enhanced support for OpenVPN, including multiple SMF instances for different setups, landed into OI/Hipster this autumn. Also "vpnc" and "openconnect" were added, all using tuntap drivers and different userspace logic to connect to various VPN protocols (IPsec and VPN/SSL).

It is still a different technology from dladm-manageable tunnels, and perhaps should stay this way since assignment of IP addresses etc. is done differently.

Sometimes there may be issues disconnecting a tunnel (that can block its later reuse if the addresses remain attached). Usually this can be solved by explicit "tunctl" management (also packaged). A few times, at least during initial integration, reboots were needed to untangle this.

Also available in: Atom PDF