Promoting Linux Requires Advertising. It Matters to Me. TM
GnuCash Personal Finance Manager
GnuCash!

Linux Network Address Translation

Network Address Translation (NAT) is a vitally important Internet technology for a variety of reasons. It can provide load balancing for parallel processing, it can provide several types of strong access security, and it can provide fault-tolerance and high-availability. Finally, it can simplify some basic network administration functions. Below, we sketch the possible uses, and then follow up with Linux-specific applications.

Internet Traffic Load Balancing
IBM describes a router which they used for parallelizing web server queries for the Olympic Games web server. If I understand the trade press properly, this router takes a TCP/IP connection request, and re-labels it, redistributes it to one of many web servers ("mirrors") operating at different IP addresses. Each server maintains an identical set of web pages. The user is unaware of the existence of multiple web servers/mirrors, as they (i.e. their browser) connect to the externally published, well-known domain name. The mirrors were geographically distributed (Atlanta, New York, California), and requests were routed to the least-busy and/or ping-time closest server. Although the above description is technically vague, the utility of such a technology is obvious.

Intranet Compute-Server Load Balancing
The web is not the only client-server technology that can potentially have trouble when there are too many clients trying to access the same server. As an example, a database server may be trying to fulfill database queries. Since database queries are much, much more CPU intensive than simple web-page queries, it is easier to overload a database. If, however, the database queries are all "read-only" queries (they do not modify the database), then it is possible to distribute the load to several machines. Network Address Translation (NAT) can provide the mechanism. With NAT, the headers of IP packets that come into one machine are re-written, and forwarded to the least-busy database server in the cluster. The reply packets from the servers are again re-written and returned to the client, thus making it appear that there was only one database server with only one IP address.

Note that such a scheme provides not only load-balancing and improved performance, but it also provides fault-tolerance: individual servers can be taken off-line and serviced, while the overall system continues to operate without stopping.

Firewall Security through Masquerading
One important security concept is that it is much easier to guard a single point of entry than it is to guard many points. This is the principle behind the Internet firewall: a single machine that divides the network into the "inside" and the "outside", with all traffic passing through the firewall. By protecting the single network firewall, the entire internal network can be protected. Masquerading allows insiders to get out, without allowing outsiders in. Masquerading re-writes the IP headers of internal packets going out, making it appear that they all came from the firewall. Reply packets coming back are translated back, and forwarded to the appropriate internal machine. Thus, inside machines are allowed to connect to the outside world. However, outside machines cannot: in fact, they cannot even *find* the internal machines, since they are aware of only *one* IP address, that of the firewall. Thus, they cannot attack the internal machines directly.

Besides providing this type of basic security, Masquerading also simplifies network administration: The admin of the internal network can choose reserved IP addresses, e.g. in the 10.x.x.x range, or the 192.168.x.x range. These addresses do not have to be registered with the InterNIC, and can be used however the sysadmin wants, as long as they are not used on the external network. Note that this also alleviates the shortage of IP addresses that ISP's are facing: A site with hundreds of computers can get by with a mere 8 or 16 Internet IP addresses, without denying any of it's users Internet access.

Interactive Web Site Security
An increasing number of web sites are becoming interactive by having cgi-bins or Java applets that access some database or other service. However, this sort of access can be a big security problem: the database typically has to be hidden behind a firewall, where it cannot be attacked, the web server and cgi-bins/Java applets must obviously be outside the firewall, so that web surfers can get to it. This is particularly true if the database contains customer information, financial information or other sensitive, confidential information, or if the database runs on a mainframe or other internal server that cannot or should not be connected directly to the Internet.

NAT in the form of Port Forwarding can provide an almost ideal solution to this access problem. On the firewall, IP packets that come in to a specific port number can be re-written and forwarded to the internal server providing the actual service. The reply packets from the internal server are re-written to make it appear that they came from the firewall. Thus, Port Forwarding is becoming, and will be a very important Internet technology.

Mobile Employees
An increasing number of corporate employees have gone mobile: they are roving about, with their laptops, doing work at customer locations. However, these employees need access to internal servers, and they need that access to be secure and encrypted. One solution for such access requirements is to run and encryption technology, such as SKIP or IPSEC to the firewall, with a configuration that gives users access to important internal servers and networks. Currently, the leading IPSEC implementation on Linux is FreeSwan; it co-exists just fine with the standard Linux iptables network filtering code. Note that there are IPsec clients available for many versions of Windows, and that Microsoft ships an IPsec implementation with Windows XP, although the XP license may prevent you from legally using this in interesting ways; in particular, using it in together with VNC.

Alternately, mobile employees may want to publish servers on thier laptops, and make those servers fidable and available despite a dynamically asssigned IP address. Technology that enables mobile IP which allowing servers on the mobile platform is refered to as RAT -- Reverse Address Transalation.

Related Topics

Attention!

The last major revision to this page was in 1998, and many of the links and references on this page are rather outdated. Things have moved on since this page was first written. A fairly flexible and sophisticated NAT has been implemented in the 2.4.x Linux kernel, through a set of highly flexible IP filter tables. The iptables utility is the prefered way of configuring Linux network translation. You should check to see if iptables meets your needs; if it does not, then you should probably investigate some of the references below.

Alternatives, Hints & Solutions

Netfilter/IP Tables
Netfilter/iptables is the defacto standard NAT/packet-filtering/firewall tool for Linux-2.4 and later kernels. Chances are excellent that your favorite Linux distribution has packet-filtering/firewalling correctly enabled in the default kernel, and includes the iptables utilities as a separately installable package. Thus, you need merely to install the tools, and then read/understand the FAQ's, HOWTO's and Tutorials. There are also several graphical tools for configuring the filter rules; however, they always seemed to be a bit underwhelming, failing to significantly simplify the (somewhat arduous and complex) task of setting up the filter rules.

IPChains
IPChains is an older Linux firewall/packet-filtering tool for Linux-2.2 kernels. It has been replaced by IP Tables (above). ipchains is a standard network utility that should come with all Linux distributions, and so the place to search for documentation is on your own computer: man ipchains. See also the ipchains HOWTO. ipchains replaces the older iptables and ipfw utilities. If you are running the older Linux 2.0 or 2.2 kernels, then the IP Masquerade HOWTO is the aproriate place to start. Those interested in NAT for firewall and security purposes should review the Linux Firewall Tools web page.

The Eddie Project
The Eddie Project offers a broad and powerful set of Open Source tools for solving a variety of cluster management and server farm load balancing problems. This project, supported by the telecom giant Ericsson, appears to be the most comprehensive, well-balanced package of offerings out there. It provides support for four major subsystems:

RFC 1631
RFC 1631 (alt) describes the "traditional" NAT (Network Address Translation) that can be used for this kind of a task. Basically, the idea behind NAT is to re-write the IP headers and substitute one numeric address for another. This document discusses some basic implementation issues, such as computing header checksums, and mentions problems with packet encryption, and ICMP. It does not discuss load-balancing or masquerading issues.

Some limitations of this traditional approach are discussed in the The Linux IP NAT theory of operation, including masquerading, load-balancing, fragmentation and keeping kernel state information.

Masquerading
One variation of NAT, called masquerading, is already available in stock Linux kernels. The theory, tools and installation procedure are discussed in the IP Masquerade mini-HOWTO. Masquerading is designed to provide security. It is intended for use as a type of a firewall, hiding many hosts behind one IP address, and relabeling all packets from behind the firewall so that they appear to be coming from on location, the firewall itself. IP Masq is very powerful and flexible in this respect, and the filter & accounting rules can configured to handle complex network topologies. However, it does not currently support the inverse operation of distributing incoming packets to multiple servers.

Note that Linux Masq does not only "pure" NAT, (i.e. not only re-writing IP packet headers), but also "impure" packet re-writing in order to handle the use of services such as FTP, IRC, quake, RealAudio, CUSeeMe, VDO Live, Microsoft PPTP, etc. from behind the firewall.

Linux IP Network Address Translation
Currently, there are more NAT implementations for Linux than one can shake a stick at. They vary in features supported, design choices, popular appeal, and more. Personally, I beleive that the situation is ripe for consolidation and collaboration.

NAPT (Network Address and Port Translation) in general is becoming a *very* important web technology, since more and more web sites are trying to be interactive. To be interactive, they have to have cgi-bins or Java applets or whatever that have to chat with one or more databases. The database may contain things like customer info, credit card numbers, or other financial/personal/confidential stuff, and so you want to hide it behind the firewall. NAPT is *crucial* for allowing client access while maintaining security. (The firewall, since it presents a single point of entry, is easier to monitor and guard than trying to have many machines exposed on the Internet).

As far as I know, no one has tried any of these NAPT implementations with ENskip or IPSEC. Note that this is another important application: with ENskip, clients can talk in an encrypted fashion with the firewall. After decoding on the firewall, it would be nice to forward the packets to the appropriate service behind the firewall. Anyone who has tried port forwarding with ENskip or IPSEC, please let me know.

Linux Virtual Server
The Linux Virtual Server Project aims to build a scalable virtual server from a cluster of real servers by using IP traffic load balancing mechanisms. The virtual server is implemented as a kernel module, based on the Linux IP masquerading code and Steven Clarke's port forwarding code. It can dynamically forward an arbitrary IP connection on given port on the firewall to a server choosen from a cluster. Dispatch uses a weighted round-robin scheduling algorithm.

The goal of this technology is to enable scalable servers, such as scalable web servers, to be built from a cluster real servers, while providing the security, filtering rules, and IP hiding/translation aspects of a NAT-style firewall. In this sense, it provides a far greater level of masking/hiding that the IBM Network Dispatcher, and resembles Cisco LocalRedirector more closely in operation.

ONE-IP
The ONE-IP Project implements network clustering using techniques that in many ways are superior to traditional NAT as described above. One method of acheiving distribution is with packets routed to a gateway which then dispatches based on hardware address, rather than IP address. Thus, all servers on the LAN segment have the same IP address, and reply to clients with that single IP address, thus avoiding the overhead of NAT re-writing. Furthermore, since dispatching is stateless, and the router, gateway and servers sit on the same segment, failover of the dispatcher is considerably simplified. Another method elminates the need for a dispatcher by broadcasting on the local segment, and having servers respond seletively based on a hash of the source address. Both of these techniques seem to be quite robust to me; I've bought into the theory. The code is for NetBSD kernels.

RAT Reverse Address Translation
RAT or RAPT (Reverse Address and Port Translation) allows a host whose real IP address is changing from time to time to remain reachable as a server via a fixed home IP address. In principle, this should allow setting up servers on DHCP-run networks. While not a perfect mobility solution, RAPT together with upcoming protocols like DHCP-DDNS, it may end up becoming another useful tool in the network admin's arsenal.

Like most Mobile IP research, RAT is being done on Linux first. The Mobile Computing Group at the National University of Singapore has a number of Mobile IP projects underway, including an early stage GPL'ed implementation of RAPT called Raptile. Testers and developers are solicited.

HTTP/WWW Load Balancing
Several URL-based load balancing technologies are generally available for Linux, either as open source, or as products.

Redundant, Load Balanced Firewalls
A product review entitled A Solution to Redundant, Load Balanced Firewall Systems discusses some of the issues surrounding fault tolerance in relation to firewalls.

Coyote Point
Coyote Point Systems, Inc. offers a EIA rack mount box that provides fault tolerant load balancing of IP traffic.

RFC 1794 (DNS Support for Load Balancing)
RFC 1794 (alt) provides a description of a way of answering DNS requests for a single, well-known domain name with multiple, different IP addresses. This allows connections to a single domain to be routed to geographically separate servers, without forcing all IP traffic through one NAT-enabled router. As such, it is geographically robust load-balancing solution. However, the mechanics of doing the actual load balancing are considerable more difficult.

A description from the net (edited):

>Date: Wed, 4 Sep 96 17:34:13 CDT
>From: subbu@htc.honeywell.com (Subburajan Ponnuswamy)
>Subject: Re: Parallel IP Routing
>
>When you look for a machine
>over the Internet, say www.xxx.com,  DNS resolves this name by
>contacting other DNS servers (unless it's already in its cache). 
>However, at the other end, www.xxx.com may have multiple IP addresses 
>(i.e multiple machines, multiple interfaces, etc). Whenever a request
>comes in for an address with multiple IPs, the DNS server sends all the
>valid IPs back in the reply. However, the order of the IP addresses
>play an important role as the requesting machine will start from the 
>IP address which appears first..
>
>You can implement your own algorithm, to shuffle the addresses, 
>add/delete the addresses,  etc., whenever a query comes in for 
>www.xxx.com. Your algorithm may  send IP addresses back to the 
>requesting machine, such that the least  loaded or geographically 
>closest machine appears first. Or you can  do a simple round 
>robin of IP addresses. The TTL value and domain update interval
>should be set accordingly.
>
>The recent release of BIND has the code for this feature (very 
>simple extension of current named) under contrib. 
>
>--Subbu* -- subbu@htc.honeywell.com  ------  Opinions are mine only..
One serious limitation of DNS-based load balancing is that many/most implementations of BIND view TTL's of less than 300 seconds, and zone refreshes more often than 15 minutes as being irrational and the symptom of a broken configuration. Given that most HTTP transfers only last a few seconds, and socket keep-alives for minutes, it is impossible to do load balancing on a fine grain. At best, DNS can support only a "stochastic" load balancing, redirecting clients to servers randomly, as various caches in various resolvers expire at random (although small) intervals.

Address Failover, HA
A related topic occurs in discussions of High Availability (HA) which deals with issues of fault-tolerance and redundancy. One important componenet of this is address takeover, where the backup machine takes over the IP address(es) of the failed machine.

Firewall-1 from checkpoint.com
See http://www.checkpoint.com/products/contnav_concon.html for another simple description & overview of the same concepts. The product runs on NT and a number of Unixes. Unconfirmed rumours of a Linux port. Similar to Linux NAT, it is integrated with a firewall.

Shock Absorber
News (Sept 1996): IBM has made a beta-test version of their software available, the ShockAbsorber, from the IBM AlphaWorks web site. It is available for the IBM RS/6000 only, although their FAQ indicates that it can distribute to other boxes, such as Linux, without modification.

If the above URL's do not work for you, try entering the alpha works site, and search for "shock-absorber". That site uses funky URL's that are not valid when put into bookmarks or web pages ...


Misc



Copyright (c) 1996, 1997, 1998, 2002 Linas Vepstas All Rights Reserved.
Created 31 August 1996 -- Linas Vepstas
Last updated November 2002 -- by Linas Vepstas
(linas@linas.org)

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included at the URL http://www.linas.org/fdl.html, the web page titled "GNU Free Documentation License".

Go Back to the Enterprise Linux(TM) page
Go Back to Linas' Home page