Fire Protection-2.2

by Frank Bernard
translation by S.Czerny

Mechanisms and Possibilities for Firewalls of the new Linux-Kernel

No other market in the IT experiences such boosts as computer security. The increasing popularity of business intranetworking and being connected to the Internet requires securing the company network not only against viruses that are introduced offline via diskettes, but also against hackers who work online. Even decision makers have already recognized this.

When a company is connected directly to the Internet, it seldom takes long until the first invasion attempt. The author's record is 4 hours from connecting a company until the first invasion attempt from the Internet. Without a firewall the companies do not even notice that they have been invaded.

A firewall today must perform many different duties, of which protecting the local network is only the most important, but by far not the only one. It must
protect the own network against intrusions from the Internet
announce intrusion attempts
prevent successful attacks
regulate internet access with regard to time, users or workstations
support NAT (network address translation) and/or masquerading
check and filter data for viruses and other unwelcome contents
block non-productive access, e.g. to sex sites
cache HTML and FTP access
supply statistics about uesd services and e.g. web access
allow distribution of costs among departments of a company
be transparent to the users
allow VPN (virtual private network) with several sites
allow the boss access to the network from his home-PC
support any number of network segments
allow remote administration
be reliable and stable
perform well and without delay
make administtation and updates easy

One thing is absolutely clear with this list that makes no claim to completeness: a classic packet filter by itself is not enough. When you buy a firewall you have the choice among hundreds. If you substitute the word "firewall" with the word "car" (I want to buy a new car), you see why the price span for a firewall ranges from the price of a ten year old Fiat Uno to that of a brand new Rolls-Royce. And as when buying a new car you have to look closely at the standard equipment. If such a thing is possible at all, for a firewall with all mentioned features a six-digit price will be due.

But you can get it for much less money, and without any quality loss: with Linuxwall.

Why Linux ?

Well, first of all because of stability. Linux runs like the VW Beetle of yore and need not be rebooted at least once a week, like other operating systems. There are said to be administrators who avoid a really necessary hardware update of the machine so as not to endanger the uptime-record (cat /proc/uptime). The network part is fully developed and can stand the comparison with other Unix operating systems.

The next argument is availability. For no other operating system there are as many kernel features and (free) tools as for Linux. Each and every application has already been thought over by someone in the Linux community, and they have written something about it or even published a program. Maybe sometimes enthusiasm outweighs competence, but still the search for a starting point that will save work, pays off.

The Linux kernel offers a wealth of possibilities to realise a firewall. That was true for 2.0, in 2.2 there have been changes.

At the moment he topicality of drivers lags a little behind the "Program loader with GUI". You had better not employ the latest hardware, if you are not sure that there is a Linux driver for it. Waiting one or two months sometimes works wonders. On the other hand commercial firewalls have been developed with other Unix derivatives and can run only on special (read: older) hardware platforms. That way you can soon have a problem with spare parts that does not exist due to Linux. wide circulation.

The scalability on the other hand is unmatched. There is an unofficial contest, who can run Linux on the most minimally and most maximally configured machine. The span reaches from a 2-MB-386 to the clone project (Linux-Magazine 2/99). The author has developed his first Linux firewall on a cast off diskless 8-MB 386 that boots from diskette and obtains its root file system via NFS from another Linux server.

And then there is readiness to help. Other operating systems are said to have the advantage of the manufacturer answers for the operating system and remedies errors as soon as they are discovered. Depending on the manufacturer that costs more or less money, and in the case of errors in addition lots of time and nerves. The result is in most cases little or nil. Linux has no manufacturer that I as the IT-responsible could make responsible (CYA : Cover Your A..) and so most decision makers find it suspect. But with Linux it has happened to the author several times that in really hopeless situations, after three days of testing and the first call for help with all facts in the newsgroups, within one day several functioning solutions were supplied, some from the developers themselves. (Footnote: once somebody asked in a newsgroup how to find out the version number under bash. After 5000 answers he posted "Thank you, no more answers please!")

Error correction with Linux is faster than with any other operating system. For an illegal statement of the Pentium that paralysed the machine in user mode, a patch for Linux was available within 12 hours. Intel confirmed the correctness of the patch, and all other operating systems just copied it.

Another important factor is transparency. It is possible to prove . in contrast to all other firewall suppliers . that there are no back doors in the firewall.

And last not least the price. The entire operating system, the utilities and almost all other parts are free and cost nothing. The only expenditure, but this can be quite large for the tasks mentioned, is the work you have to spend on collecting and patching all free parts.

Kernels

The new Linux kernel offers a variety of features that support firewalling. In addition there are external kernel patches for special applications like VPN.
Packet filters, one of the lower levels of a firewall, and absolutely necessary.
The ipfwadm packet by Jon Vos was replaced by the more flexible ipchains packet by Paul Russel.
Packet defragmentation, makes all packets filterable.
Masquerading, important if you connect an entire LAN with only one Internet IP address.
IP-Spoofing detection, if you want do detect faked Internet addresses.
Transparent proxy, so that you need to change no settings on the browsers.
Forwarding, can be switched on or off for different interfaces.
ISDN support, makes a separate ISDN router unnecessary.
Online tuning via the /proc file system.
Pseudo network interfaces, like Dummy interfaces, important for the test without real network cards
(IPIP-)Tunnel, absolutely necessary for VPN
Ethertap, for rerouting and channelling routes
Netlink, for observing and editing certain packets or protocols
Policy routing, for setting priorities among protocols.
Traffic shaper, to limit used network bandwidth.
Linux socket filter, to filter any Unix sockets.
Aliasing, enables a network card to have several IP addresses with different attributes.

Since a Linux firewall is simultaneously also a router, this combination can be more finely tuned than with separate machines. It is generally advised to bring together the different network segments of a LAN, up to 16 segments with today. s multiple Ethernet cards. With other firewalls the number of network interfaces is limited to two or three and so you do not have this degree of freedom.

Packet filter

The program ipchains is the user interface to the packet filter in the kernel. For each call of the program a rule for IP packets is specified, other protocols (e.g. IPX or NetBEUI) cannot be treated. A rule defines the properties of a packet and defines an action, if the packet meets all specified properties. Properties are, among others:

After specification of properties follows an action:

In addition, a packet that matches all specified properties, can be handed to the kernel logging and/or forwarded to the netlink device (see below).

By linking of properties and action individual rules can be specified, e.g. accept an incoming packet at interface eth0 from any machine and any port to the machine 123.234.1.2 target port 80; as command ipchains -A input -i eth0 -d 123.234.1.2 80 -j ACCEPT

For a normal TCP connection across the firewall six rules are necessary, three each for the way in and out. If for example we want to allow web access from the local net into the Internet, these rules are for the local network 212.212.212.0/24 at eth0 and the Internet at eth1 :

LAN=212.212.212.0/24
ipchains -A input -i eth0 -p tcp -s $LAN -d 0.0.0.0/0 80 -j ACCEPT
(input from the local network)
ipchains -A forward -i eth1 -p tcp -s $LAN -d 0.0.0.0/0 80 -j ACCEPT(transfer to another interface)
ipchains -A output -i eth1 -p tcp -s $LAN -d 0.0.0.0/0 80 -j ACCEPT (output to the Internet)

ipchains -A input -i eth1 -p tcp -s 0.0.0.0/0 80 -d $LAN -j ACCEPT (input from the Internet)
ipchains -A forward -i eth0 -p tcp -s 0.0.0.0/0 80 -d $LAN -j ACCEPT (transfer to another interface)
ipchains -A output -i eth0 -p tcp -s 0.0.0.0/0 80 -d $LAN -j ACCEPT (output to the local network)

Pic, single paketA typical packet filter layer of a firewall consists of hundreds of rules. The first suitable rule with a real action is taken and executed. One single rule that is stated wrong, can be a fatal security hole. Therefore it is always advisable to state rules as narrow as possible and to expand them only when the need arises.

Therefore I always state a "log rule" as last rule for the three chains (ipchains) input, forward and output. It applies to all packets that have made it to the end of the chains and have not yet been affected. They are rejected and at the same time entered in the syslog, so that then a suitable rule can be issued.

Let us now regard the case that we are using only one real (static) Internet address. The local network is "masked", that means on the firewall the network connections are translated. An observer in the Internet sees one single machine that has many connections to different targets, an observer in the local network sees direct connections to the target machines. The generalisation of masquerading, where a n:1 relationship is realised, is called NAT (Network Address Translation). NAT makes a m:n relationship. In the firewall a table with the relationship of supposed connections between client and server, connections between client and firewall as well as masked connections between firewall and server is maintained. With masquerading, the firewall is transparent to the client, but the server sees a connection to the firewall. The masked packets are altered in the firewall in such a way that the two other machines concerned are presented with what they should see.

Therefore our rules have changed as follows:

LAN=10.1.2.0/24 # private address space after RFC1918
FWADDR=212.212.212.212
ipchains -A input -i eth0 -p tcp -s $LAN -d 0.0.0.0/0 80 -j ACCEPT
(input from the local network)
ipchains -A forward -i eth1 -p tcp -s $LAN -d 0.0.0.0/0 80 -j MASQ (transfer to another interface)
ipchains -A output -i eth1 -p tcp -s $FWADDR -d 0.0.0.0/0 80 -j ACCEPT (output to the Internet)

ipchains -A input -i eth1 -p tcp -s 0.0.0.0/0 80 -d $FWADDR -j ACCEPT (input from the Internet)(demasking is done by the kernel, rule is omitted) (transfer to another interface)
ipchains -A output -i eth0 -p tcp -s 0.0.0.0/0 80 -d $LAN -j ACCEPT (output to the local network)

With dynamic IP address allocation, which is usually the case with dial-in connections, this rule cannot be specified statically. In the script IP-up, these rules (with -I instead of -A) and the assigned IP address have to be inserted, and in the script IP-down (with -D instead of -A) they must be removed again.

In many publications problems are discussed that arise with FTP and packet filtering. For this reason passive FTP was invented, which is by default used by browsers, but bot by online sessions. In general packet filters have potential security problems if a session uses several connections, often also in the UDP area. Known candidates are e.g. FTP, IRC, pcAnywhere. To allow these connections statically, the rules have to be specified broader than originally intentioned. Only a so called proxy process can insert or delete rules dynamically, by interpreting the underlying protocol (e.g. FTP) and performing the necessary actions.

Proxy

As you can see with the example of FTP, a packet filter alone is insufficient for acceptable security in the LAN. Proxy means just that, a representative. A proxy in the firewall represents the actual service that it hides, a FTP proxy for the FTP server, a HTTP proxy for the web server and so on.

A proxy process can perform many tasks:

Proxies can be cascaded, for instance for HTTP access. The proxy on the client's side for example checks authorisations and transfers the request to a special HTTP proxy, e.g. Squid or Apache.

When a proxy is used, the forward rules are omitted. Masquerading is not necessary, because the packets from the networks end at the proxy and thus are the target address for the client and the source address for the server. But the problem with this kind of proxies is that the user is aware of the proxy and that additional actions are necessary, e.g. changing settings in the browser.

Transparent Proxy

A real highlight in the kernel . albeit not new . is the possibility of transparent proxies. For this an incoming packet can be redirected to the firewall with the ipchains option "-j REDIRECT". The Linux kernel provides mechanisms that let the original target be recognized again. Transparent proxies automatically mask the source address, all connections seem to originate from one machine. Just this property allows the use of ENskip, a packet for building VPNs that was developed at the ETH Zurich.

TCP

For TCP connections you implement a proxy that accepts that connection. With getsockname(), which does not deliver its own address over a redirected connection, you can determine which machine was to be reached originally. The second socket is established to this machine. The rest is simple copying to and fro. Client and server do not become aware of the firewall, q.e.d. .

Cascaded proxies can also be realised. This becomes interesting for proxy web caches like Squid or Apache. With this rare but interesting and often used special case the HTTP proxy server is configured in such a way that it is bound to the address 127.0.0.1 instead of 0.0.0.0 . The serving outer proxies on the other hand are bound to the addresses of the network interfaces, one process per network interface. A connection request reaches the network interface, is from there redirected to the address of this network interface, and thus received and accepted by the outer proxy process. A second connection is established to the HTTP proxy process (Squid or Apache) and only this establishes the actual connection to the target machine. Now, to keep HTTP processes from establishing recursive processes themselves (painful own experience), in this case the HTTP header has to be modified, namely "http://localhost/..." instead of "http://firewall/...". Therefore the target address of the actual target is compared with all addresses of the network interfaces. The disadvantage of this construction is that a proxy must not be entered in the browser. Else all requests end at the local web server. Again the rest is simple copying to and fro.

UDP

If you do not need any VPNs with ENskip, or additional properties like authentication check or logging information, a transparent UDP proxy can be emulated through masquerading. The kernel then takes charge of everything else. The author had to use Enskip and so could not use this possibility.

With UDP the case lies more complicated than with TCP, there is no connection. Each packet can stand by itself or be the answer to another packet. There are many applications, starting with DNS and traceroute, that use UDP. Since you do not know if a certain packet is the answer to another packet, there is no other way than to remember all UDP packets and define expiry dates for quasi UDP connections (e.g. DNS request and DNS reply). When an answer packet arrives the time limit is reset to the maximum value. For each observed UDP port (and network interface, if necessary) a process is needed, like with TCP. The proxy process has a table that contains the original source and target, the involved UDP ports, time limit and the file descriptor of the answer socket. Now if a UDP packet arrives at the bound UDP port with select(), the table is searched for the real target address (the trick here is the undocumented recvfrom(...,MSG_PROXY,...) and reading the kernel sources). If the entry is not found, an answer socket is opened, and then the packet is delivered through this answer socket. An incoming packet at this socket is supplied with a new header using the information of the table, and delivered to the original source.

The procedure described here is not the ultimate lore and has a number of weak points, among others the large number of processes and files (traceroute !), but it runs very reliable in practice. Still suggestions for improvement and flashes of genius are always welcome.

ICMP

A transparent proxy, like with TCP and UDP, cannot be built with the means described up to now, for this the necessary kernel support is missing. Therefore new ideas must be developed here, if we want to work with the ping command, which is expected from all customers. Fortunately there is the option . o in the ipchains command, which delivers packets to the netlink device 36,3. Since there is no name for this device, not even in standard documents, i have called it /dev/firewall to simplify matters. Our proxy process accesses this device. Firstly, different from TCP and UDP, the packets are not redirected but discarded, and at the same time sent to /dev/firewall, that is "-o -j DENY". The proxy process opens this device and reads. All packets are written there as they arrive, fashioned with a header with specifications of total length. The further procedure is again like with the transparent UDP proxy: a table is maintained, and source, target, type, code and ID are compared. In the case of a match a new ICMP packet with adapted source and target are issued from the proxy. Since there is probably more than one route, the correct gateway has to be determined for sendto() in advance.

The amazing thing is: if all incoming packets are re-routed with "-o -j DENY", you have a monolithic approach, where one single process inspects all packets (also TCP and UDP) and decides what is done to them. This approach is opposed to the layer model, it is called "stateful inspection" and is followed in the firewall of Checkpoint, the market leader at this time, product name Firewall-1.

Other IP protocols

There is a number of interesting protocols, such as SKIP, IPIP, GRE, IGMP and EGP, that deserve a transparent proxy. Particularly the multicast protocols and the encapsulation IPV6 in IPV4 will gain importance in the near future. In principle these protocols can be handled only through direct interpretation of packet contents and with a procedure like with ICMP.

Web and FTP caching

Even though almost every browser today has an integrated cache, the multiple calling of a web page by different clients cannot be handled. A frequent use for cascaded transparent proxies are applications like e.g. Squid or Apache, that can be configured as web cache and proxy. Apache has the additional advantage that it has an integrated HTTP, and with additional modules can be boosted to an HTTPS server. With this in addition an encrypted remote configuration is possible . provided the appropriate web pages are available.

Machine A calls an Internet web page at machine B across the firewall. The call (Port 80/www) is redirected by REDIRECT to the proxy process (A). Which rewrites the HTTP header and addresses the firewall HTTP proxy (B) (Squid or Apache). The HTTP proxy (B) gets the web page, transfers it to the proxy process (A) and also puts it in the local cache. The proxy process (A) performs additional checks. For example with the so called content filtering viruses are searched for, JavaScript, Java or DCOM(ActiveX) objects are masked, cookies blocked, and so on. Then this (or in the case of a virus a replacement) page is given back to the client (machine A).

The utilisation as FTP cache is simple, too: as URL the FTP call is ftp://the.server/path.to.the.file. You write a transparent FTP proxy that manages the FTP protocol until the moment when a file is called. Instead of now fetching a file via FTP, a HTTP request is made to the firewall HTTP proxy. Whose cache is thus used automatically. With this multiple download of large files becomes possible, without really putting load on the Internet bandwidth.

Ethertap

With the ipchains command there was the option . o to redirect packets to a pseudo device. When they are read, these packets have a header additional to the actual packet. Ethertap devices (/dev/tap0 bis /dev/tap15) are interfaces between IP layer and device. The network interface tapN, like any other interface, can furnished with address, net mask, etc. and be used for routing, the device /dev/tapN can be used like a normal device (that will however only process formally correct IP packets). All IP packets that are written to an ethertap device /dev/tapN, are issued as packets on the network interface tapN , all packets that arrive at the network interface tapN can be read at the ethertap device /dev/tapN .

The difference to the ipchains command consists in the fact that not certain packets of all network interfaces are redirected, but all packets of a certain network interface.

With these devices for example it is possible to redirect routes to the virtual ethertap network interfaces and to realise proxies on this level up to "stateful inspection" on the user level.

Linux Socket Filter

The Berkeley Packet Filter (BPF) was godfather to this mechanism. You open any socket, specify a little piece of program coding for this socket's filtering and inject this program coding into the kernel by a ioctl call. The program itself is then used only to keep the socket open and remove the filter again. This method is surely the fastest filtering, useful for high throughput. Its disadvantage lies in the programming on kernel level with the poor possibilities for debugging and tracing, which also includes the possibility of a total failure of the firewall.

/proc File System

Using the pseudo file system /proc system variables can be queried or changed during operation without reboot or recompilation . With cat filename the variables can be queried and if allowed, be set with echo 'value' >filename . Important values can be found e.g. in /proc/net/ and /proc/sys/net/ipv4 . With those the system can be very finely tuned as firewall, for example one forward option per interface (for classic firewall approaches with a secure administrative interface). For known attacks like SYN flooding, IP spoofing or source routing, the level of security and talkativeness of the kernel, in dependence of the network interface, can be set through these variables. The acceptance of known router protocols, which can be a point of attack on firewalls, can be restricted to certain gateways, or be entirely denied.

Documentation for the individual files can be found in the subdirectory documentation/networking of the Linux source.

Traffic Shaper and Scheduling

A restriction of bandwidth can be realised with the traffic shaper. Its functioning is interface oriented, and cannot differentiate services or source/target pairs. It is in the alpha stage and only usable in a limited way. The development seems to stagnate at the moment.

The focus of development is on scheduling algorithms at the moment, that is the behaviour in the case that more packets are to be issued than fit the available bandwidth. Here you can choose from a number of algorithms. With the ipchains command (as algorithm) a check mark can be applied to the packet, that is summed up for each rule met. The higher this value, the more likely the packet is processed first.

VPN

With the module ENskip, a Sun SKIP compatible clone by the ETH Zurich, Tunnelbau-Dienstprogramme, developed by the routing guru Alexey Kuznetsov and several patches from the author, VPNs (Virtual Private Networks) can be realised, also with dynamic dial in addresses. VPNs are networks that use the Internet, secured by encryption, to build a company-wide, uniform network between locations. Instead of expensive leased lines only local charges for the connection to the local internet provider are due, in addition the Internet can be used. According to the target address the decision is made if a packet is encrypted and sent to a target firewall (to be unwrapped there and conveyed further) or if it goes on to the Internet. By using standards (IPsec), Linux is able to be interoperable with other manufacturers in this case, too. The description how VPN works with a Linux firewall, and which packet filter rules have to be specified, would go beyond the scope of this article, and provide enough subject matter for another ( :-) ). The valued reader can request a LinuxWall demo CD from the author, where the technical details are described.

The Enemy from Within: Observations during NT Setup

One day . NT systems were being installed for a customer - illegal packets were reported in the firewall log. It turned out that always exactly three SNMP trap packets from the systems that were being installed were sent to an Internet machine. The attempt to collate a name to the target address led nowhere. Traceroute eventually failed at a firewall that did not hand us through. The machine did not answer to a ping. Analysis of the packets yielded no result, either. Was this coincidence or is there yet another (of course unintentional) registration of micro-soft behind it? The author would appreciate hints of similar experiences.

In principle this incident also proves the importance of not only blocking access from the outside to the inside, but also of regulating the access from the inside to the outside. Else information could unintentionally fall into hands in which you do not want it. Against intentional transfer only pulling the network plug and taking out the floppy drive works, anyway.

Upshot

Linux is a first class and secure platform for realising firewalls. Analogous to the other areas where Linux is successful: everything you need will work, but there is no click. and-go user interface that makes the configuration somewhat easier. Here commercial products have the leading edge, even if the do not offer everything that Linux is able to achieve. Full NAT would come in handy, in the last kernel versions since 2.2.3 this was initiated. Main criticism is the isolation of the various solution approaches, there is no co-ordinating redesign that would also include other protocols (e.g. IPV6). There is still much work to do in the kernel for tunnelling and ENskip. The author would gladly give technical assistance and coaching to academic projects and diploma theses.

Further Literature and References

By studying parts of the Linux source code, even if not everything was understood, I was put into the position of being able to realise quite many parts of the firewall, and I have learned many things. The subdirectory documentation, although it is brief, one of the most important information sources. The HOWTOs of the Linux Documentation Project are the next level of information. For further particulars and links to firewalls, and especially Linux firewalls, we refer you to the homepage of the author, http://www.linuxwall.de.

The Author

Frank Bernard is a graduate computer scientist and has been concerned with firewalls for several years. Since the kernel version 0.99.12 (1992) he is a Linux fan and enthusiast, and wants to furnish every machine with a "real" operating system. He works freelance and has developed the LinuxWall. He can be contacted under frankb@fbit.de.