Firewall Configuration Prerequisites
By Jay Beale, Lead Developer, Bastille Linux Project (email@example.com), Principal Consultant JJB Security Consulting and Training (C) 2000, Jay Beale /FONT>
A few weeks ago, I wrote a piece on the improvements in Linux firewalling. In preparing to write the followup article showing you how to actually build a SOHO firewall with Linux 2.4, I realized that firewall configuration requires at least a minimal understanding of TCP/IP fundamentals. This article, then, is a pre-requisite for next Monday's followup. We'll look at firewall placement, default deny/allow policies, and very basic TCP/IP fundamentals. Finally, while the follow-up will be fairly detail-oriented, this article should be useful and accessible to everyone from the system administrator to the network administrator to the CTO.
With that said, let's start to consider the issues. First, placement:
Where Do I Put the Firewall?
Before we can really get into the firewall design, we really should consider where this box fits into your network architecture. See, a firewall generally serves as a point of connection between two networks. It's often called a "choke point" as it serves to stop selected data from crossing over from one network to another. In most SOHO environments, the two involved networks are: 1) the Internet and 2) your company's internal network.
In any environment even slightly more complex, you might place these firewalls in additional locations: between your financial/HR network and the rest of your company's networks, perhaps. You might also use a single firewall to mediate connections between three or more networks. Many companies like a "DMZ" situation, where the firewall mediates connections between 1) their internal network, 2) their public network of web servers, mail servers, and DNS servers and 3) the Internet. This article by Kurt Seifried discusses this "placement" decision in greater detail. My SOHO firewall will cover the simplest case , because it is the most common in a SOHO environment, but it is good to keep others in mind.
So, given a knowledge of where you're placing your firewall, you now need to talk about the general firewall policy decision: default allow or default deny.
Getting Into the Design
Every firewall ruleset design begins with a decision between two basic security stances: default deny and default allow. Basically, under default allow, you allow all types of traffic that you don't specifically block. This is often the stance of choice for universities and other extremely open and flexible environments. At the potential cost of weaker security, you gain a huge amount of flexibility. It also models the kinds of interactions that your employees/bosses are most accustomed to: there are a set of rules that cannot be broken, but all other communication/behavior is permitted. Now, what do I mean by weaker security? Well, developing rules against network applications, vulnerabilities or attacks after the fact opens you up to windows of vulnerability that you might not be happy with. On the other hand, you can look at default deny.
Default deny lends greater security, at the cost of flexibility and your time. I prefer this latter type for a secure environment, like a server farm. For a large network of users, it can prove very, very difficult to get right. Expect that you will "break" application access for someone in your first trial run of any default deny firewall. The problem is this: you don't yet know about every network-accessible program that your organization uses. Here's an example: when I designed my first large-scale firewall, I didn't realize we were using Instant Messaging clients for communication between sysadmins. This was caught in a group review of the firewall before we put it in place. These kind of misses are very, very common and quite expected - this is why you have group review of firewall policies under default deny!
The best way to understand the difference between these two policies is this: in "default allow," you tend to err on the side of forgetting to filter against a given attack. In "default deny," your errors tend to create temporary breaks in functionality. So, which should you choose? It really depends on your organization. If you have very high security needs, you'll go with default deny and accept the time investment. If you have mostly networks of users with high flexibility requirements, you'll choose default allow. If your organization has reached the right size or level of complexity, you'll apply different policies to different parts of an organization. The server farm will have a default deny policy, while the users will have a default allow policy. This allows maximum flexibility for users at their desks, while tightly locking down the servers. As a side question, by the way, let's talk about Rule Direction.
Most of the time, people talk about filtering out incoming packets, but never consider filters on outgoing packets. That is, they allow all outgoing connections from their "trusted" internal network to the Internet, but filter the connections that are initiated by outside Internet hosts. There are several important reasons to filter outgoing packets/connections:
Set policy i.e. "our employees cannot connect to external telnet servers."
Break exploits that initiate a new connection with the attacker's machine
Stop trojan horse-type programs
We'll look at this topic more when we create our SOHO firewall. For now, let's talk a little bit about the Internet protocols, so we have a better idea about how to restrict them.
Understanding Data Protocols: TCP, UDP, ICMP
This material is very basic, as it's intended to introduce the concept of packets and connections. It should make a good introduction to someone who has never considered these protocols before. At the same time, it serves as a good "refresher course" to read over before constructing a packet filtering ruleset.
OK, so a communication between two hosts on the Internet breaks the data into "bite-sized" chunks called packets. The size of the "bite" mostly depends on the characteristics of the intervening wire. Each packet has a small section of data at the beginning, called a header, which tells the routers and network computers which computer it's destined for. It even has a data section that tells the destination computer what program the data is destined for.
The Internet Protocol, the IP in TCP/IP, is responsible for the former - it is used simply to get a packet from one computer to another. The headers carry very little specific data. The only data in the IP header that's important for most filtering is the source and destination IP address.
Now, what about the TCP part? Well, as we said, IP just gets information, via packets, from one computer to another. To get data from one program on one computer to another program on a second computer, we need more. TCP and UDP are actual application-level protocols that ride along IP packets. How? Well, through the magic of encapsulation, the TCP (or UDP or ICMP) header is placed in the data section of the IP packet, like this:
These protocols provide extra capabilities and establish the possibility for a connection between two programs rather than just two computers. UDP is used for sending short messages, with no real guarantees, while TCP maintains an entire connection, complete with error correction, packet re-ordering and missed/corrupted packet resend request. But how does the computer use these protocols to facilitate communications between pairs of programs, rather than pairs of computers?
Well, TCP and UDP both accomplish this with a port number. This 16-bit number identifies the sending and receiving programs on each machine. The operating system allows each program to check out, or "bind to," one or more of these port numbers and then keeps track of which program has bound to which port, to make sure that the right data gets to the right program.
OK, so what is most critical for packet filtering is the following basic fact.
Each TCP or UDP connection is uniquely defined by the following four numbers:
source IP address (source computer)
source port (the program on said source computer)
destination IP address (destination computer)
destination port (the program on said destination computer)
Now, certain kinds of connections use "statically allocated" ports. For example, the telnet server basically binds to port 23 on its host machine. This means that a telnet session always looks like this: a TCP connection between two computers, from some port on the client (usually in the range 1024-65535) to port 23 on the server. Because of this fact, we can block all outgoing client connections to external telnet servers by simply blocking all outgoing TCP packets with destination port 23. We can block all incoming client connections to our internal telnet servers in a similar manner - block all incoming TCP packets with destination port 23.
Here are some other common server-side ports:
FTP - TCP ports 20 and 21
SSH - TCP port 22
SMTP (E-Mail between computers) - TCP port 25
DNS - TCP and UDP ports 53
HTTP (Web) - TCP port 80
POP (remote E-mail retrieval) - TCP port 110
IMAP (remote E-mail retrieval) - TCP port 143
HTTPS (Encrypted Web) - TCP port 443
EXEC (rexec) - TCP port 512
LOGIN (rlogin) - TCP port 513
SHELL (rsh) - TCP port 514
IRC (Internet Relay Chat) - TCP port 6667
Now, what do we use that long (yet far from comprehensive) port list for? Well, we use it to block people on one side of the firewall from making a particular kind of connection to a machine on the other side of the firewall. We've seen how to block telnet connections and we can use the above port list to block others. There are some exceptions, in that the protocols don't follow this traditional "client initiates connection with server, some high port on the client to some fixed low port on the server" model. FTP is the most notable exception.
FTP has many problems. One of those is that it can be difficult to firewall with non-stateful filters, like most routers and Linux 2.0-2.2 machines. Here's why. When an FTP client connects to an FTP server, the first part of the connection is quite standard. The client binds to a high port (1024-65535) and initiates a connection to the FTP server, which is bound to and listening to port 21.
What's strange now is that all data is sent back via a second connection! In the default "active" mode that most command-line FTP clients use by default, it works like this:
The FTP client binds to a high port, which we'll call the client data port, as determined by the Operating System. It communicates this port number via the primary "control" connection (client:high -> server:21) and then waits. The FTP server then opens a connection from port 20 back to the client data port and transmits data on this second connection.
This is very strange, if you look at our discussion of ordinary TCP application connections. See, the client is supposed to initiate all connections! The reason this deviation is a problem is that you can normally filter connections based on who is initiating them. With personal computers, you'd normally want to allow them to initiate connections to external machines, since they're running the applications, but not allow external machines to initiate connections to the PC's, which should be running no servers. The primary weakness that comes out of this is that, in a non-stateful firewall, you basically have to allow all TCP connections from outside machines originating from port 20! So, you end up taking more risk on the client-side. Well, there's a partial solution to this, in that you can force everyone's clients to use "passive" mode FTP, which works like this:
The FTP client asks the server for data. The server then binds to a high port, which we'll call the server data port, as determined by the Operating System. It communicates this second port number back to the client, via the primary "control connection" (client:high -> server:21) and then waits. The client opens a data connection from a new high port, which is must request from the operating system, to the server data port. The server then transmits data on this second connection.
So, this is more normal. The client is opening that second connection, albeit to an arbitrary high (1024-65535) port on the server. This is better, though it now opens the server up to greater risk. See, now the firewall on the server end has to allow all connections to high ports on the FTP server machines. Now, a knowledgeable admin can reduce this port range, from 1024-65535, to something more manageable like 40,000-45,000, but this still leaves a wide port range that has to be allowed in the server-side firewall. So, is there any hope?
Well, barring killing off FTP, there is. Stateful firewalls can watch the data stream and understand the port negotiation. Unlike non-stateful firewalls, which have to allow every potential port, stateful firewalls can allow through packets destined for the specific additional data port, at the specific "right time" in the connection.
What do you do about things like this? Basically: read up on the protocols that you're trying to firewall. This is much more necessary when you're using non-stateful firewalls, but is always a good idea.
OK, so we've covered firewall placement, default allow/deny stances, and very basic TCP/IP fundamentals. We've also noticed that much of this is aided by good information. You can get more by looking at Kurt Seifried's firewall links page, my upcoming book, or Bob Ziegler's "Linux Firewalls" book. In just a few days, SecurityPortal.com will publish the follow-up article on building a SOHO firewall with Linux 2.4, so watch for it!
Jay Beale is the Security Team Director at MandrakeSoft, creators of Mandrake Linux. He is also the Lead Developer of the Bastille Linux Project, which creates a hardening program which should later do much of what we talk about here. Jay is the author of a number of articles on Unix/Linux security, along with the upcoming book "Securing Linux the Bastille Way," to be published by Addison Wesley. You can learn more about his articles, talks and favorite security links via http://www.bastille-linux.org/jay.