| LaForge | I'm going to talk this evening about netfilter | |
| LaForge | netfilter is the packet filtering / packet mangling / NAT framework of the Linux 2.4 kernel series | |
| LaForge | Some slides and a script for this talk are avaliable at http://www.gnumonks.org/papers/netfilter-lk2000 | |
| LaForge | I expect the autitorium to be familiar with TCP/IP basics, as well as being familiar with iptables and packet filtering in general | |
| LaForge | oops... ipchains of course :) | |
| LaForge | in other words: You should know how the 2.2 packet filtering (ipchains) works ;) | |
| LaForge | first a bit of an introduction: | |
| LaForge | What is netfilter? | |
| LaForge | Netfilter is a generalized framework of hooks in the network stack | |
| LaForge | any kernel module can plug into one or more of these hooks an will receive each packet traversing this hook | |
| LaForge | the netfilter hooks are currently implemented for IPv4, IPv6 and DECnet. | |
| LaForge | (i've heared recently, that somebody wants to implement them for IPX, too) | |
| LaForge | these hooks are placed in well-chosen points of the protocol stack. | |
| LaForge | The traditional packet filtering, as well as all kinds of network address translation and packet mangling are implemented on top of these hooks. | |
| LaForge | so netfilter is definitely more than a firewalling subsystem - it's a superset of that. | |
| LaForge | The next introductory question is: | |
| LaForge | Why did we need netfilter? | |
| LaForge | Because the old 2.2 code was way too complex | |
| LaForge | it was scattered around the whole IPv4 code | |
| LaForge | there were about 25 places in the IPv4 code, where we had a #ifdef CONFIG_IP_FIREWALL ... #else ... #endif | |
| LaForge | which is quite bad. | |
| LaForge | furthermore, all packet handling had to be done in kernel | |
| LaForge | masquerading was a hack to the packet filtering code | |
| LaForge | and the filtering rules are bound to interface addresses. | |
| LaForge | The 2.2 code was not very extensible, you could only write masquerading helper modules like ip_masq_irc / ip_masq_ftp / ... | |
| LaForge | ... now to the last part of the introduction: | |
| LaForge | Who did netfilter? | |
| LaForge | The main part of netfilter design and implementation was done by Rusty Russel | |
| LaForge | Rusty is also the co-author of ipchains and Linux Kernel Firewall Maintainer for the last years. | |
| LaForge | He got sponsored for one year to concentrate on the firewalling code - and the result was netfilter | |
| LaForge | Some other people joined him in different stages of the development: Marc Boucher, James Morris and the last one is myself. | |
| LaForge | The 'formal' core team consists out of us four people. Of course there are numoerous other contributors, you can see them at http://netfilter.kernelnotes.org/scoreboard.html | |
| LaForge | So let's begin the main part of this presentation: | |
| LaForge | PART 1 - netfilter basics | |
| LaForge | I was talking about these hooks at particular points in the network stack. | |
| LaForge | I'm going to concentrate on IPv4, as this seems to be the most important case :) | |
| LaForge | --->[1]--->[ROUTE]--->[3]--->[4]---> | |
| LaForge | | ^ | |
| LaForge | | | | |
| LaForge | | [ROUTE] | |
| LaForge | v | | |
| LaForge | [2] [5] | |
| LaForge | | ^ | |
| LaForge | | | | |
| LaForge | v | | |
| LaForge | ||
| LaForge | on the left hand, you have incoming packets, coming from the network | |
| LaForge | on the right hand, outgoing packets are leaving to the network | |
| LaForge | on the bottom of the picture is our local machine, the local userspace processes. | |
| LaForge | the 5 hooks are called: | |
| LaForge | 1 NF_IP_PRE_ROUTING | |
| LaForge | 2 NF_IP_LOCAL_IN | |
| LaForge | 3 NF_IP_FORWARD | |
| LaForge | 4 NF_IP_POST_ROUTING | |
| LaForge | 5 NF_IP_LOCAL_OUT | |
| LaForge | so let's view at the path a packet goes while being forwarded by our machine: | |
| LaForge | Firs it comes off the wire, it passes hook #1. The routing decision is made, | |
| LaForge | it passes hook #3 (forward), passes hook #4 (post_routing) and leaves off to the network again. | |
| LaForge | If we look on packets which have a local destionation (are locally terminated an are not routed), the following path: | |
| LaForge | packet comes off the wire | |
| LaForge | packet hits hook #1 (pre_routing) | |
| LaForge | routing decision decides that packet is local | |
| LaForge | packet hits hook #2 (local_in) | |
| LaForge | packet hits local process | |
| LaForge | ||
| LaForge | If we look at a locally-originated packet: | |
| LaForge | packet is generated by local process at the bottom | |
| LaForge | packet hits hook #5 (local_out) | |
| LaForge | routing code decides where to route the packet | |
| LaForge | packet passes hook #4 (post_routing) | |
| LaForge | packet hits the wire of the network | |
| LaForge | (btw: i want to concentrate on the talk and handle questions after the talk, this way i can concentrate on the talk...) | |
| LaForge | (anyway, you can collect the questions at #qc, if you want) | |
| LaForge | Now we know how packets traverse the netfilter hooks | |
| LaForge | As I said, any kernel module may register on one or more of these hooks, and a callback-function is called for each packet passing this particular hook | |
| LaForge | the module may then return a verdict about the packet's future: | |
| LaForge | NF_ACCEPT = continue traversal as normal | |
| LaForge | NF_DROP = drop the packet silently, do not continue | |
| LaForge | NF_STOLEN = I (as the hook-registered module) have taken over the packet, do not continue | |
| LaForge | NF_QUEUE = enqueue packet to userspace (i'm going to say more about this later) | |
| LaForge | NF_REPEAT = please call this hook again | |
| LaForge | packet filtering / NAT / packet mangling is implemented using IP tables on each of these netfilter hooks. | |
| LaForge | IP TABLES: | |
| LaForge | IP tables are tables of rules, which a packet traverses from top to bottom | |
| LaForge | each rule in an IP table consists out of matches, which specify how the packet must look like, if it is to match this rule | |
| LaForge | and one target, which tells us what to do if this particular rule matches. | |
| LaForge | IP tables are implemented as reusable component - in fact, netfilter it self uses currently three instances of IP tables. | |
| LaForge | But any other kernel module may also use IP tables (for example as an IPsec SPDB) | |
| LaForge | The three tables implemented in netfilter itself are: 'filter', 'nat' and 'mangle' | |
| LaForge | Connectiontracking: | |
| LaForge | Connection tracking is another part, which is implemented on top of the netfileter hooks. | |
| LaForge | conntrack enables us to do stateful firewalling. That is: Decide upon the fate of a packet not only by data from this packet, but also by information about the state of the connection the packet belongs to. | |
| LaForge | i'm going to say more about connection tracking later. | |
| LaForge | First I want to talk about the three IP tables: | |
| LaForge | PART II - packet filtering | |
| LaForge | Packet filtering is implemented using the three hooks NF_IP_LOCAL_IN | |
| LaForge | NF_IP_FORWAD and NF_IP_LOCAL_OUT | |
| LaForge | each packet passes only one of these three hooks: | |
| LaForge | locally originated packets traverse only NF_IP_LOCAL_OUT | |
| LaForge | locally terminated packets traverse only NF_IP_LOCAL_IN | |
| LaForge | and forwarded packets traverse only NF_IP_FORWARD | |
| LaForge | the 'filter' table connects one chain to each of these three hooks: | |
| LaForge | NF_IP_LOCAL_IN = INPUT chain | |
| LaForge | NF_IP_LOCAL_OUT = OUTPUT chian | |
| LaForge | NF_IP_FORWARD = FORWARD chain | |
| LaForge | (the names are the same as in 2.2 - only uppercase) | |
| LaForge | but BE AWARE: the behaviour which packet traverses which chain has changed from the 2.2 behaviour | |
| LaForge | i.e. a forwarded packet only hits the FORWARD chain, _not_ INPUT and OUTPUT also | |
| LaForge | to know how we insert filtering rules in the chains of the 'filter' table, we have to examine the IP tables a bit further | |
| LaForge | As I said, the IP tables are implemented very generic, so there's one userspace tool, which is able to configure/modify all kindes of tables/chains | |
| LaForge | each rule in a chain consists out of | |
| LaForge | - match(es) which specify things like source address, destination address, port numbers, ... | |
| LaForge | - target (what to do if this rule matches) | |
| LaForge | To configure these rules, we have the tool called 'iptables' | |
| LaForge | I'm going to explain some of the iptables commands: | |
| LaForge | To fully specify an iptables command, we need the following information: | |
| LaForge | - which table to work on | |
| LaForge | - which chain in this table to use | |
| LaForge | - the operation (append, insert , delete, modify, ) | |
| LaForge | - at least one match | |
| LaForge | - and exactly one target | |
| LaForge | the syntax is something like: | |
| LaForge | iptables - t table -Operation chain -j target match(es) | |
| LaForge | to give a very basic example: | |
| LaForge | iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp | |
| LaForge | which -A(ppend)s a rule to the INPUT chain of the 'filter' table | |
| LaForge | and the rule itself ACCEPTs all tcp packets which have a destination port of 25 (smtp) | |
| LaForge | now we have to know what matches and targets we have available | |
| LaForge | as targets, we have : | |
| LaForge | ACCEPT - accept the packet | |
| LaForge | DROP - silently drop the packet (this is the 2.2 DENY) | |
| LaForge | QUEUE - queue the packet to an userspace process | |
| LaForge | RETURN - return to previous (calling) chain | |
| LaForge | foobar - jump to an userdefined chain | |
| LaForge | REJECT - drop the packet and inform the sender about it | |
| LaForge | LOG - log the packet via syslog, continue traversal | |
| LaForge | ULOG - send the packet to an userspace logging process | |
| LaForge | MIRROR - change source/destination IP and resend the packet (for testing purpose) | |
| LaForge | now the available matches: | |
| LaForge | -p protocol (tcp/udp/icmp/...) | |
| LaForge | -s source address | |
| LaForge | -d destination address | |
| LaForge | -i incoming interface | |
| LaForge | -o outgoing interface | |
| LaForge | --dport destination port | |
| LaForge | --sport source port | |
| LaForge | --state (NEW,ESTABLISHED,RELATED,INVALID) (i'm comming back to that) | |
| LaForge | --mac-source source MAC address | |
| LaForge | --mark nfmark value | |
| LaForge | --tos TOS value of the packet | |
| LaForge | --ttl ttl value of the packet | |
| LaForge | --limit (limit the rate of this packet to a certain amount of pkts/timeframe) | |
| LaForge | ||
| LaForge | knowing about the matches and targets, you are now able to configure your packet filter. | |
| LaForge | I'm coming back to the connection tracking stuff | |
| LaForge | this is a real advantage of the new 2.4 code: | |
| LaForge | stateful firewalling | |
| LaForge | the connection tracking code keeps track of all current connections going through our router/firewall | |
| LaForge | each packet is assigned one of the state values: | |
| LaForge | NEW (packet would establish a new connection, if we let it pass) | |
| LaForge | ESTABLISHED (packet is part of an already established connection) | |
| LaForge | RELATED (packet is somehow related to an already established connection) | |
| LaForge | INVALID (packet is multicast or something else whe really don't know what it is | |
| LaForge | so now we could do something like: | |
| LaForge | iptables -A FORWARD -j ACCEPT -m state --state ESTABLISHED,RELATED | |
| LaForge | which lets only all packets belonging to an already established connection and the related ones pass. | |
| LaForge | if we now block all NEW packets from the 'outer' interface (internet) | |
| LaForge | and allow NEW packets from the inside interface, we'll have the basic config of most firewalls | |
| LaForge | so how does this differ from blocking packets which have the SYN flag set? | |
| LaForge | connection tracking is generic and currently handles TCP, UDP and ICMP | |
| LaForge | so for example we don't accept icmp echo replies, if we didn't send an icmp echo request before | |
| LaForge | the connection tracking is extensible in two ways: | |
| LaForge | - application helper modules (like ip_conntrack_ftp, ip_conntrack_irc) for specific protocols | |
| LaForge | - protocol helper modules (for tracking the state of other protocols than tcp/udp/icmp) | |
| LaForge | the ip_contrack_ftp for example marks all incoming ftp data connections as RELATED | |
| LaForge | now we can do active ftp through a firewall which doesn't have to accept all connections to internal ip's with ports > 1024 anymore! | |
| LaForge | ok... time for the next parT: | |
| LaForge | PART III - NAT | |
| LaForge | in 2.2 we only had the masquerading code, which deals with a special case of NAT (network address translation) | |
| LaForge | in 2.4 we have all kinds of differnet nat: | |
| LaForge | SNAT (source address NAT), and MASQUERADE as a special case of that | |
| LaForge | DNAT (destination address NAT), and REDIRECT as a special case | |
| LaForge | source nat is done at the POST_ROUTING hook | |
| LaForge | destination nat is done at the PRE_ROUTING hook | |
| LaForge | i'll begin with a small example of SNAT: | |
| LaForge | iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -o eth0 | |
| LaForge | this will NAT all packets to be sent out on eth0 to the new source address of 1.2.3.4 | |
| LaForge | (it of course does the inverse mapping for the reply packets) | |
| LaForge | SNAT is useful for NAT cases, where you have a statically assigned IP address. | |
| LaForge | If your outgoing interfaces has a dynamically assigned IP address, you may use the MASQUERADE target. | |
| LaForge | iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0 | |
| LaForge | is an example for masqing all traffic on interface ppp0. | |
| LaForge | the address to which the packets are nat'ed is the interface address of ppp0 | |
| LaForge | it's always the current address of ppp0, so IP address changes don't need any special handling. | |
| LaForge | The next part is DNAT: | |
| LaForge | small example: | |
| LaForge | iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -t tcp --dport 80 -i eth0 | |
| LaForge | which NAT's all tcp packets, coming through interface eth0 and going to a webserver to 1.2.3.4:808 | |
| LaForge | 8080 of coruse | |
| LaForge | this is quite useful if you want to do transparent www proxying | |
| LaForge | REDIRECT is a special case of DNAT: | |
| LaForge | iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80 | |
| LaForge | all tcp packets from eth1 going to any webserver on port 80 are redirected to a proxy running on the local machine | |
| LaForge | PART IV - Packet mangling | |
| LaForge | this is something really new, which 2.2.x code didn't have at all | |
| LaForge | the 'mangle' table lets you mangle any arbitrary information inside the packets while they pass our local machine | |
| LaForge | currently we have only three targets implemented: | |
| LaForge | TOS - change the TOS bit field in the header | |
| LaForge | TTL - change the TTL field in the header (increment/decrement/set) | |
| LaForge | MARK - set the packet's skb->nfmark fielt to a particular value | |
| LaForge | of course you can again use all the matches available for packet filtering and nat. | |
| LaForge | a simple example: | |
| LaForge | iptables -t mangle -A PREROUITING -j MARK --set-mark 10 -p tcp --dport 80 | |
| LaForge | which set's the nfmark field of each packet's skb to 10, if it is tcp and has a destination port of 9- | |
| LaForge | 80 | |
| LaForge | all matches and targets are implemented as separate modules, so you can at any time write new match and/or target modules | |
| LaForge | There are two more 'advanced concepts' of netfilter, I want to introduce: | |
| LaForge | - Queuing | |
| LaForge | if you have a rule, which has the target QUEUE, the packet is inserted into a special queue inside netfilter | |
| LaForge | the packets in this queue are transmitted over a netlink socket to a userspace process. | |
| LaForge | this userspace process can now do whatever it wans with the packet (including its data) and re-inject it at exactly the place it came from | |
| LaForge | the process can (of course) also set the verdict of this packet (like: DROP this packet, ACCEPT the other one) | |
| LaForge | this enables people to write some firewalling code in userspace, and (hopefully) keeps the kernel clean from too complex code. | |
| LaForge | - Userspace logging | |
| LaForge | Very similar to queuing, although it is unidirectional | |
| LaForge | if you insert a rule with the ULOG target, the packet is copied and sent through a netlink multicast socket | |
| LaForge | one or more userspace processes may listen to this netlink multicast socket and receive the copy of the packet | |
| LaForge | the userspace process may now gather all information it needs and log it to a logfile/database/whatever | |
| LaForge | we've already implemented ulogd, which is a plugin-extensible logging daemon attaching to the ULOG target | |
| LaForge | So.... we are heading the end of my talk.... last chapter: | |
| LaForge | Current development and future: | |
| LaForge | - full TCP sequence number tracking | |
| LaForge | - port more matches/targets to IPv6 | |
| LaForge | - support for more application protocol helpers for NAT (RPC, SMB, SNMP, ...) | |
| LaForge | - more matches (like 'accept all packets as long as the number of connections to this port doesn't raise about N) | |
| LaForge | - multicast support | |
| LaForge | - infrastructure for having conntrack and nat helpers in userspace | |
| LaForge | ||
| LaForge | At the end some useful links: | |
| LaForge | This presentation: | |
| LaForge | http://www.gnumonks.org/papers/netfilter-lk2000 | |
| LaForge | netfilter homepage: http://netfilter.kernelnotes.org | |
| LaForge | links to the mailinglist(s) and the archives, as well as the iptables userspace tool are on the netfilter homepage | |
| LaForge | we also have a bunch of documents you might be interested in: The 2.4 packet filtering howto, the 2.4 NAT howto, the netfilter hacking howto, and some more stuff | |
| LaForge | everything should be linked from the netfilter homepage | |
| Blu3 | Thank you, it was very informative :) | |
| LaForge | Thanks for your interest in this talk... I'll deal with questions now |
