I'm going to talk this evening about netfilter [[BR]] netfilter is the packet filtering / packet mangling / NAT framework of the Linux 2.4 kernel series[[BR]] Some slides and a script for this talk are avaliable at [http://www.gnumonks.org/papers/netfilter-lk2000][[BR]] I expect the autitorium to be familiar with TCP/IP basics, as well as being familiar with iptables and packet filtering in general[[BR]] oops... ipchains of course :)[[BR]] in other words: You should know how the 2.2 packet filtering (ipchains) works ;)[[BR]] first a bit of an introduction:[[BR]] What is netfilter?[[BR]] Netfilter is a generalized framework of hooks in the network stack[[BR]] any kernel module can plug into one or more of these hooks an will receive each packet traversing this hook[[BR]] the netfilter hooks are currently implemented for IPv4, IPv6 and DECnet.[[BR]] (i've heared recently, that somebody wants to implement them for IPX, too)[[BR]] these hooks are placed in well-chosen points of the protocol stack.[[BR]] The traditional packet filtering, as well as all kinds of network address translation and packet mangling are implemented on top of these hooks.[[BR]] so netfilter is definitely more than a firewalling subsystem - it's a superset of that.[[BR]] The next introductory question is: [[BR]] Why did we need netfilter?[[BR]] Because the old 2.2 code was way too complex[[BR]] it was scattered around the whole IPv4 code[[BR]] there were about 25 places in the IPv4 code, where we had a #ifdef CONFIG_IP_FIREWALL ... #else ... #endif [[BR]] which is quite bad.[[BR]] furthermore, all packet handling had to be done in kernel[[BR]] masquerading was a hack to the packet filtering code[[BR]] and the filtering rules are bound to interface addresses.[[BR]] The 2.2 code was not very extensible, you could only write masquerading helper modules like ip_masq_irc / ip_masq_ftp / ...[[BR]] ... now to the last part of the introduction:[[BR]] Who did netfilter?[[BR]] The main part of netfilter design and implementation was done by Rusty Russel[[BR]] Rusty is also the co-author of ipchains and Linux Kernel Firewall Maintainer for the last years.[[BR]] He got sponsored for one year to concentrate on the firewalling code - and the result was netfilter[[BR]] Some other people joined him in different stages of the development: Marc Boucher, James Morris and the last one is myswelf.[[BR]] The 'formal' core team consists out of us four people. Of course there are numoerous other contributors, you can see them at [http://netfilter.kernelnotes.org/scoreboard.html][[BR]] So let's begin the main part of this presentation:[[BR]] PART 1 - netfilter basics[[BR]] I was talking about these hooks at particular points in the network stack.[[BR]] I'm going to concentrate on IPv4, as this seems to be the most important case :)[[BR]] --->[1]--->[ROUTE]--->[3]--->[4]--->[[BR]] | ^[[BR]] | |[[BR]] | [ROUTE][[BR]] v |[[BR]] [2] [5][[BR]] | ^[[BR]] | |[[BR]] v |[[BR]] [[BR]] on the left hand, you have incoming packets, coming from the network[[BR]] on the right hand, outgoing packets are leaving to the network[[BR]] on the bottom of the picture is our local machine, the local userspace processes.[[BR]] the 5 hooks are called:[[BR]] 1 NF_IP_PRE_ROUTING[[BR]] 2 NF_IP_LOCAL_IN[[BR]] 3 NF_IP_FORWARD[[BR]] 4 NF_IP_POST_ROUTING[[BR]] 5 NF_IP_LOCAL_OUT[[BR]] so let's view at the path a packet goes while being forwarded by our machine:[[BR]] Firs it comes off the wire, it passes hook #1. The routing decision is made,[[BR]] it passes hook #3 (forward), passes hook #4 (post_routing) and leaves off to the network again.[[BR]] If we look on packets which have a local destionation (are locally terminated an are not routed), the following path:[[BR]] packet comes off the wire[[BR]] packet hits hoo #1 (pre_routing)[[BR]] routing decision decides that packet is local[[BR]] packet hits hook #2 (local_in)[[BR]] packet hits local process[[BR]] [[BR]] If we look at a locally-originated packet:[[BR]] packet is generated by local process at the bottom[[BR]] packet hits hook #5 (local_out)[[BR]] routing code decides where to route the packet[[BR]] packet passes hook #4 (post_routing)[[BR]] packet hits the wire of the network[[BR]] (btw: i want to concentrate on the talk and handle questions after the talk, this way i can concentrate on the talk...)[[BR]] (anyway, you can collect the questions at #qc, if you want)[[BR]] Now we know how packets traverse the netfilter hooks[[BR]] As I said, any kernel module may register on one or more of these hooks, and a callback-function is called for each packet passing this particular hook[[BR]] the module may then return a verdict about the packet's future:[[BR]] NF_ACCEPT = continue traversal as normal[[BR]] NF_DROP = drop the packet silently, do not continue[[BR]] NF_STOLEN = I (as the hook-registered module) have taken over the packet, do not continue[[BR]] NF_QUEUE = enqueue packet to userspace (i'm going to say more about this later)[[BR]] NF_REPEAT = please call this hook again[[BR]] packet filtering / NAT / packet mangling is implemented using IP tables on each of these netfilter hooks.[[BR]] IP TABLES:[[BR]] IP tables are tables of rules, which a packet traverses from top to bottom[[BR]] each rule in an IP table consists out of matches, which specify how the packet must look like, if it is to match this rule[[BR]] and one target, which tells us what to do if this particular rule matches.[[BR]] IP tables are implemented as reusable component - in fact, netfilter it self uses currently three instances of IP tables.[[BR]] But any other kernel module may also use IP tables (for example as an IPsec SPDB)[[BR]] The three tables implemented in netfilter itself are: 'filter', 'nat' and 'mangle'[[BR]] Connectiontracking:[[BR]] Connection tracking is another part, which is implemented on top of the netfileter hooks.[[BR]] conntrack enables us to do stateful firewalling. That is: Decide upon the fate of a packet not only by data from this packet, but also by information about the state of the connection the packet belongs to.[[BR]] i'm going to say more about connection tracking later.[[BR]] First I want to talk about the three IP tables:[[BR]] PART II - packet filtering[[BR]] Packet filtering is implemented using the three hooks NF_IP_LOCAL_IN[[BR]] NF_IP_FORWAD and NF_IP_LOCAL_OUT[[BR]] each packet passes only one of these three hooks:[[BR]] locally originated packets traverse only NF_IP_LOCAL_OUT[[BR]] locally terminated packets traverse only NF_IP_LOCAL_IN[[BR]] and forwarded packets traverse only NF_IP_FORWARD[[BR]] the 'filter' table connects one chain to each of these three hooks:[[BR]] NF_IP_LOCAL_IN = INPUT chain[[BR]] NF_IP_LOCAL_OUT = OUTPUT chian[[BR]] NF_IP_FORWARD = FORWARD chain[[BR]] (the names are the same as in 2.2 - only uppercase)[[BR]] but BE AWARE: the behaviour which packet traverses which chain has changed from the 2.2 behaviour[[BR]] i.e. a forwarded packet only hits the FORWARD chain, _not_ INPUT and OUTPUT also[[BR]] to know how we insert filtering rules in the chains of the 'filter' table, we have to examine the IP tables a bit further[[BR]] As I said, the IP tables are implemented very generic, so there's one userspace tool, which is able to configure/modify all kindes of tables/chains[[BR]] each rule in a chain consists out of [[BR]] - match(es) which specify things like source address, destination address, port numbers, ...[[BR]] - target (what to do if this rule matches)[[BR]] To configure these rules, we have the tool called 'iptables'[[BR]] I'm going to explain some of the iptables commands:[[BR]] To fully specify an iptables command, we need the following information:[[BR]] - which table to work on[[BR]] - which chain in this table to use[[BR]] - the operation (append, insert , delete, modify, )[[BR]] - at least one match [[BR]] - and exactly one target[[BR]] the syntax is something like:[[BR]] iptables - t table -Operation chain -j target match(es)[[BR]] to give a very basic example:[[BR]] iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtp[[BR]] which -A(ppend)s a rule to the INPUT chain of the 'filter' table [[BR]] and the rule itself ACCEPTs all tcp packets which have a destination port of 25 (smtp)[[BR]] now we have to know what matches and targets we have available[[BR]] as targets, we have :[[BR]] ACCEPT - accept the packet[[BR]] DROP - silently drop the packet (this is the 2.2 DENY)[[BR]] QUEUE - queue the packet to an userspace process [[BR]] RETURN - return to previous (calling) chain[[BR]] foobar - jump to an userdefined chain[[BR]] REJECT - drop the packet and inform the sender about it[[BR]] LOG - log the packet via syslog, continue traversal[[BR]] ULOG - send the packet to an userspace logging process [[BR]] MIRROR - change source/destination IP and resend the packet (for testing purpose)[[BR]] now the available matches:[[BR]] -p protocol (tcp/udp/icmp/...)[[BR]] -s source address[[BR]] -d destination address[[BR]] -i incoming interface[[BR]] -o outgoing interface[[BR]] --dport destination port[[BR]] --sport source port[[BR]] --state (NEW,ESTABLISHED,RELATED,INVALID) (i'm comming back to that)[[BR]] --mac-source source MAC address[[BR]] --mark nfmark value[[BR]] --tos TOS value of the packet[[BR]] --ttl ttl value of the packet[[BR]] --limit (limit the rate of this packet to a certain amount of pkts/timeframe)[[BR]] [[BR]] knowing about the matches and targets, you are now able to configure your packet filter.[[BR]] I'm coming back to the connection tracking stuff[[BR]] this is a real advantage of the new 2.4 code:[[BR]] stateful firewalling[[BR]] the connection tracking code keeps track of all current connections going through our router/firewall[[BR]] each packet is assigned one of the state values:[[BR]] NEW (packet would establish a new connection, if we let it pass)[[BR]] ESTABLISHED (packet is part of an already established connection)[[BR]] RELATED (packet is somehow related to an already established connection)[[BR]] INVALID (packet is multicast or something else whe really don't know what it is[[BR]] so now we could do something like:[[BR]] iptables -A FORWARD -j ACCEPT -m state --state ESTABLISHED,RELATED[[BR]] which lets only all packets belonging to an already established connection and the related ones pass.[[BR]] if we now block all NEW packets from the 'outer' interface (internet)[[BR]] and allow NEW packets from the inside interface, we'll have the basic config of most firewalls[[BR]] so how does this differ from blocking packets which have the SYN flag set?[[BR]] connection tracking is generic and currently handles TCP, UDP and ICMP[[BR]] so for example we don't accept icmp echo replies, if we didn't send an icmp echo request before[[BR]] the connection tracking is extensible in two ways:[[BR]] - application helper modules (like ip_conntrack_ftp, ip_conntrack_irc) for specific protocols[[BR]] - protocol helper modules (for tracking the state of other protocols than tcp/udp/icmp)[[BR]] the ip_contrack_ftp for example marks all incoming ftp data connections as RELATED[[BR]] now we can do active ftp through a firewall which doesn't have to accept all connections to internal ip's with ports > 1024 anymore![[BR]] ok... time for the next parT:[[BR]] PART III - NAT[[BR]] in 2.2 we only had the masquerading code, which deals with a special case of NAT (network address translation)[[BR]] in 2.4 we have all kinds of differnet nat:[[BR]] SNAT (source address NAT), and MASQUERADE as a special case of that[[BR]] DNAT (destination address NAT), and REDIRECT as a special case [[BR]] source nat is done at the POST_ROUTING hook[[BR]] destination nat is done at the PRE_ROUTING hook[[BR]] i'll begin with a small example of SNAT:[[BR]] iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -o eth0[[BR]] this will NAT all packets to be sent out on eth0 to the new source address of 1.2.3.4[[BR]] (it of course does the inverse mapping for the reply packets)[[BR]] SNAT is useful for NAT cases, where you have a statically assigned IP address.[[BR]] If your outgoing interfaces has a dynamically assigned IP address, you may use the MASQUERADE target.[[BR]] iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0[[BR]] is an example for masqing all traffic on interface ppp0.[[BR]] the address to which the packets are nat'ed is the interface address of ppp0[[BR]] it's always the current address of ppp0, so IP address changes don't need any special handling.[[BR]] The next part is DNAT:[[BR]] small example:[[BR]] iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -t tcp --dport 80 -i eth0[[BR]] which NAT's all tcp packets, coming through interface eth0 and going to a webserver to 1.2.3.4:808[[BR]] 8080 of coruse[[BR]] this is quite useful if you want to do transparent www proxying[[BR]] REDIRECT is a special case of DNAT:[[BR]] iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80[[BR]] all tcp packets from eth1 going to any webserver on port 80 are redirected to a proxy running on the local machine[[BR]] PART IV - Packet mangling[[BR]] this is something really new, which 2.2.x code didn't have at all[[BR]] the 'mangle' table lets you mangle any arbitrary information inside the packets while they pass our local machine[[BR]] currently we have only three targets implemented:[[BR]] TOS - change the TOS bit field in the header[[BR]] TTL - change the TTL field in the header (increment/decrement/set)[[BR]] MARK - set the packet's skb->nfmark fielt to a particular value[[BR]] of course you can again use all the matches available for packet filtering and nat.[[BR]] a simple example:[[BR]] iptables -t mangle -A PREROUITING -j MARK --set-mark 10 -p tcp --dport 80[[BR]] which set's the nfmark field of each packet's skb to 10, if it is tcp and has a destination port of 9-[[BR]] 80[[BR]] all matches and targets are implemented as separate modules, so you can at any time write new match and/or target modules [[BR]] There are two more 'advanced concepts' of netfilter, I want to introduce:[[BR]] - Queuing[[BR]] if you have a rule, which has the target QUEUE, the packet is inserted into a special queue inside netfilter[[BR]] the packets in this queue are transmitted over a netlink socket to a userspace process.[[BR]] this userspace process can now do whatever it wans with the packet (including its data) and re-inject it at exactly the place it came from[[BR]] the process can (of course) also set the verdict of this packet (like: DROP this packet, ACCEPT the other one)[[BR]] this enables people to write some firewalling code in userspace, and (hopefully) keeps the kernel clean from too complex code.[[BR]] - Userspace logging[[BR]] Very similar to queuing, although it is unidirectional[[BR]] if you insert a rule with the ULOG target, the packet is copied and sent through a netlink multicast socket[[BR]] one or more userspace processes may listen to this netlink multicast socket and receive the copy of the packet[[BR]] the userspace process may now gather all information it needs and log it to a logfile/database/whatever[[BR]] we've already implemented ulogd, which is a plugin-extensible logging daemon attaching to the ULOG target[[BR]] So.... we are heading the end of my talk.... last chapter:[[BR]] Current development and future:[[BR]] - full TCP sequence number tracking[[BR]] - port more matches/targets to IPv6[[BR]] - support for more application protocol helpers for NAT (RPC, SMB, SNMP, ...)[[BR]] - more matches (like 'accept all packets as long as the number of connections to this port doesn't raise about N)[[BR]] - multicast support[[BR]] - infrastructure for having conntrack and nat helpers in userspace[[BR]] [[BR]] At the end some useful links:[[BR]] This presentation: [[BR]] [http://www.gnumonks.org/papers/netfilter-lk2000][[BR]] netfilter homepage: [http://netfilter.kernelnotes.org][[BR]] links to the mailinglist(s) and the archives, as well as the iptables userspace tool are on the netfilter homepage[[BR]] we also have a bunch of documents you might be interested in: The 2.4 packet filtering howto, the 2.4 NAT howto, the netfilter hacking howto, and some more stuff[[BR]] everything should be linked from the netfilter homepage[[BR]] Thank you, it was very informative :)[[BR]] Thanks for your interest in this talk... I'll deal with questions now[[BR]] ---- CategoryDocs