KernelNewbies:

<LaForge> I'm going to talk this evening about netfilter BR <LaForge> netfilter is the packet filtering / packet mangling / NAT framework of the Linux 2.4 kernel seriesBR <LaForge> Some slides and a script for this talk are avaliable at [http://www.gnumonks.org/papers/netfilter-lk2000][[BR]] <LaForge> I expect the autitorium to be familiar with TCP/IP basics, as well as being familiar with iptables and packet filtering in generalBR <LaForge> oops... ipchains of course :)BR <LaForge> in other words: You should know how the 2.2 packet filtering (ipchains) works ;)BR <LaForge> first a bit of an introduction:BR <LaForge> What is netfilter?BR <LaForge> Netfilter is a generalized framework of hooks in the network stackBR <LaForge> any kernel module can plug into one or more of these hooks an will receive each packet traversing this hookBR <LaForge> the netfilter hooks are currently implemented for IPv4, IPv6 and DECnet.BR <LaForge> (i've heared recently, that somebody wants to implement them for IPX, too)BR <LaForge> these hooks are placed in well-chosen points of the protocol stack.BR <LaForge> The traditional packet filtering, as well as all kinds of network address translation and packet mangling are implemented on top of these hooks.BR <LaForge> so netfilter is definitely more than a firewalling subsystem - it's a superset of that.BR <LaForge> The next introductory question is: BR <LaForge> Why did we need netfilter?BR <LaForge> Because the old 2.2 code was way too complexBR <LaForge> it was scattered around the whole IPv4 codeBR <LaForge> there were about 25 places in the IPv4 code, where we had a #ifdef CONFIG_IP_FIREWALL ... #else ... #endif BR <LaForge> which is quite bad.BR <LaForge> furthermore, all packet handling had to be done in kernelBR <LaForge> masquerading was a hack to the packet filtering codeBR <LaForge> and the filtering rules are bound to interface addresses.BR <LaForge> The 2.2 code was not very extensible, you could only write masquerading helper modules like ip_masq_irc / ip_masq_ftp / ...BR <LaForge> ... now to the last part of the introduction:BR <LaForge> Who did netfilter?BR <LaForge> The main part of netfilter design and implementation was done by Rusty RusselBR <LaForge> Rusty is also the co-author of ipchains and Linux Kernel Firewall Maintainer for the last years.BR <LaForge> He got sponsored for one year to concentrate on the firewalling code - and the result was netfilterBR <LaForge> Some other people joined him in different stages of the development: Marc Boucher, James Morris and the last one is myswelf.BR <LaForge> The 'formal' core team consists out of us four people. Of course there are numoerous other contributors, you can see them at [http://netfilter.kernelnotes.org/scoreboard.html][[BR]] <LaForge> So let's begin the main part of this presentation:BR <LaForge> PART 1 - netfilter basicsBR <LaForge> I was talking about these hooks at particular points in the network stack.BR <LaForge> I'm going to concentrate on IPv4, as this seems to be the most important case :)BR <LaForge> --->[1]--->[ROUTE]--->[3]--->[4]--->BR <LaForge> | ^BR <LaForge> | |BR <LaForge> | [ROUTE]BR <LaForge> v |BR <LaForge> [2] [5]BR <LaForge> | ^BR <LaForge> | |BR <LaForge> v |BR <LaForge> BR <LaForge> on the left hand, you have incoming packets, coming from the networkBR <LaForge> on the right hand, outgoing packets are leaving to the networkBR <LaForge> on the bottom of the picture is our local machine, the local userspace processes.BR <LaForge> the 5 hooks are called:BR <LaForge> 1 NF_IP_PRE_ROUTINGBR <LaForge> 2 NF_IP_LOCAL_INBR <LaForge> 3 NF_IP_FORWARDBR <LaForge> 4 NF_IP_POST_ROUTINGBR <LaForge> 5 NF_IP_LOCAL_OUTBR <LaForge> so let's view at the path a packet goes while being forwarded by our machine:BR <LaForge> Firs it comes off the wire, it passes hook #1. The routing decision is made,BR <LaForge> it passes hook #3 (forward), passes hook #4 (post_routing) and leaves off to the network again.BR <LaForge> If we look on packets which have a local destionation (are locally terminated an are not routed), the following path:BR <LaForge> packet comes off the wireBR <LaForge> packet hits hoo #1 (pre_routing)BR <LaForge> routing decision decides that packet is localBR <LaForge> packet hits hook #2 (local_in)BR <LaForge> packet hits local processBR <LaForge> BR <LaForge> If we look at a locally-originated packet:BR <LaForge> packet is generated by local process at the bottomBR <LaForge> packet hits hook #5 (local_out)BR <LaForge> routing code decides where to route the packetBR <LaForge> packet passes hook #4 (post_routing)BR <LaForge> packet hits the wire of the networkBR <LaForge> (btw: i want to concentrate on the talk and handle questions after the talk, this way i can concentrate on the talk...)BR <LaForge> (anyway, you can collect the questions at #qc, if you want)BR <LaForge> Now we know how packets traverse the netfilter hooksBR <LaForge> As I said, any kernel module may register on one or more of these hooks, and a callback-function is called for each packet passing this particular hookBR <LaForge> the module may then return a verdict about the packet's future:BR <LaForge> NF_ACCEPT = continue traversal as normalBR <LaForge> NF_DROP = drop the packet silently, do not continueBR <LaForge> NF_STOLEN = I (as the hook-registered module) have taken over the packet, do not continueBR <LaForge> NF_QUEUE = enqueue packet to userspace (i'm going to say more about this later)BR <LaForge> NF_REPEAT = please call this hook againBR <LaForge> packet filtering / NAT / packet mangling is implemented using IP tables on each of these netfilter hooks.BR <LaForge> IP TABLES:[[BR]] <LaForge> IP tables are tables of rules, which a packet traverses from top to bottomBR <LaForge> each rule in an IP table consists out of matches, which specify how the packet must look like, if it is to match this ruleBR <LaForge> and one target, which tells us what to do if this particular rule matches.BR <LaForge> IP tables are implemented as reusable component - in fact, netfilter it self uses currently three instances of IP tables.BR <LaForge> But any other kernel module may also use IP tables (for example as an IPsec SPDB)BR <LaForge> The three tables implemented in netfilter itself are: 'filter', 'nat' and 'mangle'BR <LaForge> Connectiontracking:[[BR]] <LaForge> Connection tracking is another part, which is implemented on top of the netfileter hooks.BR <LaForge> conntrack enables us to do stateful firewalling. That is: Decide upon the fate of a packet not only by data from this packet, but also by information about the state of the connection the packet belongs to.BR <LaForge> i'm going to say more about connection tracking later.BR <LaForge> First I want to talk about the three IP tables:BR <LaForge> PART II - packet filteringBR <LaForge> Packet filtering is implemented using the three hooks NF_IP_LOCAL_INBR <LaForge> NF_IP_FORWAD and NF_IP_LOCAL_OUTBR <LaForge> each packet passes only one of these three hooks:BR <LaForge> locally originated packets traverse only NF_IP_LOCAL_OUTBR <LaForge> locally terminated packets traverse only NF_IP_LOCAL_INBR <LaForge> and forwarded packets traverse only NF_IP_FORWARDBR <LaForge> the 'filter' table connects one chain to each of these three hooks:BR <LaForge> NF_IP_LOCAL_IN = INPUT chainBR <LaForge> NF_IP_LOCAL_OUT = OUTPUT chianBR <LaForge> NF_IP_FORWARD = FORWARD chainBR <LaForge> (the names are the same as in 2.2 - only uppercase)BR <LaForge> but BE AWARE: the behaviour which packet traverses which chain has changed from the 2.2 behaviourBR <LaForge> i.e. a forwarded packet only hits the FORWARD chain, _not_ INPUT and OUTPUT alsoBR <LaForge> to know how we insert filtering rules in the chains of the 'filter' table, we have to examine the IP tables a bit furtherBR <LaForge> As I said, the IP tables are implemented very generic, so there's one userspace tool, which is able to configure/modify all kindes of tables/chainsBR <LaForge> each rule in a chain consists out of BR <LaForge> - match(es) which specify things like source address, destination address, port numbers, ...BR <LaForge> - target (what to do if this rule matches)BR <LaForge> To configure these rules, we have the tool called 'iptables'BR <LaForge> I'm going to explain some of the iptables commands:BR <LaForge> To fully specify an iptables command, we need the following information:BR <LaForge> - which table to work onBR <LaForge> - which chain in this table to useBR <LaForge> - the operation (append, insert , delete, modify, )BR <LaForge> - at least one match BR <LaForge> - and exactly one targetBR <LaForge> the syntax is something like:BR <LaForge> iptables - t table -Operation chain -j target match(es)BR <LaForge> to give a very basic example:BR <LaForge> iptables -t filter -A INPUT -j ACCEPT -p tcp --dport smtpBR <LaForge> which -A(ppend)s a rule to the INPUT chain of the 'filter' table BR <LaForge> and the rule itself ACCEPTs all tcp packets which have a destination port of 25 (smtp)BR <LaForge> now we have to know what matches and targets we have availableBR <LaForge> as targets, we have :BR <LaForge> ACCEPT - accept the packetBR <LaForge> DROP - silently drop the packet (this is the 2.2 DENY)BR <LaForge> QUEUE - queue the packet to an userspace process BR <LaForge> RETURN - return to previous (calling) chainBR <LaForge> foobar - jump to an userdefined chainBR <LaForge> REJECT - drop the packet and inform the sender about itBR <LaForge> LOG - log the packet via syslog, continue traversalBR <LaForge> ULOG - send the packet to an userspace logging process BR <LaForge> MIRROR - change source/destination IP and resend the packet (for testing purpose)BR <LaForge> now the available matches:BR <LaForge> -p protocol (tcp/udp/icmp/...)BR <LaForge> -s source addressBR <LaForge> -d destination addressBR <LaForge> -i incoming interfaceBR <LaForge> -o outgoing interfaceBR <LaForge> --dport destination portBR <LaForge> --sport source portBR <LaForge> --state (NEW,ESTABLISHED,RELATED,INVALID) (i'm comming back to that)BR <LaForge> --mac-source source MAC addressBR <LaForge> --mark nfmark valueBR <LaForge> --tos TOS value of the packetBR <LaForge> --ttl ttl value of the packetBR <LaForge> --limit (limit the rate of this packet to a certain amount of pkts/timeframe)BR <LaForge> BR <LaForge> knowing about the matches and targets, you are now able to configure your packet filter.BR <LaForge> I'm coming back to the connection tracking stuffBR <LaForge> this is a real advantage of the new 2.4 code:BR <LaForge> stateful firewallingBR <LaForge> the connection tracking code keeps track of all current connections going through our router/firewallBR <LaForge> each packet is assigned one of the state values:BR <LaForge> NEW (packet would establish a new connection, if we let it pass)BR <LaForge> ESTABLISHED (packet is part of an already established connection)BR <LaForge> RELATED (packet is somehow related to an already established connection)BR <LaForge> INVALID (packet is multicast or something else whe really don't know what it isBR <LaForge> so now we could do something like:BR <LaForge> iptables -A FORWARD -j ACCEPT -m state --state ESTABLISHED,RELATEDBR <LaForge> which lets only all packets belonging to an already established connection and the related ones pass.BR <LaForge> if we now block all NEW packets from the 'outer' interface (internet)BR <LaForge> and allow NEW packets from the inside interface, we'll have the basic config of most firewallsBR <LaForge> so how does this differ from blocking packets which have the SYN flag set?BR <LaForge> connection tracking is generic and currently handles TCP, UDP and ICMPBR <LaForge> so for example we don't accept icmp echo replies, if we didn't send an icmp echo request beforeBR <LaForge> the connection tracking is extensible in two ways:BR <LaForge> - application helper modules (like ip_conntrack_ftp, ip_conntrack_irc) for specific protocolsBR <LaForge> - protocol helper modules (for tracking the state of other protocols than tcp/udp/icmp)BR <LaForge> the ip_contrack_ftp for example marks all incoming ftp data connections as RELATEDBR <LaForge> now we can do active ftp through a firewall which doesn't have to accept all connections to internal ip's with ports > 1024 anymore!BR <LaForge> ok... time for the next parT:BR <LaForge> PART III - NATBR <LaForge> in 2.2 we only had the masquerading code, which deals with a special case of NAT (network address translation)BR <LaForge> in 2.4 we have all kinds of differnet nat:BR <LaForge> SNAT (source address NAT), and MASQUERADE as a special case of thatBR <LaForge> DNAT (destination address NAT), and REDIRECT as a special case BR <LaForge> source nat is done at the POST_ROUTING hookBR <LaForge> destination nat is done at the PRE_ROUTING hookBR <LaForge> i'll begin with a small example of SNAT:[[BR]] <LaForge> iptables -t nat -A POSTROUTING -j SNAT --to-source 1.2.3.4 -o eth0BR <LaForge> this will NAT all packets to be sent out on eth0 to the new source address of 1.2.3.4BR <LaForge> (it of course does the inverse mapping for the reply packets)BR <LaForge> SNAT is useful for NAT cases, where you have a statically assigned IP address.BR <LaForge> If your outgoing interfaces has a dynamically assigned IP address, you may use the MASQUERADE target.BR <LaForge> iptables -t nat -A POSTROUTING -j MASQUERADE -o ppp0BR <LaForge> is an example for masqing all traffic on interface ppp0.BR <LaForge> the address to which the packets are nat'ed is the interface address of ppp0BR <LaForge> it's always the current address of ppp0, so IP address changes don't need any special handling.BR <LaForge> The next part is DNAT:[[BR]] <LaForge> small example:BR <LaForge> iptables -t nat -A PREROUTING -j DNAT --to-destination 1.2.3.4:8080 -t tcp --dport 80 -i eth0BR <LaForge> which NAT's all tcp packets, coming through interface eth0 and going to a webserver to 1.2.3.4:808BR <LaForge> 8080 of coruseBR <LaForge> this is quite useful if you want to do transparent www proxyingBR <LaForge> REDIRECT is a special case of DNAT:[[BR]] <LaForge> iptables -t nat -A PREROUTING -j REDIRECT --to-port 3128 -i eth1 -p tcp --dport 80BR <LaForge> all tcp packets from eth1 going to any webserver on port 80 are redirected to a proxy running on the local machineBR <LaForge> PART IV - Packet manglingBR <LaForge> this is something really new, which 2.2.x code didn't have at allBR <LaForge> the 'mangle' table lets you mangle any arbitrary information inside the packets while they pass our local machineBR <LaForge> currently we have only three targets implemented:BR <LaForge> TOS - change the TOS bit field in the headerBR <LaForge> TTL - change the TTL field in the header (increment/decrement/set)BR <LaForge> MARK - set the packet's skb->nfmark fielt to a particular valueBR <LaForge> of course you can again use all the matches available for packet filtering and nat.BR <LaForge> a simple example:BR <LaForge> iptables -t mangle -A PREROUITING -j MARK --set-mark 10 -p tcp --dport 80BR <LaForge> which set's the nfmark field of each packet's skb to 10, if it is tcp and has a destination port of 9-BR <LaForge> 80BR <LaForge> all matches and targets are implemented as separate modules, so you can at any time write new match and/or target modules BR <LaForge> There are two more 'advanced concepts' of netfilter, I want to introduce:BR <LaForge> - QueuingBR <LaForge> if you have a rule, which has the target QUEUE, the packet is inserted into a special queue inside netfilterBR <LaForge> the packets in this queue are transmitted over a netlink socket to a userspace process.BR <LaForge> this userspace process can now do whatever it wans with the packet (including its data) and re-inject it at exactly the place it came fromBR <LaForge> the process can (of course) also set the verdict of this packet (like: DROP this packet, ACCEPT the other one)BR <LaForge> this enables people to write some firewalling code in userspace, and (hopefully) keeps the kernel clean from too complex code.BR <LaForge> - Userspace loggingBR <LaForge> Very similar to queuing, although it is unidirectionalBR <LaForge> if you insert a rule with the ULOG target, the packet is copied and sent through a netlink multicast socketBR <LaForge> one or more userspace processes may listen to this netlink multicast socket and receive the copy of the packetBR <LaForge> the userspace process may now gather all information it needs and log it to a logfile/database/whateverBR <LaForge> we've already implemented ulogd, which is a plugin-extensible logging daemon attaching to the ULOG targetBR <LaForge> So.... we are heading the end of my talk.... last chapter:BR <LaForge> Current development and future:BR <LaForge> - full TCP sequence number trackingBR <LaForge> - port more matches/targets to IPv6BR <LaForge> - support for more application protocol helpers for NAT (RPC, SMB, SNMP, ...)BR <LaForge> - more matches (like 'accept all packets as long as the number of connections to this port doesn't raise about N)BR <LaForge> - multicast supportBR <LaForge> - infrastructure for having conntrack and nat helpers in userspaceBR <LaForge> BR <LaForge> At the end some useful links:BR <LaForge> This presentation: BR <LaForge> [http://www.gnumonks.org/papers/netfilter-lk2000][[BR]] <LaForge> netfilter homepage: [http://netfilter.kernelnotes.org][[BR]] <LaForge> links to the mailinglist(s) and the archives, as well as the iptables userspace tool are on the netfilter homepageBR <LaForge> we also have a bunch of documents you might be interested in: The 2.4 packet filtering howto, the 2.4 NAT howto, the netfilter hacking howto, and some more stuffBR <LaForge> everything should be linked from the netfilter homepageBR <Blu3> Thank you, it was very informative :)BR <LaForge> Thanks for your interest in this talk... I'll deal with questions nowBR


CategoryDocs

KernelNewbies: Documents/Netfilter (last edited 2006-08-15 04:50:46 by h-64-105-74-181)