![]() |
VOOZH | about |
The Funtoo Linux project has ended. This is an historical archive. Please read the official statement for details on the path to Funtoo's successor.
Linux's traffic control functionality offers a lot of capabilities related to influencing the rate of flow, as well as latency, of primarily outgoing but also in some cases incoming network traffic. It is designed to be a "construction kit" rather than a turn-key system, where complex network traffic policing and shaping decisions can be made using a variety of algorithms. The Linux traffic control code is also often used by academia for research purposes, where is it can be a useful mechanism to simulate and explore the impact of a variety of different network behaviors. See netem for an example of a simulation framework that can be used for this purpose.
Of course, Linux traffic control can also be extremely useful in an IT context, and this document is intended to focus on the practical, useful applications of Linux traffic control, where these capabilities can be applied to solve problems that are often experienced on modern networks.
One common use of Linux traffic control is to configure a Linux system as a Linux router or bridge, so that the Linux system sits between two networks, or between the "inside" of the network and the real router, so that it can shape traffic going to local machines as well as out to the Internet. This provides a way to prioritize, shape and police both incoming (from the Internet) and outgoing (from local machines) network traffic, because it is easiest to create traffic control rules for traffic flowing out of an interface, since we can control when the system sends data, but controlling when we receive data requires an additional intermediate queue to be created to buffer incoming data. When a Linux system is configured as a firewall or router with a physical interface for each part of the network, we can avoid using intermediate queues.
A simple way to set up a layer 2 bridge using Linux involves creating a bridge device with , adding two Ethernet ports to this bridge (again using ), and then apply prioritization, shaping and policing rules to both interfaces. The rules will apply to outgoing traffic on each interface. One physical interface will be connected to an upstream router on the same network, while the other network port will be connected to a layer 2 access switch to which local machines are connected. This allows powerful egress shaping policies to be created on both interfaces, to control the flows in and out of the network.
Resources you should take a look at, in order:
Related Interesting Links:
Daniel Robbins has had very good results with the HTB queuing discipline - it has very good features, and also has very good documentation, which is just as important, and is designed to deliver useful results in a production environment. And it works. If you use traffic control under Funtoo Linux, please use the HTB queuing discipline as the root queuing discipline because you will get good results in very little time. Avoid using any other queuing discipline under Funtoo Linux as the root queuing discipline on any interface. If you are creating a tree of classes and qdiscs, HTB should be at the top, and you should avoid hanging classes under any other qdisc unless you have plenty of time to experiment and verify that your QoS rules are working as expected. Please see State of the Code for more info on what Daniel Robbins considers to be the current state of the traffic control implementation in Linux.
If you are using enterprise kernels, especially any RHEL5-based kernels, you must be aware that the traffic control code in these kernels is about 5 years old and contains many significant bugs. In general, it is possible to avoid these bugs by using HTB as your root queueing discipline and testing things carefully to ensure that you are getting the proper behavior. The queueing discipline is known to not work reliably in RHEL5 kernels. See Broken Traffic Control for more information on known bugs with older kernels.
If you are using a more modern kernel, Linux traffic control should be fairly robust. The examples below should work with RHEL5 as well as newer kernels.
If you are implementing Linux traffic control, you should be running these commands frequently to monitor the behavior of your queuing discipline. Replace with the actual network interface name.
tc -s qdisc ls dev $wanif tc -s class ls dev $wanif
Here are some examples you can use as the basis for your own filters/classifiers:
modemif=eth4 iptables -t mangle -A POSTROUTING -o $modemif -p tcp -m tos --tos Minimize-Delay -j CLASSIFY --set-class 1:10 iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 53 -j CLASSIFY --set-class 1:10 iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 80 -j CLASSIFY --set-class 1:10 iptables -t mangle -A POSTROUTING -o $modemif -p tcp --dport 443 -j CLASSIFY --set-class 1:10 tc qdisc add dev $modemif root handle 1: htb default 12 tc class add dev $modemif parent 1: classid 1:1 htb rate 1500kbit ceil 1500kbit burst 10k tc class add dev $modemif parent 1:1 classid 1:10 htb rate 700kbit ceil 1500kbit prio 1 burst 10k tc class add dev $modemif parent 1:1 classid 1:12 htb rate 800kbit ceil 800kbit prio 2 tc filter add dev $modemif protocol ip parent 1:0 prio 1 u32 match ip protocol 0x11 0xff flowid 1:10 tc qdisc add dev $modemif parent 1:10 handle 20: sfq perturb 10 tc qdisc add dev $modemif parent 1:12 handle 30: sfq perturb 10
The code above is a working traffic control script that is even compatible with RHEL5 kernels, for a 1500kbps outbound link (T1, Cable or similar.) In this example, is part of a bridge. The code above should work regardless of whether is in a bridge or not -- just make sure that is set to the interface on which traffic is flowing out and you wish to apply traffic control.
This script uses the command to create two priority classes - 1:10 and 1:12. By default, all traffic goes into the low-priority class, 1:12. 1:10 has priority over 1:12 ( vs. ,) so if there is any traffic in 1:10 ready to be sent, it will be sent ahead of 1:12. 1:10 has a rate of 700kbit but can use up to the full outbound bandwidth of 1500kbit by borrowing from 1:12.
UDP traffic (traffic that matches ) will be put in the high priority class 1:10. This can be good for things like FPS games, to ensure that latency is low and not drowned out by lower-priority traffic.
If we stopped here, however, we would get a bit worse results than if we didn't use at all. We have basically created two outgoing sub-channels of different priorities. The higher priority class can drown out the lower-priority class, and this is intentional so it isn't the issue -- in this case we want that functionality. The problem is that the high priority and low priority classes can both be dominated by high-bandwidth flows, causing other traffic flows of the same priority to be drowned out. To fix this, two queuing disciplines are added to the high and low priority classes and will ensure that individual traffic flows are identified and each given a fair shot at sending data out of their respective classes. This should prevent starvation within the classes themselves.
First note that we are adding netfilter rules to the chain, in the table. This table allows us to modify the packets right before they are queued to be sent out of an interface, which is exactly what we want. At this point, these packets could have been locally-generated or forwarded -- as long as they are on their way to going out of (eth4 in this case), the chain will see them and we can classify them and perform other useful tweaks.
The iptables code puts all traffic with the "minimize-delay" flag (interactive ssh traffic, for example) in the high priority traffic class. In addition, all HTTP, HTTPS and DNS TCP traffic will be classified as high-priority. Remember that all UDP traffic is being classified as high priority via the rule described above, so this will take care of DNS UDP traffic automatically.
iptables -t mangle -N tosfix iptables -t mangle -A tosfix -p tcp -m length --length 0:512 -j RETURN #allow screen redraws under interactive SSH sessions to be fast: iptables -t mangle -A tosfix -m hashlimit --hashlimit 20/sec --hashlimit-burst 20 \ --hashlimit-mode srcip,srcport,dstip,dstport --hashlimit-name minlat -j RETURN iptables -t mangle -A tosfix -j TOS --set-tos Maximize-Throughput iptables -t mangle -A tosfix -j RETURN iptables -t mangle -A POSTROUTING -p tcp -m tos --tos Minimize-Delay -j tosfix
To use this code, place it near the top of the file, just below the line, but before the main and rules. These rules will apply to all packets about to get queued to any interface, but this is not necessarily a bad thing, since the TCP flags being set are not just specific to our traffic control functionality. To make these rules specific to , add "-o $modemif" after "-A POSTROUTING" on the last line, above. As-is, the rules above will set the TCP flags on all packets flowing out of all interfaces, but the the traffic control rules will only take effect for , because they are only configured for that interface.
SSH is a tricky protocol. By default, all the outgoing SSH traffic is classified as "minimize-delay" traffic, which will cause it to all flow into our high-priority class, even if it is a bulk transfer running in the background. This code will grab all "minimize-delay" traffic such as SSH and telnet and route it through some special rules. Any individual keystrokes (small packets) will be left as "minimize-delay" packets. For anything else, we will run the iptables module, which will identify individual outbound flows and allow small bursts of traffic (even big packets) to remain "minimize-delay" packets. These settings have been specifically tuned so that most screen changes (^A^N) when logging into your server(s) remotely will be fast. Any traffic over these burst limits will be reclassified as "maximize-throughput" and thus will drop to our lower-priority class 1:12. Combined with the traffic control rules, this will allow you to have very responsive SSH sessions into your servers, even if they are doing some kind of bulk outbound copy, like rsync over SSH.
Code in our main rules will ensure that any "minimize-delay" traffic is tagged to be in the high-priority 1:10 class.
What this does is keep interactive SSH and telnet keystrokes in the high-priority class, allow GNU screen full redraws and reasonable full-screen editor scrolling to remain in the high-priority class, while forcing bulk transfers into the lower-priority class.
iptables -t mangle -N ack iptables -t mangle -A ack -m tos ! --tos Normal-Service -j RETURN iptables -t mangle -A ack -p tcp -m length --length 0:128 -j TOS --set-tos Minimize-Delay iptables -t mangle -A ack -p tcp -m length --length 128: -j TOS --set-tos Maximize-Throughput iptables -t mangle -A ack -j RETURN iptables -t mangle -A POSTROUTING -p tcp -m tcp --tcp-flags SYN,RST,ACK ACK -j ack
To use this code, place it near the top of the file, just below the line, but before the main and rules.
ACK optimization is another useful thing to do. If we prioritize small ACKs heading out to the modem, it will allow TCP traffic to flow more smoothly without unnecessary delay. The lines above accomplish this.
This code basically sets the "minimize-delay" flag on small ACKs. Code in our main rules will then tag these packets so they enter high-priority traffic class 1:10.
Browse all our available articles below. Use the search field to search for topics and keywords in real-time.