Linux through Dummynet to the Outside World

Originally written by Neil Spring

As far as I know, there is no dummynet port to linux, but NIST Net might be used for the same purpose. I haven't tried it.

there is a similar project for solaris I haven't tried: ONE- Ohio Network Emulator

To setup a tunnel between a systems cluster machine running linux 2.2.x to a freebsd box running dummynet, then get routing to the outside world, do the following.

build my modified version of nos-tun.c, originally written by Nickolay Dudorov, which uses protocol number 4 (ipip) instead of protocol 94 (which is what nos tun thinks is ip over ip... I don't understand). This is the program that runs on the FreeBSD side. It seems very robust: once I started it, it ran without any problems.

get a copy of the ip-route2+tc package. this contains the program ipfw, and is really really useful. the primary ftp site alleges to be here.

I referred to someone else's guide and someone else's reference often when setting this up.

On the linux box, do the following (you pretty much have to do everything here as root):

echo 1 > /proc/sys/net/ipv4/ip_forward
insmod ipip
./ip tunnel add  remote bsd_ip_addr local linux_ip_addr mode ipip
./ip addr add linux_tunnel_ip_addr dev tunl1
./ip link set tunl1 up
./ip route add bsd_tunnel_ip_addr dev tunl1
where bsd_ip_addr and linux_ip_addr refer to actual, ethernet interface bound ip addresses. linux_tunnel_ip_addr and bsd_tunnel_ip_addr are the fake addresses used on the tunnel. At the very least linux_tunnel_ip_addr should be on the same subnet as the rest of them.

Now, on the bsd box,

./nos-tun -t /dev/tun0 -s bsd_tunnel_ip_addr -d linux_tunnel_ip_addr linux_ip_addr
arp -s linux_tunnel_ip_addr xx:xx:xx:xx:xx:xx temp pub
sysctl -w net.inet.ip.forwarding=1
ipfw add pipe n ip from any to linux_tunnel_ip_addr out
ipfw add pipe n+1 ip from linux_tunnel_ip_addr to any in
This sets up the tunnel, then tells the local ethernet segment that, to get to linux_tunnel_ip_addr, it should send to the bsd box's ethernet hardware address (fill in the x's with the results of ifconfig xl0 | grep ether). Then, we tell the box to forward packets (we had to do the same under linux above). Finally, we construct two dummynet style tunnels. n is some arbitrary integer not already used... there's no magic, but I use two that are close together. Use the following to get information about what is going on:
ipfw show
ipfw pipe n show
ipfw pipe n+1 show
Go back to the linux box, and setup some routing fun, and maybe change the mtu
route add -host guinea_pig gw bsd_tunnel_ip_addr
ifconfig tunl1 mtu 576
Now, maybe ping guinea_pig and run tcpdump -p proto 4 to see what's happening. tcpdump can be run on either box, but make sure that the linux tunnel side doesn't go promiscuous. This is what the "-p" is for. If it does, I think it may respond to arp queries for its tunneled ip address, which pretty much breaks the tunnel, since other hosts won't necessarily send through it. I don't really understand this so well, I just noticed screwy behavior when I accidentally put it into this mode. You also might need to increase the capture size, since the ipip encapsulation headers + the normal headers are longer than what tcpdump normally captures. (-s 100 works for me) Now for dummynet fun. For my modem, the following works ok (not sure about the loss rate at all):
ipfw pipe n config bw 32Kbit/s delay 70 queue 40 plr 0.0005
ipfw pipe n+1 config bw 32Kbit/s delay 70 queue 10 plr 0.0005
And for Neal's DSL connection, these are the params:
ipfw pipe n config bw 512Kbit/s delay 11 queue 10 
ipfw pipe n+1 config bw 256Kbit/s delay 11 queue 10 
This time, when you ping through the tunnel, you should see higher latency.

As a final note, if you want to monitor the queue length at the dummynet router, use my modified version of ipfw.c, queue-monitor.c. (modifications were to gut main() and list() of their rich features, the rest is just there to make the code compile.) I make no guarantees about its intrusiveness. It calls getsockopt to find the queue length once every 0.1 ms, and prints out a line, with timestamp, anytime it changes. A set of scripts that start queue-monitor output (run as a.out) and tcpdump simultaneously, then resolve the traces into a nifty jgraph form are here. My model for these is: if you use em, let me know, and we'll talk about how to improve em. I'm considering merging with the cvs architecture of joy.

if you have trouble with this guide, or it doesn't work for you, or I missed some crucial detail, or somebody screwed with linux again to make these things not work, let me know.