Dave's Blog

This is updated when I feel like it.

Entries

24 March 2005.  I have a job for next year: a one-year visiting position at Vassar College.  Yay!

7 December 2004.  Here is today's Bourne shell brain teaser: How do you pass the value of a variable containing the "*" character as an argument to a command without using quote characters?

  1. Escape it with a backslash
  2. You can't do that

If you answered #2, give yourself a round of applause!  Here's a script that demonstrates the problem:

#! /bin/sh

arg1='The java.lang.* package'
arg2='The java.lang.\* package'

touch java.lang.Foo

/bin/echo $arg1
/bin/echo $arg2

Here's an example of running it from the command line:

[daveho@noir]$ sh bourneshellsucks.sh
The java.lang.Foo package
The java.lang.\* package

22 November 2004.  Why Java generics are good: they allow you to write robust reusable code.  Case in point: in 2000 I wrote a graph library as part of some network class loading research I was working on.  That version of the library was a bit ugly: lots of casts were needed to recover the concrete class types representing edges and vertices from the generic algorithms that operated on them.  Later on, I rewrote the library to use generics, as part of FindBugs.  Each graph object (graph, vertex, and edge) is fully parameterized with the actual types of the other graph objects.  This allows easy implementation of generic graph algorithms (which don't care about the concrete vertex and edge types) while still allowing callers to access those concrete types without casts.  So, I can use the same graph implementation to represent

  • control flow graphs
  • call graphs
  • analysis ordering constraints
  • etc.
without any type casts.  In general, I can reuse the add new graph algorithms and use the graph library in other programs without diminishing its usefulness.

4 October 2004  I've been learning XSLT in order to transform the XML output produced by FindBugs into HTML.  I bought a copy of Beginning XSLT by Jeni Tennison, which I've been fairly impressed with: most of the important concepts were explained in a very clear way, and there are lots of examples (possibly too many).  One complaint I do have is that I'm not convinced that the HTML examples in the book were tested with any browser other than Internet Explorer.  In any case, I was able to produce reasonably good HTML after only a couple days work.  Well, actually, I cheated by adding new elements to the FindBugs XML output to make generating human-readable output easier.  But good programmers should always cheat when they can.

A peculiarity of XSLT is that it really wants to produce an XML tree as output.  So, the easiest route to generating HTML is to produce XHTML, which is just HTML using XML syntax.  I realize that XHTML has been around for a long time, but my web skills were learned circa 1995, and I'm slowly catching up now.  Figuring out what incantations I needed to give to the XSLT program to produce something that my browser would recognize as XHTML was the most difficult part of the entire process.  But eventually it sort of made sense.  Here is a link to the FindBugs XML to HTML stylesheet.

WRT job search: Lafayette College is hiring.  I will be sending them an application.

3 October 2004  I have decoded the meaning of the spiciness classifications printed on jars of salsa.  Here is what they mean:

Classification Meaning
MildMild
MediumMild
HotMild

28 September 2004  It's been a busy few months!  I got married (hi Kate!), went to Bar Harbor and Acadia National Park for our honeymoon, submitted a variety of papers, wrote some gnarly Java servlet code, and otherwise have had lots to do.  Now I'm starting my job search (since I'm going to graduate next year).  My goal is to get a tenure-track faculty position in Computer Science at a liberal arts college.  As of right now, I'm planning on applying at Moravian College and Hobart and William Smith Colleges.

8 July 2004  I am the moderator for a couple GNU Mailman mailing lists.  The domain name of one of the subscribers changed: mail delivered to the old address still got there, but the MTA kept sending me email saying that the old domain is no longer valid, and to switch to using the new domain.  Fine.  I log into the mailman admin interface, go to "Membership Management" screen, find the user, and click on the offending email address.  I get the screen where the user can change his or her own email address.  Slight problem: this is a form intended for the user, not the administrator.  I fill in the new email address, and click "Change My Address and Name".  Well s**t, it tells me that the user needs to confirm the change within 3 days.

[long sequence of profanities deleted]

Sorry about that, had to get it off my chest.  If anyone knows how to change an email address for someone subscribed to a mailman mailing list, please let me know.

In other news:

  • JavaOne 2004 was fun.
  • The new Morrissey album, "You Are The Quarry", is very good.

20 June 2004  It's been a busy few weeks!

If you haven't heard of Wegmans, it's the nation's #2 (in quality) grocery store chain (according to Consumer Reports, a few years ago anyway).  I worked at the Wegmans in Fayetteville, NY when I was in high school.  It's changed quite a bit since then.  Originally, it was just a really nice grocery store.  Now, it's a shopping extravaganza that is hard to describe.  If you live within an hour or two of one, you should go see what it's about.  Trust me.

24 May 2004.  Weirdest song transition playing OggVorbis files in random shuffle mode using xmms:

  • Karl Richter playing "Wachet auf, ruft uns die Stimme", followed by
  • Mekons, "Whiskey Sex Shack"

9 May 2004.  The Bourne shell sucks.  Here is the problem: I want to define a shell variable that contains some, but not all, of the parameters defined in the $@ variable, and then pass them verbatim to another command.  Sounds easy, right?.  Wrong.  The problem is that if the shell parameters contain space characters, they will get separated into multiple arguments when passed to the second command.  One potential solution would be to put double quotes around the value of the parameter, e.g.:

while [ $# -gt 0 ]; do
    if [ some condition ]; then
        args="$args \"$1\""
    fi
    shift
done

command $args

Unfortunately, this fails because the shell doesn't respect the embedded double quote characters when the expanded value of $args is tokenized.  So, if $1 contained the single value

a b
then command would get passed two arguments
"a
and
b"
Suck, suck, suck.

What makes this problem especially irritating is that the Bourne shell treats the $@ variable specially, so that when it is expanded in double quotes, it logically treats its contents as though each argument were double quoted, preserving internal spaces.  This option is not available to any other variable, though, and it is impossible to assign or modify $@.

On an unrelated note: SCO are a bunch of litigious bastards.

6 April 2004.  Yesterday I installed a new Linux distribution on my laptop: Debian Testing (Sarge).  I started by booting the netinst image for the Beta3 Debian Installer and then running apt-get to fetch additional packages.

In the past, I've shied away from Debian because the stable release is always so far behind the rest of the world in incorporating the latest versions of the kernel and other software.  At the moment, Debian stable is still using a 2.2 kernel!  This time around, I thought I'd give the testing distribution a try, since it is much more up to date.  For example, it uses the 2.4.25 kernel.  As noted in the previous blog entry, earlier 2.4 kernels have some trouble with PCI based Prism2 wireless cards.

Overall, I'm extremely impressed with Debian.  Installing packages using Apt is an absolute pleasure after years of fighting with RPMs and Redhat's up2date.  The APT Howto was very helpful and easy to understand.  For example: to install XFree86, I ran the command

apt-get install x-window-system
It really was that simple!

One think I really like about Apt is that it allows me to get just the packages I need, when I need them.  Ususally, when I install Linux, I have to spend ages going through lists of packages trying to figure out which ones I am going to need later.  I really like the idea of pulling them in on demand.

Overall, the install process was pretty smooth, but not very user friendly.  Redhat's Anaconda is a much nicer interface, and I think people who aren't very familiar with Linux and Unix will have a hard time with the Debian installer.

There were a few loose ends after the base system was installed.  For example: the installer did not create /etc/hosts, so there is no host entry for "localhost" or for the machine name.  In addition, the init scripts did not create a route for the loopback interface.  This may be because I did not have the machine connected to a network when I ran the installer, but it's still annoying.  Also, there is no entry for the cdrom drive in /etc/fstab.  I added one by hand, but this is another snag that will cause problems for inexperienced users.

Another post-install problem is that KDE simply does not work.  The progress window simply hangs while initializing "system services".  Although I normally use KDE as my desktop, I am really rather agnostic about UI stuff, so I decided to give Gnome a try.  So far, I really like Gnome!  It has made significant progress since the last time I used it (which was in summer of 2002).  I especially like gnome-terminal, which is much better than KDE's konsole.

One problem I did notice with Gnome is that the volume control applet thought that the path to the mixer was /dev/sound/mixer, while in reality it's /dev/mixer.  A quick symlink from /dev/sound to /dev fixed the problem.  What I found highly redeeming here was that the volume control applet graphically alerted me to the problem by displaying a useful error message as a tooltip when I moved the mouse over it.  Knowing the cause of the problem is about 80% of fixing it.

4 April 2004.  For a long time, the wireless card in my IBM Thinkpad R32 laptop has been misbehaving in Linux.  The wireless card is a IBM High Rate Wireless LAN Mini-PCI Adapter (part number 22P7711): the lspci command reports it as

02:07.0 Network controller: Harris Semiconductor Prism 2.5 Wavelan chipset (rev 01)
The symptoms of the problem are that occasionally, especially during large data transfers, the network would freeze, and the message
Error -110 writing Tx descriptor to BAP
would appear over and over again on the console.

Anyway, long story short, the version of the orinoco driver included in the default Redhat 9 kernel is buggy.  I upgraded the kernel to 2.4.25 (using the default Linus kernel on kernel.org), using the instructions on this web page.  Configuring the kernel was, as always, a pain, but the build process went smoothly, and the new kernel appears to have solved the problem.

In other news, there appears to be a serious Java performance anomaly in Redhat Enterprise Linux 3.  I submitted a bug report to Redhat's bugzilla, so we'll see what happens.  The problem is really pissing me off because it makes it impossible for me to do any Java work on my workstation at school.

11 Mar 2004.  A few days ago I got back from SIGCSE 2004, which was a lot of fun.  I highly recommend it to anyone interested in computer science education.

My computer at school was recently upgraded from Redhat 9 to Redhat Enterprise Linux.  So far, it seems pretty nice.  However, everything is set up to use UTF8 as the default encoding, including terminal applications such as mutt.  Unfortunately, I had been using rxvt as my terminal program, which doesn't understand that encoding.

I started out by switching to konsole, the KDE terminal emulator.  Initially, it seemed to work pretty well.  It handles UTF8 easily, and has a nice configuration interface for setting colors, font, font size, etc.  It also does font antialiasing, which it a big win.  Unfortunately, konsole takes forever to load, and in general is just incredibly slow.  Given the amount of time I spend creating and using terminal sessions, poor responsiveness is not acceptable.

Eventually I stumbled across rxvt-unicode.  In addition to having support for Unicode, it can use xft for antialiased fonts.  Here's the invocation I use to start it:

urxvt -bg black -fg gray -cr green -sb -sl 1000 -fn "xft:Luxi Mono:pixelsize=12"

15 Feb 2004.  The Willem EPROM Programmer arrived, and it works!  So far I have only tested it with 28C256 EEPROMs, but it was able to successfully write and verify them.  I should have some 27C128 EPROMs (UV-erasable) arriving soon.  They require a higher programming voltage (12.5V instead of 5V), and will be a somewhat better test of the programmer, but I expect it to work.

A few notes about the Willem EPROM Software:

  1. It accesses the parallel port hardware through port I/O instructions.  On modern OSes such as Windows 2000 and XP, this requires a special driver: the SST Port I/O Driver.  This must be installed before the EPROM software will run.  Frankly, this is a really stupid way to access the parallel port - the software should be using I/O services provided by the operating system, and not mucking about with the parallel port directly.
  2. You need to make sure that the programming software knows the I/O port address of the parallel port.  For LPT1 (the first parallel port) on standard PC hardware, this should be 0x378.  On my laptop, an IBM Thinkpad R32, the parallel port was set to some funky address.  To get the programmer to work, I had to use the BIOS configuration to set the port to a standard address.  I have also read that the programmer software will not work if the port is in bidirectional mode.  I set mine to EPP mode, and everything worked fine.
  3. There are at least two common PCB designs for the Willem programmer.  The original design is described at the website.  My programmer is the "PCB3" design from Gitti Ieo.  The PCB3 design has the advantage of being able to generate 12.5V and 21V programming voltages from a single 9V power supply.  While the two designs are functionally equivalent, there are important differences to be aware of as a user.  To program a specific device using the programmer, there are a number of configuration DIP switches and jumpers that need to be set on the programmer.  The original Willem design and the PCB3 design place these switches and jumpers at different places on the PCB board.  The programming software has a button near the top of the Window that will read either "Willem" or "PCB3".  Make sure it is configured to match your hardware.

In general, the programming software is pretty easy to use.  Once I got my parallel port configured correctly, I was able to erase, program, and verify EEPROMs in a matter of minutes.  Some day I would like to write improved programming software that would work under Linux.  Very minimal Willem EPROM Software for Linux does exist, and could serve as a starting point.

11 Feb 2004.  I bought a Willem EPROM Programmer on Ebay for $53, which included the power supply and parallel cable.  Now I'm just waiting for it to arrive :-)  I have also bid on a lot of 27c128 EPROMs.  My goal is to use Etherboot on a 3c509 ethernet card.  The Willem programmer doesn't do GAL devices, but at about 1/4th the cost of a commercial programmer, I think it's a reasonable compromise.

Note: the Etherboot project has a companion site, rom-o-matic.net, which automatically generates Etherboot images, based on your ethernet hardware and a host of other configuration options.  This is one of the coolest web sites I have seen!

15 Jan 2004.  Why are EPROM programmers so expensive?  There are about a million different companies selling them, they all have a tiny share of the market, and the software generally sucks.  Some vendors are still shipping software that runs under MS-DOS.  Yes, DOS.  What the fuck?  The Willem EPROM Programmer is probably the closest thing to a cheap general purpose programmer out there, and the software doesn't look too horrible, but it doesn't program GALs, and it needs an adapter to do 80x51 family microcontrollers.  Needhams seems to have some pretty nice programmers and good device support (including GALs), but they run about $200-$300 for the entry level models.

I want an EPROM programmer for two reasons.  First, I want to burn an EPROM for Etherboot so I can test GeekOS easily on real hardware without having to boot from a floppy.  Second, I enjoy tinkering with microcontrollers, and a general purpose programmer would be a really useful thing to have.  I have plans for a cheap 8051 development board that I would like to turn into a PCB some day.  I suppose real engineers use JTAG and in-circuit programming.

5 Jan 2004.  Here is how to write an OS kernel in 13 easy steps.  Actually, some of these steps are not all that easy, but if you are persistent you will definitely learn a lot.

  1. Write a floppy boot sector.  It should load a secondary setup program followed by a flat kernel image.  The setup program does just enough hardware initialization to start executing the kernel image.  On the x86 architecture, this means reprogramming the interrupt controllers to route hardware IRQs so they don't conflict with processor exceptions, setting up a GDT, enabling the A20 gate, entering protected mode, and jumping to the kernel's entry point.

  2. Print "Hello, world" by writing directly to video memory.  This is what your first kernel should do.  VGA text mode makes this very easy.  Getting 32 bit C code to execute on the bare hardware is a surprisingly satisfying accomplishment.

  3. Write console output routines so that you have a printf() function.  Printing messages to the console is the fundamental kernel debugging technique.  You should also implement an assert() macro.  Put lots of assertions in your code to check that the pre- and post-conditions of the functions you write are satisfied.  Be paranoid.  An ounce of prevention is worth a pound of cure when debugging a kernel.

  4. Write a page memory allocator.  This will allow you to allocate and free memory at the granularity of single page frames.  It's not a bad idea to write a kernel heap (malloc/free) implementation at this point, too.  You can make the kernel heap separate from page allocation.

  5. Write a kernel thread library.  Thread queues keep track of threads that are either ready to run (the "ready queue") or are waiting for some event to occur ("wait queues").  Each thread will have its own stack.  Thread context switches can be implemented by saving the general purpose CPU registers on the thread stack, switching to the stack of another (suspended) thread, and then restoring CPU registers from the new thread's stack.  On a uniprocessor system, you can make a region of code atomic (uninterruptible) by disabling interrupts during the block.  This will become important eventually when your kernel has multiple threads running concurrently.  You can disable interrupts to protect important data structures from being modified/accessed by multiple threads at the same time.

  6. Implement interrupt support.  When an interrupt occurs, save the CPU registers on the stack of the interrupted thread.  Use the same stack layout for interrupts as you do for voluntary thread switches; in other words, when a thread voluntarily yields the CPU, its stack should look like it was suspended by an interrupt.  By doing this, your interrupt handling code can easily choose a new thread, because all saved threads have their registers saved the same way as the current (interrupted) thread.  In general, interrupt handlers will need a way to communicate to the thread dispatch code that a new thread should be chosen for execution.  Interrupts for hardware devices (IRQs) will probably require some extra work to handle.  You will need primitives to mask (inhibit) and unmask (allow) particular IRQs.

  7. Implement a device driver for the keyboard.  PC keyboards generate IRQ 1.  All the keyboard interrupt handler needs to do is save generated key codes to a queue.  You can use a wait queue to allow threads to wait for a key event.  When a keyboard interrupt occurs, it should wake up any threads waiting in the keyboard wait queue, to notify them of the availability of new keyboard input.  Change your "Hello, world" kernel to act as a typewriter, echoing pressed keys to the screen.

  8. Install a handler for the timer interrupt.  Use the timer interrupt to keep track of how long threads have been executing.  When a thread has executed for a certain number of ticks (the scheduling quantum), suspend it and choose a new thread to run.  This is preemptive scheduling.  Keep in mind that any part of your kernel that runs with interrupts enabled can be preempted at any time.

  9. Implement user mode processes.  A simple way to get this going is to represent user executables as flat binary images stored in arrays within the kernel data segment.  This avoids the need to write executable loading code (i.e., for ELF executables), and also the need to have a filesystem.  You can use segmentation and local descriptor tables (LDTs) on the x86 as a simple way of implementing user memory protection.  Or, you can bite the bullet and use virtual memory (paging).  Paging actually simplifies kernel development once it is working, because many invalid pointer dereferences can be trapped by the kernel instead of being silently allowed to occur.  Implement a system call interface using software interrupts.  You will need to think carefully about how data is transferred to and from user memory.  Start out by implementing system calls to print text to the screen, and to read key events from the keyboard.  Implement a typewriter as a user process.  Have it run concurrently with CPU-bound programs to demonstrate that preemptive multitasking works.

  10. Implement block device drivers.  This will probably be the most difficult task so far, mostly because block I/O devices on PCs are weird and poorly documented.  Implementing a floppy driver would be a good starting point, and a programmed-IO IDE driver would also be a good choice.  If you're really feeling lazy, implement a RAM disk driver.  RAM disk images can be loaded by the setup program, and once loaded, I/O is performed by calling memcpy().  Whatever you do, make sure your block device abstraction layer has the notion of queued requests, so that kernel threads can block while waiting for I/O to complete.

  11. Implement a filesystem.  Microsoft's FAT filesystem might be a good starting point, if only because it's fairly well documented and is supported by pretty much every known OS.  You will want to design a virtual filesystem (VFS) layer to allow multiple filesystem implementations, as well as interoperability of files and interprocess communication mechanisms.

  12. Implement an executable loader.  Allow execution of programs stored in ELF or PECOFF format on a filesystem.  If your user mode memory protection uses paging, you can demand-page the executable into memory.

  13. Implement pipes and a console device.  Now you can start implementing some simple user mode tools, such as a command interpreter (shell), file copy program, etc.

As part of work I'm doing on GeekOS, the only remaining steps I have to complete are filesystem, console I/O, and pipes.  Once those are done, I'm hoping to get a respectable collection of user programs ported, and eventually get the system to the point where it is self hosting.

25 August 2003.  I think the world would be a better place if everyone owned a Gang of Four album.  First, it would throw the utter soullessness of the current music scene into sharp relief.  Second, the need for relief from the relentness corruption of human values by government and industry has never been greater.

What we think
Changes how we act
"Why Theory?", from Solid Gold

22 August 2003.  My personal philosophy:

Don't make things more complicated than they need to be.
It works as well in real life as it does in programming.

20 August 2003.  Dear SCO, I use Linux.  Go ahead and sue me.  My contact information can be found here.  Signed, David Hovemeyer.

15 August 2003.  Song lyric of the day:

Got a call last month from an old friend
Said they're hiring on the road down in old South Bend
The ad said "lookin' for a few good men"
It seems I wasn't good enough for them
"South Bend", Waco Brothers

2 August 2003.  You're designing a GUI tooklit, and you're deciding how hard it should be to customize the text color used in the items in a tree widget.  Should it be

  1. Easy?
  2. Moderately difficult?
  3. Extremely difficult?
If you answered "Extremely difficult", you may have been involved in the design of Java's Swing toolkit.

29 July 2003.  The correct place to listen to the album Republic by New Order is the state of New Jersey.  If you must listen to it somewhere else, try to enter a New Jersey frame of mind first.  Spooky!

10 July 2003.  I was born to program.  By this, I mean that I enjoy it a lot, am pretty good at it, and would like to get even better at it.  Seeing poorly written code makes me upset---you get the idea.  What I would like to know how programming skill fits in to human psychology.  I don't do it to impress anyone other than myself, so it doesn't help me advance socially.  (As a nerdy kid, I was sometimes ridiculed for my love of computers.)

Lately I've been reading The Moral Animal by Robert Wright (a fascinating book, BTW) and it seems that most human behavior is driven, ultimately, by behaviors that in ancestral times would be most likely to progagate one's genes.  Computer programming does not seem to fit any reasonable criteria for a behavior that is likely to lead to reproductive success.  It is generally a solitary activity, and so does not tend to elevate social status.  Maybe that's precisely the point, though.  The tendency for hackers to be shy and withdrawn is well established.  Perhaps programming is a means to excel at something when social advancement through more ordinary means isn't available.

Lately, though, society in general has gotten a bit more receptive to the whole hacker culture.  So, perhaps we're due for a new generation of more socially well-adjusted hackers.  Anyway, who cares.  I don't need any reason to enjoy what I do.

9 July 2003.  It struck me today that 802.11 ("Wi-Fi") networking has created a social transformation that, in a subtle way, is almost as profound as the internet itself.  Specifically, browsing the web is now a social activity, and at times, practically a team sport.  In my house, everyone tends to congregate in the living room with their laptops.  Sometimes it's to work, but just as often, it's to look for cool stuff on Ebay, be publically outraged by the latest politcal developments in the Post, research obscure facts, etc.  This phenomenon has even managed to unseat the television and the Game Cube as the locus of social activity.

8 July 2003.  I'm sure this has been said before, but slang is basically a means of social group identification, and therefore, exclusion.  That's why I don't use slang.  Well, actually, I do use obsolete slang for the entertainment value.  Word!

Why the somber color scheme?