Stop Teaching Java Already

29 Sep 2014

An open letter to university computer science departments:

Teaching your introductory computer science courses with the Java programming language is a disservice to your students.

This is not to say that Java is language without merit. Relative to other languages, Java is a stable, secure, high-performance (especially considering that it is garbage-collected) language. The “write once, run anywhere” is invaluable for anyone trying to deploy code to multiple platforms and there is a rich collection of mature libraries.

The problem is that none of these benefits are of particular value in teaching new students how to abstractly reason about code.

Consider some of the disadvantages of Java, relative to other languages:

No read-eval-print-loop (due to ahead-of-time compilation)
Verbose syntax
A non-unified (due to primitive types), complex class hierarchy
A lack of first-class functions and closures
A manifest type system weaker than full Hindley-Milner

It’s not the case that these disadvantages are never worth accepting (presumably in exchange for some other language benefit, as part of a well-reasoned trade-off). But these disadvantages are all a significant barrier for new students trying to learn the basic, abstract concepts involved in programming.

Scheme would let students explore first-class functions, recursion and syntactic macros early on and is simple enough for even first-year students to implement a meta-circular interpreter.

Haskell would let students explore first-class functions, recursion and a type system that allows programmers to express and enforce non-trivial correctness constraints in code.

Standard ML would be about the same as Haskell, but cuts out the monadic I/O and non-strict evaluation.

All of these languages offer simpler alternatives to Java’s verbose syntax and complicated class hierarchy which virtually requires programmers to use an IDE for development.

Note that idiomatically managing a Cartesian point in 2D space requires a class definition, two private fields, and four methods to read and update what is, fundamentally, an ordered pair of floating-point numbers.

Actually printing out that point requires another class with a main function as the entry point of the program, and a compilation step.

An IDE can help manage the complexity, but does not remove any complexity. All of the code is still there, and all of the code has to be correct. Whether or not a new student fully understands what is happening, is questionable. “F5” in Eclipse becomes the magic “make it run” button.

The lack of a read-eval-print-loop may be the most serious defect of all. It is not uncommon for confused students to ask their TA, “What happens when I run this line of code?” This problem is not that students are lazy (granted, many are, but that’s a separate issue). The problem is that in a language like Java, asking an authority what is happening in a piece of code is easier than actually running it and experimenting enough to find out for their self.

What this all adds up to is an introductory course that produces students that can shuffle enough sections of code around to (eventually) make a program work, at the cost of turning the details into an opaque mess. The analogy here is one of training an apprentice artisan or engineer how to connect different parts together to build a system, without teaching the apprentice what their tools are doing or how their tools work.

It would be disingenuous to suggest that the other programming languages mentioned are without their own set of disadvantages. But, for the purposes of teaching new students how to code, Python is a strictly superior alternative to Java. A typical the list of concepts to be taught in an introductory computer science course might look like the following:

Programming Basics: Variables, Operators, Expressions, Statements, Methods
Text Input/Output
Conditionals
Loops
Testing and Debugging
Arrays
Polymorphism

All of which can be taught in Python with minimal changes to the course. Switching to Python gives a simpler, cleaner syntax, a read-eval-print loop and true closures.

Switching to Python loses nothing.

Teaching in Java makes it artificially difficult to learn computer science. In a very real sense, any curriculum that continues to do so is failing its students.