Incremental Java
What is Programming?

What is Programming?

What is programming? What's it all about? There's many ways of describing what program is. Each definition looks at a different aspect of programming.

Simply put, programming is telling the computer what to do. This is done through a series of instructions or commands.

You write these instructions using a programming language. Maybe you remember taking an English class, where you were taught "correct" grammar. Programming languages also have a grammar, and you must follow it very strictly, otherwise your program won't work.

Programming languages fall into two categories: compiled and interpreted. Languages like Pascal, FORTRAN, C, C++, Java, and C# are usually compiled.

Compiling is another word for "translating". Computers runs a language called assembly language (more accurately, machine code). Assembly language is said to be low-level. Low-level means it's basically what the CPU does. CPU run assembly language instructions. Each instruction in an assembly language is very simple. Too simple to program easily (even though people used to program in assembly in the 1970s much more often).

Each assembly language is called an instruction set. Manufacturers of central processing units (such as Intel and AMD) build chips that can run all the instructions of an instruction set. Unfortunately, there's many different instruction sets. IA32 is the instruction set for Intel and AMD chips. There's an instruction set for PowerPC chips (which Macs run on). There's an instruction set for Sparc chips (which Sun workstations).

These are not compatible. This means a program written in assembly language that runs on a PC won't run on a Mac (well, it can be emulated) or vice versa.

It's just like you can't go to Japan and speak English and expect to be fully understood, or if they speak Japanese, you may not understand them. (Emulation is like having a translator translate what you say).

Or to use a more technological example, you can't use a Betamax video tape (an competitor to VHS video tape in the 1980s, developed by Sony, which was about 2/3 the size of a VHS tape) in a VHS machine or vice versa.

Why is this a problem?

Suppose you've worked many hours to write a great Tetris program. You've written it in some assembly language on a Sparc CPU. You want to run this program on a PC which runs a different assembly language. It won't work! You've worked really hard, and the program only runs on a Sparc? Why can't these programs run anywhere?

One solution to this problem is to translate from Sparc to Intel instruction set. This can work.

However, it's not a great solution. The biggest reason is inconvenience. It's painful (for most of us) to program in assembly. Many of us use cars to get to where we need to (or busses, at least). If we had to walk, we'd get a lot less done. A ten minute drive may be an hour walk.

Similarly, we can write a program in assembly language in 1000 lines, but do the same work in 100 lines in a high-level language.

That's why we'll learn Java. Java is a high-level language. High level languages let you think about programs in a more powerful way. However, it comes at a price. You can do more, in less time, but it's more complicated to learn.

Think about the difference between a word processor and a typewriter. Which is easier to learn? A typewriter is easier. But it does a lot less. It can't spell check. It can't do a search and replace. It can't create backups. You can't write italicized lines. You can't change fonts. But you pay a price. You must learn how to get to all of these features in a word processor program. Most people aren't afraid of computers, and are willing to learn, but you still have to learn.

Java (and other high level languages) are like a word processor. It's powerful. But it's harder to learn. On the other hand, it's usually far less tedious to write a Java program than writing an assembly language program.

To run on a computer, Java is translated or compiled to assembly language, which is then run on the CPU, because CPUs can only run assembly language).

(This isn't really true. Java is compiled to something called bytecode. Bytecode is something like an assembly language. It's not a real assembly language. The Java Virtual Machine then "runs" this fake assembly language, effectively translating it to a real assembly language. Why is it done this way? Companies want to provide you Java programs without letting you see how they did it. Bytecode is hard to read, but easy to run. So they send you bytecode instead.)

Other Definitions

One definition of programming is running instructions. Java lets you write more powerful "instructions".

Another definition of a program is that it manipulates data. To give you an idea of what this means, imagine you have a spreadsheet. A spreadsheet consists of cells which hold data. You can add numbers in a column, or pick the largest number, or find numbers in a range.

You can think of a program as something that looks up data in cells, does some computation, then changes the values in other cells.

This view of programming is about managing the data in many cells.

Another way to think about programming is to think of them as mathematical functions. You might have a function that adds a bunch of numbers. Or another one that finds their average. Or finds the maximum. A program is basically a function which can itself use other functions. For example, you may have square(double(x)). This doubles the value of x, then squares it.

Many different people have come up with many different programming languages. You may have to think about programming in a very different way depending on which language you use.

Why Java?

There are probably a dozen languages that are popular enough that at least a million people who use the language and know it somewhat well. Why pick Java?

Java is an object-oriented language. Objects are an idea that have been around since the 1970's. However, it wasn't until C++ was used somewhat widely by the late 1980's that many computer science departments started teaching object oriented languages.

C++ is a very complex language. It's not hard to get through the basics, but to fully understand it can be extremely difficult. Worse still, C++ had many features that were in the language, but you couldn't get a compiler that would run that feature! It's like a car manufacturer that says they have cars that should fly, but none of them fly. It was a kind of false advertising. They eventually wrote compilers that could manage these new features, but often, they didn't work well.

The folks who developed Java decided C++ was too messy, and they wanted to get rid of the nastier aspects of C++ and keep features they liked, and then add features that they felt would be better than C++. Many things make Java easier to learn. A few make Java harder to learn.

A very similar language to Java is C#. This language was developed at Microsoft. At one point, Microsoft did some development on Java. However, Sun Microsystems originated and controlled Java, and eventually wanted Microsoft to stop its work on Java. So they made a language very similar to Java called C#. It should be fairly easy to learn C# if you know Java.

There are some languages that are comparatively obscure. Scheme is a language that first year students at M.I.T. uses. Scheme is nice and easy to use. Surprising, considering it is taught at M.I.T. Scheme is a kind of functional language. For many years, people believed functional languages were too slow for the real world. Unfortunately, you need a lot of hype to convince industry to use a programming language, and Scheme never had that kind of hype.

O'Caml and ML are also functional programming languages (they are very similar to each other). These languages use functions as the basis of programming.

There are a bunch of scripting languages such as Perl, Python, and Ruby. Some of these are very powerful to use, but can be messy (especially Perl) to read. Scripting languages were once thought to be much less powerful than languages like Java. However, each of the three scripting languages listed above are powerful enough to write substantial programs.

These languages are usually interpreted. It's a little difficult to explain what it means for something to be interpreted.

Basically, when you write a program in Java, you type in many lines of code. Then, you must compile it. If there are errors, you must correct them, and then compile again. This process can be slow.

Interpreted languages usually have a programming environment. You can type one line at a time, and then see what each line does. You don't need the whole program to get some results. However, interpreted languages often run more slowly than compiled programs, but they offer the convenience of running it with less typing.

Python, in particular, has a interpreted environment.

The closest analogy I can think of is being able to tell a chef to cook something small (e.g., boil water, add vegetables, put in eggs, stir). You give individual commands and the chef responds. This is basically interpreting.

On the other hand, you might give an entire recipe to the chef. However, the recipe has to have the right format, and all the directions must already be written down. This is basically like compiling.

Compiling usually means you need a complete program to run, while interpreting can work with a partial program, and "remembers" things you've typed in. Interpreters allow you to play around and get results more quickly, but run more slowly than compilers.