The Thumb Extension

The ARM processor has found its greatest uses in embedded systems, hand-held computers, and other systems where there is a high requirement for systems to do a lot of work with limited resources. The Thumb extension was created to address some of these aspects of resource consumption, and it has become a standard extension on nearly all ARM chips produced today.

One of the resources that is limited on small systems is instruction memory. Limited instruction memory limits the size of the program you can run on your processor, so you want to look for ways to reduce the size of your code. Compile-time optimizations are one obvious way to achieve this, when such optimizations can be found. Increasing the size of the instruction set is another way to do it, but this normally results in an increase in the size of individual instructions across the board, which will lead to a corresponding increase in the amount of storage needed to store the instructions, which may not be offset by the reduction in the number of instructions needed to write the program.

The quality we're looking for is called code density. We want to somehow do the same amount of work, yet have the program take up less space. This is where the Thumb extension comes in. Thumb tries to get the best of both worlds by allowing a large (32-bit) instruction set while providing an alternate, small (16-bit) instruction set that can do the bulk of the work while taking up only half the space. They call this concept "code compression", the idea being that the small Thumb instructions are "decompressed" to their equivalent full-size 32-bit ARM instructions before they are run.

What do Thumb instructions look like, and how do they compare to their 32-bit counterparts?

The illustration below shows an example of how the ADD instruction is converted from Thumb to ARM. Notice how the immediate operand, 8 bits in Thumb, is padded with zeroes in its ARM equivalent. Note also that the add instruction takes an additional operand when decompressed.
From Kang, Bong-Ho's paper

A smaller instruction means you must have smaller opcodes, and fewer or smaller (or both) operands. Thumb ensures smaller operands in part by restricting most of its instructions to use 8 general purpose registers in place of the usual 15. A few instructions can access the full register set, such as MOV, to enable workarounds to some of the limitations of a smaller register set.

How does the ARM know whether the instruction it's running is a Thumb instruction or a regular ARM instruction?

The ARM chip contains a special state bit that tells the CPU whether to expect a compressed Thumb instruction or a standard ARM instruction. This bit is toggled with its own instruction, BX, which must be insterted into the code every time a programmer or compiler wishes to switch between Thumb mode and Standard ARM mode. An obvious result of this is that there is some overhead to switching between modes, thus it is probably not a good idea to switch to Thumb unless it will save you more than two instructions of equivalent ARM code.