High-level languages, for the most part, try to make you as unaware of the hardware as possible. Clearly, this isn't entirely true, because efficiency is still a major consideration for some programming languages.
C, in particular, was created to make it easier to write operating systems. Rather than write UNIX in assembly, which is slow process (because assembly code is tedious to write), and not very portable (because assembly is specific to an ISA), the goal was to have a language that provided good control-flow, some abstractions (structures, function calls), and could be efficiently compiled and run quickly.
Writing operating systems requires the manipulation of data at addresses, and this requires manipulating individual bits or groups of bits.
That's where two sets of operators are useful: bitwise operators and bitshift operators.
You can find these operators in C, C++, and Java (and presumably C#, since it's basically Java). Bitwise operators allow you to read and manipulate bits in variables of certain types.
Even though such features are available in C, they aren't often taught in an introductory level programming course. That's because intro level courses prefer to emphasize abstraction. With many departments using Java, there's a trend to increase what's abstract, and not get into the representation.
For example, some languages have support for stacks, queues, hashtables, and so forth. These "canned" data structures are meant to provide you, the programmer, with objects that perform certain tasks, while relieving you of the tedium and detail of understanding how the data is represented.
Nevertheless, if you intend to do some work in systems programming, or other forms of low-level coding (operating systems, device drivers, socket programming, network programming), knowing how to access and manipulate bits is important.
It turns out there's more than one kind of int. In particular, there's unsigned int, there's short int, there's long int, and then unsigned versions of those ints.
The "C" language does not specify the difference between a
short int, an int and a long int, except to state that:
sizeof( short int ) <= sizeof( int ) <= sizeof( long )
You will find that these sizes vary from ISA to ISA, and possibly
even compiler to compiler. The sizes do not have to be distinct.
That means all three sizes could be the same, or two of three could
be the same, provided that the above restrictions are held.
Bitwise operators fall into two categories: binary bitwise operators and unary bitwise operators. Binary operators take two arguments, while unary operators only take one.
We're going to discuss binary operators first, and we'll do in the context of a fake binary operator called @. Thus, we write x @ y to perform bitwise @ on x and y.
Bitwise operators, like arithmetic operators, do not change the value of the arguments. Instead, a temporary value is created. This can then be assigned to a variable. For example, when you write x + y, neither x nor y have their value changed.
Let's assume that we are doing bitwise @ on two unsigned int
variables. Let's assume that unsigned ints use 32 bits of memory.
We define two variables X and Y where
X = x31x30....x0
Y = y31y30....y0
We want to be able to refer to each bit of X and Y, which
we do so by writing out the bits by writing the variable name in lowercase
and adding the appropriate subscript for thie bit number.
The subscripts follow the convention we've used so far. The least significant bit has a subscript of 0, and is the rightmost bit. The most significant bit has a subscript of N-1 (for N bits), and is the leftmost bit. In this case N = 32 so the MSb has a subscript of 31.
Let's define @1 to be the @ operation performed on two individual bits. @ is performed on all 32 bits simultaneously.
Furthermore, let's call the temporary value created, T. Normally, this temporary value does not have a name. It is generated while the program is running, and used in other computations. Thus, when you write c = a + b, a temporary value is created for the sum a + b, and this temporary value then gets copied into c.
Of course, for efficiency, the temporary value might not be generated, and the value written directly to c. For more complex statements, such as c = (a + b) - d, temporary values are often created to store intermediate results. These temporary values are created by the runtime system, and are not given variable names.
We define Z = X @ Y as:
zi = xi @1 yi, for 0 <= i <= N - 1
In words, to compute the ith bit of Z, you
should perform the operation on the ith bit of
X and ith bit of Y.
|xi||yi||xi &1 yi|
We can do an example of bitwise &. It's easiest to do this on 4 bit numbers, however.
|z = x & y||1||0||0||0|
|xi||yi||xi |1 yi|
We can do an example of bitwise |. It's easiest to do this on 4 bit numbers, however.
|z = x | y||1||1||1||0|
The following is a chart that defines ^1, defining XOR on individual bits.
|xi||yi||xi ^1 yi|
We can do an example of bitwise ^. It's easiest to do this on 4 bit numbers, however.
|z = x ^ y||0||1||1||0|
There's not that much to say about it, other than it's not the same operation as unary minus.
The following is a chart that defines ~1, defining NOT on an individual bit.
We can do an example of bitwise ~. It's easiest to do this on 4 bit numbers (although only 2 bits are necessary to show the concept).
|z = ~x||0||0||1||1|
A 32-bit int can be used to stored 32 Boolean variables. Normally, the minimum size for one Boolean variable is one byte. All types in C must have sizes that are multiples of bytes. However, only one bit is necessary to represent a Boolean value.
You can also use bits to represent elements of a (small) set. If a bit is 1, then element i is in the set, otherwise it's not.
You can use bitwise AND to implement set intersection, bitwise OR to implement set union.
In upcoming notes, you'll learn how to find the value of individual bits of a char or int.
Most built-in binary operators do not modify the values of the arguments. This applies to logical operators too. They don't modify their arguments.
There are operators that do assignment such as +=, -=, *=, and so on. They apply to logical operators too. For example, |=, &=, ^=. Nearly all binary operators have a version with = after it.