High-level languages, for the most part, try to make you as unaware of the hardware as possible. Clearly, this isn't entirely true, because efficiency is still a major consideration for some programming languages.

C, in particular, was created to make it easier to write operating systems. Rather than write UNIX in assembly, which is slow process (because assembly code is tedious to write), and not very portable (because assembly is specific to an ISA), the goal was to have a language that provided good control-flow, some abstractions (structures, function calls), and could be efficiently compiled and run quickly.

Writing operating systems requires the manipulation of data at addresses, and this requires manipulating individual bits or groups of bits.

That's where two sets of operators are useful: *bitwise*
operators and *bitshift* operators.

You can find these operators in C, C++, and Java (and presumably C#, since it's basically Java). Bitwise operators allow you to read and manipulate bits in variables of certain types.

Even though such features are available in C, they aren't often taught in an introductory level programming course. That's because intro level courses prefer to emphasize abstraction. With many departments using Java, there's a trend to increase what's abstract, and not get into the representation.

For example, some languages have support for stacks, queues, hashtables, and so forth. These "canned" data structures are meant to provide you, the programmer, with objects that perform certain tasks, while relieving you of the tedium and detail of understanding how the data is represented.

Nevertheless, if you intend to do some work in systems programming, or other forms of low-level coding (operating systems, device drivers, socket programming, network programming), knowing how to access and manipulate bits is important.

It turns out there's more than one kind of **int**. In
particular, there's **unsigned int**, there's **short int**,
there's **long int**, and then unsigned versions of those ints.

The "C" language does not specify the difference between a short int, an int and a long int, except to state that:

sizeof( short int ) <= sizeof( int ) <= sizeof( long )You will find that these sizes vary from ISA to ISA, and possibly even compiler to compiler. The sizes do not have to be distinct. That means all three sizes could be the same, or two of three could be the same, provided that the above restrictions are held.

Bitwise operators fall into two categories: binary bitwise operators and unary bitwise operators. Binary operators take two arguments, while unary operators only take one.

We're going to discuss binary operators first, and we'll do
in the context of a fake binary operator called @. Thus, we
write **x @ y** to perform bitwise @ on **x** and **y**.

Bitwise operators, like arithmetic operators, do not change
the value of the arguments. Instead, a temporary value is created.
This can then be assigned to a variable. For example, when you
write **x + y**, neither **x** nor **y** have their value
changed.

Let's assume that we are doing bitwise @ on two unsigned int
variables. Let's assume that unsigned ints use 32 bits of memory.
We define two variables **X** and **Y** where

X = xWe want to be able to refer to each bit of_{31}x_{30}....x_{0}Y = y_{31}y_{30}....y_{0}

The subscripts follow the convention we've used so far. The least
significant bit has a subscript of 0, and is the rightmost bit. The
most significant bit has a subscript of **N-1** (for N bits), and
is the leftmost bit. In this case **N = 32** so the MSb has a subscript
of 31.

Let's define @^{1} to be the @ operation performed on two
individual bits. @ is performed on all 32 bits simultaneously.

Furthermore, let's call the temporary value created, **T**.
Normally, this temporary value does not have a name. It is generated
while the program is running, and used in other computations. Thus,
when you write **c = a + b**, a temporary value is created for the
sum **a + b**, and this temporary value then gets copied into
**c**.

Of course, for efficiency, the temporary value might not be
generated, and the value written directly to **c**. For more
complex statements, such as **c = (a + b) - d**, temporary values
are often created to store intermediate results. These temporary
values are created by the runtime system, and are not given variable
names.

We define **Z = X @ Y** as:

zIn words, to compute the_{i}= x_{i}@^{1}y_{i}, for0 <= i <= N - 1

x _{i} |
y _{i} |
x _{i} &^{1} y_{i} |

0 | 0 | 0 |

0 | 1 | 0 |

1 | 0 | 0 |

1 | 1 | 1 |

We can do an example of bitwise &. It's easiest to do this on 4 bit numbers, however.

Variable | b _{3} |
b _{2} |
b _{1} |
b _{0} |

x |
1 | 1 | 0 | 0 |

y |
1 | 0 | 1 | 0 |

z = x & y |
1 | 0 | 0 | 0 |

x _{i} |
y _{i} |
x _{i} |^{1} y_{i} |

0 | 0 | 0 |

0 | 1 | 1 |

1 | 0 | 1 |

1 | 1 | 1 |

We can do an example of bitwise |. It's easiest to do this on 4 bit numbers, however.

Variable | b _{3} |
b _{2} |
b _{1} |
b _{0} |

x |
1 | 1 | 0 | 0 |

y |
1 | 0 | 1 | 0 |

z = x | y |
1 | 1 | 1 | 0 |

The following is a chart that defines ^^{1}, defining XOR
on individual bits.

x _{i} |
y _{i} |
x _{i} ^^{1} y_{i} |

0 | 0 | 0 |

0 | 1 | 1 |

1 | 0 | 1 |

1 | 1 | 0 |

We can do an example of bitwise ^. It's easiest to do this on 4 bit numbers, however.

Variable | b _{3} |
b _{2} |
b _{1} |
b _{0} |

x |
1 | 1 | 0 | 0 |

y |
1 | 0 | 1 | 0 |

z = x ^ y |
0 | 1 | 1 | 0 |

There's not that much to say about it, other than it's not the same operation as unary minus.

The following is a chart that defines ~^{1}, defining NOT
on an individual bit.

x _{i} |
~ ^{1} x_{i} |

0 | 1 |

1 | 0 |

We can do an example of bitwise ~. It's easiest to do this on 4 bit numbers (although only 2 bits are necessary to show the concept).

Variable | b _{3} |
b _{2} |
b _{1} |
b _{0} |

x |
1 | 1 | 0 | 0 |

z = ~x |
0 | 0 | 1 | 1 |

A 32-bit int can be used to stored 32 Boolean variables. Normally, the minimum size for one Boolean variable is one byte. All types in C must have sizes that are multiples of bytes. However, only one bit is necessary to represent a Boolean value.

You can also use bits to represent elements of a (small) set. If a
bit is 1, then element **i** is in the set, otherwise it's not.

You can use bitwise AND to implement set intersection, bitwise OR to implement set union.

In upcoming notes, you'll learn how to find the value of individual
bits of a **char** or **int**.

Most built-in binary operators do not modify the values of the arguments. This applies to logical operators too. They don't modify their arguments.

There are operators that do assignment such as +=, -=, *=, and so on. They apply to logical operators too. For example, |=, &=, ^=. Nearly all binary operators have a version with = after it.