Bitshift operator and bit counting
One subject I've always found a bit tricky (in particular during interviews, since it seems to be a very common question) is bit representation and bit counting. This is a rather simple stuff, even if, as we will see, it can be quite tricky if we want to go beyond the naive method.
To get into the subject, let's assume I am interviewing you and that I ask you to write a function that returns the number of pairs of set bits 1 in a given 32-bit integer. Ok, so now you're thinking: well, I have a function that takes an integer argument, easy, I simply need to loop through the bits, determine if the previous bit for any 1 was also 1 and increment my counter by one, that's it! But wait, how will I get this bit representation of my integer??
Here comes the bitshift operator to the rescue. The bitshift operators >> and << are defined in almost every programming language and basically ... shift the bits! Let's assume you have the following bit pattern: 0100110. By applying << 1 to this number will shift the bits one position to the left and lead to 1001100. Similarly, applying >> 1 would give us 0010011. Got it? (Before going on, it's important to note that the bitshift operator does not actually modify the given number. If you want to modify the number, you should use >>= and <<= instead. It is the same distinction as applying + and += to a number).
So, you can actually multiply an int by 2 by simply bit shifting once to the left. Obviously, the number has to be smaller than $2^{31}$ if we want to be able to shift bits once to the left. In case there is a 1 as the leftmost bit of the binary representation, it will simply be discarded. Similarly, the first number from the right will be filled with a 0. This pattern can change in case we are dealing with signed numbers but for the sake of simplicity, we will assume that we are dealing with unsigned integers here.
So, bit shifting the following 32-bit integer $2^{31}$ 10000000000000000000000000000000 will result in 00000000000000000000000000000000 which is simply 0.
The reason is that the largest number that one can represent with a 32-bit integer is $2^{32}-1$ or 4294967295 in case 10. By shifting bits to the left by one position for $2^{31}$, we actually multiply the integer by 2 and obtain $2^{32}$ which is to large to be represented with a 32-bit integer.
Alright, so first of all, let's introduce the naive method for bit counting.
// function that counts the number of 1 in a given integer
int numberOfOnes(unsigned int num)
{
int counter = 0 ;
while(num)
{
counter += num & 0x1 ;
num >>= 1 ;
}
return counter ;
}
This method will indeed return the number of set bits in a given integer. However, it has to loop n times, n being the position of the left-most set bit in the integer. So, in the worst-case scenario, $2^{31}$, we will have to loop 32 times in order to reach the only bit set in this integer! Let's analyze what is happening in this for loop. We first check whether the rightmost bit is 1 by bitwise comparing the integer with 0x1 (hexadecimal representation of 1). This returns another integer where each bit is set to 1 only if the corresponding bit in both number is 1.
For example, assuming we input the number 13: 001101 --> 13 in base 10 000001 --> 1 in base 10 gives 000001 --> 1 in base 10
but inputing 14: 001110 --> 14 in base 10 000001 --> 1 in base 10 gives 000000 --> 0 in base 10
So basically, "num & 0x1" returns 1 if num has a 1 on the rightmost bit and 0 if it has a 0. By incrementing the counter variable by this value, we simply add 1 every time we come across a 1 in the integer and simply add 0 otherwise. So far, we understood how to detect whether the rightmost bit is 1 or 0 but we have to loop through every bit in that integer. We could easily compare the integer with all powers of two until $2^{31}$ but we do not really want to have to loop 32 times every time, even if the number to check is 1 or 2. We would rather loop through the integer we input until we reach the leftmost bit that is set. To do that, we use the fact that we can easily right shift all bits in the integer and we keep doing that until no 1 is left in the integer (resulting in an integer of value 0 in base 10). Hence, we keep right shifting the integer by one position until it results in 0.
Similarly, we can count the number of pairs of 1 in a given integer by using the following function.
int numberOfOnePairs(unsigned int num)
{
int counter = 0 ;
bool lastWasOne = false ;
while (num)
{
if ((num & 0x1))
{
if (lastWasOne)
counter++ ;
lastWasOne = true ;
}
else
lastWasOne = false ;
num >>= 1 ;
}
return counter ;
}
A handy tool in order to understand better what's happening here is another function that simply prints the binary representation of any unsigned integer.
void printBitRepresentation(unsigned int num)
{
// sizeof returns bytes, so need to multiply by 8 to get bits
int size = sizeof(num)*8 ;
char bits[size+1] ;
for (int i = 0 ; i < size ; i++)
{
if (num & (1 << i))
bits[size-1-i] = '1' ;
else
bits[size-1-i] = '0' ;
}
bits[size] = '\0' ;
printf("%s\n", bits) ;
}
In another post, I'll talk about methods that go beyond the naive method and do not require to loop through every bit.