inline int countBits (int i)
{
i = (i & 0x55555555) + ((i >> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >> 2) & 0x33333333);
i = (i & 0x0F0F0F0F) + ((i >> 4) & 0x0F0F0F0F);
i = (i & 0x00FF00FF) + ((i >> 8) & 0x00FF00FF);
return (i & 0x0000FFFF) + ((i >> 16) & 0x0000FFFF);
}
That looks quite similar to the "Nifty Parallel Count" method on the
page by Gurmeet Singh Manku that I referred to earlier. In his tests,
this method came out a lot slower than the pre-computed array method
I mentioned.