Often times, we take stuff for granted. But while preparing to embark on a crazy project (description in French here and Google translation in English here), I wanted to benchmark the bit manipulation operations in both Squeak and Pharo, for the 32bit and 64bit images (I am on Windows so the 64bit VM is not available for testing yet but it’ll come!). So essentially, it was just a test to compare the VM-Image-Environment combo!
To make a long story short, I was interested in testing the speed of 64bit operations on positive integers for my chess program. I quickly found some cases where LargePositiveInteger operations were more than 7-12 times slower than the SmallInteger equivalences and I became curious since it seemed like a lot. After more testing and discussions (both offline and online), someone suggested that some LargePositiveInteger operations could possibly be slow because they were not inlined in the JIT. It was then recommended that I override those methods in LargePositiveInteger (with primitives 34 to 37), thus shortcutting the default and slow methods in Integer (corresponding named primitives, primDigitBitAnd, primDigitBitOr, primDigitBitXor, primDigitBitShiftMagnitude in LargeIntegers module). I immediately got a 2-3x speedup for LargePositiveInteger but…
Things have obviously changed in the Squeak 64bit image since some original methods (in class Integer) like #bitAnd: and #bitOr: are way faster than the overrides (in class LargePositiveInteger )! Is it special code in the VM that checks for 32bit vs 64bit (more precisely, 30bit vs 60bit integers)? Is it in the LargeIntegers module?
Here are 2 typical runs for Squeak 5.1 32bit (by the way, Pharo 32bit image performs similarly) and Squeak 5.1 64bit images :
Squeak 5.1 32bit
Number of #allMask: per second: 7.637M Number of #anyMask: per second: 8.333M Number of #bitAnd: per second: 17.877M Number of #bitAnd2: per second: 42.105M Method #bitAnd2: seems to work properly! Overide of #bitAnd: in LargeInteger works! Number of #bitAt: per second: 12.075M Number of #bitAt:put: per second: 6.287M Number of #bitClear: per second: 6.737M Number of #bitInvert per second: 5.536M Number of #bitOr: per second: 15.764M Number of #bitOr2: per second: 34.409M Method #bitOr2: seems to work properly! Overide of #bitOr: in LargeInteger works! Method #bitShift2: (left & right shifts) seems to work properly! Overide of #bitShift: in LargeInteger works! Number of #bitXor: per second: 15.385M Number of #bitXor2: per second: 34.043M Method #bitXor2: seems to work properly! Overide of #bitXor: in LargeInteger works! Number of #highBit per second: 12.451M Number of #<< per second: 6.517M Number of #bitLeftShift2: per second: 8.399M Number of #lowBit per second: 10.702M Number of #noMask: per second: 7.064M Number of #>> per second: 7.323M Number of #bitRightShift2: per second: 29.358M
Squeak 5.1 64bit
Number of #allMask: per second: 36.782M Number of #anyMask: per second: 41.026M Number of #bitAnd: per second: 139.130M Number of #bitAnd2: per second: 57.143M Method #bitAnd2: seems to work properly! Overide of #bitAnd: in LargeInteger works! Number of #bitAt: per second: 23.358M Number of #bitAt:put: per second: 8.649M Number of #bitClear: per second: 38.554M Number of #bitInvert per second: 29.630M Number of #bitOr: per second: 139.130M Number of #bitOr2: per second: 58.182M Method #bitOr2: seems to work properly! Overide of #bitOr: in LargeInteger works! Method #bitShift2: (left & right shifts) seems to work properly! Overide of #bitShift: in LargeInteger works! Number of #bitXor: per second: 55.172M Number of #bitXor2: per second: 74.419M Method #bitXor2: seems to work properly! Overide of #bitXor: in LargeInteger works! Number of #highBit per second: 7.921M Number of #<< per second: 10.127M Number of #bitLeftShift2: per second: 12.800M Number of #lowBit per second: 6.823M Number of #noMask: per second: 39.024M Number of #>> per second: 23.188M Number of #bitRightShift2: per second: 56.140M
So now, I’m left with 2 questions :
- Why exactly does the override work (in 32bit images)?
- What changed so that things are different in Squeak 5.1 64bit image (overrides partially work)?
If you’re curious/interested, the code I have used to test is here.
Leave me a comment (or email) if you have an explanation!
To be continued…