My name
is
Jon Skeet

Performance difference in two equal methods

I am reading samples for JMH framework, and I have a question about code from sample called JMHSample_12_Forking. After running this code I have following results (just as author predicted):

testJavaUtilConcurrency.JMHSample_12_Forking.measure_1_c1         avgt    5   3.314 ±  0.200  ns/op
testJavaUtilConcurrency.JMHSample_12_Forking.measure_2_c2         avgt    5  22.403 ±  1.023  ns/op
...

This result is explained as follows:

Note that C1 is faster, C2 is slower, but the C1 is slow again! This is because ...

But my question is: why C2 is slower than C1? Code in both classes and both methods looks exactly the same, so, what is a source of performance difference?

Update:

I have tried to add third implementation for Counter and obtained following results:

testJavaUtilConcurrency.JMHSample_12_Forking.measure_1_c1         avgt    5   3.328 ± 0.073  ns/op
testJavaUtilConcurrency.JMHSample_12_Forking.measure_2_c2         avgt    5   22.437 ± 0.552  ns/op
testJavaUtilConcurrency.JMHSample_12_Forking.measure_2_c3         avgt    5  44.614 ± 5.080  ns/op
testJavaUtilConcurrency.JMHSample_12_Forking.measure_3_c1_again   avgt    5  43.535 ± 1.154  ns/op

During the very first test, there's one implementation of Counter. The JIT compiler is able to assume that anything calling measure(Counter) is using the same implementation, so it can inline the code from inc().

In the second test, we introduce a second implementation - now the call needs to either inline both implementations, or perform dynamic dispatch on each iteration. This is much slower than the first test due to that uncertainty (with either choice).

In the third test, we're using the same implementation as for the first test - but the state of the world is different to the first test, because the JIT still knows that the second implementation exists... it can't go back to believing there's only one implementation of Counter... so it still has to execute inc() in a slower fashion than the first test.

The moral of the story is that it's not just the code that influences performance - it's the state of the world. The state of the world in the first test is much better (from an optimization perspective) than the state of the world in the second and third tests.

See more on this question at Stackoverflow