So, I made a mistake. I have been saying for a while that I’m not so sure that Java is actually really good for microservices, because of the warm up time for the JIT and the high memory overhead. What I realized on JAX in London was that what I actually meant is that I’m not so sure that Java is actually a good fit for the Cloud, because of the above reasons. But apparently everyone is by now aware of that.

So my mistake was basically to look at pure request performance, which actually does not matter anymore, because Cloud. You can scale horizontally as far as you want. The real question is how low you can go with the memory.

I spend some time in summer hand optimizing JVM memory limits for small microservices. And even then I realized that with the way HotSpot is configuring its regions you always have a lot of memory that is just sitting there idle. Even if you can serve all the requests with your young generation and have a pretty static set of stuff in the old region you can not really tell that to HotSpot. And the issue is, with the cloud, this memory is just wasted money. Because you provision by memory and CPU, the CPU is pretty easy to over-commit, but the memory is not. It is reserved. It is there in your docker memory limit. It will determine how many machines you actually have to run.

So I took my example that I have been benchmarking and was really mean to it, I just gave it 128 MB of RAM. These are the results for HotSpot JDK 11:

Summary:
Total: 10.0060 secs
Slowest: 0.1229 secs
Fastest: 0.0007 secs
Average: 0.0284 secs
Requests/sec: 1758.2388


Response time histogram:
0.001 [1] |
0.013 [11039] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.025 [2177] |■■■■■■■■
0.037 [25] |
0.050 [1] |
0.062 [0] |
0.074 [0] |
0.086 [1092] |■■■■
0.098 [2806] |■■■■■■■■■■
0.111 [447] |■■
0.123 [5] |


Latency distribution:
10% in 0.0033 secs
25% in 0.0048 secs
50% in 0.0090 secs
75% in 0.0238 secs
90% in 0.0927 secs
95% in 0.0959 secs
99% in 0.1019 secs

I didn’t check but I’m pretty sure these outliers are GC, because with only 128 MB of visible memory the JVM will size its regions accordingly. I also bound it to only three of my HW Threads, in order to reduce the effects of Redis, which is bound to the fourth HW Thread. So this brings us down to ~1700 rps.

So for OpenJ9, which is supposed to use less memory, I would actually expect faster results, but that is not what I’m seeing:

Summary:
Total: 10.0754 secs
Slowest: 0.4198 secs
Fastest: 0.0361 secs
Average: 0.1554 secs
Requests/sec: 319.7897


Response time histogram:
0.036 [1] |
0.074 [270] |■■■■■■■■■■■
0.113 [843] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.151 [226] |■■■■■■■■■
0.190 [642] |■■■■■■■■■■■■■■■■■■■■■■■■■
0.228 [1015] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.266 [159] |■■■■■■
0.305 [46] |■■
0.343 [9] |
0.381 [4] |
0.420 [7] |


Latency distribution:
10% in 0.0786 secs
25% in 0.0884 secs
50% in 0.1731 secs
75% in 0.2012 secs
90% in 0.2208 secs
95% in 0.2362 secs
99% in 0.2853 secs

With 256MB the results look different, for OpenJ9 we are back to a more reasonable amount of requests:

Summary:
Total: 10.0068 secs
Slowest: 0.0500 secs
Fastest: 0.0007 secs
Average: 0.0086 secs
Requests/sec: 5813.5583


Response time histogram:
0.001 [1] |
0.006 [13041] |■■■■■■■■■■■■■■■■■
0.011 [31274] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.016 [10533] |■■■■■■■■■■■■■
0.020 [2466] |■■■
0.025 [592] |■
0.030 [197] |
0.035 [30] |
0.040 [35] |
0.045 [4] |
0.050 [2] |


Latency distribution:
10% in 0.0045 secs
25% in 0.0059 secs
50% in 0.0078 secs
75% in 0.0104 secs
90% in 0.0135 secs
95% in 0.0160 secs
99% in 0.0222 secs

For HotSpot:

Summary:
Total: 10.0102 secs
Slowest: 0.0655 secs
Fastest: 0.0006 secs
Average: 0.0068 secs
Requests/sec: 7344.8957


Response time histogram:
0.001 [1] |
0.007 [46179] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
0.014 [24111] |■■■■■■■■■■■■■■■■■■■■■
0.020 [2889] |■■■
0.027 [284] |
0.033 [43] |
0.040 [10] |
0.046 [1] |
0.053 [0] |
0.059 [1] |
0.066 [5] |


Latency distribution:
10% in 0.0033 secs
25% in 0.0045 secs
50% in 0.0061 secs
75% in 0.0084 secs
90% in 0.0112 secs
95% in 0.0133 secs
99% in 0.0176 secs

And HotSpot smashed it again. So I guess I have to dig deeper, because I would expect OpenJ9 to perform better. I’ll keep you posted.