Adaptive Replica Selection benchmarks

Table of Contents

Author Lee Hinman (lee@elastic.co)
Date 2017-08-30 14:58:46

Introduction

Benchmarks run on the adaptive replica selection branch based on a recent master branch. Since ARS is configurable there are tests with and without it.

This uses six machines running on Google Compute Engine, each machine has 16 CPUs and 60gb of RAM. Elasticsearch is given 31gb of RAM, all other performance-related settings are unchanged.

The machines are set up as follows:

ars-setup.png

Each test is 40,000 queries per lap, with 5 laps for a total of 200,000 queries. Rally is used with 100 clients simultaneously sending requests as quickly as possible. Rally is configured to send requests to the client node (esclient).

The query is based on a user's test case where they were seeing nodes with uneven response times, usually due to a problem with a particular node in the cluster. For more information, check out the track at rally-internal-tracks/3890.

It consists for 4 different benchmarks:

For the "load" scenarios, load was introduced on the es3 node with stress -i 8 -c 8 -m 8 -d 8.

1 replica, no-load version

No ARS, 5 laps with the cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   72.756 71.452 70.484 71.909 72.147 s 358.748
Total Old Gen GC   0 0 0 0 0 s 0
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 75.046 86.2669 95.5561 94.4944 94.4099 ops/s 75.046
Median Throughput complex-query 94.9815 96.5402 96.6245 95.0058 95.3971 ops/s 95.7866
Max Throughput complex-query 96.0466 97.2942 99.1457 98.5228 98.0666 ops/s 99.1457
50th percentile latency complex-query 1000.23 993.939 1008.95 1012.77 1000.85 ms 1003.29
90th percentile latency complex-query 1334.58 1329.26 1343.43 1350.75 1340.92 ms 1339.69
99th percentile latency complex-query 1644.43 1634.48 1648.19 1658.76 1655.41 ms 1648.34
99.9th percentile latency complex-query 1877.37 1866.27 1883.54 1889.51 1906.72 ms 1890.48
99.99th percentile latency complex-query 2136.38 2131.63 2288.19 2120.27 2117.74 ms 2144.03
100th percentile latency complex-query 2339.11 2304.25 2513.49 2275.12 2398.82 ms 2513.49
50th percentile service time complex-query 1000.23 993.939 1008.95 1012.77 1000.85 ms 1003.29
90th percentile service time complex-query 1334.58 1329.26 1343.43 1350.75 1340.92 ms 1339.69
99th percentile service time complex-query 1644.43 1634.48 1648.19 1658.76 1655.41 ms 1648.34
99.9th percentile service time complex-query 1877.37 1866.27 1883.54 1889.51 1906.72 ms 1890.48
99.99th percentile service time complex-query 2136.38 2131.63 2288.19 2120.27 2117.74 ms 2144.03
100th percentile service time complex-query 2339.11 2304.25 2513.49 2275.12 2398.82 ms 2513.49
error rate complex-query 1.04 1.06 0.86 0.99 1.03 % 0.99

ARS, 5 laps with cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   104.813 99.567 92.415 93.434 92.012 s 482.241
Total Old Gen GC   0 0 0.068 0 0.075 s 0.143
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 71.6274 96.4191 97.3081 94.654 95.4668 ops/s 71.6274
Median Throughput complex-query 93.9667 96.828 98.461 97.712 97.8372 ops/s 97.6061
Max Throughput complex-query 95.904 99.7006 101.285 98.5563 98.4476 ops/s 101.285
50th percentile latency complex-query 995.373 995.155 977.072 984.347 977.937 ms 985.724
90th percentile latency complex-query 1366.45 1356.61 1333.67 1337.76 1337.2 ms 1346.43
99th percentile latency complex-query 1696.72 1673.96 1660.46 1653.38 1651.95 ms 1670.29
99.9th percentile latency complex-query 1980.4 1954.44 1903.94 1951.45 1938.64 ms 1949.28
99.99th percentile latency complex-query 2256.85 2260.59 2171.71 2216.76 2314.91 ms 2256.85
100th percentile latency complex-query 2621.21 2399.68 2387.3 2562.51 2568.04 ms 2621.21
50th percentile service time complex-query 995.373 995.155 977.072 984.347 977.937 ms 985.724
90th percentile service time complex-query 1366.45 1356.61 1333.67 1337.76 1337.2 ms 1346.43
99th percentile service time complex-query 1696.72 1673.96 1660.46 1653.38 1651.95 ms 1670.29
99.9th percentile service time complex-query 1980.4 1954.44 1903.94 1951.45 1938.64 ms 1949.28
99.99th percentile service time complex-query 2256.85 2260.59 2171.71 2216.76 2314.91 ms 2256.85
100th percentile service time complex-query 2621.21 2399.68 2387.3 2562.51 2568.04 ms 2621.21
error rate complex-query 1.06 0.98 1.02 0.94 1 % 1

1 replica, no load summary

Metric No ARS ARS Change %
Median Throughput (ops/s) 95.7866 97.6061 1.8995350
50th percentile latency (ms) 1003.29 985.724 -1.7508397
90th percentile latency (ms) 1339.69 1346.43 0.50310146
99th percentile latency (ms) 1648.34 1670.29 1.3316427

Without any load, the median throughput and latency has roughly a 1% difference (+/-) when hitting only the client node.

Here are the results after applying Adrien's suggestions from the PR to the code and re-testing:

Metric Operation Value Unit
Total Young Gen GC   567.738 s
Total Old Gen GC   0.262 s
Min Throughput complex-query 77.2156 ops/s
Median Throughput complex-query 98.4153 ops/s
Max Throughput complex-query 103.568 ops/s
50th percentile latency complex-query 976.994 ms
90th percentile latency complex-query 1333.18 ms
99th percentile latency complex-query 1657.75 ms
99.9th percentile latency complex-query 1969.66 ms
99.99th percentile latency complex-query 2449.54 ms
100th percentile latency complex-query 3094.64 ms
50th percentile service time complex-query 976.994 ms
90th percentile service time complex-query 1333.18 ms
99th percentile service time complex-query 1657.75 ms
99.9th percentile service time complex-query 1969.66 ms
99.99th percentile service time complex-query 2449.54 ms
100th percentile service time complex-query 3094.64 ms
Metric No ARS ARS Change %
Median Throughput (ops/s) 95.7866 98.4153 2.7443296
50th percentile latency (ms) 1003.29 976.994 -2.6209770
90th percentile latency (ms) 1339.69 1333.18 -0.48593331
99th percentile latency (ms) 1648.34 1657.75 0.57087737

Just as before, a negligible difference when the cluster is unloaded.

1 replica, with load on es3

No ARS, 5 laps with the cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   300.074 272.544 266.54 268.632 274.019 s 1381.81
Total Old Gen GC   0 0.293 0 0.348 0.368 s 1.009
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 31.9327 39.3799 41.9122 41.3685 40.0552 ops/s 31.9327
Median Throughput complex-query 38.2174 40.2514 42.4385 41.9149 41.3764 ops/s 41.1558
Max Throughput complex-query 39.5881 43.5512 53.5363 48.204 58.5938 ops/s 58.5938
50th percentile latency complex-query 410.565 394.156 413.918 399.447 437.455 ms 411.721
90th percentile latency complex-query 5391.53 5240.86 5026.99 5101.87 5297.32 ms 5215.34
99th percentile latency complex-query 6469.9 6138.16 5881.31 5974.67 6268.04 ms 6181.48
99.9th percentile latency complex-query 8497.22 6824.09 6520.29 6553.15 6979.88 ms 7022.92
99.99th percentile latency complex-query 13174.6 7573.65 7001.51 7197.16 7436.44 ms 10347.9
100th percentile latency complex-query 14173.2 8231.89 7544.85 7688.05 8060.05 ms 14173.2
50th percentile service time complex-query 410.565 394.156 413.918 399.447 437.455 ms 411.721
90th percentile service time complex-query 5391.53 5240.86 5026.99 5101.87 5297.32 ms 5215.34
99th percentile service time complex-query 6469.9 6138.16 5881.31 5974.67 6268.04 ms 6181.48
99.9th percentile service time complex-query 8497.22 6824.09 6520.29 6553.15 6979.88 ms 7022.92
99.99th percentile service time complex-query 13174.6 7573.65 7001.51 7197.16 7436.44 ms 10347.9
100th percentile service time complex-query 14173.2 8231.89 7544.85 7688.05 8060.05 ms 14173.2
error rate complex-query 1.03 1.03 0.93 1.01 0.92 % 0.98

With ARS

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   104.178 102.469 102.617 103.145 102.971 s 515.38
Total Old Gen GC   0 0 0 0 0 s 0
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 42.8211 85.2494 85.5463 85.7206 84.9813 ops/s 42.8211
Median Throughput complex-query 88.2411 88.3537 88.8263 90.2476 89.1758 ops/s 88.8048
Max Throughput complex-query 89.0012 89.2967 89.8035 90.5627 89.905 ops/s 90.5627
50th percentile latency complex-query 935.323 952.327 911.456 954.175 956.159 ms 943.221
90th percentile latency complex-query 1963.25 1992.08 2036.04 1890.75 1933.65 ms 1962.73
99th percentile latency complex-query 2638.29 2672.81 2774.82 2516.11 2552.83 ms 2648.86
99.9th percentile latency complex-query 3119 3136.93 3283.85 2913.59 3043.9 ms 3134.2
99.99th percentile latency complex-query 3694.82 3402.12 3777.1 3315.67 3598.85 ms 3518.94
100th percentile latency complex-query 4223.11 3499.83 4210.43 3423.48 4049.51 ms 4223.11
50th percentile service time complex-query 935.323 952.327 911.456 954.175 956.159 ms 943.221
90th percentile service time complex-query 1963.25 1992.08 2036.04 1890.75 1933.65 ms 1962.73
99th percentile service time complex-query 2638.29 2672.81 2774.82 2516.11 2552.83 ms 2648.86
99.9th percentile service time complex-query 3119 3136.93 3283.85 2913.59 3043.9 ms 3134.2
99.99th percentile service time complex-query 3694.82 3402.12 3777.1 3315.67 3598.85 ms 3518.94
100th percentile service time complex-query 4223.11 3499.83 4210.43 3423.48 4049.51 ms 4223.11
error rate complex-query 0.97 1.03 1.12 1.01 1.05 % 1.04

1 replica, with load summary

Metric No ARS ARS Change %
Median throughput (ops/s) 41.1558 88.8048 115.77712
50th percentile latency (ms) 411.721 943.221 129.09227
90th percentile latency (ms) 5215.34 1962.73 -62.366212
99th percentile latency (ms) 6181.48 2648.86 -57.148450

So a trade of 50th percentile latency for a large reduction in 90th/99th percentile latency, while doubling the median throughput from 41 ops/s to 88 ops/s.

You can see the distribution of requests with ARS, requests were routed away from es3 (the stressed node) to be handled by the non-loaded nodes:

node_name name                active queue rejected completed
es4       search                   0     0        0    241512
es1       search                   0     0        0    220730
es2       search                   0     0        0    226168
es3       search                   0     0        0    113115
esclient  search                   0     0        0    202875
es5       search                   0     0        0    212850

In the non-ARS scenario, the requests are evenly distributed.

And running again after Adrien's feedback and subsequent changes:

Metric Operation Value Unit
Total Young Gen GC   489.716 s
Total Old Gen GC   0 s
Min Throughput complex-query 35.7991 ops/s
Median Throughput complex-query 90.8303 ops/s
Max Throughput complex-query 96.1928 ops/s
50th percentile latency complex-query 923.891 ms
90th percentile latency complex-query 1915.97 ms
99th percentile latency complex-query 2620.81 ms
99.9th percentile latency complex-query 3101.01 ms
99.99th percentile latency complex-query 4120.6 ms
100th percentile latency complex-query 4989.43 ms
50th percentile service time complex-query 923.891 ms
90th percentile service time complex-query 1915.97 ms
99th percentile service time complex-query 2620.81 ms
99.9th percentile service time complex-query 3101.01 ms
99.99th percentile service time complex-query 4120.6 ms
100th percentile service time complex-query 4989.43 ms
Metric No ARS ARS Change %
Median throughput (ops/s) 41.1558 90.8303 120.69866
50th percentile latency (ms) 411.721 923.891 124.39735
90th percentile latency (ms) 5215.34 1915.97 -63.262798
99th percentile latency (ms) 6181.48 2620.81 -57.602225

So it still has around the same performance improvements

4 replicas, no load

With 4 replicas, every node could handle a request, so rather than the routing formula picking between two different copies of the data, it could potentially pick any node in the cluster.

With the user-based test case, one node sometimes ends up with a queue of search requests due to its inability to keep up and hot-spotting for the data. This was addressed with the Little's Law work and benchmarked in my other file at https://writequit.org/org/es/stress-run.html

Note that in these tests, I do not allow the queue to automatically resize, as I didn't want it affecting the performance results and giving false positives or negatives.

No ARS, 5 laps with the cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   133.994 126.421 120.122 117.057 109.267 s 606.861
Total Old Gen GC   0 0.068 0 0.052 0.066 s 0.186
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 61.8983 83.9801 88.1633 83.4439 87.2767 ops/s 61.8983
Median Throughput complex-query 83.621 86.2051 89.7287 87.8284 87.9597 ops/s 87.689
Max Throughput complex-query 85.1025 88.4334 94.6429 89.4471 91.013 ops/s 94.6429
50th percentile latency complex-query 1454.42 1403.75 1305.12 1315.04 1308 ms 1348.53
90th percentile latency complex-query 2020.12 1972.7 1864.37 1909.29 1851.94 ms 1930.5
99th percentile latency complex-query 2448.26 2404.8 2294.22 2344.43 2257.22 ms 2363.1
99.9th percentile latency complex-query 2812.64 2734.92 2703.48 2664.67 2558.15 ms 2716.1
99.99th percentile latency complex-query 3080.82 2987.73 3039 3073.03 2841.12 ms 3055.47
100th percentile latency complex-query 3196.18 3398.03 3244.51 3260.1 3212.67 ms 3398.03
50th percentile service time complex-query 1454.42 1403.75 1305.12 1315.04 1308 ms 1348.53
90th percentile service time complex-query 2020.12 1972.7 1864.37 1909.29 1851.94 ms 1930.5
99th percentile service time complex-query 2448.26 2404.8 2294.22 2344.43 2257.22 ms 2363.1
99.9th percentile service time complex-query 2812.64 2734.92 2703.48 2664.67 2558.15 ms 2716.1
99.99th percentile service time complex-query 3080.82 2987.73 3039 3073.03 2841.12 ms 3055.47
100th percentile service time complex-query 3196.18 3398.03 3244.51 3260.1 3212.67 ms 3398.03
error rate complex-query 1.1 0.95 0.96 0.94 0.99 % 0.99

ARS, 5 laps with cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   105.76 100.472 96.16 92.079 90.213 s 484.684
Total Old Gen GC   0 0 0.056 0 0.044 s 0.1
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 72.5623 95.187 91.4413 97.7224 97.4145 ops/s 72.5623
Median Throughput complex-query 96.3588 97.8425 96.7584 99.3148 99.7013 ops/s 97.8271
Max Throughput complex-query 97.4177 98.136 97.3876 107.248 100.778 ops/s 107.248
50th percentile latency complex-query 982.194 985.57 991.243 976.461 972.69 ms 981.778
90th percentile latency complex-query 1388.58 1383.4 1392.7 1366.4 1358.64 ms 1377.62
99th percentile latency complex-query 1768.21 1770.2 1768.3 1733.5 1705.18 ms 1751.05
99.9th percentile latency complex-query 2111.05 2114.6 2121.01 2059.34 2024.64 ms 2090.17
99.99th percentile latency complex-query 2316.13 2468.77 2450.8 2308.73 2278.21 ms 2394.18
100th percentile latency complex-query 2500.14 2726.6 2695.71 2608 2584.61 ms 2726.6
50th percentile service time complex-query 982.194 985.57 991.243 976.461 972.69 ms 981.778
90th percentile service time complex-query 1388.58 1383.4 1392.7 1366.4 1358.64 ms 1377.62
99th percentile service time complex-query 1768.21 1770.2 1768.3 1733.5 1705.18 ms 1751.05
99.9th percentile service time complex-query 2111.05 2114.6 2121.01 2059.34 2024.64 ms 2090.17
99.99th percentile service time complex-query 2316.13 2468.77 2450.8 2308.73 2278.21 ms 2394.18
100th percentile service time complex-query 2500.14 2726.6 2695.71 2608 2584.61 ms 2726.6
error rate complex-query 0.97 0.92 0.96 0.95 1 % 0.96

4 replicas, no load summary

Metric No ARS ARS Change %
Median throughput (ops/s) 87.689 97.8271 11.561427
50th percentile latency (ms) 1348.53 981.778 -27.196429
90th percentile latency (ms) 1930.5 1377.62 -28.639213
99th percentile latency (ms) 2363.1 1751.05 -25.900300

ARS improves on both throughput and latency, for all of the percentiles.

Additionally, the requests are routed roughly evenly, even though round robin selection is not used. Instead the formula can select the "least loaded" node to send the request to.

4 replicas, with load on es3

No ARS, 5 laps with the cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   153.063 150.372 148.406 141.904 142.388 s 736.133
Total Old Gen GC   0 0 0 0 0.235 s 0.235
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 32.2423 51.8938 49.0778 51.7943 52.0408 ops/s 32.2423
Median Throughput complex-query 50.9153 52.6325 50.6031 52.6305 52.4415 ops/s 52.1863
Max Throughput complex-query 52.1392 57.8188 55.4156 62.4498 60.4905 ops/s 62.4498
50th percentile latency complex-query 2552.22 2591.87 2612.83 2601.61 2578.97 ms 2587.46
90th percentile latency complex-query 3449.25 3486.15 3530.09 3503.63 3473.51 ms 3489.49
99th percentile latency complex-query 4149.98 4186.45 4207.7 4148.67 4144.06 ms 4168.83
99.9th percentile latency complex-query 4672.63 4870.2 4720.98 4632.22 4681.02 ms 4719.7
99.99th percentile latency complex-query 4987.28 5296.97 5145.64 4986.35 5163.95 ms 5146.07
100th percentile latency complex-query 5526.07 5633.2 5606.27 5420.99 5462.41 ms 5633.2
50th percentile service time complex-query 2552.22 2591.87 2612.83 2601.61 2578.97 ms 2587.46
90th percentile service time complex-query 3449.25 3486.15 3530.09 3503.63 3473.51 ms 3489.49
99th percentile service time complex-query 4149.98 4186.45 4207.7 4148.67 4144.06 ms 4168.83
99.9th percentile service time complex-query 4672.63 4870.2 4720.98 4632.22 4681.02 ms 4719.7
99.99th percentile service time complex-query 4987.28 5296.97 5145.64 4986.35 5163.95 ms 5146.07
100th percentile service time complex-query 5526.07 5633.2 5606.27 5420.99 5462.41 ms 5633.2
error rate complex-query 0.89 1.01 0.97 0.94 0.99 % 0.96

ARS, 5 laps with cluster started fresh before running the benchmarks

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   102.981 102.723 103.387 101.932 102.316 s 513.339
Total Old Gen GC   0 0 0 0 0 s 0
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 29.2484 81.7369 82.0965 85.1784 86.0431 ops/s 29.2484
Median Throughput complex-query 84.7027 87.0279 86.5161 86.2562 87.8629 ops/s 86.5302
Max Throughput complex-query 85.0649 87.6369 87.2331 93.7006 89.5872 ops/s 93.7006
50th percentile latency complex-query 928.932 960.769 944.702 949.644 943.277 ms 945.357
90th percentile latency complex-query 2216.35 2045.28 2110.41 2093.92 2059.62 ms 2099.22
99th percentile latency complex-query 3842.12 3243.75 3374.46 3487.2 3323.93 ms 3463.71
99.9th percentile latency complex-query 5766.9 4004.19 4186.89 4354.08 4327.47 ms 4463.67
99.99th percentile latency complex-query 6690.88 4452.9 4685.68 4721.74 4818.03 ms 6149.39
100th percentile latency complex-query 7687.69 5242.12 5011.89 5263.98 5054.94 ms 7687.69
50th percentile service time complex-query 928.932 960.769 944.702 949.644 943.277 ms 945.357
90th percentile service time complex-query 2216.35 2045.28 2110.41 2093.92 2059.62 ms 2099.22
99th percentile service time complex-query 3842.12 3243.75 3374.46 3487.2 3323.93 ms 3463.71
99.9th percentile service time complex-query 5766.9 4004.19 4186.89 4354.08 4327.47 ms 4463.67
99.99th percentile service time complex-query 6690.88 4452.9 4685.68 4721.74 4818.03 ms 6149.39
100th percentile service time complex-query 7687.69 5242.12 5011.89 5263.98 5054.94 ms 7687.69
error rate complex-query 0.99 0.97 0.94 1.04 0.92 % 0.97

With this distribution of requests

node_name name                active queue rejected completed
es3       search                   0     0        0    119263
es2       search                   0     0        0    224000
es4       search                   0     0        0    234915
es5       search                   0     0        0    207679
esclient  search                   0     0        0    203000
es1       search                   0     0        0    229143

4 replicas, with load summary

Metric No ARS ARS Change %
Median throughput (ops/s) 52.1863 86.5302 65.810184
50th percentile latency (ms) 2587.46 945.357 -63.463899
90th percentile latency (ms) 3489.49 2099.22 -39.841639
99th percentile latency (ms) 4168.83 3463.71 -16.914098

So an improvement in all latencies, while increasing the median throughput from 52 ops/s to 86 ops/s.

As the number of replicas goes up, I expect the adaptive replica selection to be better, since it has more choices of potentially unloaded machines to service the request.

Evenly distributing requests instead of using the client node

One more test, instead of hitting the client node this time, I had Rally hit all nodes in the cluster (including the client node). For this test I dropped back to one replica since that is the most common use case.

No ARS 1 replica

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   166.788 157.16 145.218 125.436 118.413 s 713.015
Total Old Gen GC   0 0 0.096 0.118 0 s 0.214
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 64.3486 85.5629 86.9623 89.4434 89.6475 ops/s 64.3486
Median Throughput complex-query 86.1217 89.5194 89.4143 90.6719 91.0868 ops/s 89.6289
Max Throughput complex-query 87.6732 90.0012 90.3885 93.114 93.7311 ops/s 93.7311
50th percentile latency complex-query 1124.43 1097.21 1043.54 1038.33 1160.2 ms 1088.81
90th percentile latency complex-query 1736.69 1734.69 1633.39 1632.16 1821.2 ms 1706.07
99th percentile latency complex-query 2550.62 2399.38 2178.97 2189.69 2722.73 ms 2481.1
99.9th percentile latency complex-query 3016.49 2893.91 2951.67 2624.62 3143.67 ms 2991.56
99.99th percentile latency complex-query 3360.23 3266.18 3490.71 2893.75 3397.57 ms 3360.22
100th percentile latency complex-query 3507.84 3751.69 4471.25 3186.82 3732.73 ms 4471.25
50th percentile service time complex-query 1124.43 1097.21 1043.54 1038.33 1160.2 ms 1088.81
90th percentile service time complex-query 1736.69 1734.69 1633.39 1632.16 1821.2 ms 1706.07
99th percentile service time complex-query 2550.62 2399.38 2178.97 2189.69 2722.73 ms 2481.1
99.9th percentile service time complex-query 3016.49 2893.91 2951.67 2624.62 3143.67 ms 2991.56
99.99th percentile service time complex-query 3360.23 3266.18 3490.71 2893.75 3397.57 ms 3360.22
100th percentile service time complex-query 3507.84 3751.69 4471.25 3186.82 3732.73 ms 4471.25
error rate complex-query 1.08 0.98 0.93 0.99 1.07 % 1.01

ARS 1 replica

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   128.553 124.833 119.194 105.361 100.273 s 578.214
Total Old Gen GC   0 0 0 0.124 0 s 0.124
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 65.0194 91.3818 92.2215 96.2927 95.7731 ops/s 65.0194
Median Throughput complex-query 92.3494 94.4122 95.7469 96.909 98.121 ops/s 95.7207
Max Throughput complex-query 93.5308 94.8664 96.326 100.97 98.5577 ops/s 100.97
50th percentile latency complex-query 1031.64 1025.99 1010.09 999.791 987.14 ms 1010.36
90th percentile latency complex-query 1449.99 1439.34 1417.91 1413.7 1386.26 ms 1422.55
99th percentile latency complex-query 1815.91 1797.08 1783.24 1780.29 1738.78 ms 1784.96
99.9th percentile latency complex-query 2124.39 2111.02 2078.33 2122.11 2011.52 ms 2095.82
99.99th percentile latency complex-query 2495 2344.26 2466.2 2406.79 2316.72 ms 2428.25
100th percentile latency complex-query 2749.13 2569.92 2552.57 2882.8 2706.45 ms 2882.8
50th percentile service time complex-query 1031.64 1025.99 1010.09 999.791 987.14 ms 1010.36
90th percentile service time complex-query 1449.99 1439.34 1417.91 1413.7 1386.26 ms 1422.55
99th percentile service time complex-query 1815.91 1797.08 1783.24 1780.29 1738.78 ms 1784.96
99.9th percentile service time complex-query 2124.39 2111.02 2078.33 2122.11 2011.52 ms 2095.82
99.99th percentile service time complex-query 2495 2344.26 2466.2 2406.79 2316.72 ms 2428.25
100th percentile service time complex-query 2749.13 2569.92 2552.57 2882.8 2706.45 ms 2882.8
error rate complex-query 1.07 0.98 0.95 0.96 0.95 % 0.98

Round robin requests summary

Metric No ARS ARS Change %
Median throughput (ops/s) 89.6289 95.7207 6.7966917
50th percentile latency (ms) 1088.81 1010.36 -7.2051138
90th percentile latency (ms) 1706.07 1422.55 -16.618310
99th percentile latency (ms) 2481.1 1784.96 -28.057716

So still an improvement on throughput and latency for all categories.

Here's a run from after the changes Adrien recommended on the PR:

Metric Operation Value Unit
Total Young Gen GC   463.485 s
Total Old Gen GC   0.064 s
Min Throughput complex-query 65.9433 ops/s
Median Throughput complex-query 95.9248 ops/s
Max Throughput complex-query 99.6405 ops/s
50th percentile latency complex-query 1014.5 ms
90th percentile latency complex-query 1412.53 ms
99th percentile latency complex-query 1752.06 ms
99.9th percentile latency complex-query 2041.12 ms
99.99th percentile latency complex-query 2353.43 ms
100th percentile latency complex-query 3001.61 ms
50th percentile service time complex-query 1014.5 ms
90th percentile service time complex-query 1412.53 ms
99th percentile service time complex-query 1752.06 ms
99.9th percentile service time complex-query 2041.12 ms
99.99th percentile service time complex-query 2353.43 ms
100th percentile service time complex-query 3001.61 ms

And the results, pretty much the same as the non-change version:

Metric No ARS ARS Change %
Median throughput (ops/s) 89.6289 95.9248 7.0244084
50th percentile latency (ms) 1088.81 1014.5 -6.8248822
90th percentile latency (ms) 1706.07 1412.53 -17.205625
99th percentile latency (ms) 2481.1 1752.06 -29.383741

Conclusion

Final summary, we want to see a high throughput improvement and a negative latency percentile improvement to consider the feature successful.

Test case Throughput improvement % 50th % change 90th % change 99th % change
1 replica, no load 1.9% -1.7% 0.5% 1.3%
1 replica, with load 115.8% 129.0% -62.3% -57.1%
4 replicas, no load 11.6% -27.2% -28.6% -25.9%
4 replicas, with load 65.8% -63.5% -39.8% -16.9%
1 replica, round robin, no load 6.8% -7.2% -16.6% -28.0%

The adaptive replica selection shows an improvement for almost all tests in throughput and latency. While not perfect, it should help route around overloaded nodes.

Additionally, since it is dynamically configurable, it can easily be toggled on or off as desired.

Updated tests after feedback

Simon left some feedback on the PR about outstanding search requests not being measured correctly (which he is right!), so I adjusted the outstanding requests to work correctly and adjusted b=4 back to b=3 to re-run the benchmarks for the adaptive replica selection cases. I did not rerun the regular cases as they should not change.

Single replica, non-loaded case

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   75.073 75.041 74.993 74.623 73.222 s 372.952
Total Old Gen GC   0 0 0 0 0 s 0
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 63.5801 97.0808 98.5696 96.4159 98.4785 ops/s 63.5801
Median Throughput complex-query 96.8881 98.2529 99.9219 98.1566 99.3179 ops/s 98.537
Max Throughput complex-query 98.8394 99.3881 100.639 98.9853 104.276 ops/s 104.276
50th percentile latency complex-query 962.508 981.686 965.195 974.092 965.343 ms 970.15
90th percentile latency complex-query 1319.55 1339.31 1322.88 1332.44 1318.87 ms 1326.79
99th percentile latency complex-query 1642.93 1658.19 1650.91 1648.07 1645.92 ms 1648.8
99.9th percentile latency complex-query 1932.01 1954.33 1917.32 1916.36 1913.22 ms 1925.47
99.99th percentile latency complex-query 2351.23 2270.86 2234.84 2250.36 2268.94 ms 2270.86
100th percentile latency complex-query 2422.38 2440.09 2444.61 2631.93 2423.32 ms 2631.93
50th percentile service time complex-query 962.508 981.686 965.195 974.092 965.343 ms 970.15
90th percentile service time complex-query 1319.55 1339.31 1322.88 1332.44 1318.87 ms 1326.79
99th percentile service time complex-query 1642.93 1658.19 1650.91 1648.07 1645.92 ms 1648.8
99.9th percentile service time complex-query 1932.01 1954.33 1917.32 1916.36 1913.22 ms 1925.47
99.99th percentile service time complex-query 2351.23 2270.86 2234.84 2250.36 2268.94 ms 2270.86
100th percentile service time complex-query 2422.38 2440.09 2444.61 2631.93 2423.32 ms 2631.93
error rate complex-query 0.9 0.96 1.05 1.01 1.01 % 0.99

Single replica, under load case

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   138.294 135.652 129.945 121.155 122.426 s 647.472
Total Old Gen GC   0 0 0.067 0 0 s 0.067
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 53.6798 83.1811 87.0763 88.1993 80.6025 ops/s 53.6798
Median Throughput complex-query 85.5729 87.2077 89.1485 89.4732 87.7697 ops/s 87.8231
Max Throughput complex-query 86.8944 88.1925 89.9637 91.725 88.4414 ops/s 91.725
50th percentile latency complex-query 1016.17 1006.04 992.635 1005.78 1015.35 ms 1007.22
90th percentile latency complex-query 1851.86 1843.43 1853.17 1806.01 1845.79 ms 1839.46
99th percentile latency complex-query 2466.59 2443.15 2454.92 2368.89 2431.93 ms 2433.55
99.9th percentile latency complex-query 2918.42 2785.14 2843.5 2751.1 2863.54 ms 2835.38
99.99th percentile latency complex-query 3185.07 3094.71 3197.62 3057 3283.16 ms 3245.19
100th percentile latency complex-query 3728.13 3565.79 3472.04 3742.51 3680.95 ms 3742.51
50th percentile service time complex-query 1016.17 1006.04 992.635 1005.78 1015.35 ms 1007.22
90th percentile service time complex-query 1851.86 1843.43 1853.17 1806.01 1845.79 ms 1839.46
99th percentile service time complex-query 2466.59 2443.15 2454.92 2368.89 2431.93 ms 2433.55
99.9th percentile service time complex-query 2918.42 2785.14 2843.5 2751.1 2863.54 ms 2835.38
99.99th percentile service time complex-query 3185.07 3094.71 3197.62 3057 3283.16 ms 3245.19
100th percentile service time complex-query 3728.13 3565.79 3472.04 3742.51 3680.95 ms 3742.51
error rate complex-query 0.97 0.92 1.01 1.08 0.95 % 0.99

Single replica round robin, no load

Metric Operation Lap 1 Lap 2 Lap 3 Lap 4 Lap 5 Unit TOTAL
Total Young Gen GC   137.313 130.443 123.806 110.669 106.573 s 608.804
Total Old Gen GC   0 0 0 0.128 0 s 0.128
Heap used for segments   55.6507 55.6507 55.6507 55.6507 55.6507 MB 55.6507
Heap used for doc values   0.000324249 0.000324249 0.000324249 0.000324249 0.000324249 MB 0.000324249
Heap used for terms   44.3083 44.3083 44.3083 44.3083 44.3083 MB 44.3083
Heap used for points   6.06202 6.06202 6.06202 6.06202 6.06202 MB 6.06202
Heap used for stored fields   5.28004 5.28004 5.28004 5.28004 5.28004 MB 5.28004
Segment count   5 5 5 5 5   5
Min Throughput complex-query 68.0178 90.9538 92.648 94.4686 93.0936 ops/s 68.0178
Median Throughput complex-query 91.5291 95.6761 96.0065 96.3352 96.4799 ops/s 95.9452
Max Throughput complex-query 92.8571 96.1195 97.2325 97.5942 96.8988 ops/s 97.5942
50th percentile latency complex-query 1040.13 1010.26 1011.09 1004.91 1003.2 ms 1013.61
90th percentile latency complex-query 1456.38 1422.49 1414.84 1416.78 1406.63 ms 1423.83
99th percentile latency complex-query 1831.73 1787.15 1765.59 1784.83 1746.04 ms 1783.73
99.9th percentile latency complex-query 2140.53 2073.31 2075.77 2176.83 2056.19 ms 2107.3
99.99th percentile latency complex-query 2359.15 2428.91 2462.54 2481.45 2362.07 ms 2465.01
100th percentile latency complex-query 2628.35 2541.27 2785.96 2694.77 2589.07 ms 2785.96
50th percentile service time complex-query 1040.13 1010.26 1011.09 1004.91 1003.2 ms 1013.61
90th percentile service time complex-query 1456.38 1422.49 1414.84 1416.78 1406.63 ms 1423.83
99th percentile service time complex-query 1831.73 1787.15 1765.59 1784.83 1746.04 ms 1783.73
99.9th percentile service time complex-query 2140.53 2073.31 2075.77 2176.83 2056.19 ms 2107.3
99.99th percentile service time complex-query 2359.15 2428.91 2462.54 2481.45 2362.07 ms 2465.01
100th percentile service time complex-query 2628.35 2541.27 2785.96 2694.77 2589.07 ms 2785.96
error rate complex-query 0.94 0.95 1 0.93 0.98 % 0.96

Updated tests summary

Single replica, non-loaded case:

Metric No ARS ARS Change %
Median Throughput (ops/s) 95.7866 98.537 2.8713828
50th percentile latency (ms) 1003.29 970.15 -3.3031327
90th percentile latency (ms) 1339.69 1326.79 -0.96290933
99th percentile latency (ms) 1648.34 1648.8 0.027906864

So again, not a huge latency difference, as expected for the unloaded cluster.

Single replica, es3 under load:

Metric No ARS ARS Change %
Median throughput (ops/s) 41.1558 87.8231 113.39179
50th percentile latency (ms) 411.721 1007.22 144.63654
90th percentile latency (ms) 5215.34 1839.46 -64.729816
99th percentile latency (ms) 6181.48 2433.55 -60.631596

And again, a large improvement in throughput for the loaded case as well as a trade-off of 50th percentile latency for a large improvement in 90th and 99th percentile latency.

Single replica, round robin requests:

Metric No ARS ARS Change %
Median throughput (ops/s) 89.6289 95.9452 7.0471689
50th percentile latency (ms) 1088.81 1013.61 -6.9066228
90th percentile latency (ms) 1706.07 1423.83 -16.543284
99th percentile latency (ms) 2481.1 1783.73 -28.107291

Again a nice improvement in both throughput and latency for the non-stressed round-robin test case.

PMC tests

I didn't want to only test with my scenario, so I also ran the PMC tests with and without ARS enabled. Note that the throughput for this test is targeted at 20 ops/s, so I don't expect much difference there, only in the latency.

Metric Operation Value Value Unit Change %
Total Young Gen GC   71.609 80.071 s 11.816950
Min Throughput default 1734.13 1173.94 ops/s -32.303807
Median Throughput default 2493.65 2110.04 ops/s -15.383474
Max Throughput default 2608.02 2258.78 ops/s -13.391002
50th percentile latency default 20.2543 29.9434 ms 47.837249
90th percentile latency default 66.1935 66.8977 ms 1.0638507
99th percentile latency default 263.346 133.466 ms -49.319147
99.9th percentile latency default 496.514 234.434 ms -52.784010
99.99th percentile latency default 687.331 255.652 ms -62.805111
100th percentile latency default 753.063 259.545 ms -65.534756
50th percentile service time default 20.2543 29.9434 ms 47.837249
90th percentile service time default 66.1935 66.8977 ms 1.0638507
99th percentile service time default 263.346 133.466 ms -49.319147
99.9th percentile service time default 496.514 234.434 ms -52.784010
99.99th percentile service time default 687.331 255.652 ms -62.805111
100th percentile service time default 753.063 259.545 ms -65.534756
Min Throughput term 1289.22 1514.94 ops/s 17.508261
Median Throughput term 1568.08 1910.02 ops/s 21.806285
Max Throughput term 1641.91 1974.78 ops/s 20.273340
50th percentile latency term 37.8038 36.8043 ms -2.6439141
90th percentile latency term 106.132 88.4687 ms -16.642766
99th percentile latency term 308.381 167.496 ms -45.685370
99.9th percentile latency term 579.552 287.72 ms -50.354757
99.99th percentile latency term 772.559 342.839 ms -55.622936
100th percentile latency term 785.949 357.648 ms -54.494757
50th percentile service time term 37.8038 36.8043 ms -2.6439141
90th percentile service time term 106.132 88.4687 ms -16.642766
99th percentile service time term 308.381 167.496 ms -45.685370
99.9th percentile service time term 579.552 287.72 ms -50.354757
99.99th percentile service time term 772.559 342.839 ms -55.622936
100th percentile service time term 785.949 357.648 ms -54.494757
Min Throughput phrase 1212.21 1607 ops/s 32.567789
Median Throughput phrase 1474.6 1889.54 ops/s 28.139156
Max Throughput phrase 1598.05 2007.98 ops/s 25.651888
50th percentile latency phrase 31.4457 22.0566 ms -29.858136
90th percentile latency phrase 116.218 98.1068 ms -15.583817
99th percentile latency phrase 360.709 288.757 ms -19.947381
99.9th percentile latency phrase 594.483 384.245 ms -35.364846
99.99th percentile latency phrase 698.474 538.333 ms -22.927267
100th percentile latency phrase 706.989 541.952 ms -23.343645
50th percentile service time phrase 31.4457 22.0566 ms -29.858136
90th percentile service time phrase 116.218 98.1068 ms -15.583817
99th percentile service time phrase 360.709 288.757 ms -19.947381
99.9th percentile service time phrase 594.483 384.245 ms -35.364846
99.99th percentile service time phrase 698.474 538.333 ms -22.927267
100th percentile service time phrase 706.989 541.952 ms -23.343645
Min Throughput articles_monthly_agg_uncached 413.915 351.617 ops/s -15.050916
Median Throughput articles_monthly_agg_uncached 642.901 606.109 ops/s -5.7228096
Max Throughput articles_monthly_agg_uncached 673.724 627.425 ops/s -6.8721019
50th percentile latency articles_monthly_agg_uncached 14.1694 14.6811 ms 3.6113032
90th percentile latency articles_monthly_agg_uncached 15.8405 17.8053 ms 12.403649
99th percentile latency articles_monthly_agg_uncached 26.3351 30.0847 ms 14.238032
99.9th percentile latency articles_monthly_agg_uncached 67.0225 77.817 ms 16.105785
99.99th percentile latency articles_monthly_agg_uncached 77.734 90.7812 ms 16.784419
100th percentile latency articles_monthly_agg_uncached 78.4629 90.9573 ms 15.923959
50th percentile service time articles_monthly_agg_uncached 14.1694 14.6811 ms 3.6113032
90th percentile service time articles_monthly_agg_uncached 15.8405 17.8053 ms 12.403649
99th percentile service time articles_monthly_agg_uncached 26.3351 30.0847 ms 14.238032
99.9th percentile service time articles_monthly_agg_uncached 67.0225 77.817 ms 16.105785
99.99th percentile service time articles_monthly_agg_uncached 77.734 90.7812 ms 16.784419
100th percentile service time articles_monthly_agg_uncached 78.4629 90.9573 ms 15.923959
Min Throughput articles_monthly_agg_cached 3274.88 3362.04 ops/s 2.6614716
Median Throughput articles_monthly_agg_cached 3397.29 3383.61 ops/s -0.40267390
Max Throughput articles_monthly_agg_cached 3511.35 3388.89 ops/s -3.4875475
50th percentile latency articles_monthly_agg_cached 2.48462 2.49701 ms 0.49866780
90th percentile latency articles_monthly_agg_cached 3.0102 2.96122 ms -1.6271344
99th percentile latency articles_monthly_agg_cached 5.33889 5.96885 ms 11.799456
99.9th percentile latency articles_monthly_agg_cached 59.0387 67.0066 ms 13.496063
99.99th percentile latency articles_monthly_agg_cached 60.1156 71.0272 ms 18.151029
100th percentile latency articles_monthly_agg_cached 69.1591 71.3471 ms 3.1637196
50th percentile service time articles_monthly_agg_cached 2.48462 2.49701 ms 0.49866780
90th percentile service time articles_monthly_agg_cached 3.0102 2.96122 ms -1.6271344
99th percentile service time articles_monthly_agg_cached 5.33889 5.96885 ms 11.799456
99.9th percentile service time articles_monthly_agg_cached 59.0387 67.0066 ms 13.496063
99.99th percentile service time articles_monthly_agg_cached 60.1156 71.0272 ms 18.151029
100th percentile service time articles_monthly_agg_cached 69.1591 71.3471 ms 3.1637196

Author: Lee Hinman

Created: 2017-08-30 Wed 14:58