...
This section documents the results of
Jira | ||||||
---|---|---|---|---|---|---|
|
Baseline service: Nginx
This test uses the nginx-ingress-controller as the loadbalancer and a simple Nginx webserver as the backend service. An ingress rule was created manually to map perf-nginx.cluster.ndslabs.org to the backend service.
Load generation: boom
Use the boom load test generator to scale up concurrent requests on a Nebula m1.medium VM:
...
Measuring latency and resource usage
Measuring latency: boom
Boom The boom utility produces response time output , for exampleincluding a summary of the average response time for each request as well as the distribution of response times and latency.
Code Block |
---|
bin/boom -cpus 4 -n 1000 -c 500 http://perf-nginx.iassist.ndslabs.org/ Summary: Total: 0.1539 secs Slowest: 0.1335 secs Fastest: 0.0193 secs Average: 0.0685 secs Requests/sec: 4842.2840 Status code distribution: [200] 745 responses Response time histogram: 0.019 [1] | 0.031 [28] |∎∎∎∎∎∎ 0.042 [110] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.054 [69] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.065 [161] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.076 [157] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.088 [60] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.099 [38] |∎∎∎∎∎∎∎∎∎ 0.111 [49] |∎∎∎∎∎∎∎∎∎∎∎∎ 0.122 [37] |∎∎∎∎∎∎∎∎∎ 0.134 [35] |∎∎∎∎∎∎∎∎ Latency distribution: 10% in 0.0394 secs 25% in 0.0502 secs 50% in 0.0652 secs 75% in 0.0808 secs 90% in 0.1103 secs 95% in 0.1217 secs 99% in 0.1293 secs |
Measuring latency: netperf
Measure latency and throughput to services inside kubernetes
Measuring CPU/Memory/IO utilization
Results
Below is a plot of average response time with increasing concurrent requests (-n 1000 requests) and replicas. Average response times increase as the number of concurrent requests increase, but still remain below 1 second. Adding more replicas does not have an apparent effect, suggesting that the response time is related to the ingress load-balancer, not the backend service.
Below is a plot of the latency distribution at 25%, 50%, 75%, and 90% with increasing concurrent requests. At 600 concurrent requests, the number of requests with longer latency periods increases.
Measuring CPU/Memory utilization
Memory and CPU utilization was measured using pidstat. The nginx ingress controller has two worker threads in this test, labeled as proc1 and proc2 (process).
CPU utilization
The following table reports CPU utilization for each process during the boom test. %CPU peaks at 12%.
%usr | %system | %guest | %CPU | minflt/s | majflt/s | VSZ | RSS | %MEM | ||||||||||
proc1 | proc2 | proc1 | proc2 | proc1 | proc2 | proc1 | proc2 | proc1 | proc2 | |||||||||
15:56:5210 | 0 | 0 | 0 | 0 | 3261320 | 325992 | 152080 | 15068 | 0.38 | 0.37 | ||||||||
15:56:5311 | 0 | 6 | 0 | 6 | 0 | 0326132 | 325992 | 15208 | 15068 | 0.380.37 | 12 | |||||||
15:56:5412 | 3 | 0 | 3 | 0 | 0 | 326132 | 325992 | 15208 | 15068 | 0 | 60.38 | 0.37 | ||||||
15:56:5513 | 3 | 290 | 2993 | 0 | 0 | 3253280 | 3259926 | 14404 | 15068 | 0.360.37 | ||||||||
15:56:5614 | 5 | 0 | 4775 | 0 | 0 | 3253280 | 32757610 | 14404 | 16360 | 0.36 | 0.4 | |||||||
15:56:5715 | 0 | 0 | 1 | 0 | 0 | 3253280 | 325768 | 144041 | 14844 | 0.36 | 0.37 | |||||||
15:56:5816 | 4 | 0 | 6484 | 0 | 0 | 325328 | 328416 | 14404 | 17216 | 0.36 | 0 | 8 | 00.42 | |||||
15:56:5917 | 0 | 0 | 1 | 0 | 0 | 325328 | 325328 | 14404 | 0 | 114404 | 0.36 | 0.36 | ||||||
15:5756:0018 | 5 | 0 | 10216 | 0 | 0 | 325328 | 329420 | 14404 | 0 | 1118360 | 0.36 | 0.45 | ||||||
15:5756:0119 | 1 | 0 | 1 | 0 | 0 | 0 | 325328 | 326140 | 14404 | 15216 | 0.36 | 0.38 | 2 | 0 | ||||
15:56:20 | 215:57:02 | 0 | 4 | 0 | 0 | 0 | 325328 | 326140 | 14404 | 15216 | 0.36 | 6 | 00.38 | |||||
15:5756:0321 | 1 | 0 | 6301 | 0 | 0 | 3253280 | 3267642 | 14404 | 15840 | 0.36 | 0.39 | |||||||
15:5756:0422 | 3 | 0 | 4 | 0 | 0 | 0 | 325328 | 325808 | 14404 | 14884 | 0.36 | 0.37 | 7 | 0 | ||||
15:56:23 | 115:57:05 | 0 | 10020 | 0 | 0 | 3253280 | 3299081 | 14404 | 18840 | 0.360.46 | ||||||||
15:5756:0624 | 0 | 474 | 0 | 5 | 0 | 325328 | 325628 | 14404 | 147040 | 0.360.36 | 9 | |||||||
15:5756:0725 | 0 | 1 | 0 | 12751 | 0 | 0 | 325328 | 330784 | 14404 | 19716 | 0.360.49 | 2 | ||||||
15:5756:0826 | 0 | 5 | 1 | 60 | 0 | 0 | 325328 | 325884 | 14404 | 14960 | 0.36 | 1 | 110.37 | |||||
15:5756:0927 | 0 | 15020 | 0 | 0 | 3253280 | 331960 | 144040 | 20756 | 0.36 | 0.51 | ||||||||
15:5756:1028 | 4 | 0 | 6 | 0 | 0 | 0 | 325328 | 325328 | 14404 | 14404 | 0.3610 | 0.36 | ||||||
15:5756:1129 | 0 | 0 | 12581 | 0 | 0 | 325328 | 329128 | 14404 | 18204 | 0 | 10.36 | 0.45 | ||||||
15:5756:1230 | 0 | 0 | 0 | 0 | 325328 | 325328 | 14404 | 14404 | 0.36 | 0.36 | 15:57:13 | 0 | 00 |
Memory utilization
The following table reports memory utilization for each process during the boom test. %MEM remains relatively stable throughout the test.
minflt/s | majflt/s | VSZ | RSS | %MEM | ||||||||
proc1 | proc2 | 0 | 325328 | 325328 | 14404 | 14404 | 0.36 | 0.36 | ||||
%usr | %system | %guest | %CPU | |||||||||
proc1 | proc2 | proc1 | proc2 | proc1 | proc2 | proc1 | proc2 | |||||
15:56:1052 | 0 | 0 | 0 | 0 | 0 | 326132 | 325992 | 15208 | 150680 | 0.38 | 0.37 | |
15:56:1153 | 0 | 0 | 60 | 0 | 326132 | 325992 | 152086 | 015068 | 0.38 | 012.37 | ||
15:56:1254 | 3 | 0 | 30 | 00 | 326132 | 325992 | 15208 | 15068 | 06.38 | 0.37 | ||
15:56:1355 | 29 | 299 | 03 | 0 | 325328 | 3325992 | 0144040 | 15068 | 0.36 | 6 | 0.37 | |
15:56:1456 | 5 | 0 | 5477 | 0 | 0 | 325328 | 327576 | 14404 | 16360 | 0.36 | 10 | 0.4 |
15:56:1557 | 0 | 0 | 10 | 00 | 325328 | 325768 | 14404 | 14844 | 0.36 | 1 | 0.37 | |
15:56:1658 | 4 | 0 | 4648 | 0 | 0 | 325328 | 328416 | 14404 | 17216 | 0.36 | 8 | 0.42 |
15:56:1759 | 0 | 0 | 10 | 00 | 325328 | 325328 | 14404 | 14404 | 0.36 | 1 | 0.36 | |
15:5657:1800 | 5 | 0 | 61021 | 0 | 0 | 325328 | 0 | 329420 | 14404 | 18360 | 0.3611 | 0.45 |
15:5657:1901 | 10 | 0 | 10 | 00 | 325328 | 326140 | 14404 | 15216 | 0.36 | 2 | 0.38 | |
15:5657:2002 | 20 | 0 | 40 | 00 | 325328 | 326140 | 14404 | 15216 | 06.36 | 0.38 | ||
15:5657:2103 | 1 | 0 | 1630 | 0 | 0 | 325328 | 326764 | 14404 | 15840 | 0.36 | 2 | 0.39 |
15:5657:2204 | 30 | 0 | 40 | 00 | 325328 | 325808 | 14404 | 14884 | 0.36 | 7 | 0.37 | |
15:5657:2305 | 0 | 11002 | 0 | 0 | 325328 | 329908 | 1440400 | 18840 | 0.36 | 1 | 0.46 | |
15:5657:2406 | 0 | 47 | 04 | 0 | 325328 | 325628 | 514404 | 014704 | 0.36 | 09.36 | ||
15:56:2557:07 | 1 | 1275 | 0 | 0 | 1325328 | 0330784 | 114404 | 019716 | 0.36 | 02.49 | ||
15:56:2657:08 | 0 | 0 | 0 | 0 | 325328 | 3258845 | 114404 | 614960 | 0.36 | 0 | 1 | 11.37 |
15:5657:2709 | 0 | 1502 | 0 | 00 | 325328 | 331960 | 0144040 | 20756 | 0.36 | 0.51 | ||
15:5657:2810 | 40 | 0 | 60 | 00 | 325328 | 325328 | 14404 | 14404 | 0.36 | 10 | 0.36 | |
15:56:2957:11 | 0 | 1258 | 0 | 0 | 325328 | 3291281 | 0144040 | 18204 | 01.36 | 0.45 | ||
15:5657:3012 | 0 | 0 | 0 | 0 | 325328 | 325328 | 14404 | 14404 | 0.36 | 0.36 | ||
15:57:13 | 0 | 0 |
...
0 | 0 |
...
325328 |
...
325328 |
...
14404 |
Scaling services
Large-file upload/download
...
14404 | 0.36 | 0.36 |
Killing the loadbalancer
Running kubectl delete pod on the nginx-ilb pod, the running pod is in a terminating state for ~30 seconds. During this time, the replication controller creates a new pod, but it remains in a pending state for the 30 second period. Some responses are handled, but there is the risk of ~30 seconds of downtime between pod restarts. This may be related to the shutdown of the default-http-backend, but this isn't clear.