Red Hat blog

Benchmarking OpenShift Network Performance Part 2: Deep Dive

May 7, 2020Courtney Pacheco, Mohit Sheth

FURTHER INVESTIGATIONS

In part 1, you saw the basics of uperf and how to deploy Ripsaw benchmark operator with default values. Now, A variety of arguments can be tweaked to customize the user's workload. Below are some such arguments.

MULTUS

Cluster Networking Interface (CNI) basically tells the cluster how pods should communicate with other pods. There are various CNIs supported by OpenShift. OpenShiftSDN is the default CNI; other third-party CNIs like Flannel SDN and Nuage SDN are also supported. This is where Multus comes into the picture. Like the earlier example, Multus is also a CNI, which enables attaching multiple network interfaces to pods. This makes Multus a "meta-plugin," a CNI plugin that can call multiple other CNI plugins. For this, a “NetworkAttachmentDefinition” is defined with the name as NAD.yaml.

This is applied to the my-ripsaw namespace and can be checked as follows

$ oc apply -f NAD.yaml -n my-ripsawnetworkattachmentdefinition.k8s.cni.cncf.io/macvlan-range-0 created
networkattachmentdefinition.k8s.cni.cncf.io/macvlan-range-1 created

$ oc get network-attachment-definitions -n my-ripsaw
NAME AGE
macvlan-range-0 39h
macvlan-range-1 39h

Now these “NetworkAttachmentDefinition” can be included in the uperf CR:

<snip>  
  workload:
    cleanup: false
    name: uperf
    args:
      serviceip: true
      hostnetwork: false
      pin: false
      multus:
        enabled: true
        client: "macvlan-range-0"
        server: "macvlan-range-1"
</snip>

When this CR is applied, you can check that the macvlan network is added to the client and server pods. The macvlan annotations applied to the server pod are shown below in green. The IP address of 11.10.1.68 is from the range provided in “NetworkAttachmentDefinition” for the server:

$ oc describe pod <uperf_server_pod_name>
Name:         <uperf_server_pod_name>
Namespace:    my-ripsaw
Priority:     0
Node:         ip-10-0-138-102.us-west-2.compute.internal/10.0.138.102
Start Time:   Sun, 01 Mar 2020 16:20:08 +0000
Labels:       app=uperf-bench-server-0-2d679e17
              type=uperf-bench-server-2d679e17
Annotations:  k8s.v1.cni.cncf.io/networks: macvlan-range-1
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.133.0.242"
                    ],
                    "dns": {},
                    "default-route": [
                        "10.133.0.1"
                    ]
                },{
                    "name": "macvlan-range-1",
                    "interface": "net1",
                    "ips": [
                        "11.10.1.68"
                    ],
                    "mac": "8a:9d:41:8d:2b:aa",
                    "dns": {}
                }]
              openshift.io/scc: restricted

Now you can check the multiple network interface inside the server or client pod:

$ oc rsh <uperf_server_pod_name$ oc rsh <uperf_server_pod_name>>
sh-4.2$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if245: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8951 qdisc noqueue state UP group default 
    link/ether 0a:58:0a:85:00:f0 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.133.0.242/22 brd 10.133.3.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::4413:2fff:fecc:d394/64 scope link 
       valid_lft forever preferred_lft forever
4: net1@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default 
    link/ether d6:ac:00:0d:01:52 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 11.10.1.68/16 brd 11.10.255.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 fe80::d4ac:ff:fe0d:152/64 scope link 
       valid_lft forever preferred_lft forever

MULTITHREADING

Here the number of threads being used have been tweaked in the CR:

      nthrs:
        - 4

Make sure to delete and “reapply” the CR:

$ kubectl delete -f ${RIPSAW}/resources/crds/ripsaw_v1alpha1_uperf_cr.yaml
$ kubectl apply -f ${RIPSAW}/resources/crds/ripsaw_v1alpha1_uperf_cr.yaml

Once the benchmarking has completed, you can view the logs once again. The differences between the two runs are shown below. The BLACK text is the original run, and the WHITE text is the new run:

<?xml version=1.0?>
<profile name="stream-tcp-16384-1">
<group nthreads="1">
      <transaction iterations="1">
        <flowop type="connect" options="remotehost=$h protocol=tcp"/>
      </transaction>
      <transaction duration="30">
        <flowop type=write options="count=16 size=16384"/>
      </transaction>
      <transaction iterations="1">
        <flowop type=disconnect />
      </transaction>
          </group>
</profile>

<?xml version=1.0?>
<profile name="stream-tcp-16384-4">
<group nthreads="4">
      <transaction iterations="1">
        <flowop type="connect" options="remotehost=$h protocol=tcp"/>
      </transaction>
      <transaction duration="30">
        <flowop type=write options="count=16 size=16384"/>
      </transaction>
      <transaction iterations="1">
        <flowop type=disconnect />
      </transaction>
          </group>
</profile>

The main differences are: (1) different profile name, and (2) different value for “nthreads.” Perfect!

The handshake “phases” debug output should not look any different, so no differences are expected there, as it is mostly doing the same basic testing and error checking.

When you look at the execution of the flow ops, you see major differences. For example:

</snip>
Starting 1 threads running 
profile:stream-tcp-16384-1 ...   0.00 seconds
TX command [UPERF_CMD_NEXT_TXN, 0] to
 <server-pod-ip>
timestamp_ms:1579010499379.3477 name:Txn1
 nr_bytes:0 nr_ops:0
timestamp_ms:1579010500380.2354 name:Txn1
 nr_bytes:0 nr_ops:1
</snip>

</snip>
Starting 4 threads running 
profile:stream-tcp-16384-4 ...   0.00 seconds
TX command [UPERF_CMD_NEXT_TXN, 0] to
 <server-pod-ip>
timestamp_ms:1579050082616.3635 name:Txn1
 nr_bytes:0 nr_ops:0
timestamp_ms:1579050083617.2061 name:Txn1
 nr_bytes:0 nr_ops:4
</snip>

Although not immediately obvious, the results from four different threads under transaction ID 1 are different from the results for 1 thread under transaction ID #1 in the original test run. (Remember, transaction IDs start with 0, not 1!)

You also see a difference in the “Strand Details” section. That is, you see four strands being used when “nthrs” is set to 4 rather than a single strand when “nthrs” is set to 1.

Next, you see major differences in the transaction and flowop tables:

Txn                Count         avg         cpu         max         min 
---------------------------------------------------------------------------
Txn0                   1    853.41us      0.00ns    853.41us    853.41us 
Txn1               52740    570.21us      0.00ns    148.0`9ms     30.09us 
Txn2                   1     13.15us      0.00ns     13.15us     13.15us 
$ oc rsh <uperf_server_pod_name>
Flowop             Count         avg         cpu         max         min 
---------------------------------------------------------------------------
connect                1    852.71us      0.00ns    852.71us    852.71us 
write             843824     35.61us      0.00ns    148.09ms     29.99us 
disconnect             1     12.66us      0.00ns     12.66us     12.66us

Txn                Count         avg         cpu         max         min 
---------------------------------------------------------------------------
Txn0                   4    462.02us      0.00ns    595.69us    361.99us 
Txn1               93164      1.30ms      0.00ns    350.60ms     29.93us 
Txn2                   4      9.41us      0.00ns     13.69us      5.50us 

Flowop             Count         avg         cpu         max         min 
---------------------------------------------------------------------------
connect                4    461.46us      0.00ns    595.01us    361.40us 
write            1490589     80.95us      0.00ns    350.60ms     29.84us 
disconnect             4      9.00us      0.00ns     13.26us      5.11us

Far more writes are being executed, and you see 4 separate connects/disconnects, one for each thread.

When you look at the netstat statistics, you also see some major differences. Namely, the rates for opkts and obits increase, while the rates for ipkts and ibts decrease. For example:

Netstat statistics for this run
-------------------------------------------------------------------------------
Nic       opkts/s     ipkts/s      obits/s      ibits/s
eth0         9503       11004     3.42Gb/s     5.81Mb/s 
-------------------------------------------------------------------------------

Netstat statistics for this run
-------------------------------------------------------------------------------
Nic       opkts/s     ipkts/s      obits/s      ibits/s
eth0        14745        7480     6.03Gb/s     3.98Mb/s 
-------------------------------------------------------------------------------

Finally, when you look at the run statistics, you will see that the bytes/sec and ops/sec nearly double, while latency nearly halves. So, you see a significant difference between using a single thread and multiple threads.

DIFFERENT PROTOCOLS

Protocols other than TCP can be used. For example, you could use UDP by removing the “tcp” value under “protos” in the uperf CR yaml and adding “udp” instead. However, using a different protocol does not alter the format of the output logs. That is, you still get “netstat statistics,” “run statistics,” etc., all in the same format. The most noticeable difference with UDP, though, is that the “ipkts/s” rate drops to zero (0) for the netstat results.

If desired, you can run two different protocols (“protos”) in the same test run. To do so, simply add a line for “udp” after the “tcp” line. That is, add the line highlighted in orange:

</snip>

      protos:
        - tcp
        - udp
      sizes:
        - 16384
      nthrs:
        - 4
      runtime: 30

Once the CR is reapplied and the benchmark has successfully completed, there will be a separator in the logs separating the TCP results from the UDP results.

OTHER PARAMETERS

Other parameters can be changed, too. For example, you can set the message size, the runtime, etc. More information on the effects of modifying these parameters can be found in reference 8 (the UPerf manual). The values used for these parameters are very specific to one’s project, so will not be covered here.

KIBANA AND ELASTICSEARCH

As mentioned in the CR yaml comments, you can store uperf benchmark results in Elasticsearch and view them in Kibana. Info on how to store results into Elasticsearch can be found here, under the “Storing Results into Elasticsearch” section: https://github.com/cloud-bulldozer/ripsaw/blob/master/docs/uperf.md

RESULTS - OCP 4.3 v OCP 4.2 on AWS

For Performance and Scale release testing, you usually compare the current release to the previous one. The graphs below show TCP Throughput and TCP Latency performance numbers for pod to pod using OpenShiftSDN as the overlay. The Instance count is the number of client-server pairs used and the number of threads is 1.

Platform products

Try & buy

Featured cloud services

By category

By organization type

By customer

Services

Training & certification

Featured

Topics

Articles

More to explore

For customers

For partners

About us

Open source

Company details

Communities

Recommendations

Select a language

Select a language

Benchmarking OpenShift Network Performance Part 2: Deep Dive

About the authors

Courtney Pacheco

Mohit Sheth

Deploying SAS Viya on HPE GreenLake and Red Hat OpenShift

SAS Viya on Red Hat OpenShift Service for AWS (ROSA)

Understanding the Red Hat security impact scale

Products

Tools

Try, buy, & sell

Communicate

About Red Hat

Select a language

Red Hat legal and privacy links

Red Hat legal and privacy links