OpenShift is becoming the enterprise platform of choice for cloud native software, implementing higher level abstractions on top of the Kubernetes low-level primitives. As extension mechanisms like aggregated API servers, admission webhooks and custom resource definitions are becoming more widely adopted to run custom workloads, additional stress is being imposed on the API server.

The API server is becoming a critical component under risk. Custom controllers with unregulated traffic can cause cluster instabilities when high-level object access slows down critical low-level communications, leading to request failures, timeouts and API retry storms.

Hence, it is very important that the API server knows how to prioritize traffic from all the different clients, without starving critical control plane traffic.

Kubernetes API Priority and Fairness (APF) is a flow control mechanism that allows platform owners to define API-level policies to regulate inbound requests to the API server. It protects the API server from being overwhelmed by unexpectedly high request volume, while protecting critical traffic from the throttling effect on best-effort workloads.

APF has been enabled in OpenShift since version 4.5. In this post, we will examine how OpenShift utilizes APF to protect the control plane. We will also be going over some configuration, metrics and debugging endpoints that will aid you in making APF works for your OpenShift cluster.

What Is APF

Prior to APF, the API server uses the --max-requests-inflight and --max-mutating-requests-inflight command-line flags to regulate the volume of inbound requests. The only distinction that this implementation can make is whether the requests are mutating ones or not. They can’t, for example, ensure that lower priority traffic doesn’t overwhelm the critical ones as described in this issue.

By classifying requests into flows and priorities, APF manages and throttles all inbound requests in a prioritized and fair manner.

With APF enabled, all incoming requests are evaluated against a set of flow schemas. Every request will be matched with exactly one flow schema, which assigns the request to a priority level. When requests of a priority level are being throttled, requests of other priority levels remain unaffected.

To further enforce fairness among requests of a priority level, the matching flow schema associates requests with flows, where requests originating from the same source are assigned the same flow distinguisher.

Among the flows in a priority level, new requests are either served immediately, enqueued or rejected, depending on the priority level’s queue capacity, concurrent request limit, and total in-flight requests.

Requests are rejected if one of the following conditions is true:

  • The priority level is configured to reject excessive requests
  • The queues that the new requests will be assigned to are full

Requests are enqueued using shuffle sharding, a technique commonly used to isolate workloads to improve fault tolerance. When sufficient capacity becomes available, the requests will be dequeued using a fair queueing algorithm across the flows. Enqueued requests can also be rejected if the queue’s time limit expires.

OpenShift API Priority & Fairness (1)

In subsequent sections, we will go over how to adjust and validate these queueing properties using the FlowSchema and PriorityLevelConfiguration resources.

Introducing OpenShift Flow Schemas

Let’s spin up an OpenShift 4.6.15 cluster with monitoring enabled using CodeReady Containers 1.22.0:

crc config set enable-cluster-monitoring true
crc start --memory=16096

Once the cluster is ready, use the oc CLI to login:

oc login -u kubeadmin -p [password] https://api.crc.testing:6443

You can retrieve your login credential with:
crc console --credentials

The following is the list of OpenShift FlowSchema resources:

oc get flowschema | grep openshift                                 
openshift-apiserver-sar             exempt                             2     ByUser  29d  False
openshift-oauth-apiserver-sar       exempt                             2     ByUser  29d  False
openshift-apiserver                 workload-high                      1000  ByUser  29d  False
openshift-controller-manager        workload-high                      1000  ByUser  29d  False
openshift-oauth-apiserver           workload-high                      1000  ByUser  29d  False
openshift-oauth-server              workload-high                      1000  ByUser  29d  False
openshift-apiserver-operator        openshift-control-plane-operators  2000  ByUser  29d  False
openshift-authentication-operator   openshift-control-plane-operators  2000  ByUser  29d  False
openshift-etcd-operator             openshift-control-plane-operators  2000  ByUser  29d  False
openshift-kube-apiserver-operator   openshift-control-plane-operators  2000  ByUser  29d  False
openshift-monitoring-metrics        workload-high                      2000  ByUser  29d  False

 

yq 4.3.1 is used to improve the readability of the YAML outputs of subsequent commands.

To help us better understand some important configuration, let’s examine the spec of the openshift-apiserver-operator flow schema:

oc get flowschema openshift-apiserver-operator -oyaml | yq e .spec - 
distinguisherMethod:
 type: ByUser
matchingPrecedence: 2000
priorityLevelConfiguration:
 name: openshift-control-plane-operators
rules:
 - resourceRules:
     - apiGroups:
         - '*'
       clusterScope: true
       namespaces:
         - '*'
       resources:
         - '*'
       verbs:
         - '*'
   subjects:
     - kind: ServiceAccount
       serviceAccount:
         name: openshift-apiserver-operator
         namespace: openshift-apiserver-operator

The rules describe the list of criteria used to identify matching requests. The flow schema matches a request if and only if:

  • at least one of its subjects matches the subject making the request and
  • at least one of its resourceRules or nonResourceRules matches the verb and (non-)resource being requested

Essentially, this flow schema will match all requests issued by the openshift-apiserver-operator service account in the openshift-apiserver-operator namespace, for all namespaced-scoped as well as cluster-scoped resources.

If we impersonate the openshift-apiserver-operator service account to issue a GET request to list all the pods, the X-Kubernetes-Pf-Prioritylevel-Uid and X-Kubernetes-Pf-Flowschema-Uid response headers would show that our request is mapped with the openshift-apiserver-operator flow schema and its priority level configuration, as expected:

SERVICE_ACCOUNT="system:serviceaccount:openshift-apiserver-operator:openshift-apiserver-operator"

FLOW_SCHEMA_UID="$(oc get po -A --as "$SERVICE_ACCOUNT" -v8 2>&1 | grep -i X-Kubernetes-Pf-Flowschema-Uid | awk '{print $6}')"

PRIORITY_LEVEL_UID="$(oc get po -A --as "$SERVICE_ACCOUNT" -v8 2>&1 | grep -i X-Kubernetes-Pf-Prioritylevel-Uid | awk '{print $6}')"

CUSTOM_COLUMN=”uid:{metadata.uid},name:{metadata.name}”
oc get flowschema -o custom-columns="$CUSTOM_COLUMN" | grep $FLOW_SCHEMA_UID
9a3bf863-d69f-470a-b119-df9bd3a709bd   openshift-apiserver-operator

oc get prioritylevelconfiguration -o custom-columns="$CUSTOM_COLUMN" | grep $PRIORITY_LEVEL_UID             
2cf49074-5360-44da-a259-2b051972daf0   openshift-control-plane-operators

Without the service account impersonation, the request issued by the same command will be mapped with the global-default flow schema because the request is bound to the OpenShift kubeadmin user.

This request mapping mechanism provides a granular way to assign requests from different origins to different flows, based on their flow distinguishers, so that they can’t starve each other.

The distinguishedMethod defines how the flow distinguishers are computed:

  • ByUser where requests originated from the same subject are grouped into the same flow so that different users can’t overwhelm each other
  • ByNamespace where requests originated from the same namespace are grouped into the same flow so that workloads in one namespace can’t overwhelm those in other namespaces
  • An empty string where all requests are grouped into a single flow

When matching requests, a flow schema with a lower matchingPrecedence has higher precedence than one with a higher matchingPrecendence.

The priorityLevelConfiguration refers to the priority level configuration resource that specifies the flow control attributes.

Understanding Priority Level Configuration

The openshift-control-plane-operators priority level is used to regulate OpenShift operator requests to the API server. Let’s take a look at its .spec:

oc get prioritylevelconfiguration openshift-control-plane-operators -oyaml | yq e .spec -
limited:
 assuredConcurrencyShares: 10
 limitResponse:
   queuing:
     handSize: 6
     queueLengthLimit: 50
     queues: 128
   type: Queue
type: Limited

The limited.assuredConcurrencyShares (ACS) defines the concurrency shares used to calculate the assured concurrency value (ACV). The ACV of a priority level defines the total number of concurrent requests that may be executing at a time. Its exact value is affected by the API server’s concurrency limit (SCL), which is divided among all priority levels in proportion to their ACS.

When APF enabled, the SCL is set to the summation of the --max-requests-inflight and --max-mutating-requests-inflight options. By default, these options are set to 3000 and 1000, respectively in OpenShift 4.6.

Using the formula presented in the Kubernetes documentation, we can calculate the ACV of the openshift-control-plane-operators priority level as follows:

ACV(l) = ceil(SCL * ACS(l) / (sum[priority levels k] ACS(k)))
      = ceil((3000 + 1000) * 10 / (1 + 100 + 10 + 10 + 30 + 40 +20))
      = ceil(189.57)
      = 190

We can use the apiserver_flowcontrol_request_concurrency_limit metric to confirm this value:

OpenShift API Priority & Fairness (2)

The Prometheus console is accessible at localhost:9090 via port-forwarding:

oc -n openshift-monitoring port-forward svc/prometheus-operated 9090

Later when a new custom priority level is added, the ACVs of all the existing priority levels will decrease as the SCL is being shared across more priority levels.

The limited.limitResponse defines the strategy to handle requests that can’t be executed immediately. The type subproperty supports two values:

  • Queue where excessive requests are queued
  • Reject where excessive requests are dropped with an HTTP 429 error

With the Queue configuration, the queueing behavior can be further adjusted using the following subproperties:

  • queueing.queues is the number of queues of a priority level
  • queueing.queueLengthLimit is the number of requests allowed to be waiting in a queue
  • queueing.handSize is the number of possible queues a request can be assigned to during enqueuing. The request will be added to the shortest queue in this list

The exact values to be used for these properties are dependent on your use case.

For example, while increasing queues size reduces the rate of collisions between different flows (because there are more queues available), it increases memory usage. Although increasing queueLengthLimit supports bursty traffic (as each queue can hold more requests), the cost of latency and memory usage are likely to be higher. Since handSize is computed by a request’s flow distinguisher, a larger value means it’s less likely for individual flows to collide (because there are more queues to choose from), but more likely for a small number of flows to dominate the API server (as some queues get more dense than others).

In the next section, we will create some custom flow schema and priority level configuration to regulate the traffic from a custom controller.

Configuring Custom Flow Schema and Priority Level

Let’s start by creating a demo namespace with 3 service accounts, namely, podlister-0, podlister-1 and podlister-2, with permissions to LIST and GET pods from the demo namespace:

cat <<EOF | oc apply -f -
apiVersion: v1
kind: Namespace
metadata:
 name: demo
EOF

for i in {0..2}; do
cat <<EOF | oc auth reconcile -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
 name: podlister
 namespace: demo  
rules:
 - apiGroups: [""]
   resources: ["pods"]
   verbs: ["list", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
 name: podlister
 namespace: demo
subjects:
- apiGroup: ""
 kind: ServiceAccount
 name: podlister-$i
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: Role
 name: podlister
EOF
done

for i in {0..2}; do
cat <<EOF | oc apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
 name: podlister-$i
 namespace: demo
 labels:
   kubernetes.io/name: podlister-$i
EOF
done

Then we will create a custom flow schema and priority level configuration to regulate requests originating from these 3 service accounts:

cat <<EOF | oc apply -f -
apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
kind: FlowSchema
metadata:
 name: restrict-pod-lister
spec:
 priorityLevelConfiguration:
   name: restrict-pod-lister
 distinguisherMethod:
   type: ByUser
 rules:
 - resourceRules:
   - apiGroups: [""]
     namespaces: ["demo"]
     resources: ["pods"]
     verbs: ["list", "get"]
   subjects:
   - kind: ServiceAccount
     serviceAccount:
       name: podlister-0
       namespace: demo
   - kind: ServiceAccount
     serviceAccount:
       name: podlister-1
       namespace: demo 
   - kind: ServiceAccount
     serviceAccount:
       name: podlister-2
       namespace: demo            
---
apiVersion: flowcontrol.apiserver.k8s.io/v1alpha1
kind: PriorityLevelConfiguration
metadata:
 name: restrict-pod-lister
spec:
 type: Limited
 limited:
   assuredConcurrencyShares: 5
   limitResponse:
     queuing:   
       queues: 10
       queueLengthLimit: 20
       handSize: 4
     type: Queue
EOF

The restrict-pod-lister priority level has 10 queues. Each queue can hold a maximum of 20 requests. With its ACS set to 5, this priority level can handle about 93 concurrent requests:

OpenShift API Priority & Fairness (3)

The values used in the above queue configuration are provided for demonstration purposes only. You should adjust them to suit your use case.

Examining APF Metrics And Debugging Endpoints

Now we are ready to deploy our custom controller into the demo namespace, as 3 separate Deployment resources. Each Deployment uses one of the service accounts we created earlier:

for i in {0..2}; do
cat <<EOF | oc apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
 name: podlister-$i
 namespace: demo
 labels:
   kubernetes.io/name: podlister-$i
spec:
 selector:
   matchLabels:
     kubernetes.io/name: podlister-$i
 template:
   metadata:
     labels:
       kubernetes.io/name: podlister-$i
   spec:
     serviceAccountName: podlister-$i
     containers:
     - name: podlister
       image: quay.io/isim/podlister
       imagePullPolicy: Always
       command:
       - /podlister
       env:
       - name: NAMESPACE
         valueFrom:
           fieldRef:
             fieldPath: metadata.namespace
       - name: POD_NAME
         valueFrom:
           fieldRef:
             fieldPath: metadata.name
       - name: SHOW_ERRORS_ONLY
         value: "true"
       - name: TARGET_NAMESPACE
         value: demo
       - name: TICK_INTERVAL
         value: 100ms         
       resources:
         requests:
           cpu: 30m
           memory: 50Mi
         limits:
           cpu: 100m
           memory: 128Mi
EOF
done

The controller uses Go’s time.Tick() function to send continuous traffic to the LIST pod endpoint of the API server, to retrieve all the pods in the demo namespace. The source code can be found here.

Switching over to the Prometheus console, let’s use the apiserver_flowcontrol_dispatched_requests_total metric to retrieve the total number of requests matched by our flow schema:

apiserver_flowcontrol_dispatched_requests_total{job="apiserver",flowSchema="restrict-pod-lister"}

OpenShift API Priority & Fairness (4)

As expected of a counter vector, we observe an upward trend in the summation of its rates:

sum(rate(apiserver_flowcontrol_dispatched_requests_total{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (flowSchema)

OpenShift API Priority & Fairness (5)

The apiserver_flowcontrol_current_inqueue_requests metric shows the number of requests waiting in the queues. The 0 value indicates that our queues are currently empty:

apiserver_flowcontrol_current_inqueue_requests{job="apiserver",flowSchema="restrict-pod-lister"}

OpenShift API Priority & Fairness

More importantly, the number of rejected requests is also 0, as shown by the apiserver_flowcontrol_rejected_requests_total metric:

apiserver_flowcontrol_rejected_requests_total{job="apiserver",flowSchema="restrict-pod-lister"}

OpenShift API Priority & Fairness-1

The apiserver_flowcontrol_request_execution_seconds metric provides insights into how long it takes to execute requests in our priority level:

histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (le,flowSchema))

OpenShift API Priority & Fairness (1)-1

In this particular test run, the p99 of the request execution time in our queues is around 16 milliseconds.

Conversely, the apiserver_flowcontrol_request_wait_duration_seconds metric shows how long requests spent inside the queues:

histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (le,flowSchema))

OpenShift API Priority & Fairness (2)-1

The p99 of the request wait duration of this test run is around 4.95 milliseconds.

We will revisit these two metrics later to see how they can affect our client-side context timeout.

Let’s add more replicas to increase the traffic volume to activate the queueing effect:

for i in {0..2}; do oc -n demo scale deploy/podlister-$i --replicas=10; done

When our queues are saturated, the number of rejected requests starts to increase. The reason label tells us why these requests are being rejected (i.e. queue-fulltimeout or concurrency-limit):

sum(rate(apiserver_flowcontrol_rejected_requests_total{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (flowSchema,reason)

OpenShift API Priority & Fairness (3)-1

As the API server responds with HTTP 504 (request timed out) errors, these error messages can be seen in the controller’s logs:

oc -n demo logs deploy/podlister-0 | grep -i error                                                    
2021/02/11 04:32:39 error while listing pods: the server was unable to return a response in the time allotted, but may still be processing the request (get pods)
2021/02/11 04:33:39 error while listing pods: the server was unable to return a response in the time allotted, but may still be processing the request (get pods)

In cases where the API server responds with a HTTP 429 (too many requests) error, the controller will see this error message:

the server has received too many requests and has asked us to try again later

The p99 of the request in-queue wait duration ranges between 1.5 to 3.5  seconds:

histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (le,flowSchema))

OpenShift API Priority & Fairness (4)-1

The p99 request execution time lies between 1.5 to 6.0 seconds:

histogram_quantile(0.99, sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket{job="apiserver",flowSchema="restrict-pod-lister"}[15m])) by (le,flowSchema))

OpenShift API Priority & Fairness (5)-1

In addition to metrics, APF also exposes debugging endpoints that can provide further insights into the conditions of the queues.

The /debug/api_priority_and_fairness/dump_priority_levels endpoint tells us the total number of executing and waiting requests in our priority level:

oc get --raw /debug/api_priority_and_fairness/dump_priority_levels   
PriorityLevelName,                 ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, Exec..
workload-high,                     0,            true,   false,       0,               0
exempt,                            <none>,       <none>, <none>,      <none>,          <none>
openshift-control-plane-operators, 0,            false,  false,       0,               2
global-default,                    0,            false,  false,       0,               1
system,                            0,            true,   false,       0,               0
restrict-pod-lister,               8,            false,  false,       155,             93
leader-election,                   0,            true,   false,       0,               0
workload-low,                      0,            false,  false,       0,               2
catch-all,                         0,            true,   false,       0,               0

At the time of this particular test run, there were 155 waiting requests and 93 executing requests in the restrict-pod-lister priority level.

The /debug/api_priority_and_fairness/dump_queues endpoint can provide further visibility into the condition of every queue in our flow schema:

oc get --raw /debug/api_priority_and_fairness/dump_queues  | grep -i restrict-pod-lister 
PriorityLevelName,                 Index,  PendingRequests, ExecutingRequests, VirtualStart    
restrict-pod-lister,               0,      19,              12,                25217.6231
restrict-pod-lister,               1,      18,              10,                25251.8502
restrict-pod-lister,               2,      20,              11,                25213.0914
restrict-pod-lister,               3,      19,              11,                25229.0108
restrict-pod-lister,               4,      18,              12,                25207.1798
restrict-pod-lister,               5,      19,              12,                25213.2181
restrict-pod-lister,               6,      0,               0,                 0.0000
restrict-pod-lister,               7,      19,              11,                25205.3927
restrict-pod-lister,               8,      0,               0,                 0.0000
restrict-pod-lister,               9,      19,              14,                25232.5364
...

Finally, the /debug/api_priority_and_fairness/dump_requests endpoint allows us to identify which queue the request is assigned to:

oc get --raw /debug/api_priority_and_fairness/dump_requests 
PriorityLevelName,   FlowSchemaName,      QueueIndex, RequestIndexInQueue, FlowDistingsher,                        ArriveTime
restrict-pod-lister, restrict-pod-lister, 0,          0,                   system:serviceaccount:demo:podlister-1, 2021-02-11T05:13:59.874733557Z
restrict-pod-lister, restrict-pod-lister, 0,          1,                   system:serviceaccount:demo:podlister-1, 2021-02-11T05:13:59.880309335Z
restrict-pod-lister, restrict-pod-lister, 0,          2,                   system:serviceaccount:demo:podlister-1, 2021-02-11T05:13:59.881055726Z

restrict-pod-lister, restrict-pod-lister, 1,          13,                  system:serviceaccount:demo:podlister-0, 2021-02-11T05:14:01.645786117Z
restrict-pod-lister, restrict-pod-lister, 1,          14,                  system:serviceaccount:demo:podlister-0, 2021-02-11T05:14:01.825985532Z
restrict-pod-lister, restrict-pod-lister, 1,          15,                  system:serviceaccount:demo:podlister-0, 2021-02-11T05:14:01.899721291Z
restrict-pod-lister, restrict-pod-lister, 1,          16,                  system:serviceaccount:demo:podlister-1, 2021-02-11T05:14:02.167530293Z
restrict-pod-lister, restrict-pod-lister, 1,          17,                  system:serviceaccount:demo:podlister-1, 2021-02-11T05:14:02.183224599Z
...
restrict-pod-lister, restrict-pod-lister, 3,          0,                   system:serviceaccount:demo:podlister-2, 2021-02-11T05:14:01.051811112Z
restrict-pod-lister, restrict-pod-lister, 3,          1,                   system:serviceaccount:demo:podlister-2, 2021-02-11T05:14:01.053504144Z
restrict-pod-lister, restrict-pod-lister, 3,          2,                   system:serviceaccount:demo:podlister-2, 2021-02-11T05:14:01.0833556Z
...

While all these are happening, all the OpenShift operators remain healthy and unaffected:

CUSTOM_COLUMNS="Name:.metadata.name,AVAILABLE:.status.conditions[?(@.type=='Available')].status,DEGRADED:.status.conditions[?(@.type=='Degraded')].status"

oc get clusteroperator -o custom-columns="$CUSTOM_COLUMNS" 
Name                                       AVAILABLE   DEGRADED
authentication                             True        False
cloud-credential                           True        False
cluster-autoscaler                         True        False
config-operator                            True        False
console                                    True        False
csi-snapshot-controller                    True        False
dns                                        True        False
etcd                                       True        False
image-registry                             True        False
ingress                                    True        False
insights                                   True        False
kube-apiserver                             True        False
kube-controller-manager                    True        False
kube-scheduler                             True        False
kube-storage-version-migrator              True        False
machine-api                                True        False
machine-approver                           True        False
machine-config                             True        False
marketplace                                True        False
monitoring                                 True        False
network                                    True        False
node-tuning                                True        False
openshift-apiserver                        True        False
openshift-controller-manager               True        False
openshift-samples                          True        False
operator-lifecycle-manager                 True        False
operator-lifecycle-manager-catalog         True        False
operator-lifecycle-manager-packageserver   True        False
service-ca                                 True        False
storage                                    True        False

If we update the controllers with a context timeout that is less than the in-queue wait duration, we will start seeing some client-side context deadline exceeded errors in the logs:

oc -n demo set env deploy CONTEXT_TIMEOUT=1s --all                        

oc -n demo logs deploy/podlister-0 | grep -i "context deadline"  
2021/02/11 05:22:35 error while listing pods: Get "https://172.25.0.1:443/api/v1/namespaces/demo/pods": context deadline exceeded
2021/02/11 05:22:36 error while listing pods: Get "https://172.25.0.1:443/api/v1/namespaces/demo/pods": context deadline exceeded
2021/02/11 05:22:36 error while listing pods: Get "https://172.25.0.1:443/api/v1/namespaces/demo/pods": context deadline exceeded

So if you start seeing many context deadline exceeded errors in your controllers’ logs, you now know how to use the APF metrics, debugging endpoints and error logs to determine if APF is throttling your requests.

There are other APF metrics not covered in this post that you might find relevant to your use case. Check out the APF documentation for the full list.

Recovering From The Throttling Effect

If we scale the controllers down to 0 replicas, the number of rejected requests will gradually decrease, as the API server recovers from the throttling effect:

for i in {0..2}; do oc -n demo scale deploy/podlister-$i --replicas=0; done

OpenShift API Priority & Fairness (6)-1

Conclusion

In this post, we went over how to create and configure custom FlowSchema and PriorityLevelConfiguration resources to regulate inbound traffic to the API server.

We saw how APF queued and rejected the excessive requests generated by our custom controller. We used the different APF metrics and debugging endpoints to gain insights into the priority level queues. While this is happening, all the OpenShift operators remain unaffected by the throttling effect.

We also looked at the scenario where our client-side context deadline timed out before the API server finished processing our requests, due to a long in-queue wait duration.

The flow schema provides other configurations such as rejecting the excessive traffic instead of queueing them, and regulating inbound traffic by namespace instead of by users, all of which will be left as an exercise for the readers.