I have written about using Taints and Tolerations to prevent pods from running on certain (tainted) nodes and there is some influence on scheduling that we can exert using Limits and Requests. But if we really want to control pod placement we have to look no further than Node/Pod Affinity and Anti-Affinity. This allows you to specify nodes that your pod can run on (Pod Affinity) and can be used to spread out your pods at runtime to different nodes using Anti-Affinity. Why? Because it’s great to have a container cluster, but if all your pods are landing on a single note your not gonna have a great (up)time. Let’s get started!
As always, these concepts apply to both Kubernetes and Openshift. We will try to do everything from the
oc command line.
Affinity means to have a natural liking to something. In Openshift it means that there is a connection (a preferred grouping) of resources. Naturally Anti-Affinity inverses this. With Affinity you can group workloads together on a single host or ensure that pods land on the same server. This can be useful if your workload gains performance by being scheduled together. The inverse is also true. By using Anti-Affinity rules we can make sure not all of our frontend pods are being run on the same node so that when it might go down or get busy our application pods won’t go down all at once.
Affinity is specified in your pod spec,
pod.spec.affinity. Tip! Use
oc explain pod.spec.affinity for some helpful info:
$ oc explain pod.spec.affinity KIND: Pod VERSION: v1 RESOURCE: affinity <Object> DESCRIPTION: If specified, the pod's scheduling constraints Affinity is a group of affinity scheduling rules. FIELDS: nodeAffinity <Object> Describes node affinity scheduling rules for the pod. podAffinity <Object> Describes pod affinity scheduling rules (e.g. co-locate this pod in the same node, zone, etc. as some other pod(s)). podAntiAffinity <Object> Describes pod anti-affinity scheduling rules (e.g. avoid putting this pod in the same node, zone, etc. as some other pod(s)).
nodeAffinity we can ask the pod to be scheduled (or not to be scheduled) on a node with a certain label. This works a lot like a
podAffinityis used to tell our pod to schedule our pod with other pods based on affinity rules
podAntiAffinityenables us to separate pods based on affinity rules
When using a Affinity rule you also need to specify the
topologyKey: kubernetes.io/hostname in the
yaml. Also, when using a Preferred rule you need to set a
weight so that the scheduler knows (on a scale from 1-100) how strongly it should weigh the preference.
Why not taint?
At this point you might be asking, why not use a toleration or the nodeSelector found in the pods spec? This is a good question. Using these techniques gives us controll on where to place a pod but it does based on static information on the node. Using Affinity rules we can schedule dynamilcy based on where other pods are located.
So, how does this work? Affinity Rules use matchExpressions based on
key=value pairs to match. We will take the following
yaml as an example:
kind: Pod metadata: name: looking-for-a-green-pod spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: color operator: In values: - green - darkgreen - lightgreen topologyKey: kubernetes.io/hostname containers: - name: looking-for-a-green-pod image: docker.io/ocpqe/hello-pod
This will create a pod called
looking-for-a-green-pod that looks for another pod that has the key
color with one of three values
We could easily create a pod called
black-and-white that just wont schedule on the same node as a pod with color defined using the following affinity rule:
spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: labelSelector: - matchExpressions: - key: color operator: Exists
The operators we can use are:
- In Meaning one of the
- NotIn Meaning the
valueis not in the
- Exists Meaning the
valueexists in the
- DoesNotExist The
valueshould not exist in the
- Lt Lesser then
- Gt Greater then
Required and Preferred
We can set up our Affinity rules in two modes, “Required” and “Preferred”. Let me explain:
- Required Affinity rules have to be met before a pod is scheduled on a node
- Preferred Affinity rules are, well, preferred. We would like these rules to be met but we can be a bit more flexible
Creating pods with Affinity rules
Lets spin up two pods that want to be scheduled together,
# green-pod.yaml apiVersion: v1 kind: Pod metadata: name: green-pod labels: color: green spec: containers: - name: green-pod image: docker.io/ocpqe/hello-pod
# looking-for-a-green-pod apiVersion: v1 kind: Pod metadata: name: looking-for-a-green-pod spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: color operator: In values: - green - darkgreen - lightgreen topologyKey: kubernetes.io/hostname containers: - name: looking-for-a-green-pod image: docker.io/ocpqe/hello-pod
You can save both these definitions to a
yaml file and use
oc apply -f FILE to create the. When this is done they should both be running on the same node:
$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES green-pod 1/1 Running 0 11m 10.217.0.98 crc-ktfxm-master-0 <none> <none> looking-for-a-green-pod 1/1 Running 0 64s 10.217.0.101 crc-ktfxm-master-0 <none> <none>
Let’s create our pod that does not like any color:
# black-and-white.yaml apiVersion: v1 kind: Pod metadata: name: black-and-white spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: color operator: Exists topologyKey: kubernetes.io/hostname containers: - name: black-and-white image: docker.io/ocpqe/hello-pod
Now when we have a look at our pods we will see that our newest one does not like to run with the other pods:
$ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES black-and-white 0/1 Pending 0 14s <none> <none> <none> <none> green-pod 1/1 Running 0 16m 10.217.0.98 crc-ktfxm-master-0 <none> <none> looking-for-a-green-pod 1/1 Running 0 6m34s 10.217.0.101 crc-ktfxm-master-0 <none> <none>
Note Because this is running on CRC which is single node cluster the pod will not start because there are no other nodes available.
And we can see the effect with
$ oc describe pod black-and-white .... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 36s (x2 over 101s) default-scheduler 0/1 nodes are available: 1 node(s) didn't match pod anti-affinity rules.
Not lets change the pod from requiring the affinity rule from being met to a preffered rule. This is not as simple as swapping out
requiredDuringSchedulingIgnoredDuringExecution because a preferred rules needs some extra information to work with, we will update our
apiVersion: v1 kind: Pod metadata: name: black-and-white spec: affinity: podAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 50 podAffinityTerm: labelSelector: matchExpressions: - key: color operator: Exists topologyKey: kubernetes.io/hostname containers: - name: black-and-white image: docker.io/ocpqe/hello-pod
As we can see in the events now all pods are scheduled on the same node. Even the
black-and-white pod because despite its preference there is simply no other node to run on.
$ oc get events LAST SEEN TYPE REASON OBJECT MESSAGE 1m Normal Scheduled pod/black-and-white Successfully assigned all-together-now/black-and-white to crc-ktfxm-master-0
Using Affinity Rules can help us dynamically select where our pods are scheduled based on node labels and other pods. This makes it easy to spread out a workload across a cluster or keep pods together for maximum performance.
I hope this post has helped you. Check out my other EX280 related content on my EX280 page