Kubernetes Backend Deployments¶
Kubernetes Race Condition problem¶
As described in #652,
there is a problem that exists in Kubernetes, while terminating Pods.
Terminating Pods could be graceful, but the nature of distributed
environments will show failures, because not all components in the
distributed system changed already their state. When a Pod terminates,
the controller-manager has to update the endpoints
of the Kubernetes
service
. Additionally Skipper has to get this endpoints
list. Skipper polls the kube-apiserver every -source-poll-timeout=<ms>
,
which defaults to 3000.
Reducing this interval or implementing watch will only reduce the
timeframe, but not fix the underlying race condition.
Mitigation strategies can be different and the next section document strategies for application developers to mitigate the problem.
Teardown strategies¶
An application that is target of an ingress can circumvent HTTP code 504s Gateway Timeouts with these strategies:
- use Pod lifecycle hooks
- use a SIGTERM handler to switch
readinessProbe
to unhealthy and exit later, or just wait for SIGKILL terminating the process.
Pod Lifecycle Hooks¶
Kubernetes Pod Lifecycle
Hooks
in the Pod spec can have a preStop
command which executes for
example a binary. The following will execute the binary sleep
with
argument 20
to wait 20 seconds before terminating the containers
within the Pod:
lifecycle:
preStop:
exec:
command: ["sleep","20"]
or use sleep in kubelet without the need of the binary sleep
in your
container:
lifecycle:
preStop:
sleep:
seconds: 20
20 seconds should be enough to fade your Pod out of the endpoints list and Skipper’s routing table.
SIGTERM handling in Containers¶
An application can implement a SIGTERM handler, that changes the
readinessProbe
target to unhealthy for the application
instance. This will make sure it will be deleted from the endpoints
list and from Skipper’s routing table. Similar to Pod Lifecycle
Hooks you could sleep 20 seconds and after that
terminate your application or you just wait until SIGKILL will cleanup
the instance after 60s.
go func() {
var sigs chan os.Signal
sigs = make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGTERM)
for {
select {
case <-sigs:
healthCheck = unhealthy
time.Sleep(20*time.Second)
os.Exit(0)
}
}
}()