The Postgres operator manages PostgreSQL clusters on Kubernetes (K8s):
The operator watches additions, updates, and deletions of PostgreSQL cluster manifests and changes the running clusters accordingly. For example, when a user submits a new manifest, the operator fetches that manifest and spawns a new Postgres cluster along with all necessary entities such as K8s StatefulSets and Postgres roles. See this Postgres cluster manifest for settings that a manifest may contain.
The operator also watches updates to its own configuration and alters running Postgres clusters if necessary. For instance, if the Docker image in a pod is changed, the operator carries out the rolling update, which means it re-spawns pods of each managed StatefulSet one-by-one with the new Docker image.
Finally, the operator periodically synchronizes the actual state of each Postgres cluster with the desired state defined in the cluster’s manifest.
The operator aims to be hands free as configuration works only via manifests. This enables easy integration in automated deploy pipelines with no access to K8s directly.
The scope of the Postgres Operator is on provisioning, modifying configuration and cleaning up Postgres clusters that use Patroni, basically to make it easy and convenient to run Patroni based clusters on K8s. The provisioning and modifying includes K8s resources on one side but also e.g. database and role provisioning once the cluster is up and running. We try to leave as much work as possible to K8s and to Patroni where it fits, especially the cluster bootstrap and high availability. The operator is however involved in some overarching orchestration, like rolling updates to improve the user experience.
Monitoring or tuning Postgres is not in scope of the operator in the current state. However, with globally configurable sidecars we provide enough flexibility to complement it with other tools like ZMON, Prometheus or more Postgres specific options.
Here is a diagram, that summarizes what would be created by the operator, when a new Postgres cluster CRD is submitted:
This picture is not complete without an overview of what is inside a single cluster pod, so let’s zoom in:
These two diagrams should help you to understand the basics of what kind of functionality the operator provides.
This project is currently in active development. It is however already used internally by Zalando in order to run Postgres clusters on K8s in larger numbers for staging environments and a growing number of production clusters. In this environment the operator is deployed to multiple K8s clusters, where users deploy manifests via our CI/CD infrastructure or rely on a slim user interface to create manifests.
Please, report any issues discovered to https://github.com/zalando/postgres-operator/issues.
“Watching after your PostGIS herd” talk by Felix Kunde, FOSS4G 2021: video | slides |
“PostgreSQL on K8S at Zalando: Two years in production” talk by Alexander Kukushkin, FOSSDEM 2020: video | slides |
“Postgres as a Service at Zalando” talk by Jan Mußler, DevOpsDays Poznań 2019: video
“Building your own PostgreSQL-as-a-Service on Kubernetes” talk by Alexander Kukushkin, KubeCon NA 2018: video | slides |
“PostgreSQL and Kubernetes: DBaaS without a vendor-lock” talk by Oleksii Kliukin, PostgreSQL Sessions 2018: video | slides |
“PostgreSQL High Availability on Kubernetes with Patroni” talk by Oleksii Kliukin, Atmosphere 2018: video | slides |
“Blue elephant on-demand: Postgres + Kubernetes” talk by Oleksii Kliukin and Jan Mussler, FOSDEM 2018: video | slides (pdf) |
Series of blog posts on how to use the Zalando Operator, configure backups and use etcd as DCS by thedatabaseme, Mar. 2022-23.
“Zalando Postgres Operator in Production: the way of Helm” by Zangir Kapishov on medium, Jan. 2023.
“Chaos testing of a Postgres cluster managed by the Zalando Postgres Operator” by Nikolay Sivko on coroot, Aug. 2022.
“Getting started with the Zalando Operator for PostgreSQL” by Daniel Westermann on dbi services blog, Mar. 2021.
“Our experience with Postgres Operator for Kubernetes by Zalando” by Nikolay Bogdanov on Palark blog, Feb. 2021.
“How to set up continuous backups and monitoring” by Pål Kristensen on GitHub, Mar. 2020.
“Postgres on Kubernetes with the Zalando operator” by Vito Botta on has_many :code, Feb. 2020.
“Running PostgreSQL in Google Kubernetes Engine” by Kenneth Rørvik on Repill Linpro blog, Sep. 2019.
“Zalando Postgres Operator: One Year Later” by Sergey Dudoladov on Open Source Zalando, Nov. 2018