[pull] master from patroni:master #11

pull · 2024-07-02T10:39:00Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

Since `synchronous_mode` was introduced to Patroni, the plain Postgres synchronous replication has been no longer working. The issue occurs because `process_sync_replication` always resets the value of `synchronous_standby_names` in Postgres when `synchronous_mode` is disabled in Patroni. This commit fixes that issue by setting the value of `synchronous_standby_names` as configured by the user, if that is the case, when `synchronous_mode` is disabled. Closes #3093 References: PAT-254.

Since PG16 logical replication slots on a standby can be invalidated due to horizon. In this case, pg_replication_slot_advance() will fail with ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE. We should force slot copy (i.e., recreation) of such slots.

The `SlotsAdvanceThread` is asynchronously calling pg_replication_slot_advance() and providing feedback about logical replication slots that must be reinitialized by copying from the primary. That is, the parent thread will learn about slots to be copied only when scheduling the next pg_replication_slot_advance() call. As a result it was possible situation when logical slot was copied with PostgreSQL restart more than once. To improve it we implement following measures: 1. do not schedule slot sync if it is in the list to be copied 2. remove to be copued slots from the `self._scheduled` structure 3. clean state of `SlotsAdvanceThread` when slot files are copied.

* Update release notes, bump version * Fix rn * Bump pyright

The `Status` class was introduced in #2853, but we kept old properties in the `Cluster` object in order to have fewer changes in the rest of the code. This PR is finishing the refactoring. The following adjustments were made: - Introduced `Status.is_empty()` method, which is used in the `Cluster.is_empty()` instead of checking actual values to simplify introduction of further fields to the Status object. - Removed `Cluster.last_lsn` property - Changed `Cluster.slots` property to always return dict and perform sanity checks on values. Besides that, this PR addressing a couple of problems: - the `AbstractDCS.get_cluster()` method some properties without holding a lock on `_cluster_thread_lock`. - `Cluster.__permanent_slots` property was setting 'lsn' from all cluster members, while it should be doing that only for members with `replicatefrom` tag.

This constant was imported in `postgresql/__init__.py` and used in the `can_advance_slots` property. But, after refactoring in #2958 we pass around a reference to `Postgresql` instead of `major_version` and therefore we can just rely on `can_advance_slots` property and don't reimplement its logic in other places.

Lets consider a following replication setup: ``` primary->standby1->standby2(replicatefrom: standby1) ``` In this case the `primary` will not create a physical replication slot for standby2, because it is streaming from the `standby1`. Things will look differently if we have the following dynamic configuration: ```yaml slots: primary: type: physical standby1: type: physical standby2: type: physical ``` In this case `primary` will also have `standby2` physical replication slot, which periodically must be advanced. So far it was working by taking value of `xlog_location` from the `/members/standby2` key in DCS. But, when DCS is down and failsafe mode is activate, the `standby2` physical slot on the `primary` will not not be moved, because there was not way to get the latest value of `xlog_location`. This PR is addressing the problem by making replica nodes to return their `xlog_location` as `lsn` header in the response on `POST /failsafe` REST API request. The current primary will use these values to advance replication slots for nodes with `replicatefrom` tag.

Pass the `Cluster` object instead of `Leader`. It will help to implement a new feature, "Configurable retention of replication slots for cluster members". Besides that fix a couple of issues with docstrings.

We forgot to update it in #3063

It could happen that there is "something" streaming from the current primary node with `application_name` that matches name of the current primary, for instance due to a faulty configuration. When processing `pg_stat_replication` we only checked that the `application_name` matches with the name one of the member nodes, but we forgot to exclude our own name. As a result there were following side-effects: 1. The current primary could be declared as a synchronous node. 2. As a result of [1] it wasn't possible to do a switchover. 3. During shutdown the current primary was waiting for itself to release it from synchronous nodes. Close #3111

If only the leader can't access DCS its member key will expire and `POST /failsafe` requests might be rejected because of that. Close #3096

`standby_slot_names` was renamed to `synchronized_standby_slots`

…r_status() (#3119) The last one is only available since psycopg 2.8, while the first one since 2.0.8. For backward compatibility monkeypatch connection object returned by psycopg3. Close #3116

There was one oversight of #2781 - to influence external tools that Patroni could execute, we set global `umask` value based on permissions of the $PGDATA directory. As a result, it also influenced permissions of log files created by Patroni. To address the problem we implement two measures: 1. Make `log.mode` configurable. 2. If the value is not set - calculate permissions from the original value of the umask setting.

To enable quorum commit: ```diff $ patronictl.py edit-config --- +++ @@ -5,3 +5,4 @@ use_pg_rewind: true retry_timeout: 10 ttl: 30 +synchronous_mode: quorum Apply these changes? [y/N]: y Configuration changed ``` By default Patroni will use `ANY 1(list,of,stanbys)` in `synchronous_standby_names`. That is, only one node out of listed replicas will be used for quorum. If you want to increase the number of quorum nodes it is possible to do it with: ```diff $ patronictl edit-config --- +++ @@ -6,3 +6,4 @@ retry_timeout: 10 synchronous_mode: quorum ttl: 30 +synchronous_node_count: 2 Apply these changes? [y/N]: y Configuration changed ``` Good old `synchronous_mode: on` is still supported. Close #664 Close #672

1. All nodes with role == 'replica' and state == 'running' are are registered. In case is state isn't running the node is removed. 2. In case of failover/switchover we always first update the primary 3. When switching to a registered secondary we call citus_update_node() three times: rename primary to primary-demoted, put the primary name to a promoted secondary row and put the promoted secondary name to the primary row State transitions are produced by the transition() method. First of all the method makes sure that the actual primary is registered in the metadata. In case if for a given group the primary didn't change, the method registers new secondaries and removes secondaries that are gone. It prefers to use citus_update_node() UDF to replace gone secondaries with added. Communication protocol between primary nodes remains the same and all old features work without any changes.

It was announced as deprecated in v3.2.0

Besides that: 1. Introduce `setup.py isort` for quick check 2. Introduce GH actions to check imports

Option has been deprecated and should be removed in the new major release

Due to postgres --describe-config not showing GUCs defined as GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE, Patroni was always ignoring some GUCs that a user might want to have configured with non-default values. - remove postgres --describe-config validation. - define minor versions for availability bounds of some back-patched GUCs

Current problem of Patroni that strikes many people is that it removes replication slot for member which key is expired from DCS. As a result, when the replica comes back from a scheduled maintenance WAL segments could be already absent, and it can't continue streaming without pulling files from archive. With PostgreSQL 16 and newer we get another problem: logical slot on a standby node could be invalidated if physical replication slot on the primary was removed (and `pg_catalog` vacuumed). The most problematic environment is Kubernetes, where slot is removed nearly instantly when member Pod is deleted. So far, one of the recommended solutions was to configure permanent physical slots with names that match member names to avoid removal of replication slots. It works, but depending on environment might be non-trivial to implement (when for example members may change their names). This PR implements support of `member_slots_ttl` global configuration parameter, that controls for how long member replication slots should be kept when the member key is absent. Default value is set to `30min`. The feature is supported only starting from PostgreSQL 11 and newer, because we want to retain slots not only on the leader node, but on all nodes that could potentially become the new leader, and they should be moved forward using `pg_replication_slot_advance()` function. One could disable feature and get back to the old behavior by setting `member_slots_ttl` to `0`.

- don't register secondaries with `noloadbalance` tag. - mention in the documentation that secondaries are also registered in `pg_dist_node`. - update docker/kubernetes README files to include examples with secondaries being registered in `pg_dist_node`.

There are two cases when libpq may search for "localhost": 1. When host in the connection string is not specified and it is using default socket directory path. 2. When specified host matches default socket directory path. Since we don't know the value of default socket directory path and effectively can't detect the case 2, the best strategy to mitigate the problem would be to add "localhost" if we detected a "host" be a unix socket directory (it starts with '/' character). Close #3134

This commit is a breaking change: 1. `role` in DCS is written as "primary" instead of "master". 2. `role` in REST API responses is also written as "primary". 3. REST API no longer accepts role=master in requests (for example switchover/failover/restart endpoints). 4. `/metrics` REST API endpoint will no longer report `patroni_master`. 5. `patronictl` no longer accepts `--master` argument. 6. `no_master` option in declarative configuration of custom replica creation methods is no longer treated as a special option, please use `no_leader` instead. 7. `patroni_wale_restore` doesn't accept `--no_master` anymore. 8. `patroni_barman` doesn't accept `--role=master` anymore. 9. callback scripts will be executed with role=primary instead of role=master 10. On Kubernetes Patroni by default will set role label to primary. In case if you want to keep old behavior and avoid downtime or lengthy complex migrations you can configure `kubernetes.leader_label_value` and `kubernetes.standby_leader_label_value` to `master`. However, a few exceptions regarding master are still in place: 1. `GET /master` REST API endpoint will continue to work. 2. `master_start_timeout` and `master_stop_timeout` in global configuration are still accepted. 3. `master` tag is still preserved in Consul services in addition to `primary`. Rationale for these exceptions: DBA doesn't always 100% control the infrastructure and can't adjust the configuration.

Followup on #3142

the original fix didn't address same problem with with permanent slots, but only the part with member slots being retained due to `member_slots_ttl`.

- Increase version - Use newer pyright (not latest) - Add RNs

It is available in PostgreSQL 17 Besides that, enable PG17 in behave tests and include PG17 to supported versions in docs.

There are cases when we may send the same PATCH request more than one time to K8s API server and it could happen that the first request actually successfully updated the target and we cancelled while waiting for a response. The second PATCH request in this case will fail due to resource_version mismatch. So far our strategy for update_leader() method was - re-read the object and repeat the request with the new resource_version. However, we can avoid the update by comparing annotations on the read object with annotations that we wanted to set.

pgaudit could be added to shared_preload_libraries, but we don't check for it, because setting a custom GUC works in all cases.

…3181) Reported in #3171

Still check against `postgres --describe-config` if a GUC does not have a validator but is a valid postgres GUC

Declaring variables with `Union` and using `isinstance()` hack doesn't work anymore. Therefore the code is updated to use `Any` for variable and `cast` function after firguring out the correct type in order to avoid getting errors about `Unknown` types.

…3192) From the actual code, in patroni/postgresql/config.py::ConfigHandler.CMDLINE_OPTIONS, the previous defaults were wrong.

python-consul is unmaintained for a long time and py-consul is an official replacement. However, we still keep backward compatibility with python-consul. Close: #3189

Additionally run on_role_change callback in post_recover() for a primary that failed to start after a crash to increase chances the callback is executed, even if the further start as a replica fails --------- Co-authored-by: Alexander Kukushkin <cyberdemn@gmail.com>

Close #3202

…3207) Close #3206

Found via `codespell -H` and `typos --hidden --format brief`

It was broken in #3191

They started showing deprecation warning when importing ALL and FRAME constants.

1. Implemented compatibility. 2. Constrained the upper version in requirements.txt to avoid future failures. 3. Setup an additional pipeline to check with the latest ydiff. Close #3209 Close #3212 Close #3218

* Release v4.0.4 - Increase version - Use latest pyright - Add RNs

When in automatic mode we probably don't need to warn user about failure to set up watchdog. This is the common case and makes many users think that this feature is somehow necessary to run Patroni safely. For most users it is completely fine to run without and it makes sense to reduce their log spam.

…#3229) This allows to set whether a particular permanent replication slot should always be created ('cluster_type=any', the default), or just on a primary ('cluster_type=primary') or standby ('cluster_type=standby') cluster, respectively.

When doing `patronictl restart <clustername> --pending`, the confirmation lists all members, regardless if their restart is really pending: ``` > patronictl restart pgcluster --pending + Cluster: pgcluster (7436691039717365672) ----+----+-----------+-----------------+---------------------------------+ | Member | Host | Role | State | TL | Lag in MB | Pending restart | Pending restart reason | +--------+----------+--------------+-----------+----+-----------+-----------------+---------------------------------+ | win1 | 10.0.0.2 | Sync Standby | streaming | 8 | 0 | * | hba_file: [hidden - too long] | | | | | | | | | ident_file: [hidden - too long] | | | | | | | | | max_connections: 201->202 | +--------+----------+--------------+-----------+----+-----------+-----------------+---------------------------------+ | win2 | 10.0.0.3 | Leader | running | 8 | | * | hba_file: [hidden - too long] | | | | | | | | | ident_file: [hidden - too long] | | | | | | | | | max_connections: 201->202 | +--------+----------+--------------+-----------+----+-----------+-----------------+---------------------------------+ | win3 | 10.0.0.4 | Replica | streaming | 8 | 0 | | | +--------+----------+--------------+-----------+----+-----------+-----------------+---------------------------------+ When should the restart take place (e.g. 2024-11-27T08:27) [now]: Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []: Are you sure you want to restart members win1, win2, win3? [y/N]: ``` When we proceed with the restart despite the scary message mentioning all members, not just the ones needing a restart, there will be an error message stating the node not to be restarted was indeed not restarted: ``` Are you sure you want to restart members win1, win2, win3? [y/N]: y Restart if the PostgreSQL version is less than provided (e.g. 9.5.2) []: Success: restart on member win1 Success: restart on member win2 Failed: restart for member win3, status code=503, (restart conditions are not satisfied) ``` The misleading confirmation message can also be seen when using the `--any` flag. The current PR is fixing this. However, we do not apply filtering in case of scheduled pending restart, because the condition must be evaluated at the scheduled time.

Patroni could be doing replica bootstrap and we don't want want pg_basebackup/wal-g/pgBackRest/barman or similar keep running. Besides that, remove data directory on replica bootstrap failure if configuration allows. Close #3224

* Compatibility with python-json-logger>=3.1 After refactoring the old API is still working, but producing warnings and pyright also fails. Besides that improve coverage of watchdog/base.py and ctl.py * Stick to ubuntu 22.04 * Please pyright

hughcapet and others added 2 commits July 1, 2024 10:47

Use trusted publishers for pypi (#3095)

3c07410

pull bot added the ⤵️ pull label Jul 2, 2024

RMTT and others added 27 commits July 10, 2024 08:16

Make postgresql.parameter documentation more clear (#3092)

f00826d

Release v3.3.2 (#3099)

fbbd32a

* Update release notes, bump version * Fix rn * Bump pyright

Refactor update_leader() method (#3107)

c633923

Pass the `Cluster` object instead of `Leader`. It will help to implement a new feature, "Configurable retention of replication slots for cluster members". Besides that fix a couple of issues with docstrings.

Patroni doesn't forece wal_log_hints anymore (#3109)

4456e26

We forgot to update it in #3063

Ignore restapi.allowlist_include_members for POST /failsafe (#3113)

ab9faf9

If only the leader can't access DCS its member key will expire and `POST /failsafe` requests might be rejected because of that. Close #3096

Compatibility with v17 beta3 (#3120)

5eb431b

`standby_slot_names` was renamed to `synchronized_standby_slots`

Use get_parameter_status() method instead of Connection.info.paramete…

b458bd9

…r_status() (#3119) The last one is only available since psycopg 2.8, while the first one since 2.0.8. For backward compatibility monkeypatch connection object returned by psycopg3. Close #3116

Add synchronous_node_count to dynamic conf doc (#3124)

fc5a8ed

Remove user creation (#2894)

c931da1

It was announced as deprecated in v3.2.0

Reformat imports with isort (#3123)

93eb4ed

Besides that: 1. Introduce `setup.py isort` for quick check 2. Introduce GH actions to check imports

Fix timestamp order in postmaster check log (#3128)

a03dba0

Fix request URL in failsafe handling logs (#3126)

7659ccd

Remove patronictl failover --leader option (#3129)

31cf951

Option has been deprecated and should be removed in the new major release

Follow up on #2755 (#3137)

8cdb0c2

- don't register secondaries with `noloadbalance` tag. - mention in the documentation that secondaries are also registered in `pg_dist_node`. - update docker/kubernetes README files to include examples with secondaries being registered in `pg_dist_node`.

CyberDem0n and others added 30 commits September 17, 2024 11:21

Don't retains member slots on nodes with nofailover tag (#3169)

d7e172c

Followup on #3142

Follow up on #3148 (#3167)

78a46b9

the original fix didn't address same problem with with permanent slots, but only the part with member slots being retained due to `member_slots_ttl`.

Release v4.0.2 (#3166)

6b68503

- Increase version - Use newer pyright (not latest) - Add RNs

Add support of sslnegotiation client-side connection option (#3173)

e91e6b5

It is available in PostgreSQL 17 Besides that, enable PG17 in behave tests and include PG17 to supported versions in docs.

Disable pgaudit when creating users to not expose password (#3175)

877acf2

pgaudit could be added to shared_preload_libraries, but we don't check for it, because setting a custom GUC works in all cases.

Docs compatibility with sphinx 8 (#3177)

58ee52b

Fix issue with mixed setups: primary on pre-v4 and replicas on v4+ (#…

74c0acf

…3181) Reported in #3171

Partially revert 8c5ab4c (#3180)

ff27870

Still check against `postgres --describe-config` if a GUC does not have a validator but is a valid postgres GUC

Temporary pin psycopg2-binary version for macOS (#3186)

ba970d8

Pyright 1.1.385 (#3182)

4853b3b

Declaring variables with `Union` and using `isinstance()` hack doesn't work anymore. Therefore the code is updated to use `Any` for variable and `cast` function after firguring out the correct type in order to avoid getting errors about `Unknown` types.

Add documentation for sslnegotiation option (#3185)

75ff8b3

Increase version, add RNs (#3188)

969d7ec

Fix defaults 'max_wal_senders' and 'max_replication_slots' in docs (#…

72be036

…3192) From the actual code, in patroni/postgresql/config.py::ConfigHandler.CMDLINE_OPTIONS, the previous defaults were wrong.

Switch to py-consul (#3191)

e8a8bfe

python-consul is unmaintained for a long time and py-consul is an official replacement. However, we still keep backward compatibility with python-consul. Close: #3189

Fixes outdated link to Zalando's tech blog on Patroni (#3201)

177101a

Cache DCS instances to avoid thread leak in patronictl list -W (#3205)

e1faa38

Close #3202

Make sure only supported parameters are written to connection string (#…

efba02f

…3207) Close #3206

Fix typos (#3204)

4ce0f99

Found via `codespell -H` and `typos --hidden --format brief`

Restore compatibility with python-consul2 (#3215)

3f00b7a

It was broken in #3191

Compatibility with prettytable>=3.12.0 (#3217)

19f75b4

They started showing deprecation warning when importing ALL and FRAME constants.

Compatibility with ydiff==1.4.2 (#3216)

a903438

1. Implemented compatibility. 2. Constrained the upper version in requirements.txt to avoid future failures. 3. Setup an additional pipeline to check with the latest ydiff. Close #3209 Close #3212 Close #3218

Release v4.0.4 (#3221)

fb0fcc8

* Release v4.0.4 - Increase version - Use latest pyright - Add RNs

Implement sync_priority tag (#3223)

39f5de2

Cancel long-running jobs on Patroni stop (#3232)

e73f204

Patroni could be doing replica bootstrap and we don't want want pg_basebackup/wal-g/pgBackRest/barman or similar keep running. Besides that, remove data directory on replica bootstrap failure if configuration allows. Close #3224

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from patroni:master #11

[pull] master from patroni:master #11

pull bot commented Jul 2, 2024 •

edited

Loading

[pull] master from patroni:master #11

Are you sure you want to change the base?

[pull] master from patroni:master #11

Conversation

pull bot commented Jul 2, 2024 • edited Loading

pull bot commented Jul 2, 2024 •

edited

Loading