pax_global_header00006660000000000000000000000064126426441220014515gustar00rootroot0000000000000052 comment=310f3f31f9477c4345aac4b946bec44beced6024 repmgr-3.0.3/000077500000000000000000000000001264264412200130145ustar00rootroot00000000000000repmgr-3.0.3/.gitignore000066400000000000000000000001331264264412200150010ustar00rootroot00000000000000*~ *.o *.so repmgr repmgrd README.htm* README.pdf sql/repmgr_funcs.so sql/repmgr_funcs.sql repmgr-3.0.3/CONTRIBUTING.md000066400000000000000000000023321264264412200152450ustar00rootroot00000000000000License and Contributions ========================= `repmgr` is licensed under the GPL v3. All of its code and documentation is Copyright 2010-2015, 2ndQuadrant Limited. See the files COPYRIGHT and LICENSE for details. The development of repmgr has primarily been sponsored by 2ndQuadrant customers. Additional work has been sponsored by the 4CaaST project for cloud computing, which has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement 258862. Contributions to `repmgr` are welcome, and will be listed in the file `CREDITS`. 2ndQuadrant Limited requires that any contributions provide a copyright assignment and a disclaimer of any work-for-hire ownership claims from the employer of the developer. This lets us make sure that all of the repmgr distribution remains free code. Please contact info@2ndQuadrant.com for a copy of the relevant Copyright Assignment Form. Code style ---------- Code in repmgr is formatted to a consistent style using the following command: astyle --style=ansi --indent=tab --suffix=none *.c *.h Contributors should reformat their code similarly before submitting code to the project, in order to minimize merge conflicts with other work.repmgr-3.0.3/COPYRIGHT000066400000000000000000000012631264264412200143110ustar00rootroot00000000000000Copyright (c) 2010-2015, 2ndQuadrant Limited All rights reserved. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/ to obtain one. repmgr-3.0.3/CREDITS000066400000000000000000000011211264264412200140270ustar00rootroot00000000000000Code and documentation contributors to repmgr include: Jaime Casanova Simon Riggs Greg Smith Robert J. Noles Gabriele Bartolini Bas van Oostveen Hannu Krosing Cédric Villemain Charles Duffy Daniel Farina Shawn Ellis Jay Taylor Christian Kruse Krzysztof Gajdemski repmgr-3.0.3/FAILOVER.rst000066400000000000000000000162651264264412200147670ustar00rootroot00000000000000==================================================== PostgreSQL Automatic Failover - User Documentation ==================================================== Automatic Failover ================== repmgr allows for automatic failover when it detects the failure of the master node. Following is a quick setup for this. Installation ============ For convenience, we define: **node1** is the fully qualified domain name of the Master server, IP 192.168.1.10 **node2** is the fully qualified domain name of the Standby server, IP 192.168.1.11 **witness** is the fully qualified domain name of the server used as a witness, IP 192.168.1.12 **Note:** We don't recommend using names with the status of a server like «masterserver», because it would be confusing once a failover takes place and the Master is now on the «standbyserver». Summary ------- 2 PostgreSQL servers are involved in the replication. Automatic failover needs a vote to decide what server it should promote, so an odd number is required. A witness-repmgrd is installed in a third server where it uses a PostgreSQL cluster to communicate with other repmgrd daemons. 1. Install PostgreSQL in all the servers involved (including the witness server) 2. Install repmgr in all the servers involved (including the witness server) 3. Configure the Master PostreSQL 4. Clone the Master to the Standby using "repmgr standby clone" command 5. Configure repmgr in all the servers involved (including the witness server) 6. Register Master and Standby nodes 7. Initiate witness server 8. Start the repmgrd daemons in all nodes **Note** A complete High-Availability design needs at least 3 servers to still have a backup node after a first failure. Install PostgreSQL ------------------ You can install PostgreSQL using any of the recommended methods. You should ensure it's 9.0 or later. Install repmgr -------------- Install repmgr following the steps in the README file. Configure PostreSQL ------------------- Log in to node1. Edit the file postgresql.conf and modify the parameters:: listen_addresses='*' wal_level = 'hot_standby' archive_mode = on archive_command = 'cd .' # we can also use exit 0, anything that # just does nothing max_wal_senders = 10 wal_keep_segments = 5000 # 80 GB required on pg_xlog hot_standby = on shared_preload_libraries = 'repmgr_funcs' Edit the file pg_hba.conf and add lines for the replication:: host repmgr repmgr 127.0.0.1/32 trust host repmgr repmgr 192.168.1.10/30 trust host replication all 192.168.1.10/30 trust **Note:** It is also possible to use a password authentication (md5), .pgpass file should be edited to allow connection between each node. Create the user and database to manage replication:: su - postgres createuser -s repmgr createdb -O repmgr repmgr Restart the PostgreSQL server:: pg_ctl -D $PGDATA restart And check everything is fine in the server log. Create the ssh-key for the postgres user and copy it to other servers:: su - postgres ssh-keygen # /!\ do not use a passphrase /!\ cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys chmod 600 ~/.ssh/authorized_keys exit rsync -avz ~postgres/.ssh/authorized_keys node2:~postgres/.ssh/ rsync -avz ~postgres/.ssh/authorized_keys witness:~postgres/.ssh/ rsync -avz ~postgres/.ssh/id_rsa* node2:~postgres/.ssh/ rsync -avz ~postgres/.ssh/id_rsa* witness:~postgres/.ssh/ Clone Master ------------ Log in to node2. Clone node1 (the current Master):: su - postgres repmgr -d repmgr -U repmgr -h node1 standby clone Start the PostgreSQL server:: pg_ctl -D $PGDATA start And check everything is fine in the server log. Configure repmgr ---------------- Log in to each server and configure repmgr by editing the file /etc/repmgr/repmgr.conf:: cluster=my_cluster node=1 node_name=earth conninfo='host=192.168.1.10 dbname=repmgr user=repmgr' master_response_timeout=60 reconnect_attempts=6 reconnect_interval=10 failover=automatic promote_command='promote_command.sh' follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf' **cluster** is the name of the current replication. **node** is the number of the current node (1, 2 or 3 in the current example). **node_name** is an identifier for every node. **conninfo** is used to connect to the local PostgreSQL server (where the configuration file is) from any node. In the witness server configuration you need to add a 'port=5499' to the conninfo. **master_response_timeout** is the maximum amount of time we are going to wait before deciding the master has died and start the failover procedure. **reconnect_attempts** is the number of times we will try to reconnect to master after a failure has been detected and before start the failover procedure. **reconnect_interval** is the amount of time between retries to reconnect to master after a failure has been detected and before start the failover procedure. **failover** configure behavior: *manual* or *automatic*. **promote_command** the command executed to do the failover (including the PostgreSQL failover itself). The command must return 0 on success. **follow_command** the command executed to address the current standby to another Master. The command must return 0 on success. Register Master and Standby --------------------------- Log in to node1. Register the node as master:: su - postgres repmgr -f /etc/repmgr/repmgr.conf master register This will also create the repmgr schema and functions. Log in to node2. Register it as a standby:: su - postgres repmgr -f /etc/repmgr/repmgr.conf standby register Initialize witness server ------------------------- Log in to witness. Initialize the witness server:: su - postgres repmgr -d repmgr -U repmgr -h 192.168.1.10 -D $WITNESS_PGDATA -f /etc/repmgr/repmgr.conf witness create The witness server needs the following information from the command line: * Connection details for the current master, to copy the cluster configuration. * A location for initializing its own $PGDATA. repmgr will also ask for the superuser password on the witness database so it can reconnect when needed (the command line option --initdb-no-pwprompt will set up a password-less superuser). By default the witness server will listen on port 5499; this value can be overridden by explicitly providing the port number in the conninfo string in repmgr.conf. (Note that it is also possible to specify the port number with the -l/--local-port option, however this option is now deprecated and will be overridden by a port setting in the conninfo string). Start the repmgrd daemons ------------------------- Log in to node2 and witness:: su - postgres repmgrd -f /etc/repmgr/repmgr.conf --daemonize -> /var/log/postgresql/repmgr.log 2>&1 **Note:** The Master does not need a repmgrd daemon. Suspend Automatic behavior ========================== Edit the repmgr.conf of the node to remove from automatic processing and change:: failover=manual Then, signal repmgrd daemon:: su - postgres kill -HUP $(pidof repmgrd) Usage ===== The repmgr documentation is in the README file (how to build, options, etc.) repmgr-3.0.3/FAQ.md000066400000000000000000000140101264264412200137410ustar00rootroot00000000000000FAQ - Frequently Asked Questions about repmgr ============================================= This FAQ applies to `repmgr` 3.0 and later. General ------- - What's the difference between the repmgr versions? repmgr 3.x builds on the improved replication facilities added in PostgreSQL 9.3, as well as improved automated failover support via `repmgrd`, and is not compatible with PostgreSQL 9.2 and earlier. repmgr 2.x supports PostgreSQL 9.0 onwards. While it is compatible with PostgreSQL 9.3 and later, we recommend repmgr v3. - What's the advantage of using replication slots? Replication slots, introduced in PostgreSQL 9.4, ensure that the master server will retain WAL files until they have been consumed by all standby servers. This makes WAL file management much easier, and if used `repmgr` will no longer insist on a fixed number (default: 5000) of WAL files being preserved. (However this does mean that if a standby is no longer connected to the master, the master will retain WAL files indefinitely). - How many replication slots should I define in `max_replication_slots`? Normally at least same number as the number of standbys which will connect to the node. Note that changes to `max_replication_slots` require a server restart to take effect, and as there is no particular penalty for unused replication slots, setting a higher figure will make adding new nodes easier. - Does `repmgr` support hash indexes? No. Hash indexes and replication do not mix well and their use is explicitly discouraged; see: http://www.postgresql.org/docs/current/interactive/sql-createindex.html#AEN74175 `repmgr` -------- - When should I use the --rsync-only option? By default, `repmgr` uses `pg_basebackup` to clone a standby from a master. However, `pg_basebackup` copies the entire data directory, which can take some time depending on installation size. If you have an existing but "stale" standby, `repmgr` can use `rsync` instead, which means only changed or added files need to be copied. - Can I register an existing master/standby? Yes, this is no problem. - How can a failed master be re-added as a standby? This is a two-stage process. First, the failed master's data directory must be re-synced with the current master; secondly the failed master needs to be re-registered as a standby. The section "Converting a failed master to a standby" in the `README.md` file contains more detailed information on this process. - Is there an easy way to check my master server is correctly configured for use with `repmgr`? Yes - execute `repmgr` with the `--check-upstream-config` option, and it will let you know which items in `postgresql.conf` need to be modified. - Even though I specified custom `rsync` options, `repmgr` appends the `--checksum` - why? When syncing a stale data directory from an active server, it's essential that `rsync` compares the content of files rather than just timestamp and size, to ensure that all changed files are copied and prevent corruption. - When cloning a standby, how can I prevent `repmgr` from copying `postgresql.conf` and `pg_hba.conf` from the PostgreSQL configuration directory in `/etc`? Use the command line option `--ignore-external-config-files` - How can I prevent `repmgr` from copying local configuration files in the data directory? If you're updating an existing but stale data directory which contains e.g. configuration files you don't want to be overwritten with the same file from the master, specify the files in the `rsync_options` configuration option, e.g. rsync_options=--exclude=postgresql.local.conf This option is only available when using the `--rsync-only` option. - How can I make the witness server use a particular port? By default the witness server is configured to use port 5499; this is intended to support running the witness server as a separate instance on a normal node server, rather than on its own dedicated server. To specify different port for the witness server, supply the port number in the `conninfo` string in `repmgr.conf` (repmgr 3.0.1 and earlier: use the `-l/--local-port` option) - Do I need to include `shared_preload_libraries = 'repmgr_funcs'` in `postgresql.conf` if I'm not using `repmgrd`? No, the `repmgr_funcs` library is only needed when running `repmgrd`. If you later decide to run `repmgrd`, you just need to add `shared_preload_libraries = 'repmgr_funcs'` and restart PostgreSQL. - I've provided replication permission for the `repmgr` user in `pg_hba.conf` but `repmgr`/`repmgrd` complains it can't connect to the server... Why? `repmgr`/`repmgrd` need to be able to connect to the repmgr database with a normal connection to query metadata. The `replication` connection permission is for PostgreSQL's streaming replication and doesn't necessarily need to be the `repmgr` user. `repmgrd` --------- - Do I need a witness server? Not necessarily. However if you have an uneven number of nodes spread over more than one network segment, a witness server will enable better handling of a 'split brain' situation by providing a "casting vote" on the preferred network segment. - How can I prevent a node from ever being promoted to master? In `repmgr.conf`, set its priority to a value of 0 or less. - Does `repmgrd` support delayed standbys? `repmgrd` can monitor delayed standbys - those set up with `recovery_min_apply_delay` set to a non-zero value in `recovery.conf` - but as it's not currently possible to directly examine the value applied to the standby, `repmgrd` may not be able to properly evaluate the node as a promotion candidate. We recommend that delayed standbys are explicitly excluded from promotion by setting `priority` to 0 in `repmgr.conf`. Note that after registering a delayed standby, `repmgrd` will only start once the metadata added in the master node has been replicated. - How can I get `repmgrd` to rotate its logfile? Configure your system's `logrotate` service to do this; see example in README.md repmgr-3.0.3/HISTORY000066400000000000000000000200451264264412200141010ustar00rootroot000000000000003.0.3 2016-01-04 Create replication slot if required before base backup is run (Abhijit) standy clone: when using rsync, clean up "pg_replslot" directory (Ian) Improve --help output (Ian) Improve config file parsing (Ian) Various logging output improvements, including explicit HINTS (Ian) Add --log-level to explicitly set log level on command line (Ian) Repurpose --verbose to display extra log output (Ian) Add --terse to hide hints and other non-critical output (Ian) Reference internal functions with explicit catalog path (Ian) When following a new primary, have repmgr (not repmgrd) create the new slot (Ian) Add /etc/repmgr.conf as a default configuration file location (Ian) Prevent repmgrd's -v/--verbose option expecting a parameter (Ian) Prevent invalid replication_lag values being written to the monitoring table (Ian) Improve repmgrd behaviour when monitored standby node is temporarily unavailable (Martín) 3.0.2 2015-10-02 Improve handling of --help/--version options; and improve help output (Ian) Improve handling of situation where logfile can't be opened (Ian) Always pass -D/--pgdata option to pg_basebackup (Ian) Bugfix: standby clone --force does not empty pg_xlog (Gianni) Bugfix: autofailover with reconnect_attempts > 1 (Gianni) Bugfix: ignore comments after values (soxwellfb) Bugfix: handle string values in 'node' parameter correctly (Gregory Duchatelet) Allow repmgr to be compiled with a newer libpq (Marco) Bugfix: call update_node_record_set_upstream() for STANDBY FOLLOW (Tomas) Update `repmgr --help` output (per Github report from renard) Update tablespace remapping in --rsync-only mode for 9.5 and later (Ian) Deprecate `-l/--local-port` option - the port can be extracted from the conninfo string in repmgr.conf (Ian) Add STANDBY UNREGISTER (Vik Fearing) Don't fail with error when registering master if schema already defined (Ian) Fixes to whitespace handling when parsing config file (Ian) 3.0.1 2015-04-16 Prevent repmgrd from looping infinitely if node was not registered (Ian) When promoting a standby, have repmgr (not repmgrd) handle metadata updates (Ian) Re-use replication slot if it already exists (Ian) Prevent a test SSH connection being made when not needed (Ian) Correct monitoring table column names (Ian) 3.0 2015-03-27 Require PostgreSQL 9.3 or later (Ian) Use `pg_basebackup` by default (instead of `rsync`) to clone standby servers (Ian) Use `pg_ctl promote` to promote a standby to primary Enable tablespace remapping using `pg_basebackup` (in PostgreSQL 9.3 with `rsync`) (Ian) Support cascaded standbys (Ian) "pg_bindir" no longer required as a configuration parameter (Ian) Enable replication slots to be used (PostgreSQL 9.4 and later (Ian) Command line option "--check-upstream-config" (Ian) Add event logging table and option to execute an external program when an event occurs (Ian) General usability and logging message improvements (Ian) Code consolidation and cleanup (Ian) 2.0.3 2015-04-16 Add -S/--superuser option for witness database creation Ian) Add -c/--fast-checkpoint option for cloning (Christoph) Add option "--initdb-no-pwprompt" (Ian) 2.0.2 2015-02-17 Add "--checksum" in rsync when using "--force" (Jaime) Use createdb/createuser instead of psql (Jaime) Fixes to witness creation and monitoring (wamonite) Use default master port if none supplied (Martín) Documentation fixes and improvements (Ian) 2.0.1 2014-07-16 Documentation fixes and new QUICKSTART file (Ian) Explicitly specify directories to ignore when cloning (Ian) Fix log level for some log messages (Ian) RHEL/CentOS specfile, init script and Makefile fixes (Nathan Van Overloop) Debian init script and config file documentation fixes (József Kószó) Typo fixes (Riegie Godwin Jeyaranchen, PriceChild) 2.0stable 2014-01-30 Documentation fixes (Christian) General refactoring, code quality improvements and stabilization work (Christian) Added proper daemonizing (-d/--daemonize) (Christian) Added PID file handling (-p/--pid-file) (Christian) New config option: monitor_interval_secs (Christian) New config option: retry_promote_interval (Christian) New config option: logfile (Christian) New config option: pg_bindir (Christian) New config option: pgctl_options (Christian) 2.0beta2 2013-12-19 Improve autofailover logic and algorithms (Jaime, Andres) Ignore pg_log when cloning (Jaime) Add timestamps to log line in stderr (Christian) Correctly check wal_keep_segments (Jay Taylor) Add a ssh_options parameter (Jay Taylor) 2.0beta1 2012-07-27 Make CLONE command try to make an exact copy including $PGDATA location (Cedric) Add detection of master failure (Jaime) Add the notion of a witness server (Jaime) Add autofailover capabilities (Jaime) Add a configuration parameter to indicate the script to execute on failover or follow (Jaime) Make the monitoring optional and turned off by default, it can be turned on with --monitoring-history switch (Jaime) Add tunables to specify number of retries to reconnect to master and the time between them (Jaime) 1.2.0 2012-07-27 Test ssh connection before trying to rsync (Cédric) Add CLUSTER SHOW command (Carlo) Add CLUSTER CLEANUP command (Jaime) Add function write_primary_conninfo (Marco) Teach repmgr how to get tablespace's location in different pg version (Jaime) Improve version message (Carlo) 1.1.1 2012-04-18 Add --ignore-rsync-warning (Cédric) Add strnlen for compatibility with OS X (Greg) Improve performance of the repl_status view (Jaime) Remove last argument from log_err (Jaime, Reported by Jeroen Dekkers) Complete documentation about possible error conditions (Jaime) Document how to clean history (Jaime) 1.1.0 2011-03-09 Make options -U, -R and -p not mandatory (Jaime) 1.1.0b1 2011-02-24 Fix missing "--force" option in help (Greg Smith) Correct warning message for wal_keep_segments (Bas van Oostveen) Add Debian build/usage docs (Bas, Hannu Krosing, Cedric Villemain) Add Debian .deb packaging (Hannu) Move configuration data into a structure (Bas, Gabriele Bartolini) Make rsync options configurable (Bas) Add syslog as alternate logging destination (Gabriele) Change from using malloc to static memory allocations (Gabriele) Add debugging messages after every query (Gabriele) Parameterize schema name used for repmgr (Gabriele) Avoid buffer overruns by using snprintf etc. (Gabriele) Fix use of database query after close (Gabriele) Add information about progress during "standby clone" (Gabriele) Fix double free errors in repmgrd (Charles Duffy, Greg) Make repmgr exit with an error code when encountering an error (Charles) Standardize on error return codes, use in repmgrd too (Greg) Add [un]install actions/SQL like most contrib modules (Daniel Farina) Wrap all string construction and produce error on overflow (Daniel) Correct freeing of memory from first_wal_segment (Daniel) Allow creating recovery.conf file with a password (Daniel) Inform when STANDBY CLONE sees an unused config file (Daniel) Use 64-bit computation for WAL apply_lag (Greg) Add info messages for database and general work done (Greg) Map old verbose flag into a useful setting for the new logger (Greg) Document repmgrd startup restrictions and log info about them (Greg) 1.0.0 2010-12-05 First public release repmgr-3.0.3/LICENSE000066400000000000000000001045141264264412200140260ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . repmgr-3.0.3/Makefile000066400000000000000000000045601264264412200144610ustar00rootroot00000000000000# # Makefile # Copyright (c) 2ndQuadrant, 2010-2015 repmgrd_OBJS = dbutils.o config.o repmgrd.o log.o strutil.o repmgr_OBJS = dbutils.o check_dir.o config.o repmgr.o log.o strutil.o DATA = repmgr.sql uninstall_repmgr.sql PG_CPPFLAGS = -I$(libpq_srcdir) PG_LIBS = $(libpq_pgport) all: repmgrd repmgr $(MAKE) -C sql repmgrd: $(repmgrd_OBJS) $(CC) $(CFLAGS) $(repmgrd_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgrd $(MAKE) -C sql repmgr: $(repmgr_OBJS) $(CC) $(CFLAGS) $(repmgr_OBJS) $(PG_LIBS) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o repmgr ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS) else subdir = contrib/repmgr top_builddir = ../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif # XXX: Try to use PROGRAM construct (see pgxs.mk) someday. Right now # is overriding pgxs install. install: install_prog install_ext install_prog: mkdir -p '$(DESTDIR)$(bindir)' $(INSTALL_PROGRAM) repmgrd$(X) '$(DESTDIR)$(bindir)/' $(INSTALL_PROGRAM) repmgr$(X) '$(DESTDIR)$(bindir)/' install_ext: $(MAKE) -C sql install install_rhel: mkdir -p '$(DESTDIR)/etc/init.d/' $(INSTALL_PROGRAM) RHEL/repmgrd.init '$(DESTDIR)/etc/init.d/repmgrd' mkdir -p '$(DESTDIR)/etc/sysconfig/' $(INSTALL_PROGRAM) RHEL/repmgrd.sysconfig '$(DESTDIR)/etc/sysconfig/repmgrd' mkdir -p '$(DESTDIR)/etc/repmgr/' $(INSTALL_PROGRAM) repmgr.conf.sample '$(DESTDIR)/etc/repmgr/' mkdir -p '$(DESTDIR)/usr/bin/' $(INSTALL_PROGRAM) repmgrd$(X) '$(DESTDIR)/usr/bin/' $(INSTALL_PROGRAM) repmgr$(X) '$(DESTDIR)/usr/bin/' ifneq (,$(DATA)$(DATA_built)) @for file in $(addprefix $(srcdir)/, $(DATA)) $(DATA_built); do \ echo "$(INSTALL_DATA) $$file '$(DESTDIR)$(datadir)/$(datamoduledir)'"; \ $(INSTALL_DATA) $$file '$(DESTDIR)$(datadir)/$(datamoduledir)'; \ done endif clean: rm -f *.o rm -f repmgrd rm -f repmgr $(MAKE) -C sql clean deb: repmgrd repmgr mkdir -p ./debian/usr/bin cp repmgrd repmgr ./debian/usr/bin/ mkdir -p ./debian/usr/share/postgresql/9.0/contrib/ cp sql/repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/ cp sql/uninstall_repmgr_funcs.sql ./debian/usr/share/postgresql/9.0/contrib/ mkdir -p ./debian/usr/lib/postgresql/9.0/lib/ cp sql/repmgr_funcs.so ./debian/usr/lib/postgresql/9.0/lib/ dpkg-deb --build debian mv debian.deb ../postgresql-repmgr-9.0_1.0.0.deb rm -rf ./debian/usr repmgr-3.0.3/PACKAGES.md000066400000000000000000000120271264264412200145160ustar00rootroot00000000000000Packaging ========= Notes on RedHat Linux, Fedora, and CentOS Builds ------------------------------------------------ The RPM packages of PostgreSQL put `pg_config` into the `postgresql-devel` package, not the main server one. And if you have a RPM install of PostgreSQL 9.0, the entire PostgreSQL binary directory will not be in your PATH by default either. Individual utilities are made available via the `alternatives` mechanism, but not all commands will be wrapped that way. The files installed by repmgr will certainly not be in the default PATH for the postgres user on such a system. They will instead be in /usr/pgsql-9.0/bin/ on this type of system. When building repmgr against a RPM packaged build, you may discover that some development packages are needed as well. The following build errors can occur: /usr/bin/ld: cannot find -lxslt /usr/bin/ld: cannot find -lpam Install the following packages to correct those: yum install libxslt-devel yum install pam-devel If building repmgr as a regular user, then doing the install into the system directories using sudo, the syntax is hard. `pg_config` won't be in root's path either. The following recipe should work: sudo PATH="/usr/pgsql-9.0/bin:$PATH" make USE_PGXS=1 install Issues with 32 and 64 bit RPMs ------------------------------ If when building, you receive a series of errors of this form: /usr/bin/ld: skipping incompatible /usr/pgsql-9.0/lib/libpq.so when searching for -lpq This is likely because you have both the 32 and 64 bit versions of the `postgresql90-devel` package installed. You can check that like this: rpm -qa --queryformat '%{NAME}\t%{ARCH}\n' | grep postgresql90-devel And if two packages appear, one for i386 and one for x86_64, that's not supposed to be allowed. This can happen when using the PGDG repo to install that package; here is an example sessions demonstrating the problem case appearing: # yum install postgresql-devel .. Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package postgresql90-devel.i386 0:9.0.2-2PGDG.rhel5 set to be updated ---> Package postgresql90-devel.x86_64 0:9.0.2-2PGDG.rhel5 set to be updated --> Finished Dependency Resolution Dependencies Resolved ========================================================================= Package Arch Version Repository Size ========================================================================= Installing: postgresql90-devel i386 9.0.2-2PGDG.rhel5 pgdg90 1.5 M postgresql90-devel x86_64 9.0.2-2PGDG.rhel5 pgdg90 1.6 M Note how both the i386 and x86_64 platform architectures are selected for installation. Your main PostgreSQL package will only be compatible with one of those, and if the repmgr build finds the wrong postgresql90-devel these "skipping incompatible" messages appear. In this case, you can temporarily remove both packages, then just install the correct one for your architecture. Example: rpm -e postgresql90-devel --allmatches yum install postgresql90-devel-9.0.2-2PGDG.rhel5.x86_64 Instead just deleting the package from the wrong platform might not leave behind the correct files, due to the way in which these accidentally happen to interact. If you already tried to build repmgr before doing this, you'll need to do: make USE_PGXS=1 clean to get rid of leftover files from the wrong architecture. Notes on Ubuntu, Debian or other Debian-based Builds ---------------------------------------------------- The Debian packages of PostgreSQL put `pg_config` into the development package called `postgresql-server-dev-$version`. When building repmgr against a Debian packages build, you may discover that some development packages are needed as well. You will need the following development packages installed: sudo apt-get install libxslt-dev libxml2-dev libpam-dev libedit-dev If you're using Debian packages for PostgreSQL and are building repmgr with the USE_PGXS option you also need to install the corresponding development package: sudo apt-get install postgresql-server-dev-9.0 If you build and install repmgr manually it will not be on the system path. The binaries will be installed in /usr/lib/postgresql/$version/bin/ which is not on the default path. The reason behind this is that Ubuntu/Debian systems manage multiple installed versions of PostgreSQL on the same system through a wrapper called pg_wrapper and repmgr is not (yet) known to this wrapper. You can solve this in many different ways, the most Debian like is to make an alternate for repmgr and repmgrd: sudo update-alternatives --install /usr/bin/repmgr repmgr /usr/lib/postgresql/9.0/bin/repmgr 10 sudo update-alternatives --install /usr/bin/repmgrd repmgrd /usr/lib/postgresql/9.0/bin/repmgrd 10 You can also make a deb package of repmgr using: make USE_PGXS=1 deb This will build a Debian package one level up from where you build, normally the same directory that you have your repmgr/ directory in. repmgr-3.0.3/QUICKSTART.md000066400000000000000000000130361264264412200150330ustar00rootroot00000000000000repmgr quickstart guide ======================= This quickstart guide provides some annotated examples on basic `repmgr` setup. It assumes you are familiar with PostgreSQL replication concepts setup and Linux/UNIX system administration. For the purposes of this guide, we'll assume the database user will be `repmgr_usr` and the database will be `repmgr_db`. Master setup ------------ 1. Configure PostgreSQL - create user and database: ``` CREATE ROLE repmgr_usr LOGIN SUPERUSER; CREATE DATABASE repmgr_db OWNER repmgr_usr; ``` - configure `postgresql.conf` for replication (see README.md for sample settings) - update `pg_hba.conf`, e.g.: ``` host repmgr_db repmgr_usr 192.168.1.0/24 trust host replication repmgr_usr 192.168.1.0/24 trust ``` Restart the PostgreSQL server after making these changes. 2. Create the `repmgr` configuration file: $ cat /path/to/repmgr/node1/repmgr.conf cluster=test node=1 node_name=node1 conninfo='host=repmgr_node1 user=repmgr_usr dbname=repmgr_db' pg_bindir=/path/to/postgres/bin (For an annotated `repmgr.conf` file, see `repmgr.conf.sample` in the repository's root directory). 3. Register the master node with `repmgr`: $ repmgr -f /path/to/repmgr/node1/repmgr.conf --verbose master register [2015-03-03 17:45:53] [INFO] repmgr connecting to master database [2015-03-03 17:45:53] [INFO] repmgr connected to master, checking its state [2015-03-03 17:45:53] [INFO] master register: creating database objects inside the repmgr_test schema [2015-03-03 17:45:53] [NOTICE] Master node correctly registered for cluster test with id 1 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db) Standby setup ------------- 1. Use `repmgr standby clone` to clone a standby from the master: repmgr -D /path/to/standby/data -d repmgr_db -U repmgr_usr --verbose standby clone 192.168.1.2 [2015-03-03 18:18:21] [NOTICE] No configuration file provided and default file './repmgr.conf' not found - continuing with default values [2015-03-03 18:18:21] [NOTICE] repmgr Destination directory ' /path/to/standby/data' provided [2015-03-03 18:18:21] [INFO] repmgr connecting to upstream node [2015-03-03 18:18:21] [INFO] repmgr connected to upstream node, checking its state [2015-03-03 18:18:21] [INFO] Successfully connected to upstream node. Current installation size is 27 MB [2015-03-03 18:18:21] [NOTICE] Starting backup... [2015-03-03 18:18:21] [INFO] creating directory " /path/to/standby/data"... [2015-03-03 18:18:21] [INFO] Executing: 'pg_basebackup -l "repmgr base backup" -h localhost -p 9595 -U repmgr_usr -D /path/to/standby/data ' NOTICE: pg_stop_backup complete, all required WAL segments have been archived [2015-03-03 18:18:23] [NOTICE] repmgr standby clone (using pg_basebackup) complete [2015-03-03 18:18:23] [NOTICE] HINT: You can now start your postgresql server [2015-03-03 18:18:23] [NOTICE] for example : pg_ctl -D /path/to/standby/data start Note that the `repmgr.conf` file is not required when cloning a standby. However we recommend providing a valid `repmgr.conf` if you wish to use replication slots, or want `repmgr` to log the clone event to the `repl_events` table. This will clone the PostgreSQL database files from the master, including its `postgresql.conf` and `pg_hba.conf` files, and additionally automatically create the `recovery.conf` file containing the correct parameters to start streaming from the primary node. 2. Start the PostgreSQL server 3. Create the `repmgr` configuration file: $ cat /path/node2/repmgr/repmgr.conf cluster=test node=2 node_name=node2 conninfo='host=repmgr_node2 user=repmgr_usr dbname=repmgr_db' pg_bindir=/path/to/postgres/bin 4. Register the standby node with `repmgr`: $ repmgr -f /path/to/repmgr/node2/repmgr.conf --verbose standby register [2015-03-03 18:24:34] [NOTICE] Opening configuration file: /path/to/repmgr/node2/repmgr.conf [2015-03-03 18:24:34] [INFO] repmgr connecting to standby database [2015-03-03 18:24:34] [INFO] repmgr connecting to master database [2015-03-03 18:24:34] [INFO] finding node list for cluster 'test' [2015-03-03 18:24:34] [INFO] checking role of cluster node '1' [2015-03-03 18:24:34] [INFO] repmgr connected to master, checking its state [2015-03-03 18:24:34] [INFO] repmgr registering the standby [2015-03-03 18:24:34] [INFO] repmgr registering the standby complete [2015-03-03 18:24:34] [NOTICE] Standby node correctly registered for cluster test with id 2 (conninfo: host=localhost user=repmgr_usr dbname=repmgr_db) This concludes the basic `repmgr` setup of master and standby. The records created in the `repl_nodes` table should look something like this: repmgr_db=# SELECT * from repmgr_test.repl_nodes; id | type | upstream_node_id | cluster | name | conninfo | slot_name | priority | active ----+---------+------------------+---------+-------+----------------------------------------------------+-----------+----------+-------- 1 | primary | | test | node1 | host=repmgr_node1 user=repmgr_usr dbname=repmgr_db | | 0 | t 2 | standby | 1 | test | node2 | host=repmgr_node2 user=repmgr_usr dbname=repmgr_db | | 0 | t (2 rows) repmgr-3.0.3/README.md000066400000000000000000000627601264264412200143060ustar00rootroot00000000000000repmgr: Replication Manager for PostgreSQL ========================================== `repmgr` is an open-source tool to manage replication and failover between multiple PostgreSQL servers. It enhances PostgreSQL's built-in hot-standby capabilities with tools to set up standby servers, monitor replication, and perform administrative tasks such as failover or manual switchover operations. This document covers `repmgr 3`, which supports PostgreSQL 9.3 and later. This version can use `pg_basebackup` to clone standby servers, supports replication slots and cascading replication, doesn't require a restart after promotion, and has many usability improvements. Please continue to use `repmgr 2` with PostgreSQL 9.2 and earlier. For a list of changes since `repmgr 2` and instructions on upgrading to `repmgr 3`, see the "Upgrading from repmgr 2" section below. For a list of frequently asked questions about `repmgr`, please refer to the file `FAQ.md`. Overview -------- The `repmgr` command-line tool is used to perform administrative tasks, and the `repmgrd` daemon is used to optionally monitor replication and manage automatic failover. To get started, each PostgreSQL node in your cluster must have a `repmgr.conf` file. The current master node must be registered using `repmgr master register`. Existing standby servers can be registered using `repmgr standby register`. A new standby server can be created using `repmgr standby clone` followed by `repmgr standby register`. See the `QUICKSTART.md` file for examples of how to use these commands. Once the cluster is in operation, run `repmgr cluster show` to see the status of the registered primary and standby nodes. Any standby can be manually promoted using `repmgr standby promote`. Other standby nodes can be told to follow the new master using `repmgr standby follow`. We show examples of these commands below. Next, for detailed monitoring, you must run `repmgrd` (with the same configuration file) on all your nodes. Replication status information is stored in a custom schema along with information about registered nodes. You also need `repmgrd` to configure automatic failover in your cluster. See the `FAILOVER.rst` file for an explanation of how to set up automatic failover. Requirements ------------ `repmgr` is developed and tested on Linux and OS X, but it should work on any UNIX-like system which PostgreSQL itself supports. All nodes must be running the same major version of PostgreSQL, and we recommend that they also run the same minor version. This version of `repmgr` (v3) supports PostgreSQL 9.3 and later. Earlier versions of `repmgr` needed password-less SSH access between nodes in order to clone standby servers using `rsync`. `repmgr 3` can use `pg_basebackup` instead in most circumstances; ssh is not required. You will need to use rsync only if your PostgreSQL configuration files are outside your data directory (as on Debian) and you wish these to be copied by `repmgr`. See the `SSH-RSYNC.md` file for details on configuring password-less SSH between your nodes. Installation ------------ `repmgr` must be installed on each PostgreSQL server node. * Packages - PGDG publishes RPM packages for RedHat-based distributions - Debian/Ubuntu provide .deb packages. - See `PACKAGES.md` for details on building .deb and .rpm packages from the `repmgr` source code. * Source installation - `git clone https://github.com/2ndQuadrant/repmgr` - Or download tar.gz files from https://github.com/2ndQuadrant/repmgr/releases - To install from source, run `sudo make USE_PGXS=1 install` After installation, you should be able to run `repmgr --version` and `repmgrd --version`. These binaries should be installed in the same directory as other PostgreSQL binaries, such as `psql`. Configuration ------------- ### Server configuration By default, `repmgr` uses PostgreSQL's built-in replication protocol to clone a primary and create a standby server. If your configuration files live outside your data directory, however, you will still need to set up password-less SSH so that rsync can be used. See the `SSH-RSYNC.md` file for details. ### PostgreSQL configuration The primary server needs to be configured for replication with settings like the following in `postgresql.conf`: # Allow read-only queries on standby servers. The number of WAL # senders should be larger than the number of standby servers. hot_standby = on wal_level = 'hot_standby' max_wal_senders = 10 # How much WAL to retain on the primary to allow a temporarily # disconnected standby to catch up again. The larger this is, the # longer the standby can be disconnected. This is needed only in # 9.3; from 9.4, replication slots can be used instead (see below). wal_keep_segments = 5000 # Enable archiving, but leave it unconfigured (so that it can be # configured without a restart later). Recommended, not required. archive_mode = on archive_command = 'cd .' # If you plan to use repmgrd, ensure that shared_preload_libraries # is configured to load 'repmgr_funcs' shared_preload_libraries = 'repmgr_funcs' PostgreSQL 9.4 makes it possible to use replication slots, which means the value of `wal_keep_segments` need no longer be set. See section "Replication slots" below for more details. With PostgreSQL 9.3, `repmgr` expects `wal_keep_segments` to be set to at least 5000 (= 80GB of WAL) by default, though this can be overriden with the `-w N` argument. A dedicated PostgreSQL superuser account and a database in which to store monitoring and replication data are required. Create them by running the following commands: createuser -s repmgr createdb repmgr -O repmgr We recommend using the name `repmgr` for both user and database, but you can use whatever name you like (and you need to set the names you chose in the `conninfo` string in `repmgr.conf`; see below). We also recommend that you set the `repmgr` user's search path to include the `repmgr` schema for convenience when querying the metadata tables and views. The `repmgr` application will create its metadata schema in the `repmgr` database when the master server is registered. ### repmgr configuration Create a `repmgr.conf` file on each server. Here's a minimal sample: cluster=test node=1 node_name=node1 conninfo='host=repmgr_node1 user=repmgr dbname=repmgr' The `cluster` name must be the same on all nodes. The `node` (an integer) and `node_name` must be unique to each node. The `conninfo` string must point to repmgr's database *on this node*. The host must be an IP or a name that all the nodes in the cluster can resolve (not `localhost`!). All nodes must use the same username and database name, but other parameters, such as the port, can vary between nodes. Your `repmgr.conf` should not be stored inside the PostgreSQL data directory. We recommend `/etc/repmgr/repmgr.conf`, but you can place it anywhere and use the `-f /path/to/repmgr.conf` option to tell `repmgr` where it is. If not specified, `repmgr` will search for `repmgr.conf` in the current working directory. If your PostgreSQL binaries (`pg_ctl`, `pg_basebackup`) are not in your `PATH`, you can specify an alternate location in `repmgr.conf`: pg_bindir=/path/to/postgres/bin See `repmgr.conf.sample` for an example configuration file with all available configuration settings annotated. ### Starting up The master node must be registered first using `repmgr master register`, and standby servers must be registered using `repmgr standby register`; this inserts details about each node into the control database. Use `repmgr cluster show` to see the result. See the `QUICKSTART.md` file for examples of how to use these commands. Failover -------- To promote a standby to master, on the standby execute e.g.: repmgr -f /etc/repmgr/repmgr.conf --verbose standby promote `repmgr` will attempt to connect to the current master to verify that it is not available (if it is, `repmgr` will not promote the standby). Other standby servers need to be told to follow the new master with e.g.: repmgr -f /etc/repmgr/repmgr.conf --verbose standby follow See file `FAILOVER.rst` for details on setting up automated failover. Converting a failed master to a standby --------------------------------------- Often it's desirable to bring a failed master back into replication as a standby. First, ensure that the master's PostgreSQL server is no longer running; then use `repmgr standby clone` to re-sync its data directory with the current master, e.g.: repmgr -f /etc/repmgr/repmgr.conf \ --force --rsync-only \ -h node2 -d repmgr -U repmgr --verbose \ standby clone Here it's essential to use the command line options `--force`, to ensure `repmgr` will re-use the existing data directory, and `--rsync-only`, which causes `repmgr` to use `rsync` rather than `pg_basebackup`, as the latter can only be used to clone a fresh standby. The node can then be restarted. The node will then need to be re-registered with `repmgr`; again the `--force` option is required to update the existing record: repmgr -f /etc/repmgr/repmgr.conf \ --force \ standby register Replication management with repmgrd ----------------------------------- `repmgrd` is a management and monitoring daemon which runs on standby nodes and which can automate actions such as failover and updating standbys to follow the new master.`repmgrd` can be started simply with e.g.: repmgrd -f /etc/repmgr/repmgr.conf --verbose > $HOME/repmgr/repmgr.log 2>&1 or alternatively: repmgrd -f /etc/repmgr/repmgr.conf --verbose --monitoring-history > $HOME/repmgr/repmgrd.log 2>&1 which will track replication advance or lag on all registered standbys. For permanent operation, we recommend using the options `-d/--daemonize` to detach the `repmgrd` process, and `-p/--pid-file` to write the process PID to a file. Example log output (at default log level): [2015-03-11 13:15:40] [INFO] checking cluster configuration with schema 'repmgr_test' [2015-03-11 13:15:40] [INFO] checking node 2 in cluster 'test' [2015-03-11 13:15:40] [INFO] reloading configuration file and updating repmgr tables [2015-03-11 13:15:40] [INFO] starting continuous standby node monitoring Note that currently `repmgrd` does not provide logfile rotation. To ensure the current logfile does not grow indefinitely, configure your system's `logrotate` to do this. Sample configuration to rotate logfiles weekly with retention for up to 52 weeks and rotation forced if a file grows beyond 100Mb: /var/log/postgresql/repmgr-9.4.log { missingok compress rotate 52 maxsize 100M weekly create 0600 postgres postgres } Witness server -------------- In a situation caused e.g. by a network interruption between two data centres, it's important to avoid a "split-brain" situation where both sides of the network assume they are the active segment and the side without an active master unilaterally promotes one of its standbys. To prevent this situation happening, it's essential to ensure that one network segment has a "voting majority", so other segments will know they're in the minority and not attempt to promote a new master. Where an odd number of servers exists, this is not an issue. However, if each network has an even number of nodes, it's necessary to provide some way of ensuring a majority, which is where the witness server becomes useful. This is not a fully-fledged standby node and is not integrated into replication, but it effectively represents the "casting vote" when deciding which network segment has a majority. A witness server can be set up using `repmgr witness create` (see below for details) and can run on a dedicated server or an existing node. Note that it only makes sense to create a witness server in conjunction with running `repmgrd`; the witness server will require its own `repmgrd` instance. Monitoring ---------- When `repmgrd` is running with the option `-m/--monitoring-history`, it will constantly write node status information to the `repl_monitor` table, which can be queried easily using the view `repl_status`: repmgr=# SELECT * FROM repmgr_test.repl_status; -[ RECORD 1 ]-------------+----------------------------- primary_node | 1 standby_node | 2 standby_name | node2 node_type | standby active | t last_monitor_time | 2015-03-11 14:02:34.51713+09 last_wal_primary_location | 0/3012AF0 last_wal_standby_location | 0/3012AF0 replication_lag | 0 bytes replication_time_lag | 00:00:03.463085 apply_lag | 0 bytes communication_time_lag | 00:00:00.955385 Event logging and notifications ------------------------------- To help understand what significant events (e.g. failure of a node) happened when and for what reason, `repmgr` logs such events into the `repl_events` table, e.g.: repmgr_db=# SELECT * from repmgr_test.repl_events ; node_id | event | successful | event_timestamp | details ---------+------------------+------------+-------------------------------+----------------------------------------------------------------------------------- 1 | master_register | t | 2015-03-16 17:36:21.711796+09 | 2 | standby_clone | t | 2015-03-16 17:36:31.286934+09 | Cloned from host 'localhost', port 5500; backup method: pg_basebackup; --force: N 2 | standby_register | t | 2015-03-16 17:36:32.391567+09 | (3 rows) Additionally `repmgr` can execute an external program each time an event is logged. This program is defined with the configuration variable `event_notification_command`; the command string can contain the following placeholders, which will be replaced with the same content which is written to the `repl_events` table: %n - node id %e - event type %s - success (1 or 0) %t - timestamp %d - description Example: event_notification_command=/path/to/some-script %n %e %s "%t" "%d" By default the program defined with `event_notification_command` will be executed for every event; to restrict execution to certain events, list these in the parameter `event_notifications` event_notifications=master_register,standby_register Following event types currently exist: master_register standby_register standby_unregister standby_clone standby_promote witness_create repmgrd_start repmgrd_monitor repmgrd_failover_promote repmgrd_failover_follow Cascading replication --------------------- Cascading replication - where a standby can connect to an upstream node and not the master server itself - was introduced in PostgreSQL 9.2. `repmgr` and `repmgrd` support cascading replication by keeping track of the relationship between standby servers - each node record is stored with the node id of its upstream ("parent") server (except of course the master server). In a failover situation where the master node fails and a top-level standby is promoted, a standby connected to another standby will not be affected and continue working as normal (even if the upstream standby it's connected to becomes the master node). If however the node's direct upstream fails, the "cascaded standby" will attempt to reconnect to that node's parent. To configure standby servers for cascading replication, add the parameter `upstream_node` to `repmgr.conf` and set it to the id of the node it should connect to, e.g.: cluster=test node=2 node_name=node2 upstream_node=1 Replication slots ----------------- Replication slots were introduced with PostgreSQL 9.4 and enable standbys to notify the master of their WAL consumption, ensuring that the master will not remove any WAL files until they have been received by all standbys. This mitigates the requirement to manage WAL file retention using `wal_keep_segments` etc., with the caveat that if a standby fails, no WAL files will be removed until the standby's replication slot is deleted. To enable replication slots, set the boolean parameter `use_replication_slots` in `repmgr.conf`: use_replication_slots=1 `repmgr` will automatically generate an appropriate slot name, which is stored in the `repl_nodes` table. Note that `repmgr` will fail with an error if this option is specified when working with PostgreSQL 9.3. Be aware that when initially cloning a standby, you will need to ensure that all required WAL files remain available while the cloning is taking place. If using the default `pg_basebackup` method, we recommend setting `pg_basebackup`'s `--xlog-method` parameter to `stream` like this: pg_basebackup_options='--xlog-method=stream' See the `pg_basebackup` documentation [*] for details. Otherwise you'll need to set `wal_keep_segments` to an appropriately high value. [*] http://www.postgresql.org/docs/current/static/app-pgbasebackup.html Further reading: * http://www.postgresql.org/docs/current/interactive/warm-standby.html#STREAMING-REPLICATION-SLOTS * http://blog.2ndquadrant.com/postgresql-9-4-slots/ Upgrading from repmgr 2 ----------------------- `repmgr 3` is largely compatible with `repmgr 2`; the only step required to upgrade is to update the `repl_nodes` table to the definition needed by `repmgr 3`. See the file `sql/repmgr2_repmgr3.sql` for details on how to do this. `repmgrd` must *not* be running while `repl_nodes` is being updated. Existing `repmgr.conf` files can be retained as-is. --------------------------------------- Reference --------- ### repmgr command reference Not all of these commands need the ``repmgr.conf`` file, but they need to be able to connect to the remote and local databases. You can teach it which is the remote database by using the -h parameter or as a last parameter in standby clone and standby follow. If you need to specify a port different then the default 5432 you can specify a -p parameter. Standby is always considered as localhost and a second -p parameter will indicate its port if is different from the default one. * `master register` Registers a master in a cluster. This command needs to be executed before any standby nodes are registered. `primary register` can be used as an alias for `master register`. * `standby register` Registers a standby with `repmgr`. This command needs to be executed to enable promote/follow operations and to allow `repmgrd` to work with the node. An existing standby can be registered using this command. * `standby unregister` Unregisters a standby with `repmgr`. This command does not affect the actual replication. * `standby clone [node to be cloned]` Clones a new standby node from the data directory of the master (or an upstream cascading standby) using `pg_basebackup` or `rsync`. Additionally it will create the `recovery.conf` file required to start the server as a standby. This command does not require `repmgr.conf` to be provided, but does require connection details of the master or upstream server as command line parameters. Provide the `-D/--data-dir` option to specify the destination data directory; if not, the same directory path as on the source server will be used. By default, `pg_basebackup` will be used to copy data from the master or upstream node but this can only be used for bootstrapping new installations. To update an existing but 'stale' data directory (for example belonging to a failed master), `rsync` must be used by specifying `--rsync-only`. In this case, password-less SSH connections between servers are required. * `standby promote` Promotes a standby to a master if the current master has failed. This command requires a valid `repmgr.conf` file for the standby, either specified explicitly with `-f/--config-file` or located in the current working directory; no additional arguments are required. If the standby promotion succeeds, the server will not need to be restarted. However any other standbys will need to follow the new server, by using `standby follow` (see below); if `repmgrd` is active, it will handle this. This command will not function if the current master is still running. * `witness create` Creates a witness server as a separate PostgreSQL instance. This instance can be on a separate server or a server running an existing node. The witness server contain a copy of the repmgr metadata tables but will not be set up as a standby; instead it will update its metadata copy each time a failover occurs. Note that it only makes sense to create a witness server if `repmgrd` is in use; see section "witness server" above. By default the witness server will use port 5499 to facilitate easier setup on a server running an existing node. * `standby follow` Attaches the standby to a new master. This command requires a valid `repmgr.conf` file for the standby, either specified explicitly with `-f/--config-file` or located in the current working directory; no additional arguments are required. This command will force a restart of the standby server. It can only be used to attach a standby to a new master node. * `cluster show` Displays information about each node in the replication cluster. This command polls each registered server and shows its role (master / standby / witness) or "FAILED" if the node doesn't respond. It polls each server directly and can be run on any node in the cluster; this is also useful when analyzing connectivity from a particular node. This command requires a valid `repmgr.conf` file for the node on which it is executed, either specified explicitly with `-f/--config-file` or located in the current working directory; no additional arguments are required. Example: repmgr -f /path/to/repmgr.conf cluster show Role | Connection String * master | host=node1 dbname=repmgr user=repmgr standby | host=node2 dbname=repmgr user=repmgr standby | host=node3 dbname=repmgr user=repmgr * `cluster cleanup` Purges monitoring history from the `repl_monitor` table to prevent excessive table growth. Use the `-k/--keep-history` to specify the number of days of monitoring history to retain. This command can be used manually or as a cronjob. This command requires a valid `repmgr.conf` file for the node on which it is executed, either specified explicitly with `-f/--config-file` or located in the current working directory; no additional arguments are required. ### repmgr configuration file See `repmgr.conf.sample` for an example configuration file with available configuration settings annotated. ### repmgr database schema `repmgr` creates a small schema for its own use in the database specified in each node's `conninfo` configuration parameter. This database can in principle be any database. The schema name is the global `cluster` name prefixed with `repmgr_`, so for the example setup above the schema name is `repmgr_test`. The schema contains two tables: * `repl_nodes` stores information about all registered servers in the cluster * `repl_monitor` stores monitoring information about each node (generated by `repmgrd` with `-m/--monitoring-history` option enabled) and one view: * `repl_status` summarizes the latest monitoring information for each node (generated by `repmgrd` with `-m/--monitoring-history` option enabled) ### Error codes `repmgr` or `repmgrd` will return one of the following error codes on program exit: * SUCCESS (0) Program ran successfully. * ERR_BAD_CONFIG (1) Configuration file could not be parsed or was invalid * ERR_BAD_RSYNC (2) An rsync call made by the program returned an error * ERR_NO_RESTART (4) An attempt to restart a PostgreSQL instance failed * ERR_DB_CON (6) Error when trying to connect to a database * ERR_DB_QUERY (7) Error while executing a database query * ERR_PROMOTED (8) Exiting program because the node has been promoted to master * ERR_BAD_PASSWORD (9) Password used to connect to a database was rejected * ERR_STR_OVERFLOW (10) String overflow error * ERR_FAILOVER_FAIL (11) Error encountered during failover (repmgrd only) * ERR_BAD_SSH (12) Error when connecting to remote host via SSH * ERR_SYS_FAILURE (13) Error when forking (repmgrd only) * ERR_BAD_BASEBACKUP (14) Error when executing pg_basebackup * ERR_MONITORING_FAIL (16) Unrecoverable error encountered during monitoring (repmgrd only) Support and Assistance ---------------------- 2ndQuadrant provides 24x7 production support for repmgr, including configuration assistance, installation verification and training for running a robust replication cluster. For further details see: * http://2ndquadrant.com/en/support/ There is a mailing list/forum to discuss contributions or issues http://groups.google.com/group/repmgr The IRC channel #repmgr is registered with freenode. Further information is available at http://www.repmgr.org/ We'd love to hear from you about how you use repmgr. Case studies and news are always welcome. Send us an email at info@2ndQuadrant.com, or send a postcard to repmgr c/o 2ndQuadrant 7200 The Quorum Oxford Business Park North Oxford OX4 2JZ United Kingdom Thanks from the repmgr core team. * Ian Barwick * Jaime Casanova * Abhijit Menon-Sen * Simon Riggs * Cedric Villemain Further reading --------------- * http://blog.2ndquadrant.com/announcing-repmgr-2-0/ * http://blog.2ndquadrant.com/managing-useful-clusters-repmgr/ * http://blog.2ndquadrant.com/easier_postgresql_90_clusters/ repmgr-3.0.3/RHEL/000077500000000000000000000000001264264412200135465ustar00rootroot00000000000000repmgr-3.0.3/RHEL/repmgr3-93.spec000066400000000000000000000033551264264412200162400ustar00rootroot00000000000000Summary: repmgr Name: repmgr Version: 3.0 Release: 1 License: GPLv3 Group: System Environment/Daemons URL: http://repmgr.org Packager: Ian Barwick Vendor: 2ndQuadrant Limited Distribution: centos Source0: %{name}-%{version}.tar.gz BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root %description repmgr is a utility suite which greatly simplifies the process of setting up and managing replication using streaming replication within a cluster of PostgreSQL servers. %prep %setup %build export PATH=$PATH:/usr/pgsql-9.3/bin/ %{__make} USE_PGXS=1 %install [ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot} export PATH=$PATH:/usr/pgsql-9.3/bin/ %{__make} USE_PGXS=1 install DESTDIR=%{buildroot} INSTALL="install -p" %{__make} USE_PGXS=1 install_prog DESTDIR=%{buildroot} INSTALL="install -p" %{__make} USE_PGXS=1 install_rhel DESTDIR=%{buildroot} INSTALL="install -p" %clean [ "%{buildroot}" != "/" ] && %{__rm} -rf %{buildroot} %files %defattr(-,root,root) /usr/bin/repmgr /usr/bin/repmgrd /usr/pgsql-9.3/bin/repmgr /usr/pgsql-9.3/bin/repmgrd /usr/pgsql-9.3/lib/repmgr_funcs.so /usr/pgsql-9.3/share/contrib/repmgr.sql /usr/pgsql-9.3/share/contrib/repmgr_funcs.sql /usr/pgsql-9.3/share/contrib/uninstall_repmgr.sql /usr/pgsql-9.3/share/contrib/uninstall_repmgr_funcs.sql %attr(0755,root,root)/etc/init.d/repmgrd %attr(0644,root,root)/etc/sysconfig/repmgrd %attr(0644,root,root)/etc/repmgr/repmgr.conf.sample %changelog * Tue Mar 10 2015 Ian Barwick ian@2ndquadrant.com> - build for repmgr 3.0 * Thu Jun 05 2014 Nathan Van Overloop 2.0.2 - fix witness creation to create db and user if needed * Fri Apr 04 2014 Nathan Van Overloop 2.0.1 - initial build for RHEL6 repmgr-3.0.3/RHEL/repmgrd.init000077500000000000000000000053411264264412200161010ustar00rootroot00000000000000#!/bin/sh # # chkconfig: - 75 16 # description: Enable repmgrd replication management and monitoring daemon for PostgreSQL # processname: repmgrd # pidfile="/var/run/${NAME}.pid" # Source function library. INITD=/etc/rc.d/init.d . $INITD/functions # Get function listing for cross-distribution logic. TYPESET=`typeset -f|grep "declare"` # Get network config. . /etc/sysconfig/network DESC="PostgreSQL replication management and monitoring daemon" NAME=repmgrd REPMGRD_ENABLED=no REPMGRD_OPTS= REPMGRD_USER=postgres REPMGRD_BIN=/usr/pgsql-9.3/bin/repmgrd REPMGRD_PIDFILE=/var/run/repmgrd.pid REPMGRD_LOCK=/var/lock/subsys/${NAME} REPMGRD_LOG=/var/lib/pgsql/9.3/data/pg_log/repmgrd.log # Read configuration variable file if it is present [ -r /etc/sysconfig/$NAME ] && . /etc/sysconfig/$NAME # For SELinux we need to use 'runuser' not 'su' if [ -x /sbin/runuser ] then SU=runuser else SU=su fi test -x $REPMGRD_BIN || exit 0 case "$REPMGRD_ENABLED" in [Yy]*) break ;; *) exit 0 ;; esac if [ -z "${REPMGRD_OPTS}" ] then echo "Not starting ${NAME}, REPMGRD_OPTS not set in /etc/sysconfig/${NAME}" exit 0 fi start() { REPMGRD_START=$"Starting ${NAME} service: " # Make sure startup-time log file is valid if [ ! -e "${REPMGRD_LOG}" -a ! -h "${REPMGRD_LOG}" ] then touch "${REPMGRD_LOG}" || exit 1 chown ${REPMGRD_USER}:postgres "${REPMGRD_LOG}" chmod go-rwx "${REPMGRD_LOG}" [ -x /sbin/restorecon ] && /sbin/restorecon "${REPMGRD_LOG}" fi echo -n "${REPMGRD_START}" $SU -l $REPMGRD_USER -c "${REPMGRD_BIN} ${REPMGRD_OPTS} -p ${REPMGRD_PIDFILE} &" >> "${REPMGRD_LOG}" 2>&1 < /dev/null sleep 2 pid=`head -n 1 "${REPMGRD_PIDFILE}" 2>/dev/null` if [ "x${pid}" != "x" ] then success "${REPMGRD_START}" touch "${REPMGRD_LOCK}" echo $pid > "${REPMGRD_PIDFILE}" echo else failure "${REPMGRD_START}" echo script_result=1 fi } stop() { echo -n $"Stopping ${NAME} service: " if [ -e "${REPMGRD_LOCK}" ] then killproc ${NAME} ret=$? if [ $ret -eq 0 ] then echo_success rm -f "${REPMGRD_PIDFILE}" rm -f "${REPMGRD_LOCK}" else echo_failure script_result=1 fi else # not running; per LSB standards this is "ok" echo_success fi echo } # See how we were called. case "$1" in start) start ;; stop) stop ;; status) status -p $REPMGRD_PIDFILE $NAME script_result=$? ;; restart) stop start ;; *) echo $"Usage: $0 {start|stop|status|restart}" exit 2 esac exit $script_result repmgr-3.0.3/RHEL/repmgrd.sysconfig000066400000000000000000000010131264264412200171270ustar00rootroot00000000000000# default settings for repmgrd. This file is source by /bin/sh from # /etc/init.d/repmgrd # disable repmgrd by default so it won't get started upon installation # valid values: yes/no REPMGRD_ENABLED=no # Options for repmgrd (required) #REPMGRD_OPTS="--verbose -d -f /var/lib/pgsql/repmgr/repmgr.conf" # User to run repmgrd as #REPMGRD_USER=postgres # repmgrd binary #REPMGRD_BIN=/usr/bin/repmgrd # pid file #REPMGRD_PIDFILE=/var/lib/pgsql/repmgr/repmgrd.pid # log file #REPMGRD_LOG=/var/lib/pgsql/repmgr/repmgrd.log repmgr-3.0.3/SSH-RSYNC.md000066400000000000000000000032061264264412200146700ustar00rootroot00000000000000Set up trusted copy between postgres accounts --------------------------------------------- If you need to use `rsync` to clone standby servers, the `postgres` account on your primary and standby servers must be each able to access the other using SSH without a password. First generate an ssh key, using an empty passphrase, and copy the resulting keys and a matching authorization file to a privileged user account on the other system: [postgres@node1]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/var/lib/pgsql/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /var/lib/pgsql/.ssh/id_rsa. Your public key has been saved in /var/lib/pgsql/.ssh/id_rsa.pub. The key fingerprint is: aa:bb:cc:dd:ee:ff:aa:11:22:33:44:55:66:77:88:99 postgres@db1.domain.com [postgres@node1]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys [postgres@node1]$ chmod go-rwx ~/.ssh/* [postgres@node1]$ cd ~/.ssh [postgres@node1]$ scp id_rsa.pub id_rsa authorized_keys user@node2: Login as a user on the other system, and install the files into the `postgres` user's account: [user@node2 ~]$ sudo chown postgres.postgres authorized_keys id_rsa.pub id_rsa [user@node2 ~]$ sudo mkdir -p ~postgres/.ssh [user@node2 ~]$ sudo chown postgres.postgres ~postgres/.ssh [user@node2 ~]$ sudo mv authorized_keys id_rsa.pub id_rsa ~postgres/.ssh [user@node2 ~]$ sudo chmod -R go-rwx ~postgres/.ssh Now test that ssh in both directions works. You may have to accept some new known hosts in the process. repmgr-3.0.3/TODO000066400000000000000000000055561264264412200135170ustar00rootroot00000000000000Known issues in repmgr ====================== * When running repmgr against a remote machine, operations that start the database server using the ``pg_ctl`` command may accidentally terminate after their associated ssh session ends. * PGPASSFILE may not be passed to pg_basebackup Planned feature improvements ============================ * Use 'primary' instead of 'master' in documentation and log output for consistency with PostgreSQL documentation. See also commit 870b0a53b627eeb9aca1fc14cbafe25b5beafe12. * A better check which standby did receive most of the data * Make the fact that a standby may be delayed a factor in the voting algorithm * include support for delayed standbys * Create the repmgr user/database on "master register". * Use pg_basebackup for the data directory, and ALSO rsync for the configuration files. * If no configuration file supplied, search in sensible default locations (currently: current directory and `pg_config --sysconfdir`); if possible this should include the location provided by the package, if installed. * repmgrd: if connection to the upstream node fails on startup, optionally retry for a certain period before giving up; this will cover cases when e.g. primary and standby are both starting up, and the standby comes up before the primary. See github issue #80. * make old master node ID available for event notification commands (See github issue #80). * Have pg_basebackup use replication slots, if and when support for this is added; see: http://www.postgresql.org/message-id/555DD2B2.7020000@gmx.net * use "primary/standby" terminology in place of "master/slave" for consistency with main PostrgreSQL usage * repmgr standby clone: possibility to use barman instead of performing a new base backup * possibility to transform a failed master into a new standby with pg_rewind * "repmgr standby switchover" to promote a standby in a controlled manner and convert the existing primary into a standby * make repmgrd more robust * repmgr: when cloning a standby using pg_basebackup and replication slots are requested, activate the replication slot using pg_receivexlog to negate the need to set `wal_keep_segments` just for the initial clone (9.4 and 9.5). Usability improvements ====================== * repmgr: add interrupt handler, so that if the program is interrupted while running a backup, an attempt can be made to execute pg_stop_backup() on the primary, to prevent an orphaned backup state existing. * repmgr: when unregistering a node, delete any entries in the repl_monitoring table. * repmgr: for "standby unregister", accept connection parameters for the primary and perform metadata updates (and slot removal) directly on the primary, to allow a shutdown standby to be unregistered (currently the standby must still be running, which means the replication slot can't be dropped). repmgr-3.0.3/check_dir.c000066400000000000000000000164271264264412200151050ustar00rootroot00000000000000/* * check_dir.c - Directories management functions * Copyright (C) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include #include #include #include #include #include #include /* NB: postgres_fe must be included BEFORE check_dir */ #include #include #include "check_dir.h" #include "strutil.h" #include "log.h" static bool _create_pg_dir(char *dir, bool force, bool for_witness); static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf); /* * make sure the directory either doesn't exist or is empty * we use this function to check the new data directory and * the directories for tablespaces * * This is the same check initdb does on the new PGDATA dir * * Returns 0 if nonexistent, 1 if exists and empty, 2 if not empty, * or -1 if trouble accessing directory */ int check_dir(char *dir) { DIR *chkdir; struct dirent *file; int result = 1; errno = 0; chkdir = opendir(dir); if (!chkdir) return (errno == ENOENT) ? 0 : -1; while ((file = readdir(chkdir)) != NULL) { if (strcmp(".", file->d_name) == 0 || strcmp("..", file->d_name) == 0) { /* skip this and parent directory */ continue; } else { result = 2; /* not empty */ break; } } #ifdef WIN32 /* * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in * released version */ if (GetLastError() == ERROR_NO_MORE_FILES) errno = 0; #endif closedir(chkdir); if (errno != 0) return -1; /* some kind of I/O error? */ return result; } /* * Create directory with error log message when failing */ bool create_dir(char *dir) { if (mkdir_p(dir, 0700) == 0) return true; log_err(_("unable to create directory \"%s\": %s\n"), dir, strerror(errno)); return false; } bool set_dir_permissions(char *dir) { return (chmod(dir, 0700) != 0) ? false : true; } /* function from initdb.c */ /* source adapted from FreeBSD /src/bin/mkdir/mkdir.c */ /* * this tries to build all the elements of a path to a directory a la mkdir -p * we assume the path is in canonical form, i.e. uses / as the separator * we also assume it isn't null. * * note that on failure, the path arg has been modified to show the particular * directory level we had problems with. */ int mkdir_p(char *path, mode_t omode) { struct stat sb; mode_t numask, oumask; int first, last, retval; char *p; p = path; oumask = 0; retval = 0; #ifdef WIN32 /* skip network and drive specifiers for win32 */ if (strlen(p) >= 2) { if (p[0] == '/' && p[1] == '/') { /* network drive */ p = strstr(p + 2, "/"); if (p == NULL) return 1; } else if (p[1] == ':' && ((p[0] >= 'a' && p[0] <= 'z') || (p[0] >= 'A' && p[0] <= 'Z'))) { /* local drive */ p += 2; } } #endif if (p[0] == '/') /* Skip leading '/'. */ ++p; for (first = 1, last = 0; !last; ++p) { if (p[0] == '\0') last = 1; else if (p[0] != '/') continue; *p = '\0'; if (!last && p[1] == '\0') last = 1; if (first) { /* * POSIX 1003.2: For each dir operand that does not name an * existing directory, effects equivalent to those caused by the * following command shall occcur: * * mkdir -p -m $(umask -S),u+wx $(dirname dir) && mkdir [-m mode] * dir * * We change the user's umask and then restore it, instead of * doing chmod's. */ oumask = umask(0); numask = oumask & ~(S_IWUSR | S_IXUSR); (void) umask(numask); first = 0; } if (last) (void) umask(oumask); /* check for pre-existing directory; ok if it's a parent */ if (stat(path, &sb) == 0) { if (!S_ISDIR(sb.st_mode)) { if (last) errno = EEXIST; else errno = ENOTDIR; retval = 1; break; } } else if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0) { retval = 1; break; } if (!last) *p = '/'; } if (!first && !last) (void) umask(oumask); return retval; } bool is_pg_dir(char *dir) { const size_t buf_sz = 8192; char path[buf_sz]; struct stat sb; int r; /* test pgdata */ xsnprintf(path, buf_sz, "%s/PG_VERSION", dir); if (stat(path, &sb) == 0) return true; /* test tablespace dir */ sprintf(path, "ls %s/PG_*/ -I*", dir); r = system(path); if (r == 0) return true; return false; } bool create_pg_dir(char *dir, bool force) { return _create_pg_dir(dir, force, false); } bool create_witness_pg_dir(char *dir, bool force) { return _create_pg_dir(dir, force, true); } static bool _create_pg_dir(char *dir, bool force, bool for_witness) { bool pg_dir = false; /* Check this directory could be used as a PGDATA dir */ switch (check_dir(dir)) { case 0: /* dir not there, must create it */ log_info(_("creating directory \"%s\"...\n"), dir); if (!create_dir(dir)) { log_err(_("unable to create directory \"%s\"...\n"), dir); return false; } break; case 1: /* Present but empty, fix permissions and use it */ log_info(_("checking and correcting permissions on existing directory %s ...\n"), dir); if (!set_dir_permissions(dir)) { log_err(_("unable to change permissions of directory \"%s\": %s\n"), dir, strerror(errno)); return false; } break; case 2: /* Present and not empty */ log_warning(_("directory \"%s\" exists but is not empty\n"), dir); pg_dir = is_pg_dir(dir); if (pg_dir && force) { /* * The witness server does not store any data other than a copy of the * repmgr metadata, so in --force mode we can simply overwrite the * directory. * * For non-witness servers, we'll leave the data in place, both to reduce * the risk of unintentional data loss and to make it possible for the * data directory to be brought up-to-date with rsync. */ if (for_witness) { log_notice(_("deleting existing data directory \"%s\"\n"), dir); nftw(dir, unlink_dir_callback, 64, FTW_DEPTH | FTW_PHYS); } /* Let it continue */ break; } else if (pg_dir && !force) { log_hint(_("This looks like a PostgreSQL directory.\n" "If you are sure you want to clone here, " "please check there is no PostgreSQL server " "running and use the -F/--force option\n")); return false; } return false; default: /* Trouble accessing directory */ log_err(_("could not access directory \"%s\": %s\n"), dir, strerror(errno)); return false; } return true; } static int unlink_dir_callback(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf) { int rv = remove(fpath); if (rv) perror(fpath); return rv; } repmgr-3.0.3/check_dir.h000066400000000000000000000020221264264412200150740ustar00rootroot00000000000000/* * check_dir.h * Copyright (c) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _REPMGR_CHECK_DIR_H_ #define _REPMGR_CHECK_DIR_H_ int mkdir_p(char *path, mode_t omode); int check_dir(char *dir); bool create_dir(char *dir); bool set_dir_permissions(char *dir); bool is_pg_dir(char *dir); bool create_pg_dir(char *dir, bool force); bool create_witness_pg_dir(char *dir, bool force); #endif repmgr-3.0.3/config.c000066400000000000000000000620771264264412200144410ustar00rootroot00000000000000/* * config.c - Functions to parse the config file * Copyright (C) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include /* for stat() */ #include "config.h" #include "log.h" #include "strutil.h" #include "repmgr.h" static void parse_event_notifications_list(t_configuration_options *options, const char *arg); static void tablespace_list_append(t_configuration_options *options, const char *arg); static void exit_with_errors(ErrorList *config_errors); const static char *_progname = '\0'; static char config_file_path[MAXPGPATH]; static bool config_file_provided = false; static bool config_file_found = false; void set_progname(const char *argv0) { _progname = get_progname(argv0); } const char * progname(void) { return _progname; } /* * load_config() * * Set default options and overwrite with values from provided configuration * file. * * Returns true if a configuration file could be parsed, otherwise false. * * Any configuration options changed in this function must also be changed in * reload_config() * * NOTE: this function is called before the logger is set up, so we need * to handle the verbose option ourselves; also the default log level is NOTICE, * so we can't use DEBUG. */ bool load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0) { struct stat stat_config; /* * If a configuration file was provided, check it exists, otherwise * emit an error and terminate. We assume that if a user explicitly * provides a configuration file, they'll want to make sure it's * used and not fall back to any of the defaults. */ if (config_file[0]) { strncpy(config_file_path, config_file, MAXPGPATH); canonicalize_path(config_file_path); if (stat(config_file_path, &stat_config) != 0) { log_err(_("provided configuration file \"%s\" not found: %s\n"), config_file, strerror(errno) ); exit(ERR_BAD_CONFIG); } if (verbose == true) { log_notice(_("using configuration file \"%s\"\n"), config_file); } config_file_provided = true; config_file_found = true; } /* * If no configuration file was provided, attempt to find a default file * in this order: * - current directory * - /etc/repmgr.conf * - default sysconfdir * * here we just check for the existence of the file; parse_config() * will handle read errors etc. */ if (config_file_provided == false) { char my_exec_path[MAXPGPATH]; char sysconf_etc_path[MAXPGPATH]; /* 1. "./repmgr.conf" */ if (verbose == true) { log_notice(_("looking for configuration file in current directory\n")); } snprintf(config_file_path, MAXPGPATH, "./%s", CONFIG_FILE_NAME); canonicalize_path(config_file_path); if (stat(config_file_path, &stat_config) == 0) { config_file_found = true; goto end_search; } /* 2. "/etc/repmgr.conf" */ if (verbose == true) { log_notice(_("looking for configuration file in /etc\n")); } snprintf(config_file_path, MAXPGPATH, "/etc/%s", CONFIG_FILE_NAME); if (stat(config_file_path, &stat_config) == 0) { config_file_found = true; goto end_search; } /* 3. default sysconfdir */ if (find_my_exec(argv0, my_exec_path) < 0) { fprintf(stderr, _("%s: could not find own program executable\n"), argv0); exit(EXIT_FAILURE); } get_etc_path(my_exec_path, sysconf_etc_path); if (verbose == true) { log_notice(_("looking for configuration file in %s"), sysconf_etc_path); } snprintf(config_file_path, MAXPGPATH, "%s/%s", sysconf_etc_path, CONFIG_FILE_NAME); if (stat(config_file_path, &stat_config) == 0) { config_file_found = true; goto end_search; } end_search: if (config_file_found == true) { if (verbose == true) { log_notice(_("configuration file found at: %s\n"), config_file_path); } } else { if (verbose == true) { log_notice(_("no configuration file provided or found\n")); } } } return parse_config(options); } /* * Parse configuration file; if any errors are encountered, * list them and exit. * * Ensure any default values set here are synced with repmgr.conf.sample * and any other documentation. */ bool parse_config(t_configuration_options *options) { FILE *fp; char *s, buf[MAXLINELENGTH]; char name[MAXLEN]; char value[MAXLEN]; /* For sanity-checking provided conninfo string */ PQconninfoOption *conninfo_options; char *conninfo_errmsg = NULL; /* Collate configuration file errors here for friendlier reporting */ static ErrorList config_errors = { NULL, NULL }; /* Initialize configuration options with sensible defaults * note: the default log level is set in log.c and does not need * to be initialised here */ memset(options->cluster_name, 0, sizeof(options->cluster_name)); options->node = -1; options->upstream_node = NO_UPSTREAM_NODE; options->use_replication_slots = 0; memset(options->conninfo, 0, sizeof(options->conninfo)); options->failover = MANUAL_FAILOVER; options->priority = DEFAULT_PRIORITY; memset(options->node_name, 0, sizeof(options->node_name)); memset(options->promote_command, 0, sizeof(options->promote_command)); memset(options->follow_command, 0, sizeof(options->follow_command)); memset(options->rsync_options, 0, sizeof(options->rsync_options)); memset(options->ssh_options, 0, sizeof(options->ssh_options)); memset(options->pg_bindir, 0, sizeof(options->pg_bindir)); memset(options->pg_ctl_options, 0, sizeof(options->pg_ctl_options)); memset(options->pg_basebackup_options, 0, sizeof(options->pg_basebackup_options)); /* default master_response_timeout is 60 seconds */ options->master_response_timeout = 60; /* default to 6 reconnection attempts at intervals of 10 seconds */ options->reconnect_attempts = 6; options->reconnect_interval = 10; options->monitor_interval_secs = 2; options->retry_promote_interval_secs = 300; memset(options->event_notification_command, 0, sizeof(options->event_notification_command)); options->tablespace_mapping.head = NULL; options->tablespace_mapping.tail = NULL; /* * If no configuration file available (user didn't specify and none found * in the default locations), return with default values */ if (config_file_found == false) { log_notice(_("no configuration file provided and no default file found - " "continuing with default values\n")); return true; } fp = fopen(config_file_path, "r"); /* * A configuration file has been found, either provided by the user * or found in one of the default locations. If we can't open it, * fail with an error. */ if (fp == NULL) { if (config_file_provided) { log_err(_("unable to open provided configuration file \"%s\"; terminating\n"), config_file_path); } else { log_err(_("unable to open default configuration file \"%s\"; terminating\n"), config_file_path); } exit(ERR_BAD_CONFIG); } /* Read file */ while ((s = fgets(buf, sizeof buf, fp)) != NULL) { bool known_parameter = true; /* Parse name/value pair from line */ parse_line(buf, name, value); /* Skip blank lines */ if (!strlen(name)) continue; /* Skip comments */ if (name[0] == '#') continue; /* Copy into correct entry in parameters struct */ if (strcmp(name, "cluster") == 0) strncpy(options->cluster_name, value, MAXLEN); else if (strcmp(name, "node") == 0) options->node = repmgr_atoi(value, "node", &config_errors); else if (strcmp(name, "upstream_node") == 0) options->upstream_node = repmgr_atoi(value, "upstream_node", &config_errors); else if (strcmp(name, "conninfo") == 0) strncpy(options->conninfo, value, MAXLEN); else if (strcmp(name, "rsync_options") == 0) strncpy(options->rsync_options, value, QUERY_STR_LEN); else if (strcmp(name, "ssh_options") == 0) strncpy(options->ssh_options, value, QUERY_STR_LEN); else if (strcmp(name, "loglevel") == 0) strncpy(options->loglevel, value, MAXLEN); else if (strcmp(name, "logfacility") == 0) strncpy(options->logfacility, value, MAXLEN); else if (strcmp(name, "failover") == 0) { char failoverstr[MAXLEN]; strncpy(failoverstr, value, MAXLEN); if (strcmp(failoverstr, "manual") == 0) { options->failover = MANUAL_FAILOVER; } else if (strcmp(failoverstr, "automatic") == 0) { options->failover = AUTOMATIC_FAILOVER; } else { error_list_append(&config_errors,_("value for 'failover' must be 'automatic' or 'manual'\n")); } } else if (strcmp(name, "priority") == 0) options->priority = repmgr_atoi(value, "priority", &config_errors); else if (strcmp(name, "node_name") == 0) strncpy(options->node_name, value, MAXLEN); else if (strcmp(name, "promote_command") == 0) strncpy(options->promote_command, value, MAXLEN); else if (strcmp(name, "follow_command") == 0) strncpy(options->follow_command, value, MAXLEN); else if (strcmp(name, "master_response_timeout") == 0) options->master_response_timeout = repmgr_atoi(value, "master_response_timeout", &config_errors); /* 'primary_response_timeout' as synonym for 'master_response_timeout' - * we'll switch terminology in a future release (3.1?) */ else if (strcmp(name, "primary_response_timeout") == 0) options->master_response_timeout = repmgr_atoi(value, "primary_response_timeout", &config_errors); else if (strcmp(name, "reconnect_attempts") == 0) options->reconnect_attempts = repmgr_atoi(value, "reconnect_attempts", &config_errors); else if (strcmp(name, "reconnect_interval") == 0) options->reconnect_interval = repmgr_atoi(value, "reconnect_interval", &config_errors); else if (strcmp(name, "pg_bindir") == 0) strncpy(options->pg_bindir, value, MAXLEN); else if (strcmp(name, "pg_ctl_options") == 0) strncpy(options->pg_ctl_options, value, MAXLEN); else if (strcmp(name, "pg_basebackup_options") == 0) strncpy(options->pg_basebackup_options, value, MAXLEN); else if (strcmp(name, "logfile") == 0) strncpy(options->logfile, value, MAXLEN); else if (strcmp(name, "monitor_interval_secs") == 0) options->monitor_interval_secs = repmgr_atoi(value, "monitor_interval_secs", &config_errors); else if (strcmp(name, "retry_promote_interval_secs") == 0) options->retry_promote_interval_secs = repmgr_atoi(value, "retry_promote_interval_secs", &config_errors); else if (strcmp(name, "use_replication_slots") == 0) /* XXX we should have a dedicated boolean argument format */ options->use_replication_slots = repmgr_atoi(value, "use_replication_slots", &config_errors); else if (strcmp(name, "event_notification_command") == 0) strncpy(options->event_notification_command, value, MAXLEN); else if (strcmp(name, "event_notifications") == 0) parse_event_notifications_list(options, value); else if (strcmp(name, "tablespace_mapping") == 0) tablespace_list_append(options, value); else { known_parameter = false; log_warning(_("%s/%s: unknown name/value pair provided; ignoring\n"), name, value); } /* * Raise an error if a known parameter is provided with an empty value. * Currently there's no reason why empty parameters are needed; if * we want to accept those, we'd need to add stricter default checking, * as currently e.g. an empty `node` value will be converted to '0'. */ if (known_parameter == true && !strlen(value)) { char error_message_buf[MAXLEN] = ""; snprintf(error_message_buf, MAXLEN, _("no value provided for parameter \"%s\""), name); error_list_append(&config_errors, error_message_buf); } } fclose(fp); /* Check config settings */ /* The following checks are for the presence of the parameter */ if (*options->cluster_name == '\0') { error_list_append(&config_errors, _("\"cluster\": parameter was not found\n")); } if (options->node == -1) { error_list_append(&config_errors, _("\"node\": parameter was not found\n")); } if (*options->node_name == '\0') { error_list_append(&config_errors, _("\"node_name\": parameter was not found\n")); } if (*options->conninfo == '\0') { error_list_append(&config_errors, _("\"conninfo\": parameter was not found\n")); } else { /* Sanity check the provided conninfo string * * NOTE: PQconninfoParse() verifies the string format and checks for valid options * but does not sanity check values */ conninfo_options = PQconninfoParse(options->conninfo, &conninfo_errmsg); if (conninfo_options == NULL) { char error_message_buf[MAXLEN] = ""; snprintf(error_message_buf, MAXLEN, _("\"conninfo\": %s"), conninfo_errmsg); error_list_append(&config_errors, error_message_buf); } PQconninfoFree(conninfo_options); } if (config_errors.head != NULL) { exit_with_errors(&config_errors); } return true; } char * trim(char *s) { /* Initialize start, end pointers */ char *s1 = s, *s2 = &s[strlen(s) - 1]; /* If string is empty, no action needed */ if (s2 < s1) return s; /* Trim and delimit right side */ while ((isspace(*s2)) && (s2 >= s1)) --s2; *(s2 + 1) = '\0'; /* Trim left side */ while ((isspace(*s1)) && (s1 < s2)) ++s1; /* Copy finished string */ memmove(s, s1, s2 - s1); s[s2 - s1 + 1] = '\0'; return s; } void parse_line(char *buf, char *name, char *value) { int i = 0; int j = 0; /* * Extract parameter name, if present */ for (; i < MAXLEN; ++i) { if (buf[i] == '=') break; switch(buf[i]) { /* Ignore whitespace */ case ' ': case '\n': case '\r': case '\t': continue; default: name[j++] = buf[i]; } } name[j] = '\0'; /* * Ignore any whitespace following the '=' sign */ for (; i < MAXLEN; ++i) { if (buf[i+1] == ' ') continue; if (buf[i+1] == '\t') continue; break; } /* * Extract parameter value */ j = 0; for (++i; i < MAXLEN; ++i) if (buf[i] == '\'') continue; else if (buf[i] == '#') break; else if (buf[i] != '\n') value[j++] = buf[i]; else break; value[j] = '\0'; trim(value); } bool reload_config(t_configuration_options *orig_options) { PGconn *conn; t_configuration_options new_options; bool config_changed = false; /* * Re-read the configuration file: repmgr.conf */ log_info(_("reloading configuration file and updating repmgr tables\n")); parse_config(&new_options); if (new_options.node == -1) { log_warning(_("unable to parse new configuration, retaining current configuration\n")); return false; } if (strcmp(new_options.cluster_name, orig_options->cluster_name) != 0) { log_warning(_("unable to change cluster name, retaining current configuration\n")); return false; } if (new_options.node != orig_options->node) { log_warning(_("unable to change node ID, retaining current configuration\n")); return false; } if (strcmp(new_options.node_name, orig_options->node_name) != 0) { log_warning(_("unable to change standby name, keeping current configuration\n")); return false; } if (new_options.failover != MANUAL_FAILOVER && new_options.failover != AUTOMATIC_FAILOVER) { log_warning(_("new value for 'failover' must be 'automatic' or 'manual'\n")); return false; } if (new_options.master_response_timeout <= 0) { log_warning(_("new value for 'master_response_timeout' must be greater than zero\n")); return false; } if (new_options.reconnect_attempts < 0) { log_warning(_("new value for 'reconnect_attempts' must be zero or greater\n")); return false; } if (new_options.reconnect_interval < 0) { log_warning(_("new value for 'reconnect_interval' must be zero or greater\n")); return false; } if (strcmp(orig_options->conninfo, new_options.conninfo) != 0) { /* Test conninfo string */ conn = establish_db_connection(new_options.conninfo, false); if (!conn || (PQstatus(conn) != CONNECTION_OK)) { log_warning(_("'conninfo' string is not valid, retaining current configuration\n")); return false; } PQfinish(conn); } /* * No configuration problems detected - copy any changed values * * NB: keep these in the same order as in config.h to make it easier * to manage them */ /* cluster_name */ if (strcmp(orig_options->cluster_name, new_options.cluster_name) != 0) { strcpy(orig_options->cluster_name, new_options.cluster_name); config_changed = true; } /* conninfo */ if (strcmp(orig_options->conninfo, new_options.conninfo) != 0) { strcpy(orig_options->conninfo, new_options.conninfo); config_changed = true; } /* node */ if (orig_options->node != new_options.node) { orig_options->node = new_options.node; config_changed = true; } /* failover */ if (orig_options->failover != new_options.failover) { orig_options->failover = new_options.failover; config_changed = true; } /* priority */ if (orig_options->priority != new_options.priority) { orig_options->priority = new_options.priority; config_changed = true; } /* node_name */ if (strcmp(orig_options->node_name, new_options.node_name) != 0) { strcpy(orig_options->node_name, new_options.node_name); config_changed = true; } /* promote_command */ if (strcmp(orig_options->promote_command, new_options.promote_command) != 0) { strcpy(orig_options->promote_command, new_options.promote_command); config_changed = true; } /* follow_command */ if (strcmp(orig_options->follow_command, new_options.follow_command) != 0) { strcpy(orig_options->follow_command, new_options.follow_command); config_changed = true; } /* * XXX These ones can change with a simple SIGHUP? * * strcpy (orig_options->loglevel, new_options.loglevel); strcpy * (orig_options->logfacility, new_options.logfacility); * * logger_shutdown(); XXX do we have progname here ? logger_init(progname, * orig_options.loglevel, orig_options.logfacility); */ /* rsync_options */ if (strcmp(orig_options->rsync_options, new_options.rsync_options) != 0) { strcpy(orig_options->rsync_options, new_options.rsync_options); config_changed = true; } /* ssh_options */ if (strcmp(orig_options->ssh_options, new_options.ssh_options) != 0) { strcpy(orig_options->ssh_options, new_options.ssh_options); config_changed = true; } /* master_response_timeout */ if (orig_options->master_response_timeout != new_options.master_response_timeout) { orig_options->master_response_timeout = new_options.master_response_timeout; config_changed = true; } /* reconnect_attempts */ if (orig_options->reconnect_attempts != new_options.reconnect_attempts) { orig_options->reconnect_attempts = new_options.reconnect_attempts; config_changed = true; } /* reconnect_interval */ if (orig_options->reconnect_interval != new_options.reconnect_interval) { orig_options->reconnect_interval = new_options.reconnect_interval; config_changed = true; } /* pg_ctl_options */ if (strcmp(orig_options->pg_ctl_options, new_options.pg_ctl_options) != 0) { strcpy(orig_options->pg_ctl_options, new_options.pg_ctl_options); config_changed = true; } /* pg_basebackup_options */ if (strcmp(orig_options->pg_basebackup_options, new_options.pg_basebackup_options) != 0) { strcpy(orig_options->pg_basebackup_options, new_options.pg_basebackup_options); config_changed = true; } /* monitor_interval_secs */ if (orig_options->monitor_interval_secs != new_options.monitor_interval_secs) { orig_options->monitor_interval_secs = new_options.monitor_interval_secs; config_changed = true; } /* retry_promote_interval_secs */ if (orig_options->retry_promote_interval_secs != new_options.retry_promote_interval_secs) { orig_options->retry_promote_interval_secs = new_options.retry_promote_interval_secs; config_changed = true; } /* use_replication_slots */ if (orig_options->use_replication_slots != new_options.use_replication_slots) { orig_options->use_replication_slots = new_options.use_replication_slots; config_changed = true; } if (config_changed == true) { log_debug(_("reload_config(): configuration has changed\n")); } else { log_debug(_("reload_config(): configuration has not changed\n")); } return config_changed; } void error_list_append(ErrorList *error_list, char *error_message) { ErrorListCell *cell; cell = (ErrorListCell *) pg_malloc0(sizeof(ErrorListCell)); if (cell == NULL) { log_err(_("unable to allocate memory; terminating.\n")); exit(ERR_BAD_CONFIG); } cell->error_message = pg_malloc0(MAXLEN); strncpy(cell->error_message, error_message, MAXLEN); if (error_list->tail) { error_list->tail->next = cell; } else { error_list->head = cell; } error_list->tail = cell; } /* * Convert provided string to an integer using strtol; * on error, if a callback is provided, pass the error message to that, * otherwise exit */ int repmgr_atoi(const char *value, const char *config_item, ErrorList *error_list) { char *endptr; long longval = 0; char error_message_buf[MAXLEN] = ""; /* It's possible that some versions of strtol() don't treat an empty * string as an error. */ if (*value == '\0') { snprintf(error_message_buf, MAXLEN, _("no value provided for \"%s\""), config_item); } else { errno = 0; longval = strtol(value, &endptr, 10); if (value == endptr || errno) { snprintf(error_message_buf, MAXLEN, _("\"%s\": invalid value (provided: \"%s\")"), config_item, value); } } /* Currently there are no values which could be negative */ if (longval < 0) { snprintf(error_message_buf, MAXLEN, _("\"%s\" must be zero or greater (provided: %s)"), config_item, value); } /* Error message buffer is set */ if (error_message_buf[0] != '\0') { if (error_list == NULL) { log_err("%s\n", error_message_buf); exit(ERR_BAD_CONFIG); } error_list_append(error_list, error_message_buf); } return (int32) longval; } /* * Split argument into old_dir and new_dir and append to tablespace mapping * list. * * Adapted from pg_basebackup.c */ static void tablespace_list_append(t_configuration_options *options, const char *arg) { TablespaceListCell *cell; char *dst; char *dst_ptr; const char *arg_ptr; cell = (TablespaceListCell *) pg_malloc0(sizeof(TablespaceListCell)); if (cell == NULL) { log_err(_("unable to allocate memory; terminating\n")); exit(ERR_BAD_CONFIG); } dst_ptr = dst = cell->old_dir; for (arg_ptr = arg; *arg_ptr; arg_ptr++) { if (dst_ptr - dst >= MAXPGPATH) { log_err(_("directory name too long\n")); exit(ERR_BAD_CONFIG); } if (*arg_ptr == '\\' && *(arg_ptr + 1) == '=') ; /* skip backslash escaping = */ else if (*arg_ptr == '=' && (arg_ptr == arg || *(arg_ptr - 1) != '\\')) { if (*cell->new_dir) { log_err(_("multiple \"=\" signs in tablespace mapping\n")); exit(ERR_BAD_CONFIG); } else { dst = dst_ptr = cell->new_dir; } } else *dst_ptr++ = *arg_ptr; } if (!*cell->old_dir || !*cell->new_dir) { log_err(_("invalid tablespace mapping format \"%s\", must be \"OLDDIR=NEWDIR\"\n"), arg); exit(ERR_BAD_CONFIG); } canonicalize_path(cell->old_dir); canonicalize_path(cell->new_dir); if (options->tablespace_mapping.tail) options->tablespace_mapping.tail->next = cell; else options->tablespace_mapping.head = cell; options->tablespace_mapping.tail = cell; } /* * parse_event_notifications_list() * * */ static void parse_event_notifications_list(t_configuration_options *options, const char *arg) { const char *arg_ptr; char event_type_buf[MAXLEN] = ""; char *dst_ptr = event_type_buf; for (arg_ptr = arg; arg_ptr <= (arg + strlen(arg)); arg_ptr++) { /* ignore whitespace */ if (*arg_ptr == ' ' || *arg_ptr == '\t') { continue; } /* * comma (or end-of-string) should mark the end of an event type - * just as long as there was something preceding it */ if ((*arg_ptr == ',' || *arg_ptr == '\0') && event_type_buf[0] != '\0') { EventNotificationListCell *cell; cell = (EventNotificationListCell *) pg_malloc0(sizeof(EventNotificationListCell)); if (cell == NULL) { log_err(_("unable to allocate memory; terminating\n")); exit(ERR_BAD_CONFIG); } strncpy(cell->event_type, event_type_buf, MAXLEN); if (options->event_notifications.tail) { options->event_notifications.tail->next = cell; } else { options->event_notifications.head = cell; } options->event_notifications.tail = cell; memset(event_type_buf, 0, MAXLEN); dst_ptr = event_type_buf; } /* ignore duplicated commas */ else if (*arg_ptr == ',') { continue; } else { *dst_ptr++ = *arg_ptr; } } } static void exit_with_errors(ErrorList *config_errors) { ErrorListCell *cell; log_err(_("%s: following errors were found in the configuration file.\n"), progname()); for (cell = config_errors->head; cell; cell = cell->next) { log_err("%s\n", cell->error_message); } exit(ERR_BAD_CONFIG); } repmgr-3.0.3/config.h000066400000000000000000000060551264264412200144400ustar00rootroot00000000000000/* * config.h * Copyright (c) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _REPMGR_CONFIG_H_ #define _REPMGR_CONFIG_H_ #include "postgres_fe.h" #include "strutil.h" #define CONFIG_FILE_NAME "repmgr.conf" typedef struct EventNotificationListCell { struct EventNotificationListCell *next; char event_type[MAXLEN]; } EventNotificationListCell; typedef struct EventNotificationList { EventNotificationListCell *head; EventNotificationListCell *tail; } EventNotificationList; typedef struct TablespaceListCell { struct TablespaceListCell *next; char old_dir[MAXPGPATH]; char new_dir[MAXPGPATH]; } TablespaceListCell; typedef struct TablespaceList { TablespaceListCell *head; TablespaceListCell *tail; } TablespaceList; typedef struct { char cluster_name[MAXLEN]; int node; int upstream_node; char conninfo[MAXLEN]; int failover; int priority; char node_name[MAXLEN]; char promote_command[MAXLEN]; char follow_command[MAXLEN]; char loglevel[MAXLEN]; char logfacility[MAXLEN]; char rsync_options[QUERY_STR_LEN]; char ssh_options[QUERY_STR_LEN]; int master_response_timeout; int reconnect_attempts; int reconnect_interval; char pg_bindir[MAXLEN]; char pg_ctl_options[MAXLEN]; char pg_basebackup_options[MAXLEN]; char logfile[MAXLEN]; int monitor_interval_secs; int retry_promote_interval_secs; int use_replication_slots; char event_notification_command[MAXLEN]; EventNotificationList event_notifications; TablespaceList tablespace_mapping; } t_configuration_options; #define T_CONFIGURATION_OPTIONS_INITIALIZER { "", -1, NO_UPSTREAM_NODE, "", MANUAL_FAILOVER, -1, "", "", "", "", "", "", "", -1, -1, -1, "", "", "", "", 0, 0, 0, "", { NULL, NULL }, {NULL, NULL} } typedef struct ErrorListCell { struct ErrorListCell *next; char *error_message; } ErrorListCell; typedef struct ErrorList { ErrorListCell *head; ErrorListCell *tail; } ErrorList; void set_progname(const char *argv0); const char * progname(void); bool load_config(const char *config_file, bool verbose, t_configuration_options *options, char *argv0); bool reload_config(t_configuration_options *orig_options); bool parse_config(t_configuration_options *options); void parse_line(char *buff, char *name, char *value); char *trim(char *s); void error_list_append(ErrorList *error_list, char *error_message); int repmgr_atoi(const char *s, const char *config_item, ErrorList *error_list); #endif repmgr-3.0.3/dbutils.c000066400000000000000000001050471264264412200146350ustar00rootroot00000000000000/* * dbutils.c - Database connection/management functions * Copyright (C) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include #include #include #include "repmgr.h" #include "config.h" #include "strutil.h" #include "log.h" char repmgr_schema[MAXLEN] = ""; char repmgr_schema_quoted[MAXLEN] = ""; PGconn * establish_db_connection(const char *conninfo, const bool exit_on_error) { /* Make a connection to the database */ PGconn *conn = NULL; char connection_string[MAXLEN]; strcpy(connection_string, conninfo); strcat(connection_string, " fallback_application_name='repmgr'"); log_debug(_("connecting to: '%s'\n"), connection_string); conn = PQconnectdb(connection_string); /* Check to see that the backend connection was successfully made */ if ((PQstatus(conn) != CONNECTION_OK)) { log_err(_("connection to database failed: %s\n"), PQerrorMessage(conn)); if (exit_on_error) { PQfinish(conn); exit(ERR_DB_CON); } } return conn; } PGconn * establish_db_connection_by_params(const char *keywords[], const char *values[], const bool exit_on_error) { /* Make a connection to the database */ PGconn *conn = PQconnectdbParams(keywords, values, true); /* Check to see that the backend connection was successfully made */ if ((PQstatus(conn) != CONNECTION_OK)) { log_err(_("connection to database failed: %s\n"), PQerrorMessage(conn)); if (exit_on_error) { PQfinish(conn); exit(ERR_DB_CON); } } return conn; } bool begin_transaction(PGconn *conn) { PGresult *res; log_verbose(LOG_DEBUG, "begin_transaction()\n"); res = PQexec(conn, "BEGIN"); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to begin transaction: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool commit_transaction(PGconn *conn) { PGresult *res; log_verbose(LOG_DEBUG, "commit_transaction()\n"); res = PQexec(conn, "COMMIT"); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to commit transaction: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool rollback_transaction(PGconn *conn) { PGresult *res; log_verbose(LOG_DEBUG, "rollback_transaction()\n"); res = PQexec(conn, "ROLLBACK"); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to rollback transaction: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool check_cluster_schema(PGconn *conn) { PGresult *res; char sqlquery[QUERY_STR_LEN]; sqlquery_snprintf(sqlquery, "SELECT 1 FROM pg_namespace WHERE nspname = '%s'", get_repmgr_schema()); log_verbose(LOG_DEBUG, "check_cluster_schema(): %s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("check_cluster_schema(): unable to check cluster schema: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } if (PQntuples(res) == 0) { /* schema doesn't exist */ log_debug(_("check_cluster_schema(): schema '%s' doesn't exist\n"), get_repmgr_schema()); PQclear(res); return false; } PQclear(res); return true; } int is_standby(PGconn *conn) { PGresult *res; int result = 0; char *sqlquery = "SELECT pg_catalog.pg_is_in_recovery()"; log_verbose(LOG_DEBUG, "is_standby(): %s\n", sqlquery); res = PQexec(conn, sqlquery); if (res == NULL || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("Unable to query server mode: %s\n"), PQerrorMessage(conn)); result = -1; } else if (PQntuples(res) == 1 && strcmp(PQgetvalue(res, 0, 0), "t") == 0) { result = 1; } PQclear(res); return result; } /* check the PQStatus and try to 'select 1' to confirm good connection */ bool is_pgup(PGconn *conn, int timeout) { char sqlquery[QUERY_STR_LEN]; /* Check the connection status twice in case it changes after reset */ bool twice = false; /* Check the connection status twice in case it changes after reset */ for (;;) { if (PQstatus(conn) != CONNECTION_OK) { if (twice) return false; PQreset(conn); /* reconnect */ twice = true; } else { /* * Send a SELECT 1 just to check if the connection is OK */ if (!cancel_query(conn, timeout)) goto failed; if (wait_connection_availability(conn, timeout) != 1) goto failed; sqlquery_snprintf(sqlquery, "SELECT 1"); if (PQsendQuery(conn, sqlquery) == 0) { log_warning(_("PQsendQuery: Query could not be sent to primary. %s\n"), PQerrorMessage(conn)); goto failed; } if (wait_connection_availability(conn, timeout) != 1) goto failed; break; failed: /* * we need to retry, because we might just have lost the * connection once */ if (twice) return false; PQreset(conn); /* reconnect */ twice = true; } } return true; } /* * Return the id of the active master node, or NODE_NOT_FOUND if no * record available. * * This reports the value stored in the database only and * does not verify whether the node is actually available */ int get_master_node_id(PGconn *conn, char *cluster) { char sqlquery[QUERY_STR_LEN]; PGresult *res; int retval; sqlquery_snprintf(sqlquery, "SELECT id " " FROM %s.repl_nodes " " WHERE cluster = '%s' " " AND type = 'master' " " AND active IS TRUE ", get_repmgr_schema_quoted(conn), cluster); log_verbose(LOG_DEBUG, "get_master_node_id():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("get_master_node_id(): query failed\n%s\n"), PQerrorMessage(conn)); retval = NODE_NOT_FOUND; } else if (PQntuples(res) == 0) { log_warning(_("get_master_node_id(): no active primary found\n")); retval = NODE_NOT_FOUND; } else { retval = atoi(PQgetvalue(res, 0, 0)); } PQclear(res); return retval; } /* * Return the server version number for the connection provided */ int get_server_version(PGconn *conn, char *server_version) { PGresult *res; res = PQexec(conn, "SELECT current_setting('server_version_num'), " " current_setting('server_version')"); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to determine server version number:\n%s"), PQerrorMessage(conn)); PQclear(res); return -1; } if (server_version != NULL) strcpy(server_version, PQgetvalue(res, 0, 0)); return atoi(PQgetvalue(res, 0, 0)); } int guc_set(PGconn *conn, const char *parameter, const char *op, const char *value) { PGresult *res; char sqlquery[QUERY_STR_LEN]; int retval = 1; sqlquery_snprintf(sqlquery, "SELECT true FROM pg_settings " " WHERE name = '%s' AND setting %s '%s'", parameter, op, value); log_verbose(LOG_DEBUG, "guc_set():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("guc_set(): unable to execute query\n%s\n"), PQerrorMessage(conn)); retval = -1; } else if (PQntuples(res) == 0) { retval = 0; } PQclear(res); return retval; } /** * Just like guc_set except with an extra parameter containing the name of * the pg datatype so that the comparison can be done properly. */ int guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype) { PGresult *res; char sqlquery[QUERY_STR_LEN]; int retval = 1; sqlquery_snprintf(sqlquery, "SELECT true FROM pg_settings " " WHERE name = '%s' AND setting::%s %s '%s'::%s", parameter, datatype, op, value, datatype); log_verbose(LOG_DEBUG, "guc_set_typed():n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("guc_set_typed(): unable to execute query\n%s\n"), PQerrorMessage(conn)); retval = -1; } else if (PQntuples(res) == 0) { retval = 0; } PQclear(res); return retval; } bool get_cluster_size(PGconn *conn, char *size) { PGresult *res; char sqlquery[QUERY_STR_LEN]; sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_size_pretty(SUM(pg_catalog.pg_database_size(oid))::bigint) " " FROM pg_database "); log_verbose(LOG_DEBUG, "get_cluster_size():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (res == NULL || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("get_cluster_size(): unable to execute query\n%s\n"), PQerrorMessage(conn)); PQclear(res); return false; } strncpy(size, PQgetvalue(res, 0, 0), MAXLEN); PQclear(res); return true; } bool get_pg_setting(PGconn *conn, const char *setting, char *output) { char sqlquery[QUERY_STR_LEN]; PGresult *res; int i; bool success = true; sqlquery_snprintf(sqlquery, "SELECT name, setting " " FROM pg_settings WHERE name = '%s'", setting); log_verbose(LOG_DEBUG, "get_pg_setting(): %s\n", sqlquery); res = PQexec(conn, sqlquery); if (res == NULL || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("get_pg_setting() - PQexec failed: %s"), PQerrorMessage(conn)); PQclear(res); return false; } for (i = 0; i < PQntuples(res); i++) { if (strcmp(PQgetvalue(res, i, 0), setting) == 0) { strncpy(output, PQgetvalue(res, i, 1), MAXLEN); success = true; break; } else { /* XXX highly unlikely this would ever happen */ log_err(_("get_pg_setting(): unknown parameter \"%s\""), PQgetvalue(res, i, 0)); } } if (success == true) { log_debug(_("get_pg_setting(): returned value is \"%s\"\n"), output); } PQclear(res); return success; } /* * get_upstream_connection() * * Returns connection to node's upstream node * * NOTE: will attempt to connect even if node is marked as inactive */ PGconn * get_upstream_connection(PGconn *standby_conn, char *cluster, int node_id, int *upstream_node_id_ptr, char *upstream_conninfo_out) { PGconn *upstream_conn = NULL; PGresult *res; char sqlquery[QUERY_STR_LEN]; char upstream_conninfo_stack[MAXCONNINFO]; char *upstream_conninfo = &*upstream_conninfo_stack; /* * If the caller wanted to get a copy of the connection info string, sub * out the local stack pointer for the pointer passed by the caller. */ if (upstream_conninfo_out != NULL) upstream_conninfo = upstream_conninfo_out; sqlquery_snprintf(sqlquery, " SELECT un.conninfo, un.name, un.id " " FROM %s.repl_nodes un " "INNER JOIN %s.repl_nodes n " " ON (un.id = n.upstream_node_id AND un.cluster = n.cluster)" " WHERE n.cluster = '%s' " " AND n.id = %i ", get_repmgr_schema_quoted(standby_conn), get_repmgr_schema_quoted(standby_conn), cluster, node_id); log_verbose(LOG_DEBUG, "get_upstream_connection():\n%s\n", sqlquery); res = PQexec(standby_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to get conninfo for upstream server\n%s\n"), PQerrorMessage(standby_conn)); PQclear(res); return NULL; } if (!PQntuples(res)) { log_notice(_("no record found for upstream server")); PQclear(res); return NULL; } strncpy(upstream_conninfo, PQgetvalue(res, 0, 0), MAXCONNINFO); if (upstream_node_id_ptr != NULL) *upstream_node_id_ptr = atoi(PQgetvalue(res, 0, 1)); PQclear(res); log_verbose(LOG_DEBUG, "get_upstream_connection(): conninfo is \"%s\"\n", upstream_conninfo); upstream_conn = establish_db_connection(upstream_conninfo, false); if (PQstatus(upstream_conn) != CONNECTION_OK) { log_err(_("unable to connect to upstream node: %s\n"), PQerrorMessage(upstream_conn)); return NULL; } return upstream_conn; } /* * Read the node list from the local node and attempt to connect to each node * in turn to definitely establish if it's the cluster primary. * * The node list is returned in the order which makes it likely that the * current primary will be returned first, reducing the number of speculative * connections which need to be made to other nodes. * * If master_conninfo_out points to allocated memory of MAXCONNINFO in length, * the primary server's conninfo string will be copied there. */ PGconn * get_master_connection(PGconn *standby_conn, char *cluster, int *master_id, char *master_conninfo_out) { PGconn *remote_conn = NULL; PGresult *res; char sqlquery[QUERY_STR_LEN]; char remote_conninfo_stack[MAXCONNINFO]; char *remote_conninfo = &*remote_conninfo_stack; int i, node_id; if (master_id != NULL) { *master_id = NODE_NOT_FOUND; } /* find all nodes belonging to this cluster */ log_info(_("retrieving node list for cluster '%s'\n"), cluster); sqlquery_snprintf(sqlquery, " SELECT id, conninfo, " " CASE WHEN type = 'master' THEN 1 ELSE 2 END AS type_priority" " FROM %s.repl_nodes " " WHERE cluster = '%s' " " AND type != 'witness' " "ORDER BY active DESC, type_priority, priority, id", get_repmgr_schema_quoted(standby_conn), cluster); log_verbose(LOG_DEBUG, "get_master_connection():\n%s\n", sqlquery); res = PQexec(standby_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to retrieve node records: %s\n"), PQerrorMessage(standby_conn)); PQclear(res); return NULL; } for (i = 0; i < PQntuples(res); i++) { int is_node_standby; /* initialize with the values of the current node being processed */ node_id = atoi(PQgetvalue(res, i, 0)); strncpy(remote_conninfo, PQgetvalue(res, i, 1), MAXCONNINFO); log_verbose(LOG_INFO, _("checking role of cluster node '%i'\n"), node_id); remote_conn = establish_db_connection(remote_conninfo, false); if (PQstatus(remote_conn) != CONNECTION_OK) continue; is_node_standby = is_standby(remote_conn); if (is_node_standby == -1) { log_err(_("unable to retrieve recovery state from node %i:\n%s\n"), node_id, PQerrorMessage(remote_conn)); PQfinish(remote_conn); continue; } /* if is_standby() returns 0, queried node is the master */ if (is_node_standby == 0) { PQclear(res); log_debug(_("get_master_connection(): current master node is %i\n"), node_id); if (master_id != NULL) { *master_id = node_id; } return remote_conn; } /* if it is a standby, clear connection info and continue*/ PQfinish(remote_conn); } /* * If we finish this loop without finding a master then we doesn't have * the info or the master has failed (or we reached max_connections or * superuser_reserved_connections, anything else I'm missing?). * * Probably we will need to check the error to know if we need to start * failover procedure or just fix some situation on the standby. */ PQclear(res); return NULL; } /* * wait until current query finishes ignoring any results, this could be an * async command or a cancelation of a query * return 1 if Ok; 0 if any error ocurred; -1 if timeout reached */ int wait_connection_availability(PGconn *conn, long long timeout) { PGresult *res; fd_set read_set; int sock = PQsocket(conn); struct timeval tmout, before, after; struct timezone tz; /* recalc to microseconds */ timeout *= 1000000; while (timeout > 0) { if (PQconsumeInput(conn) == 0) { log_warning(_("wait_connection_availability(): could not receive data from connection. %s\n"), PQerrorMessage(conn)); return 0; } if (PQisBusy(conn) == 0) { do { res = PQgetResult(conn); PQclear(res); } while (res != NULL); break; } tmout.tv_sec = 0; tmout.tv_usec = 250000; FD_ZERO(&read_set); FD_SET(sock, &read_set); gettimeofday(&before, &tz); if (select(sock, &read_set, NULL, NULL, &tmout) == -1) { log_warning( _("wait_connection_availability(): select() returned with error\n%s\n"), strerror(errno)); return -1; } gettimeofday(&after, &tz); timeout -= (after.tv_sec * 1000000 + after.tv_usec) - (before.tv_sec * 1000000 + before.tv_usec); } if (timeout >= 0) { return 1; } log_warning(_("wait_connection_availability(): timeout reached")); return -1; } bool cancel_query(PGconn *conn, int timeout) { char errbuf[ERRBUFF_SIZE]; PGcancel *pgcancel; if (wait_connection_availability(conn, timeout) != 1) return false; pgcancel = PQgetCancel(conn); if (pgcancel == NULL) return false; /* * PQcancel can only return 0 if socket()/connect()/send() fails, in any * of those cases we can assume something bad happened to the connection */ if (PQcancel(pgcancel, errbuf, ERRBUFF_SIZE) == 0) { log_warning(_("Can't stop current query: %s\n"), errbuf); PQfreeCancel(pgcancel); return false; } PQfreeCancel(pgcancel); return true; } /* Return the repmgr schema as an unmodified string * This is useful for displaying the schema name in log messages, * however inclusion in SQL statements, get_repmgr_schema_quoted() should * always be used. */ char * get_repmgr_schema(void) { return repmgr_schema; } char * get_repmgr_schema_quoted(PGconn *conn) { if (strcmp(repmgr_schema_quoted, "") == 0) { char *identifier = PQescapeIdentifier(conn, repmgr_schema, strlen(repmgr_schema)); maxlen_snprintf(repmgr_schema_quoted, "%s", identifier); PQfreemem(identifier); } return repmgr_schema_quoted; } bool create_replication_slot(PGconn *conn, char *slot_name) { char sqlquery[QUERY_STR_LEN]; PGresult *res; /* * Check whether slot exists already; if it exists and is active, that * means another active standby is using it, which creates an error situation; * if not we can reuse it as-is */ sqlquery_snprintf(sqlquery, "SELECT active, slot_type " " FROM pg_replication_slots " " WHERE slot_name = '%s' ", slot_name); log_verbose(LOG_DEBUG, "create_replication_slot():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to query pg_replication_slots: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } if (PQntuples(res)) { if (strcmp(PQgetvalue(res, 0, 1), "physical") != 0) { log_err(_("Slot '%s' exists and is not a physical slot\n"), slot_name); PQclear(res); } if (strcmp(PQgetvalue(res, 0, 0), "f") == 0) { PQclear(res); log_debug("Replication slot '%s' exists but is inactive; reusing\n", slot_name); return true; } PQclear(res); log_err(_("Slot '%s' already exists as an active slot\n"), slot_name); return false; } sqlquery_snprintf(sqlquery, "SELECT * FROM pg_create_physical_replication_slot('%s')", slot_name); log_debug(_("create_replication_slot(): Creating slot '%s' on primary\n"), slot_name); log_verbose(LOG_DEBUG, "create_replication_slot():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to create slot '%s' on the primary node: %s\n"), slot_name, PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool drop_replication_slot(PGconn *conn, char *slot_name) { char sqlquery[QUERY_STR_LEN]; PGresult *res; sqlquery_snprintf(sqlquery, "SELECT pg_drop_replication_slot('%s')", slot_name); log_verbose(LOG_DEBUG, "drop_replication_slot():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to drop replication slot \"%s\":\n %s\n"), slot_name, PQerrorMessage(conn)); PQclear(res); return false; } log_verbose(LOG_DEBUG, "replication slot \"%s\" successfully dropped\n", slot_name); return true; } bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint) { char sqlquery[QUERY_STR_LEN]; PGresult *res; sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_start_backup('repmgr_standby_clone_%ld', %s))", time(NULL), fast_checkpoint ? "TRUE" : "FALSE"); log_verbose(LOG_DEBUG, "start_backup():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to start backup: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } if (first_wal_segment != NULL) { char *first_wal_seg_pq = PQgetvalue(res, 0, 0); size_t buf_sz = strlen(first_wal_seg_pq); first_wal_segment = pg_malloc0(buf_sz + 1); xsnprintf(first_wal_segment, buf_sz + 1, "%s", first_wal_seg_pq); } PQclear(res); return true; } bool stop_backup(PGconn *conn, char *last_wal_segment) { char sqlquery[QUERY_STR_LEN]; PGresult *res; sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_xlogfile_name(pg_catalog.pg_stop_backup())"); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to stop backup: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } if (last_wal_segment != NULL) { char *last_wal_seg_pq = PQgetvalue(res, 0, 0); size_t buf_sz = strlen(last_wal_seg_pq); last_wal_segment = pg_malloc0(buf_sz + 1); xsnprintf(last_wal_segment, buf_sz + 1, "%s", last_wal_seg_pq); } PQclear(res); return true; } bool set_config_bool(PGconn *conn, const char *config_param, bool state) { char sqlquery[QUERY_STR_LEN]; PGresult *res; sqlquery_snprintf(sqlquery, "SET %s TO %s", config_param, state ? "TRUE" : "FALSE"); log_verbose(LOG_DEBUG, "set_config_bool():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err("unable to set '%s': %s\n", config_param, PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } /* * copy_configuration() * * Copy records in master's `repl_nodes` table to witness database * * This is used by `repmgr` when setting up the witness database, and * `repmgrd` after a failover event occurs */ bool copy_configuration(PGconn *masterconn, PGconn *witnessconn, char *cluster_name) { char sqlquery[MAXLEN]; PGresult *res; int i; sqlquery_snprintf(sqlquery, "TRUNCATE TABLE %s.repl_nodes", get_repmgr_schema_quoted(witnessconn)); log_verbose(LOG_DEBUG, "copy_configuration():\n%s\n", sqlquery); res = PQexec(witnessconn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to truncate witness servers's repl_nodes table:\n%s\n"), PQerrorMessage(witnessconn)); return false; } sqlquery_snprintf(sqlquery, "SELECT id, type, upstream_node_id, name, conninfo, priority, slot_name FROM %s.repl_nodes", get_repmgr_schema_quoted(masterconn)); log_verbose(LOG_DEBUG, "copy_configuration():\n%s\n", sqlquery); res = PQexec(masterconn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err("Unable to retrieve node records from master:\n%s\n", PQerrorMessage(masterconn)); PQclear(res); return false; } for (i = 0; i < PQntuples(res); i++) { bool node_record_created; log_verbose(LOG_DEBUG, "copy_configuration(): writing node record for node %s (id: %s)\n", PQgetvalue(res, i, 4), PQgetvalue(res, i, 0)); node_record_created = create_node_record(witnessconn, "copy_configuration", atoi(PQgetvalue(res, i, 0)), PQgetvalue(res, i, 1), strlen(PQgetvalue(res, i, 2)) ? atoi(PQgetvalue(res, i, 2)) : NO_UPSTREAM_NODE, cluster_name, PQgetvalue(res, i, 3), PQgetvalue(res, i, 4), atoi(PQgetvalue(res, i, 5)), strlen(PQgetvalue(res, i, 6)) ? PQgetvalue(res, i, 6) : NULL ); if (node_record_created == false) { PQclear(res); log_err("Unable to copy node record to witness database\n%s\n", PQerrorMessage(witnessconn)); return false; } } PQclear(res); return true; } /* * create_node_record() * * Create an entry in the `repl_nodes` table. * * XXX we should pass the record parameters as a struct. */ bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name) { char sqlquery[QUERY_STR_LEN]; char upstream_node_id[MAXLEN]; char slot_name_buf[MAXLEN]; PGresult *res; if (upstream_node == NO_UPSTREAM_NODE) { /* * No explicit upstream node id provided for standby - attempt to * get primary node id */ if (strcmp(type, "standby") == 0) { int primary_node_id = get_master_node_id(conn, cluster_name); maxlen_snprintf(upstream_node_id, "%i", primary_node_id); } else { maxlen_snprintf(upstream_node_id, "%s", "NULL"); } } else { maxlen_snprintf(upstream_node_id, "%i", upstream_node); } if (slot_name != NULL && slot_name[0]) { maxlen_snprintf(slot_name_buf, "'%s'", slot_name); } else { maxlen_snprintf(slot_name_buf, "%s", "NULL"); } /* XXX convert to placeholder query */ sqlquery_snprintf(sqlquery, "INSERT INTO %s.repl_nodes " " (id, type, upstream_node_id, cluster, " " name, conninfo, slot_name, priority) " "VALUES (%i, '%s', %s, '%s', '%s', '%s', %s, %i) ", get_repmgr_schema_quoted(conn), node, type, upstream_node_id, cluster_name, node_name, conninfo, slot_name_buf, priority); log_verbose(LOG_DEBUG, "create_node_record(): %s\n", sqlquery); if (action != NULL) { log_verbose(LOG_DEBUG, "create_node_record(): action is \"%s\"\n", action); } res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to create node record\n%s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool delete_node_record(PGconn *conn, int node, char *action) { char sqlquery[QUERY_STR_LEN]; PGresult *res; sqlquery_snprintf(sqlquery, "DELETE FROM %s.repl_nodes " " WHERE id = %d", get_repmgr_schema_quoted(conn), node); log_verbose(LOG_DEBUG, "delete_node_record(): %s\n", sqlquery); if (action != NULL) { log_verbose(LOG_DEBUG, "create_node_record(): action is \"%s\"\n", action); } res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to delete node record: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } /* * create_event_record() * * If `conn` is not NULL, insert a record into the events table. * * If configuration parameter `event_notification_command` is set, also * attempt to execute that command. * * Returns true if all operations succeeded, false if one or more failed. * * Note this function may be called with `conn` set to NULL in cases where * the master node is not available and it's therefore not possible to write * an event record. In this case, if `event_notification_command` is set, a * user-defined notification to be generated; if not, this function will have * no effect. */ bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details) { char sqlquery[QUERY_STR_LEN]; PGresult *res; char event_timestamp[MAXLEN] = ""; bool success = true; struct tm ts; /* Only attempt to write a record if a connection handle was provided. Also check that the repmgr schema has been properly intialised - if not it means no configuration file was provided, which can happen with e.g. `repmgr standby clone`, and we won't know which schema to write to. */ if (conn != NULL && strcmp(repmgr_schema, DEFAULT_REPMGR_SCHEMA_PREFIX) != 0) { int n_node_id = htonl(node_id); char *t_successful = successful ? "TRUE" : "FALSE"; const char *values[4] = { (char *)&n_node_id, event, t_successful, details }; int lengths[4] = { sizeof(n_node_id), 0, 0, 0 }; int binary[4] = {1, 0, 0, 0}; sqlquery_snprintf(sqlquery, " INSERT INTO %s.repl_events ( " " node_id, " " event, " " successful, " " details " " ) " " VALUES ($1, $2, $3, $4) " " RETURNING event_timestamp ", get_repmgr_schema_quoted(conn)); log_verbose(LOG_DEBUG, "create_event_record():\n%s\n", sqlquery); res = PQexecParams(conn, sqlquery, 4, NULL, values, lengths, binary, 0); if (!res || PQresultStatus(res) != PGRES_TUPLES_OK) { log_warning(_("Unable to create event record: %s\n"), PQerrorMessage(conn)); success = false; } else { /* Store timestamp to send to the notification command */ strncpy(event_timestamp, PQgetvalue(res, 0, 0), MAXLEN); log_verbose(LOG_DEBUG, "create_event_record(): Event timestamp is \"%s\"\n", event_timestamp); } PQclear(res); } /* * If no database connection provided, or the query failed, generate a * current timestamp ourselves. This isn't quite the same * format as PostgreSQL, but is close enough for diagnostic use. */ if (!strlen(event_timestamp)) { time_t now; time(&now); ts = *localtime(&now); strftime(event_timestamp, MAXLEN, "%Y-%m-%d %H:%M:%S%z", &ts); } /* an event notification command was provided - parse and execute it */ if (strlen(options->event_notification_command)) { char parsed_command[MAXPGPATH]; const char *src_ptr; char *dst_ptr; char *end_ptr; int r; /* * If configuration option 'event_notifications' was provided, * check if this event is one of the ones listed; if not listed, * don't execute the notification script. * * (If 'event_notifications' was not provided, we assume the script * should be executed for all events). */ if (options->event_notifications.head != NULL) { EventNotificationListCell *cell; bool notify_ok = false; for (cell = options->event_notifications.head; cell; cell = cell->next) { if (strcmp(event, cell->event_type) == 0) { notify_ok = true; break; } } /* * Event type not found in the 'event_notifications' list - return early */ if (notify_ok == false) { log_debug(_("Not executing notification script for event type '%s'\n"), event); return success; } } dst_ptr = parsed_command; end_ptr = parsed_command + MAXPGPATH - 1; *end_ptr = '\0'; for(src_ptr = options->event_notification_command; *src_ptr; src_ptr++) { if (*src_ptr == '%') { switch (src_ptr[1]) { case 'n': /* %n: node id */ src_ptr++; snprintf(dst_ptr, end_ptr - dst_ptr, "%i", node_id); dst_ptr += strlen(dst_ptr); break; case 'e': /* %e: event type */ src_ptr++; strlcpy(dst_ptr, event, end_ptr - dst_ptr); dst_ptr += strlen(dst_ptr); break; case 'd': /* %d: details */ src_ptr++; if (details != NULL) { strlcpy(dst_ptr, details, end_ptr - dst_ptr); dst_ptr += strlen(dst_ptr); } break; case 's': /* %s: successful */ src_ptr++; strlcpy(dst_ptr, successful ? "1" : "0", end_ptr - dst_ptr); dst_ptr += strlen(dst_ptr); break; case 't': /* %: timestamp */ src_ptr++; strlcpy(dst_ptr, event_timestamp, end_ptr - dst_ptr); dst_ptr += strlen(dst_ptr); break; default: /* otherwise treat the % as not special */ if (dst_ptr < end_ptr) *dst_ptr++ = *src_ptr; break; } } else { if (dst_ptr < end_ptr) *dst_ptr++ = *src_ptr; } } *dst_ptr = '\0'; log_debug("create_event_record(): executing\n%s\n", parsed_command); r = system(parsed_command); if (r != 0) { log_warning(_("Unable to execute event notification command\n")); log_info(_("Parsed event notification command was:\n%s\n"), parsed_command); success = false; } } return success; } /* * Update node record following change of status * (e.g. inactive primary converted to standby) */ bool update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active) { PGresult *res; char sqlquery[QUERY_STR_LEN]; sqlquery_snprintf(sqlquery, " UPDATE %s.repl_nodes " " SET type = '%s', " " upstream_node_id = %i, " " active = %s " " WHERE cluster = '%s' " " AND id = %i ", get_repmgr_schema_quoted(conn), type, upstream_node_id, active ? "TRUE" : "FALSE", cluster_name, this_node_id); log_verbose(LOG_DEBUG, "update_node_record_status():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to update node record: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id) { PGresult *res; char sqlquery[QUERY_STR_LEN]; log_debug(_("update_node_record_set_upstream(): Updating node %i's upstream node to %i\n"), this_node_id, new_upstream_node_id); sqlquery_snprintf(sqlquery, " UPDATE %s.repl_nodes " " SET upstream_node_id = %i " " WHERE cluster = '%s' " " AND id = %i ", get_repmgr_schema_quoted(conn), new_upstream_node_id, cluster_name, this_node_id); log_verbose(LOG_DEBUG, "update_node_record_set_upstream():\n%s\n", sqlquery); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to set new upstream node id: %s\n"), PQerrorMessage(conn)); PQclear(res); return false; } PQclear(res); return true; } PGresult * get_node_record(PGconn *conn, char *cluster, int node_id) { char sqlquery[QUERY_STR_LEN]; sprintf(sqlquery, "SELECT id, upstream_node_id, conninfo, type, slot_name, active " " FROM %s.repl_nodes " " WHERE cluster = '%s' " " AND id = %i", get_repmgr_schema_quoted(conn), cluster, node_id); log_verbose(LOG_DEBUG, "get_node_record():\n%s\n", sqlquery); return PQexec(conn, sqlquery); } repmgr-3.0.3/dbutils.h000066400000000000000000000076421264264412200146440ustar00rootroot00000000000000/* * dbutils.h * Copyright (c) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _REPMGR_DBUTILS_H_ #define _REPMGR_DBUTILS_H_ #include "access/xlogdefs.h" #include "config.h" #include "strutil.h" typedef enum { UNKNOWN = 0, MASTER, STANDBY, WITNESS } t_server_type; /* * Struct to store node information */ typedef struct s_node_info { int node_id; int upstream_node_id; t_server_type type; char name[MAXLEN]; char conninfo_str[MAXLEN]; char slot_name[MAXLEN]; int priority; bool active; bool is_ready; bool is_visible; XLogRecPtr xlog_location; } t_node_info; #define T_NODE_INFO_INITIALIZER { \ NODE_NOT_FOUND, \ NO_UPSTREAM_NODE, \ UNKNOWN, \ "", \ "", \ "", \ DEFAULT_PRIORITY, \ true, \ false, \ false, \ InvalidXLogRecPtr \ } PGconn *establish_db_connection(const char *conninfo, const bool exit_on_error); PGconn *establish_db_connection_by_params(const char *keywords[], const char *values[], const bool exit_on_error); bool begin_transaction(PGconn *conn); bool commit_transaction(PGconn *conn); bool rollback_transaction(PGconn *conn); bool check_cluster_schema(PGconn *conn); int is_standby(PGconn *conn); bool is_pgup(PGconn *conn, int timeout); int get_master_node_id(PGconn *conn, char *cluster); int get_server_version(PGconn *conn, char *server_version); bool get_cluster_size(PGconn *conn, char *size); bool get_pg_setting(PGconn *conn, const char *setting, char *output); int guc_set(PGconn *conn, const char *parameter, const char *op, const char *value); int guc_set_typed(PGconn *conn, const char *parameter, const char *op, const char *value, const char *datatype); PGconn *get_upstream_connection(PGconn *standby_conn, char *cluster, int node_id, int *upstream_node_id_ptr, char *upstream_conninfo_out); PGconn *get_master_connection(PGconn *standby_conn, char *cluster, int *master_id, char *master_conninfo_out); int wait_connection_availability(PGconn *conn, long long timeout); bool cancel_query(PGconn *conn, int timeout); char *get_repmgr_schema(void); char *get_repmgr_schema_quoted(PGconn *conn); bool create_replication_slot(PGconn *conn, char *slot_name); bool drop_replication_slot(PGconn *conn, char *slot_name); bool start_backup(PGconn *conn, char *first_wal_segment, bool fast_checkpoint); bool stop_backup(PGconn *conn, char *last_wal_segment); bool set_config_bool(PGconn *conn, const char *config_param, bool state); bool copy_configuration(PGconn *masterconn, PGconn *witnessconn, char *cluster_name); bool create_node_record(PGconn *conn, char *action, int node, char *type, int upstream_node, char *cluster_name, char *node_name, char *conninfo, int priority, char *slot_name); bool delete_node_record(PGconn *conn, int node, char *action); bool create_event_record(PGconn *conn, t_configuration_options *options, int node_id, char *event, bool successful, char *details); bool update_node_record_status(PGconn *conn, char *cluster_name, int this_node_id, char *type, int upstream_node_id, bool active); bool update_node_record_set_upstream(PGconn *conn, char *cluster_name, int this_node_id, int new_upstream_node_id); PGresult * get_node_record(PGconn *conn, char *cluster, int node_id); #endif repmgr-3.0.3/debian/000077500000000000000000000000001264264412200142365ustar00rootroot00000000000000repmgr-3.0.3/debian/DEBIAN/000077500000000000000000000000001264264412200151605ustar00rootroot00000000000000repmgr-3.0.3/debian/DEBIAN/control000066400000000000000000000005211264264412200165610ustar00rootroot00000000000000Package: repmgr-auto Version: 2.0beta2 Section: database Priority: optional Architecture: all Depends: rsync, postgresql-9.0 | postgresql-9.1 | postgresql-9.2 | postgresql-9.3 | postgresql-9.4 Maintainer: Jaime Casanova Description: PostgreSQL replication setup, magament and monitoring has two main executables repmgr-3.0.3/debian/repmgr.repmgrd.default000066400000000000000000000006641264264412200205450ustar00rootroot00000000000000# default settings for repmgrd. This file is source by /bin/sh from # /etc/init.d/repmgrd # disable repmgrd by default so it won't get started upon installation # valid values: yes/no REPMGRD_ENABLED=no # Options for repmgrd (required) #REPMGRD_OPTS="--config-file /path/to/repmgr.conf" # User to run repmgrd as #REPMGRD_USER=postgres # repmgrd binary #REPMGRD_BIN=/usr/bin/repmgrd # pid file #REPMGRD_PIDFILE=/var/run/repmgrd.pid repmgr-3.0.3/debian/repmgr.repmgrd.init000066400000000000000000000043351264264412200200630ustar00rootroot00000000000000#!/bin/sh ### BEGIN INIT INFO # Provides: repmgrd # Required-Start: $local_fs $remote_fs $network $syslog postgresql # Required-Stop: $local_fs $remote_fs $network $syslog postgresql # Should-Start: $syslog postgresql # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Start/stop repmgrd # Description: Enable repmgrd replication management and monitoring daemon for PostgreSQL ### END INIT INFO set -e DESC="PostgreSQL replication management and monitoring daemon" NAME=repmgrd REPMGRD_ENABLED=no REPMGRD_OPTS= REPMGRD_USER=postgres REPMGRD_BIN=/usr/bin/repmgrd REPMGRD_PIDFILE=/var/run/repmgrd.pid # Read configuration variable file if it is present [ -r /etc/default/$NAME ] && . /etc/default/$NAME test -x $REPMGRD_BIN || exit 0 case "$REPMGRD_ENABLED" in [Yy]*) break ;; *) exit 0 ;; esac # Define LSB log_* functions. . /lib/lsb/init-functions if [ -z "$REPMGRD_OPTS" ] then log_warning_msg "Not starting $NAME, REPMGRD_OPTS not set in /etc/default/$NAME" exit 0 fi do_start() { # Return # 0 if daemon has been started # 1 if daemon was already running # other if daemon could not be started or a failure occured start-stop-daemon --start --quiet --background --chuid $REPMGRD_USER --make-pidfile --pidfile $REPMGRD_PIDFILE --exec $REPMGRD_BIN -- $REPMGRD_OPTS } do_stop() { # Return # 0 if daemon has been stopped # 1 if daemon was already stopped # other if daemon could not be stopped or a failure occurred start-stop-daemon --stop --quiet --retry=TERM/30/KILL/5 --pidfile $REPMGRD_PIDFILE --name "$(basename $REPMGRD_BIN)" } case "$1" in start) log_daemon_msg "Starting $DESC" "$NAME" do_start case "$?" in 0) log_end_msg 0 ;; 1) log_progress_msg "already started" log_end_msg 0 ;; *) log_end_msg 1 ;; esac ;; stop) log_daemon_msg "Stopping $DESC" "$NAME" do_stop case "$?" in 0) log_end_msg 0 ;; 1) log_progress_msg "already stopped" log_end_msg 0 ;; *) log_end_msg 1 ;; esac ;; restart|force-reload) $0 stop $0 start ;; status) status_of_proc -p $REPMGRD_PIDFILE $REPMGRD_BIN $NAME && exit 0 || exit $? ;; *) echo "Usage: $SCRIPTNAME {start|stop|restart|force-reload|status}" >&2 exit 3 ;; esac exit 0 repmgr-3.0.3/errcode.h000066400000000000000000000022401264264412200146060ustar00rootroot00000000000000/* * errcode.h * Copyright (C) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _ERRCODE_H_ #define _ERRCODE_H_ /* Exit return code */ #define SUCCESS 0 #define ERR_BAD_CONFIG 1 #define ERR_BAD_RSYNC 2 #define ERR_NO_RESTART 4 #define ERR_DB_CON 6 #define ERR_DB_QUERY 7 #define ERR_PROMOTED 8 #define ERR_BAD_PASSWORD 9 #define ERR_STR_OVERFLOW 10 #define ERR_FAILOVER_FAIL 11 #define ERR_BAD_SSH 12 #define ERR_SYS_FAILURE 13 #define ERR_BAD_BASEBACKUP 14 #define ERR_INTERNAL 15 #define ERR_MONITORING_FAIL 16 #endif /* _ERRCODE_H_ */ repmgr-3.0.3/log.c000066400000000000000000000164451264264412200137530ustar00rootroot00000000000000/* * log.c - Logging methods * Copyright (C) 2ndQuadrant, 2010-2015 * * This module is a set of methods for logging (currently only syslog) * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include "repmgr.h" #include #ifdef HAVE_SYSLOG #include #endif #include #include #include "log.h" #define DEFAULT_IDENT "repmgr" #ifdef HAVE_SYSLOG #define DEFAULT_SYSLOG_FACILITY LOG_LOCAL0 #endif /* #define REPMGR_DEBUG */ static int detect_log_facility(const char *facility); static void _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap); int log_type = REPMGR_STDERR; int log_level = LOG_NOTICE; int last_log_level = LOG_NOTICE; int verbose_logging = false; int terse_logging = false; void stderr_log_with_level(const char *level_name, int level, const char *fmt, ...) { va_list arglist; va_start(arglist, fmt); _stderr_log_with_level(level_name, level, fmt, arglist); va_end(arglist); } static void _stderr_log_with_level(const char *level_name, int level, const char *fmt, va_list ap) { time_t t; struct tm *tm; char buff[100]; /* * Store the requested level so that if there's a subsequent * log_hint(), we can suppress that if appropriate. */ last_log_level = level; if (log_level >= level) { time(&t); tm = localtime(&t); strftime(buff, 100, "[%Y-%m-%d %H:%M:%S]", tm); fprintf(stderr, "%s [%s] ", buff, level_name); vfprintf(stderr, fmt, ap); fflush(stderr); } } void log_hint(const char *fmt, ...) { va_list ap; if (terse_logging == false) { va_start(ap, fmt); _stderr_log_with_level("HINT", last_log_level, fmt, ap); va_end(ap); } } void log_verbose(int level, const char *fmt, ...) { va_list ap; va_start(ap, fmt); if (verbose_logging == true) { switch(level) { case LOG_EMERG: _stderr_log_with_level("EMERG", level, fmt, ap); break; case LOG_ALERT: _stderr_log_with_level("ALERT", level, fmt, ap); break; case LOG_CRIT: _stderr_log_with_level("CRIT", level, fmt, ap); break; case LOG_ERR: _stderr_log_with_level("ERR", level, fmt, ap); break; case LOG_WARNING: _stderr_log_with_level("WARNING", level, fmt, ap); break; case LOG_NOTICE: _stderr_log_with_level("NOTICE", level, fmt, ap); break; case LOG_INFO: _stderr_log_with_level("INFO", level, fmt, ap); break; case LOG_DEBUG: _stderr_log_with_level("DEBUG", level, fmt, ap); break; } } va_end(ap); } bool logger_init(t_configuration_options * opts, const char *ident) { char *level = opts->loglevel; char *facility = opts->logfacility; int l; int f; #ifdef HAVE_SYSLOG int syslog_facility = DEFAULT_SYSLOG_FACILITY; #endif #ifdef REPMGR_DEBUG printf("Logger initialisation (Level: %s, Facility: %s)\n", level, facility); #endif if (!ident) { ident = DEFAULT_IDENT; } if (level && *level) { l = detect_log_level(level); #ifdef REPMGR_DEBUG printf("Assigned level for logger: %d\n", l); #endif if (l >= 0) log_level = l; else stderr_log_warning(_("Invalid log level \"%s\" (available values: DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG)\n"), level); } if (facility && *facility) { f = detect_log_facility(facility); #ifdef REPMGR_DEBUG printf("Assigned facility for logger: %d\n", f); #endif if (f == 0) { /* No syslog requested, just stderr */ #ifdef REPMGR_DEBUG printf(_("Use stderr for logging\n")); #endif } else if (f == -1) { stderr_log_warning(_("Cannot detect log facility %s (use any of LOCAL0, LOCAL1, ..., LOCAL7, USER or STDERR)\n"), facility); } #ifdef HAVE_SYSLOG else { syslog_facility = f; log_type = REPMGR_SYSLOG; } #endif } #ifdef HAVE_SYSLOG if (log_type == REPMGR_SYSLOG) { setlogmask(LOG_UPTO(log_level)); openlog(ident, LOG_CONS | LOG_PID | LOG_NDELAY, syslog_facility); stderr_log_notice(_("Setup syslog (level: %s, facility: %s)\n"), level, facility); } #endif if (*opts->logfile) { FILE *fd; /* Check if we can write to the specified file before redirecting * stderr - if freopen() fails, stderr output will vanish into * the ether and the user won't know what's going on. */ fd = fopen(opts->logfile, "a"); if (fd == NULL) { stderr_log_err(_("Unable to open specified logfile '%s' for writing: %s\n"), opts->logfile, strerror(errno)); stderr_log_err(_("Terminating\n")); exit(ERR_BAD_CONFIG); } fclose(fd); stderr_log_notice(_("Redirecting logging output to '%s'\n"), opts->logfile); fd = freopen(opts->logfile, "a", stderr); /* It's possible freopen() may still fail due to e.g. a race condition; as it's not feasible to restore stderr after a failed freopen(), we'll write to stdout as a last resort. */ if (fd == NULL) { printf(_("Unable to open specified logfile %s for writing: %s\n"), opts->logfile, strerror(errno)); printf(_("Terminating\n")); exit(ERR_BAD_CONFIG); } } return true; } bool logger_shutdown(void) { #ifdef HAVE_SYSLOG if (log_type == REPMGR_SYSLOG) closelog(); #endif return true; } /* * Indicate whether extra-verbose logging is required. This will * generate a lot of output, particularly debug logging, and should * not be permanently enabled in production. * * NOTE: in previous repmgr versions, this option forced the log * level to INFO. */ void logger_set_verbose(void) { verbose_logging = true; } /* * Indicate whether some non-critical log messages can be omitted. * Currently this includes warnings about irrelevant command line * options and hints. */ void logger_set_terse(void) { terse_logging = true; } int detect_log_level(const char *level) { if (!strcmp(level, "DEBUG")) return LOG_DEBUG; if (!strcmp(level, "INFO")) return LOG_INFO; if (!strcmp(level, "NOTICE")) return LOG_NOTICE; if (!strcmp(level, "WARNING")) return LOG_WARNING; if (!strcmp(level, "ERR")) return LOG_ERR; if (!strcmp(level, "ALERT")) return LOG_ALERT; if (!strcmp(level, "CRIT")) return LOG_CRIT; if (!strcmp(level, "EMERG")) return LOG_EMERG; return -1; } static int detect_log_facility(const char *facility) { int local = 0; if (!strncmp(facility, "LOCAL", 5) && strlen(facility) == 6) { local = atoi(&facility[5]); switch (local) { case 0: return LOG_LOCAL0; break; case 1: return LOG_LOCAL1; break; case 2: return LOG_LOCAL2; break; case 3: return LOG_LOCAL3; break; case 4: return LOG_LOCAL4; break; case 5: return LOG_LOCAL5; break; case 6: return LOG_LOCAL6; break; case 7: return LOG_LOCAL7; break; } } else if (!strcmp(facility, "USER")) { return LOG_USER; } else if (!strcmp(facility, "STDERR")) { return 0; } return -1; } repmgr-3.0.3/log.h000066400000000000000000000077501264264412200137570ustar00rootroot00000000000000/* * log.h * Copyright (c) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _REPMGR_LOG_H_ #define _REPMGR_LOG_H_ #include "repmgr.h" #define REPMGR_SYSLOG 1 #define REPMGR_STDERR 2 void stderr_log_with_level(const char *level_name, int level, const char *fmt,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4))); /* Standard error logging */ #define stderr_log_debug(...) stderr_log_with_level("DEBUG", LOG_DEBUG, __VA_ARGS__) #define stderr_log_info(...) stderr_log_with_level("INFO", LOG_INFO, __VA_ARGS__) #define stderr_log_notice(...) stderr_log_with_level("NOTICE", LOG_NOTICE, __VA_ARGS__) #define stderr_log_warning(...) stderr_log_with_level("WARNING", LOG_WARNING, __VA_ARGS__) #define stderr_log_err(...) stderr_log_with_level("ERROR", LOG_ERR, __VA_ARGS__) #define stderr_log_crit(...) stderr_log_with_level("CRITICAL", LOG_CRIT, __VA_ARGS__) #define stderr_log_alert(...) stderr_log_with_level("ALERT", LOG_ALERT, __VA_ARGS__) #define stderr_log_emerg(...) stderr_log_with_level("EMERGENCY", LOG_EMERG, __VA_ARGS__) #ifdef HAVE_SYSLOG #include #define log_debug(...) \ if (log_type == REPMGR_SYSLOG) \ syslog(LOG_DEBUG, __VA_ARGS__); \ else \ stderr_log_debug(__VA_ARGS__); #define log_info(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_INFO, __VA_ARGS__); \ else stderr_log_info(__VA_ARGS__); \ } #define log_notice(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_NOTICE, __VA_ARGS__); \ else stderr_log_notice(__VA_ARGS__); \ } #define log_warning(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_WARNING, __VA_ARGS__); \ else stderr_log_warning(__VA_ARGS__); \ } #define log_err(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_ERR, __VA_ARGS__); \ else stderr_log_err(__VA_ARGS__); \ } #define log_crit(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_CRIT, __VA_ARGS__); \ else stderr_log_crit(__VA_ARGS__); \ } #define log_alert(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_ALERT, __VA_ARGS__); \ else stderr_log_alert(__VA_ARGS__); \ } #define log_emerg(...) \ { \ if (log_type == REPMGR_SYSLOG) syslog(LOG_ALERT, __VA_ARGS__); \ else stderr_log_alert(__VA_ARGS__); \ } #else #define LOG_EMERG 0 /* system is unusable */ #define LOG_ALERT 1 /* action must be taken immediately */ #define LOG_CRIT 2 /* critical conditions */ #define LOG_ERR 3 /* error conditions */ #define LOG_WARNING 4 /* warning conditions */ #define LOG_NOTICE 5 /* normal but significant condition */ #define LOG_INFO 6 /* informational */ #define LOG_DEBUG 7 /* debug-level messages */ #define log_debug(...) stderr_log_debug(__VA_ARGS__) #define log_info(...) stderr_log_info(__VA_ARGS__) #define log_notice(...) stderr_log_notice(__VA_ARGS__) #define log_warning(...) stderr_log_warning(__VA_ARGS__) #define log_err(...) stderr_log_err(__VA_ARGS__) #define log_crit(...) stderr_log_crit(__VA_ARGS__) #define log_alert(...) stderr_log_alert(__VA_ARGS__) #define log_emerg(...) stderr_log_emerg(__VA_ARGS__) #endif int detect_log_level(const char *level); /* Logger initialisation and shutdown */ bool logger_init(t_configuration_options * opts, const char *ident); bool logger_shutdown(void); void logger_set_verbose(void); void logger_set_terse(void); void log_hint(const char *fmt, ...); void log_verbose(int level, const char *fmt, ...); extern int log_type; extern int log_level; #endif repmgr-3.0.3/repmgr.c000066400000000000000000003165271264264412200144720ustar00rootroot00000000000000/* * repmgr.c - Command interpreter for the repmgr package * Copyright (C) 2ndQuadrant, 2010-2015 * * This module is a command-line utility to easily setup a cluster of * hot standby servers for an HA environment * * Commands implemented are: * * [ MASTER | PRIMARY ] REGISTER * * STANDBY REGISTER * STANDBY UNREGISTER * STANDBY CLONE * STANDBY FOLLOW * STANDBY PROMOTE * * WITNESS CREATE * * CLUSTER SHOW * CLUSTER CLEANUP * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include "repmgr.h" #include #include #include #include #include #include "storage/fd.h" /* for PG_TEMP_FILE_PREFIX */ #include "pqexpbuffer.h" #include "log.h" #include "config.h" #include "check_dir.h" #include "strutil.h" #include "version.h" #define RECOVERY_FILE "recovery.conf" #ifndef TABLESPACE_MAP #define TABLESPACE_MAP "tablespace_map" #endif #define WITNESS_DEFAULT_PORT "5499" /* If this value is ever changed, remember * to update comments and documentation */ #define NO_ACTION 0 /* Dummy default action */ #define MASTER_REGISTER 1 #define STANDBY_REGISTER 2 #define STANDBY_UNREGISTER 3 #define STANDBY_CLONE 4 #define STANDBY_PROMOTE 5 #define STANDBY_FOLLOW 6 #define WITNESS_CREATE 7 #define CLUSTER_SHOW 8 #define CLUSTER_CLEANUP 9 static bool create_recovery_file(const char *data_dir); static int test_ssh_connection(char *host, char *remote_user); static int copy_remote_files(char *host, char *remote_user, char *remote_path, char *local_path, bool is_directory, int server_version_num); static int run_basebackup(const char *data_dir); static void check_parameters_for_action(const int action); static bool create_schema(PGconn *conn); static void write_primary_conninfo(char *line); static bool write_recovery_file_line(FILE *recovery_file, char *recovery_file_path, char *line); static void check_master_standby_version_match(PGconn *conn, PGconn *master_conn); static int check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *server_version_string); static bool check_upstream_config(PGconn *conn, int server_version_num, bool exit_on_error); static bool update_node_record_set_master(PGconn *conn, int this_node_id); static char *make_pg_path(char *file); static void do_master_register(void); static void do_standby_register(void); static void do_standby_unregister(void); static void do_standby_clone(void); static void do_standby_promote(void); static void do_standby_follow(void); static void do_witness_create(void); static void do_cluster_show(void); static void do_cluster_cleanup(void); static void do_check_upstream_config(void); static void exit_with_errors(void); static void print_error_list(ErrorList *error_list, int log_level); static void help(void); /* Global variables */ static const char *keywords[6]; static const char *values[6]; static bool config_file_required = true; /* Initialization of runtime options */ t_runtime_options runtime_options = T_RUNTIME_OPTIONS_INITIALIZER; t_configuration_options options = T_CONFIGURATION_OPTIONS_INITIALIZER; static bool wal_keep_segments_used = false; static char *server_mode = NULL; static char *server_cmd = NULL; static char pg_bindir[MAXLEN] = ""; static char repmgr_slot_name[MAXLEN] = ""; static char *repmgr_slot_name_ptr = NULL; static char path_buf[MAXLEN] = ""; /* Collate command line errors and warnings here for friendlier reporting */ ErrorList cli_errors = { NULL, NULL }; ErrorList cli_warnings = { NULL, NULL }; int main(int argc, char **argv) { static struct option long_options[] = { {"dbname", required_argument, NULL, 'd'}, {"host", required_argument, NULL, 'h'}, {"port", required_argument, NULL, 'p'}, {"username", required_argument, NULL, 'U'}, {"superuser", required_argument, NULL, 'S'}, {"data-dir", required_argument, NULL, 'D'}, {"local-port", required_argument, NULL, 'l'}, {"config-file", required_argument, NULL, 'f'}, {"remote-user", required_argument, NULL, 'R'}, {"wal-keep-segments", required_argument, NULL, 'w'}, {"keep-history", required_argument, NULL, 'k'}, {"force", no_argument, NULL, 'F'}, {"wait", no_argument, NULL, 'W'}, {"verbose", no_argument, NULL, 'v'}, {"pg_bindir", required_argument, NULL, 'b'}, {"rsync-only", no_argument, NULL, 'r'}, {"fast-checkpoint", no_argument, NULL, 'c'}, {"log-level", required_argument, NULL, 'L'}, {"terse", required_argument, NULL, 't'}, {"initdb-no-pwprompt", no_argument, NULL, 1}, {"check-upstream-config", no_argument, NULL, 2}, {"recovery-min-apply-delay", required_argument, NULL, 3}, {"ignore-external-config-files", no_argument, NULL, 4}, {"help", no_argument, NULL, '?'}, {"version", no_argument, NULL, 'V'}, {NULL, 0, NULL, 0} }; int optindex; int c, targ; int action = NO_ACTION; bool check_upstream_config = false; bool config_file_parsed = false; char *ptr = NULL; set_progname(argv[0]); /* Prevent getopt_long() from printing an error message */ opterr = 0; while ((c = getopt_long(argc, argv, "?Vd:h:p:U:S:D:l:f:R:w:k:FWIvb:rcL:t", long_options, &optindex)) != -1) { /* * NOTE: some integer parameters (e.g. -p/--port) are stored internally * as strings. We use repmgr_atoi() to check these but discard the * returned integer; repmgr_atoi() will append the error message to the * provided list. */ switch (c) { case '?': help(); exit(SUCCESS); case 'V': printf("%s %s (PostgreSQL %s)\n", progname(), REPMGR_VERSION, PG_VERSION); exit(SUCCESS); case 'd': strncpy(runtime_options.dbname, optarg, MAXLEN); break; case 'h': strncpy(runtime_options.host, optarg, MAXLEN); break; case 'p': repmgr_atoi(optarg, "-p/--port", &cli_errors); strncpy(runtime_options.masterport, optarg, MAXLEN); break; case 'U': strncpy(runtime_options.username, optarg, MAXLEN); break; case 'S': strncpy(runtime_options.superuser, optarg, MAXLEN); break; case 'D': strncpy(runtime_options.dest_dir, optarg, MAXFILENAME); break; case 'l': /* -l/--local-port is deprecated */ repmgr_atoi(optarg, "-l/--local-port", &cli_errors); strncpy(runtime_options.localport, optarg, MAXLEN); break; case 'f': strncpy(runtime_options.config_file, optarg, MAXLEN); break; case 'R': strncpy(runtime_options.remote_user, optarg, MAXLEN); break; case 'w': repmgr_atoi(optarg, "-w/--wal-keep-segments", &cli_errors); strncpy(runtime_options.wal_keep_segments, optarg, MAXLEN); wal_keep_segments_used = true; break; case 'k': runtime_options.keep_history = repmgr_atoi(optarg, "-k/--keep-history", &cli_errors); break; case 'F': runtime_options.force = true; break; case 'W': runtime_options.wait_for_master = true; break; case 'I': runtime_options.ignore_rsync_warn = true; break; case 'v': runtime_options.verbose = true; break; case 'b': strncpy(runtime_options.pg_bindir, optarg, MAXLEN); break; case 'r': runtime_options.rsync_only = true; break; case 'c': runtime_options.fast_checkpoint = true; break; case 'L': { int detected_log_level = detect_log_level(optarg); if (detected_log_level != -1) { strncpy(runtime_options.loglevel, optarg, MAXLEN); } else { PQExpBufferData invalid_log_level; initPQExpBuffer(&invalid_log_level); appendPQExpBuffer(&invalid_log_level, _("Invalid log level \"%s\" provided"), optarg); error_list_append(&cli_errors, invalid_log_level.data); } break; } case 't': runtime_options.terse = true; break; case 1: runtime_options.initdb_no_pwprompt = true; break; case 2: check_upstream_config = true; break; case 3: targ = strtol(optarg, &ptr, 10); if (targ < 1) { error_list_append(&cli_errors, _("Invalid value provided for '-r/--recovery-min-apply-delay'")); break; } if (ptr && *ptr) { if (strcmp(ptr, "ms") != 0 && strcmp(ptr, "s") != 0 && strcmp(ptr, "min") != 0 && strcmp(ptr, "h") != 0 && strcmp(ptr, "d") != 0) { error_list_append(&cli_errors, _("Value provided for '-r/--recovery-min-apply-delay' must be one of ms/s/min/h/d")); break; } } strncpy(runtime_options.recovery_min_apply_delay, optarg, MAXLEN); break; case 4: runtime_options.ignore_external_config_files = true; break; default: { PQExpBufferData unknown_option; initPQExpBuffer(&unknown_option); appendPQExpBuffer(&unknown_option, _("Unknown option '%s'"), argv[optind - 1]); error_list_append(&cli_errors, unknown_option.data); } } } /* Exit here already if errors in command line options found */ if (cli_errors.head != NULL) { exit_with_errors(); } if (check_upstream_config == true) { do_check_upstream_config(); exit(SUCCESS); } /* * Now we need to obtain the action, this comes in one of these forms: * MASTER REGISTER | * STANDBY {REGISTER | UNREGISTER | CLONE [node] | PROMOTE | FOLLOW [node]} | * WITNESS CREATE | * CLUSTER {SHOW | CLEANUP} * * the node part is optional, if we receive it then we shouldn't have * received a -h option */ if (optind < argc) { server_mode = argv[optind++]; if (strcasecmp(server_mode, "STANDBY") != 0 && strcasecmp(server_mode, "MASTER") != 0 && /* allow PRIMARY as synonym for MASTER */ strcasecmp(server_mode, "PRIMARY") != 0 && strcasecmp(server_mode, "WITNESS") != 0 && strcasecmp(server_mode, "CLUSTER") != 0) { PQExpBufferData unknown_mode; initPQExpBuffer(&unknown_mode); appendPQExpBuffer(&unknown_mode, _("Unknown server mode '%s'"), server_mode); error_list_append(&cli_errors, unknown_mode.data); } } if (optind < argc) { server_cmd = argv[optind++]; /* check posibilities for all server modes */ if (strcasecmp(server_mode, "MASTER") == 0 || strcasecmp(server_mode, "PRIMARY") == 0 ) { if (strcasecmp(server_cmd, "REGISTER") == 0) action = MASTER_REGISTER; } else if (strcasecmp(server_mode, "STANDBY") == 0) { if (strcasecmp(server_cmd, "REGISTER") == 0) action = STANDBY_REGISTER; if (strcasecmp(server_cmd, "UNREGISTER") == 0) action = STANDBY_UNREGISTER; else if (strcasecmp(server_cmd, "CLONE") == 0) action = STANDBY_CLONE; else if (strcasecmp(server_cmd, "PROMOTE") == 0) action = STANDBY_PROMOTE; else if (strcasecmp(server_cmd, "FOLLOW") == 0) action = STANDBY_FOLLOW; } else if (strcasecmp(server_mode, "CLUSTER") == 0) { if (strcasecmp(server_cmd, "SHOW") == 0) action = CLUSTER_SHOW; else if (strcasecmp(server_cmd, "CLEANUP") == 0) action = CLUSTER_CLEANUP; } else if (strcasecmp(server_mode, "WITNESS") == 0) { if (strcasecmp(server_cmd, "CREATE") == 0) action = WITNESS_CREATE; } } if (action == NO_ACTION) { if (server_cmd == NULL) { error_list_append(&cli_errors, "No server command provided"); } else { PQExpBufferData unknown_action; initPQExpBuffer(&unknown_action); appendPQExpBuffer(&unknown_action, _("Unknown server command '%s'"), server_cmd); error_list_append(&cli_errors, unknown_action.data); } } /* For some actions we still can receive a last argument */ if (action == STANDBY_CLONE) { if (optind < argc) { if (runtime_options.host[0]) { error_list_append(&cli_errors, _("Conflicting parameters: you can't use -h while providing a node separately.")); } else { strncpy(runtime_options.host, argv[optind++], MAXLEN); } } } if (optind < argc) { PQExpBufferData too_many_args; initPQExpBuffer(&too_many_args); appendPQExpBuffer(&too_many_args, _("too many command-line arguments (first extra is \"%s\")"), argv[optind]); error_list_append(&cli_errors, too_many_args.data); } check_parameters_for_action(action); /* * Sanity checks for command line parameters completed by now; * any further errors will be runtime ones */ if (cli_errors.head != NULL) { exit_with_errors(); } if (cli_warnings.head != NULL && runtime_options.terse == false) { print_error_list(&cli_warnings, LOG_WARNING); } if (!runtime_options.dbname[0]) { if (getenv("PGDATABASE")) strncpy(runtime_options.dbname, getenv("PGDATABASE"), MAXLEN); else if (getenv("PGUSER")) strncpy(runtime_options.dbname, getenv("PGUSER"), MAXLEN); else strncpy(runtime_options.dbname, DEFAULT_DBNAME, MAXLEN); } /* * If no primary port (-p/--port) provided, explicitly set the * default PostgreSQL port. */ if (!runtime_options.masterport[0]) { strncpy(runtime_options.masterport, DEFAULT_MASTER_PORT, MAXLEN); } /* * The configuration file is not required for some actions (e.g. 'standby clone'), * however if available we'll parse it anyway for options like 'log_level', * 'use_replication_slots' etc. */ config_file_parsed = load_config(runtime_options.config_file, runtime_options.verbose, &options, argv[0]); /* * Initialise pg_bindir - command line parameter will override * any setting in the configuration file */ if (!strlen(runtime_options.pg_bindir)) { strncpy(runtime_options.pg_bindir, options.pg_bindir, MAXLEN); } /* Add trailing slash */ if (strlen(runtime_options.pg_bindir)) { int len = strlen(runtime_options.pg_bindir); if (runtime_options.pg_bindir[len - 1] != '/') { maxlen_snprintf(pg_bindir, "%s/", runtime_options.pg_bindir); } else { strncpy(pg_bindir, runtime_options.pg_bindir, MAXLEN); } } keywords[2] = "user"; values[2] = (runtime_options.username[0]) ? runtime_options.username : NULL; keywords[3] = "dbname"; values[3] = runtime_options.dbname; keywords[4] = "application_name"; values[4] = (char *) progname(); keywords[5] = NULL; values[5] = NULL; /* * Initialize the logger. If verbose command line parameter was input, * make sure that the log level is at least INFO. This is mainly useful * for STANDBY CLONE. That doesn't require a configuration file where a * logging level might be specified at, but it often requires detailed * logging to troubleshoot problems. */ /* Command-line parameter -L/--log-level overrides any setting in config file*/ if (*runtime_options.loglevel != '\0') { strncpy(options.loglevel, runtime_options.loglevel, MAXLEN); } logger_init(&options, progname()); if (runtime_options.verbose) logger_set_verbose(); if (runtime_options.terse) logger_set_terse(); /* * Node configuration information is not needed for all actions, with * STANDBY CLONE being the main exception. */ if (config_file_required) { if (options.node == NODE_NOT_FOUND) { if (config_file_parsed == true) { log_err(_("No node information was found. " "Check the configuration file.\n")); } else { log_err(_("No node information was found. " "Please supply a configuration file.\n")); } exit(ERR_BAD_CONFIG); } } /* * If `use_replication_slots` set in the configuration file * and command line parameter `--wal-keep-segments` was used, * emit a warning as to the latter's redundancy. Note that * the version check for 9.4 or later is done in check_upstream_config() */ if (options.use_replication_slots && wal_keep_segments_used) { log_warning(_("-w/--wal-keep-segments has no effect when replication slots in use\n")); } /* Initialise the repmgr schema name */ maxlen_snprintf(repmgr_schema, "%s%s", DEFAULT_REPMGR_SCHEMA_PREFIX, options.cluster_name); /* * Initialise slot name, if required (9.4 and later) * * NOTE: the slot name will be defined for each record, including * the master; the `slot_name` column in `repl_nodes` defines * the name of the slot, but does not imply a slot has been created. * The version check for 9.4 or later is done in check_upstream_config() */ if (options.use_replication_slots) { maxlen_snprintf(repmgr_slot_name, "repmgr_slot_%i", options.node); repmgr_slot_name_ptr = repmgr_slot_name; log_verbose(LOG_DEBUG, "slot name initialised as: %s\n", repmgr_slot_name); } switch (action) { case MASTER_REGISTER: do_master_register(); break; case STANDBY_REGISTER: do_standby_register(); break; case STANDBY_UNREGISTER: do_standby_unregister(); break; case STANDBY_CLONE: do_standby_clone(); break; case STANDBY_PROMOTE: do_standby_promote(); break; case STANDBY_FOLLOW: do_standby_follow(); break; case WITNESS_CREATE: do_witness_create(); break; case CLUSTER_SHOW: do_cluster_show(); break; case CLUSTER_CLEANUP: do_cluster_cleanup(); break; default: /* An action will have been determined by this point */ break; } logger_shutdown(); return 0; } static void do_cluster_show(void) { PGconn *conn; PGresult *res; char sqlquery[QUERY_STR_LEN]; char node_role[MAXLEN]; int i; /* We need to connect to check configuration */ log_info(_("connecting to database\n")); conn = establish_db_connection(options.conninfo, true); sqlquery_snprintf(sqlquery, "SELECT conninfo, type " " FROM %s.repl_nodes ", get_repmgr_schema_quoted(conn)); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("Unable to retrieve node information from the database\n%s\n"), PQerrorMessage(conn)); log_hint(_("Please check that all nodes have been registered\n")); PQclear(res); PQfinish(conn); exit(ERR_BAD_CONFIG); } PQfinish(conn); printf("Role | Connection String\n"); for (i = 0; i < PQntuples(res); i++) { conn = establish_db_connection(PQgetvalue(res, i, 0), false); if (PQstatus(conn) != CONNECTION_OK) strcpy(node_role, " FAILED"); else if (strcmp(PQgetvalue(res, i, 1), "witness") == 0) strcpy(node_role, " witness"); else if (is_standby(conn)) strcpy(node_role, " standby"); else strcpy(node_role, "* master"); printf("%-10s", node_role); printf("| %s\n", PQgetvalue(res, i, 0)); PQfinish(conn); } PQclear(res); } static void do_cluster_cleanup(void) { PGconn *conn = NULL; PGconn *master_conn = NULL; PGresult *res; char sqlquery[QUERY_STR_LEN]; int entries_to_delete = 0; /* We need to connect to check configuration */ log_info(_("connecting to database\n")); conn = establish_db_connection(options.conninfo, true); /* check if there is a master in this cluster */ log_info(_("connecting to master database\n")); master_conn = get_master_connection(conn, options.cluster_name, NULL, NULL); if (!master_conn) { log_err(_("cluster cleanup: cannot connect to master\n")); PQfinish(conn); exit(ERR_DB_CON); } PQfinish(conn); log_debug(_("Number of days of monitoring history to retain: %i\n"), runtime_options.keep_history); sqlquery_snprintf(sqlquery, "SELECT COUNT(*) " " FROM %s.repl_monitor " " WHERE age(now(), last_monitor_time) >= '%d days'::interval ", get_repmgr_schema_quoted(master_conn), runtime_options.keep_history); res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("cluster cleanup: unable to query number of monitoring records to clean up:\n%s\n"), PQerrorMessage(master_conn)); PQclear(res); PQfinish(master_conn); exit(ERR_DB_QUERY); } entries_to_delete = atoi(PQgetvalue(res, 0, 0)); PQclear(res); if (entries_to_delete == 0) { log_info(_("cluster cleanup: no monitoring records to delete\n")); PQfinish(master_conn); return; } log_debug(_("cluster cleanup: at least %i monitoring records to delete\n"), entries_to_delete); if (runtime_options.keep_history > 0) { sqlquery_snprintf(sqlquery, "DELETE FROM %s.repl_monitor " " WHERE age(now(), last_monitor_time) >= '%d days'::interval ", get_repmgr_schema_quoted(master_conn), runtime_options.keep_history); } else { sqlquery_snprintf(sqlquery, "TRUNCATE TABLE %s.repl_monitor", get_repmgr_schema_quoted(master_conn)); } res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("cluster cleanup: unable to delete monitoring records\n%s\n"), PQerrorMessage(master_conn)); PQclear(res); PQfinish(master_conn); exit(ERR_DB_QUERY); } PQclear(res); /* * Let's VACUUM the table to avoid autovacuum to be launched in an * unexpected hour */ sqlquery_snprintf(sqlquery, "VACUUM %s.repl_monitor", get_repmgr_schema_quoted(master_conn)); res = PQexec(master_conn, sqlquery); /* XXX There is any need to check this VACUUM happens without problems? */ PQclear(res); PQfinish(master_conn); if (runtime_options.keep_history > 0) { log_info(_("cluster cleanup: monitoring records older than %i day(s) deleted\n"), runtime_options.keep_history); } else { log_info(_("cluster cleanup: all monitoring records deleted\n")); } } static void do_master_register(void) { PGconn *conn; PGconn *master_conn; bool schema_exists = false; int ret; int primary_node_id = UNKNOWN_NODE_ID; bool record_created; conn = establish_db_connection(options.conninfo, true); /* Verify that master is a supported server version */ log_info(_("connecting to master database\n")); check_server_version(conn, "master", true, NULL); /* Check we are a master */ log_verbose(LOG_INFO, _("connected to master, checking its state\n")); ret = is_standby(conn); if (ret) { log_err(_(ret == 1 ? "server is in standby mode and cannot be registered as a master\n" : "connection to node lost!\n")); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Create schema and associated database objects, if it does not exist */ schema_exists = check_cluster_schema(conn); if (!schema_exists) { log_info(_("master register: creating database objects inside the %s schema\n"), get_repmgr_schema()); begin_transaction(conn); if (!create_schema(conn)) { log_err(_("Unable to create repmgr schema - see preceding error message(s); aborting\n")); rollback_transaction(conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } commit_transaction(conn); } /* Ensure there isn't any other master already registered */ master_conn = get_master_connection(conn, options.cluster_name, NULL, NULL); if (master_conn != NULL && !runtime_options.force) { PQfinish(master_conn); log_err(_("there is a master already in cluster %s\n"), options.cluster_name); exit(ERR_BAD_CONFIG); } PQfinish(master_conn); begin_transaction(conn); /* * Check if a node with a different ID is registered as primary. This shouldn't * happen but could do if an existing master was shut down without being * unregistered. */ primary_node_id = get_master_node_id(conn, options.cluster_name); if (primary_node_id != NODE_NOT_FOUND && primary_node_id != options.node) { log_err(_("another node with id %i is already registered as master\n"), primary_node_id); rollback_transaction(conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Delete any existing record for this node if --force set */ if (runtime_options.force) { PGresult *res; bool node_record_deleted; res = get_node_record(conn, options.cluster_name, options.node); if (PQntuples(res)) { log_notice(_("deleting existing master record with id %i\n"), options.node); node_record_deleted = delete_node_record(conn, options.node, "master register"); if (node_record_deleted == false) { rollback_transaction(conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } } } /* Now register the master */ record_created = create_node_record(conn, "master register", options.node, "master", NO_UPSTREAM_NODE, options.cluster_name, options.node_name, options.conninfo, options.priority, repmgr_slot_name_ptr); if (record_created == false) { rollback_transaction(conn); PQfinish(conn); exit(ERR_DB_QUERY); } commit_transaction(conn); /* Log the event */ create_event_record(conn, &options, options.node, "master_register", true, NULL); PQfinish(conn); log_notice(_("master node correctly registered for cluster %s with id %d (conninfo: %s)\n"), options.cluster_name, options.node, options.conninfo); return; } static void do_standby_register(void) { PGconn *conn; PGconn *master_conn; int ret; bool record_created; log_info(_("connecting to standby database\n")); conn = establish_db_connection(options.conninfo, true); /* Check we are a standby */ ret = is_standby(conn); if (ret == 0 || ret == -1) { log_err(_(ret == 0 ? "this node should be a standby (%s)\n" : "connection to node (%s) lost\n"), options.conninfo); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Check if there is a schema for this cluster */ if (check_cluster_schema(conn) == false) { /* schema doesn't exist */ log_err(_("schema '%s' doesn't exist.\n"), get_repmgr_schema()); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* check if there is a master in this cluster */ log_info(_("connecting to master database\n")); master_conn = get_master_connection(conn, options.cluster_name, NULL, NULL); if (!master_conn) { log_err(_("a master must be defined before configuring a standby\n")); exit(ERR_BAD_CONFIG); } /* * Verify that standby and master are supported and compatible server * versions */ check_master_standby_version_match(conn, master_conn); /* Now register the standby */ log_info(_("registering the standby\n")); if (runtime_options.force) { bool node_record_deleted = delete_node_record(master_conn, options.node, "standby register"); if (node_record_deleted == false) { PQfinish(master_conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } } record_created = create_node_record(master_conn, "standby register", options.node, "standby", options.upstream_node, options.cluster_name, options.node_name, options.conninfo, options.priority, repmgr_slot_name_ptr); if (record_created == false) { if (!runtime_options.force) { log_hint(_("use option -F/--force to overwrite an existing node record\n")); } // XXX log registration failure? PQfinish(master_conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Log the event */ create_event_record(master_conn, &options, options.node, "standby_register", true, NULL); PQfinish(master_conn); PQfinish(conn); log_info(_("standby registration complete\n")); log_notice(_("standby node correctly registered for cluster %s with id %d (conninfo: %s)\n"), options.cluster_name, options.node, options.conninfo); return; } static void do_standby_unregister(void) { PGconn *conn; PGconn *master_conn; int ret; bool node_record_deleted; log_info(_("connecting to standby database\n")); conn = establish_db_connection(options.conninfo, true); /* Check we are a standby */ ret = is_standby(conn); if (ret == 0 || ret == -1) { log_err(_(ret == 0 ? "this node should be a standby (%s)\n" : "connection to node (%s) lost\n"), options.conninfo); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Check if there is a schema for this cluster */ if (check_cluster_schema(conn) == false) { /* schema doesn't exist */ log_err(_("schema '%s' doesn't exist.\n"), get_repmgr_schema()); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* check if there is a master in this cluster */ log_info(_("connecting to master database\n")); master_conn = get_master_connection(conn, options.cluster_name, NULL, NULL); if (!master_conn) { log_err(_("a master must be defined before unregistering a standby\n")); exit(ERR_BAD_CONFIG); } /* * Verify that standby and master are supported and compatible server * versions */ check_master_standby_version_match(conn, master_conn); /* Now unregister the standby */ log_info(_("unregistering the standby\n")); node_record_deleted = delete_node_record(master_conn, options.node, "standby unregister"); if (node_record_deleted == false) { PQfinish(master_conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Log the event */ create_event_record(master_conn, &options, options.node, "standby_unregister", true, NULL); PQfinish(master_conn); PQfinish(conn); log_info(_("standby unregistration complete\n")); log_notice(_("standby node correctly unregistered for cluster %s with id %d (conninfo: %s)\n"), options.cluster_name, options.node, options.conninfo); return; } static void do_standby_clone(void) { PGconn *upstream_conn; PGresult *res; char sqlquery[QUERY_STR_LEN]; int server_version_num; char cluster_size[MAXLEN]; int r = 0, retval = SUCCESS; int i; bool pg_start_backup_executed = false; bool target_directory_provided = false; bool external_config_file_copy_required = false; char master_data_directory[MAXFILENAME]; char local_data_directory[MAXFILENAME]; char master_config_file[MAXFILENAME] = ""; char local_config_file[MAXFILENAME] = ""; bool config_file_outside_pgdata = false; char master_hba_file[MAXFILENAME] = ""; char local_hba_file[MAXFILENAME] = ""; bool hba_file_outside_pgdata = false; char master_ident_file[MAXFILENAME] = ""; char local_ident_file[MAXFILENAME] = ""; bool ident_file_outside_pgdata = false; char master_control_file[MAXFILENAME] = ""; char local_control_file[MAXFILENAME] = ""; char *first_wal_segment = NULL; char *last_wal_segment = NULL; PQExpBufferData event_details; /* * If dest_dir (-D/--pgdata) was provided, this will become the new data * directory (otherwise repmgr will default to the same directory as on the * source host) */ if (runtime_options.dest_dir[0]) { target_directory_provided = true; log_notice(_("destination directory '%s' provided\n"), runtime_options.dest_dir); } /* Connection parameters for master only */ keywords[0] = "host"; values[0] = runtime_options.host; keywords[1] = "port"; values[1] = runtime_options.masterport; /* Connect to check configuration */ log_info(_("connecting to upstream node\n")); upstream_conn = establish_db_connection_by_params(keywords, values, true); /* Verify that upstream node is a supported server version */ log_verbose(LOG_INFO, _("connected to upstream node, checking its state\n")); server_version_num = check_server_version(upstream_conn, "master", true, NULL); check_upstream_config(upstream_conn, server_version_num, true); if (get_cluster_size(upstream_conn, cluster_size) == false) exit(ERR_DB_QUERY); log_info(_("Successfully connected to upstream node. Current installation size is %s\n"), cluster_size); /* * If --recovery-min-apply-delay was passed, check that * we're connected to PostgreSQL 9.4 or later */ if (*runtime_options.recovery_min_apply_delay) { if (get_server_version(upstream_conn, NULL) < 90400) { log_err(_("PostgreSQL 9.4 or greater required for --recovery-min-apply-delay\n")); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } } /* * Check that tablespaces named in any `tablespace_mapping` configuration * file parameters exist. * * pg_basebackup doesn't verify mappings, so any errors will not be caught. * We'll do that here as a value-added service. * * -T/--tablespace-mapping is not available as a pg_basebackup option for * PostgreSQL 9.3 - we can only handle that with rsync, so if `--rsync-only` * not set, fail with an error */ if (options.tablespace_mapping.head != NULL) { TablespaceListCell *cell; if (get_server_version(upstream_conn, NULL) < 90400 && !runtime_options.rsync_only) { log_err(_("in PostgreSQL 9.3, tablespace mapping can only be used in conjunction with --rsync-only\n")); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } for (cell = options.tablespace_mapping.head; cell; cell = cell->next) { sqlquery_snprintf(sqlquery, "SELECT spcname " " FROM pg_tablespace " " WHERE pg_tablespace_location(oid) = '%s'", cell->old_dir); res = PQexec(upstream_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to execute tablespace query: %s\n"), PQerrorMessage(upstream_conn)); PQclear(res); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } if (PQntuples(res) == 0) { log_err(_("no tablespace matching path '%s' found\n"), cell->old_dir); PQclear(res); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } } } /* * Obtain data directory and configuration file locations * We'll check to see whether the configuration files are in the data * directory - if not we'll have to copy them via SSH * * XXX: if configuration files are symlinks to targets outside the data * directory, they won't be copied by pg_basebackup, but we can't tell * this from the below query; we'll probably need to add a check for their * presence and if missing force copy by SSH */ sqlquery_snprintf(sqlquery, " WITH dd AS ( " " SELECT setting " " FROM pg_settings " " WHERE name = 'data_directory' " " ) " " SELECT ps.name, ps.setting, " " ps.setting ~ ('^' || dd.setting) AS in_data_dir " " FROM dd, pg_settings ps " " WHERE ps.name IN ('data_directory', 'config_file', 'hba_file', 'ident_file') " " ORDER BY 1 "); log_debug(_("standby clone: %s\n"), sqlquery); res = PQexec(upstream_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("can't get info about data directory and configuration files: %s\n"), PQerrorMessage(upstream_conn)); PQclear(res); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } /* We need all 4 parameters, and they can be retrieved only by superusers */ if (PQntuples(res) != 4) { log_err("STANDBY CLONE should be run by a SUPERUSER\n"); PQclear(res); PQfinish(upstream_conn); exit(ERR_BAD_CONFIG); } for (i = 0; i < PQntuples(res); i++) { if (strcmp(PQgetvalue(res, i, 0), "data_directory") == 0) { strncpy(master_data_directory, PQgetvalue(res, i, 1), MAXFILENAME); } else if (strcmp(PQgetvalue(res, i, 0), "config_file") == 0) { if (strcmp(PQgetvalue(res, i, 2), "f") == 0) { config_file_outside_pgdata = true; external_config_file_copy_required = true; strncpy(master_config_file, PQgetvalue(res, i, 1), MAXFILENAME); } } else if (strcmp(PQgetvalue(res, i, 0), "hba_file") == 0) { if (strcmp(PQgetvalue(res, i, 2), "f") == 0) { hba_file_outside_pgdata = true; external_config_file_copy_required = true; strncpy(master_hba_file, PQgetvalue(res, i, 1), MAXFILENAME); } } else if (strcmp(PQgetvalue(res, i, 0), "ident_file") == 0) { if (strcmp(PQgetvalue(res, i, 2), "f") == 0) { ident_file_outside_pgdata = true; external_config_file_copy_required = true; strncpy(master_ident_file, PQgetvalue(res, i, 1), MAXFILENAME); } } else log_warning(_("unknown parameter: %s\n"), PQgetvalue(res, i, 0)); } PQclear(res); /* * target directory (-D/--pgdata) provided - use that as new data directory * (useful when executing backup on local machine only or creating the backup * in a different local directory when backup source is a remote host) */ if (target_directory_provided) { strncpy(local_data_directory, runtime_options.dest_dir, MAXFILENAME); strncpy(local_config_file, runtime_options.dest_dir, MAXFILENAME); strncpy(local_hba_file, runtime_options.dest_dir, MAXFILENAME); strncpy(local_ident_file, runtime_options.dest_dir, MAXFILENAME); } /* * Otherwise use the same data directory as on the remote host */ else { strncpy(local_data_directory, master_data_directory, MAXFILENAME); strncpy(local_config_file, master_config_file, MAXFILENAME); strncpy(local_hba_file, master_hba_file, MAXFILENAME); strncpy(local_ident_file, master_ident_file, MAXFILENAME); log_notice(_("setting data directory to: %s\n"), local_data_directory); log_hint(_("use -D/--data-dir to explicitly specify a data directory\n")); } /* * When using rsync only, we need to check the SSH connection early */ if (runtime_options.rsync_only) { r = test_ssh_connection(runtime_options.host, runtime_options.remote_user); if (r != 0) { log_err(_("aborting, remote host %s is not reachable.\n"), runtime_options.host); retval = ERR_BAD_SSH; goto stop_backup; } } /* Check the local data directory can be used */ if (!create_pg_dir(local_data_directory, runtime_options.force)) { log_err(_("unable to use directory %s ...\n"), local_data_directory); log_hint(_("use -F/--force option to force this directory to be overwritten\n")); r = ERR_BAD_CONFIG; retval = ERR_BAD_CONFIG; goto stop_backup; } /* * If replication slots requested, create appropriate slot on * the primary; this must be done before pg_start_backup() is * issued, either by us or by pg_basebackup. */ if (options.use_replication_slots) { if (create_replication_slot(upstream_conn, repmgr_slot_name) == false) { PQfinish(upstream_conn); exit(ERR_DB_QUERY); } } log_notice(_("starting backup...\n")); if (runtime_options.fast_checkpoint == false) { log_hint(_("this may take some time; consider using the -c/--fast-checkpoint option\n")); } if (runtime_options.rsync_only) { PQExpBufferData tablespace_map; bool tablespace_map_rewrite = false; /* For 9.5 and greater, create our own tablespace_map file */ if (server_version_num >= 90500) { initPQExpBuffer(&tablespace_map); } /* * From 9.1 default is to wait for a sync standby to ack, avoid that by * turning off sync rep for this session */ if (set_config_bool(upstream_conn, "synchronous_commit", false) == false) { r = ERR_BAD_CONFIG; retval = ERR_BAD_CONFIG; goto stop_backup; } if (start_backup(upstream_conn, first_wal_segment, runtime_options.fast_checkpoint) == false) { r = ERR_BAD_BASEBACKUP; retval = ERR_BAD_BASEBACKUP; goto stop_backup; } /* * Note that we've successfully executed pg_start_backup(), * so we know whether or not to execute pg_stop_backup() after * the 'stop_backup' label */ pg_start_backup_executed = true; /* * 1. copy data directory, omitting directories which should not be * copied, or for which copying would serve no purpose. * * 2. copy pg_control file */ /* Copy the data directory */ log_info(_("standby clone: master data directory '%s'\n"), master_data_directory); r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_data_directory, local_data_directory, true, server_version_num); if (r != 0) { log_warning(_("standby clone: failed copying master data directory '%s'\n"), master_data_directory); goto stop_backup; } /* Handle tablespaces */ sqlquery_snprintf(sqlquery, " SELECT oid, pg_tablespace_location(oid) AS spclocation " " FROM pg_tablespace " " WHERE spcname NOT IN ('pg_default', 'pg_global')"); res = PQexec(upstream_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to execute tablespace query: %s\n"), PQerrorMessage(upstream_conn)); PQclear(res); r = retval = ERR_DB_QUERY; goto stop_backup; } for (i = 0; i < PQntuples(res); i++) { bool mapping_found = false; PQExpBufferData tblspc_dir_src; PQExpBufferData tblspc_dir_dst; PQExpBufferData tblspc_oid; TablespaceListCell *cell; initPQExpBuffer(&tblspc_dir_src); initPQExpBuffer(&tblspc_dir_dst); initPQExpBuffer(&tblspc_oid); appendPQExpBuffer(&tblspc_oid, "%s", PQgetvalue(res, i, 0)); appendPQExpBuffer(&tblspc_dir_src, "%s", PQgetvalue(res, i, 1)); /* Check if tablespace path matches one of the provided tablespace mappings */ if (options.tablespace_mapping.head != NULL) { for (cell = options.tablespace_mapping.head; cell; cell = cell->next) { if (strcmp(tblspc_dir_src.data, cell->old_dir) == 0) { mapping_found = true; break; } } } if (mapping_found == true) { appendPQExpBuffer(&tblspc_dir_dst, "%s", cell->new_dir); log_debug(_("mapping source tablespace '%s' (OID %s) to '%s'\n"), tblspc_dir_src.data, tblspc_oid.data, tblspc_dir_dst.data); } else { appendPQExpBuffer(&tblspc_dir_dst, "%s", tblspc_dir_src.data); } /* Copy tablespace directory */ r = copy_remote_files(runtime_options.host, runtime_options.remote_user, tblspc_dir_src.data, tblspc_dir_dst.data, true, server_version_num); /* Update symlinks in pg_tblspc */ if (mapping_found == true) { /* 9.5 and later - create a tablespace_map file */ if (server_version_num >= 90500) { tablespace_map_rewrite = true; appendPQExpBuffer(&tablespace_map, "%s %s\n", tblspc_oid.data, tblspc_dir_dst.data); } /* Pre-9.5, we have to manipulate the symlinks in pg_tblspc/ ourselves */ else { PQExpBufferData tblspc_symlink; initPQExpBuffer(&tblspc_symlink); appendPQExpBuffer(&tblspc_symlink, "%s/pg_tblspc/%s", local_data_directory, tblspc_oid.data); if (unlink(tblspc_symlink.data) < 0 && errno != ENOENT) { log_err(_("unable to remove tablespace symlink %s\n"), tblspc_symlink.data); PQclear(res); r = retval = ERR_BAD_BASEBACKUP; goto stop_backup; } if (symlink(tblspc_dir_dst.data, tblspc_symlink.data) < 0) { log_err(_("unable to create tablespace symlink from %s to %s\n"), tblspc_symlink.data, tblspc_dir_dst.data); PQclear(res); r = retval = ERR_BAD_BASEBACKUP; goto stop_backup; } } } } PQclear(res); if (server_version_num >= 90500 && tablespace_map_rewrite == true) { PQExpBufferData tablespace_map_filename; FILE *tablespace_map_file; initPQExpBuffer(&tablespace_map_filename); appendPQExpBuffer(&tablespace_map_filename, "%s/%s", local_data_directory, TABLESPACE_MAP); /* Unlink any existing file (it should be there, but we don't care if it isn't) */ if (unlink(tablespace_map_filename.data) < 0 && errno != ENOENT) { log_err(_("unable to remove tablespace_map file %s\n"), tablespace_map_filename.data); r = retval = ERR_BAD_BASEBACKUP; goto stop_backup; } tablespace_map_file = fopen(tablespace_map_filename.data, "w"); if (tablespace_map_file == NULL) { log_err(_("unable to create tablespace_map file '%s'\n"), tablespace_map_filename.data); r = retval = ERR_BAD_BASEBACKUP; goto stop_backup; } if (fputs(tablespace_map.data, tablespace_map_file) == EOF) { log_err(_("unable to write to tablespace_map file '%s'\n"), tablespace_map_filename.data); r = retval = ERR_BAD_BASEBACKUP; goto stop_backup; } fclose(tablespace_map_file); } } else { r = run_basebackup(local_data_directory); if (r != 0) { log_warning(_("standby clone: base backup failed\n")); retval = ERR_BAD_BASEBACKUP; goto stop_backup; } } /* * If configuration files were not inside the data directory, we;ll need to * copy them via SSH (unless `--ignore-external-config-files` was provided) * * TODO: add option to place these files in the same location on the * standby server as on the primary? */ if (external_config_file_copy_required && !runtime_options.ignore_external_config_files) { log_notice(_("copying configuration files from master\n")); r = test_ssh_connection(runtime_options.host, runtime_options.remote_user); if (r != 0) { log_err(_("aborting, remote host %s is not reachable.\n"), runtime_options.host); retval = ERR_BAD_SSH; goto stop_backup; } if (config_file_outside_pgdata) { log_info(_("standby clone: master config file '%s'\n"), master_config_file); r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_config_file, local_config_file, false, server_version_num); if (r != 0) { log_err(_("standby clone: failed copying master config file '%s'\n"), master_config_file); retval = ERR_BAD_SSH; goto stop_backup; } } if (hba_file_outside_pgdata) { log_info(_("standby clone: master hba file '%s'\n"), master_hba_file); r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_hba_file, local_hba_file, false, server_version_num); if (r != 0) { log_err(_("standby clone: failed copying master hba file '%s'\n"), master_hba_file); retval = ERR_BAD_SSH; goto stop_backup; } } if (ident_file_outside_pgdata) { log_info(_("standby clone: master ident file '%s'\n"), master_ident_file); r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_ident_file, local_ident_file, false, server_version_num); if (r != 0) { log_err(_("standby clone: failed copying master ident file '%s'\n"), master_ident_file); retval = ERR_BAD_SSH; goto stop_backup; } } } /* * When using rsync, copy pg_control file last, emulating the base backup * protocol. */ if (runtime_options.rsync_only) { maxlen_snprintf(local_control_file, "%s/global", local_data_directory); log_info(_("standby clone: local control file '%s'\n"), local_control_file); if (!create_dir(local_control_file)) { log_err(_("couldn't create directory %s ...\n"), local_control_file); goto stop_backup; } maxlen_snprintf(master_control_file, "%s/global/pg_control", master_data_directory); log_info(_("standby clone: master control file '%s'\n"), master_control_file); r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_control_file, local_control_file, false, server_version_num); if (r != 0) { log_warning(_("standby clone: failed copying master control file '%s'\n"), master_control_file); retval = ERR_BAD_SSH; goto stop_backup; } } stop_backup: if (runtime_options.rsync_only && pg_start_backup_executed) { log_notice(_("notifying master about backup completion...\n")); if (stop_backup(upstream_conn, last_wal_segment) == false) { r = ERR_BAD_BASEBACKUP; retval = ERR_BAD_BASEBACKUP; } } /* If the backup failed then exit */ if (r != 0) { /* If a replication slot was previously created, drop it */ if (options.use_replication_slots) { drop_replication_slot(upstream_conn, repmgr_slot_name); } log_err(_("unable to take a base backup of the master server\n")); log_warning(_("destination directory (%s) may need to be cleaned up manually\n"), local_data_directory); PQfinish(upstream_conn); exit(retval); } /* * Clean up any $PGDATA subdirectories which may contain * files which won't be removed by rsync and which could * be stale or are otherwise not required */ if (runtime_options.rsync_only && runtime_options.force) { char script[MAXLEN]; /* * Remove any existing WAL from the target directory, since * rsync's --exclude option doesn't do it. */ maxlen_snprintf(script, "rm -rf %s/pg_xlog/*", local_data_directory); r = system(script); if (r != 0) { log_err(_("unable to empty local WAL directory %s/pg_xlog/\n"), local_data_directory); exit(ERR_BAD_RSYNC); } /* * Remove any replication slot directories; this matches the * behaviour a base backup, which would result in an empty * pg_replslot directory. * * NOTE: watch out for any changes in the replication * slot directory name (as of 9.4: "pg_replslot") and * functionality of replication slots */ if (server_version_num >= 90400) { maxlen_snprintf(script, "rm -rf %s/pg_replslot/*", local_data_directory); r = system(script); if (r != 0) { log_err(_("unable to empty replication slot directory %s/pg_replslot/\n"), local_data_directory); exit(ERR_BAD_RSYNC); } } } /* Finally, write the recovery.conf file */ create_recovery_file(local_data_directory); if (runtime_options.rsync_only) { log_notice(_("standby clone (using rsync) complete\n")); } else { log_notice(_("standby clone (using pg_basebackup) complete\n")); } /* * XXX It might be nice to provide the following options: * - have repmgr start the daemon automatically * - provide a custom pg_ctl command */ log_notice(_("you can now start your PostgreSQL server\n")); if (target_directory_provided) { log_hint(_("for example : pg_ctl -D %s start\n"), local_data_directory); } else { log_hint(_("for example : /etc/init.d/postgresql start\n")); } /* Log the event */ initPQExpBuffer(&event_details); /* Add details about relevant runtime options used */ appendPQExpBuffer(&event_details, _("Cloned from host '%s', port %s"), runtime_options.host, runtime_options.masterport); appendPQExpBuffer(&event_details, _("; backup method: %s"), runtime_options.rsync_only ? "rsync" : "pg_basebackup"); appendPQExpBuffer(&event_details, _("; --force: %s"), runtime_options.force ? "Y" : "N"); create_event_record(upstream_conn, &options, options.node, "standby_clone", true, event_details.data); PQfinish(upstream_conn); exit(retval); } static void do_standby_promote(void) { PGconn *conn; char script[MAXLEN]; PGconn *old_master_conn; int r, retval; char data_dir[MAXLEN]; int i, promote_check_timeout = 60, promote_check_interval = 2; bool promote_success = false; bool success; PQExpBufferData details; /* We need to connect to check configuration */ log_info(_("connecting to standby database\n")); conn = establish_db_connection(options.conninfo, true); /* Verify that standby is a supported server version */ log_verbose(LOG_INFO, _("connected to standby, checking its state\n")); check_server_version(conn, "standby", true, NULL); /* Check we are in a standby node */ retval = is_standby(conn); if (retval == 0 || retval == -1) { log_err(_(retval == 0 ? "this command should be executed on a standby node\n" : "connection to node lost!\n")); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* we also need to check if there isn't any master already */ old_master_conn = get_master_connection(conn, options.cluster_name, NULL, NULL); if (old_master_conn != NULL) { log_err(_("this cluster already has an active master server\n")); PQfinish(old_master_conn); PQfinish(conn); exit(ERR_BAD_CONFIG); } log_notice(_("promoting standby\n")); /* Get the data directory */ success = get_pg_setting(conn, "data_directory", data_dir); PQfinish(conn); if (success == false) { log_err(_("unable to determine data directory\n")); exit(ERR_BAD_CONFIG); } /* * Promote standby to master. * * `pg_ctl promote` returns immediately and has no -w option, so we * can't be sure when or if the promotion completes. * For now we'll poll the server until the default timeout (60 seconds) */ maxlen_snprintf(script, "%s -D %s promote", make_pg_path("pg_ctl"), data_dir); log_notice(_("promoting server using '%s'\n"), script); r = system(script); if (r != 0) { log_err(_("unable to promote server from standby to master\n")); exit(ERR_NO_RESTART); } /* reconnect to check we got promoted */ log_info(_("reconnecting to promoted server\n")); conn = establish_db_connection(options.conninfo, true); for(i = 0; i < promote_check_timeout; i += promote_check_interval) { retval = is_standby(conn); if (!retval) { promote_success = true; break; } sleep(promote_check_interval); } if (promote_success == false) { log_err(_(retval == 1 ? "STANDBY PROMOTE failed, this is still a standby node.\n" : "connection to node lost!\n")); exit(ERR_FAILOVER_FAIL); } /* update node information to reflect new status */ if (update_node_record_set_master(conn, options.node) == false) { initPQExpBuffer(&details); appendPQExpBuffer(&details, _("unable to update node record for node %i"), options.node); log_err("%s\n", details.data); create_event_record(NULL, &options, options.node, "standby_promote", false, details.data); exit(ERR_DB_QUERY); } initPQExpBuffer(&details); appendPQExpBuffer(&details, "Node %i was successfully promoted to master", options.node); log_notice(_("STANDBY PROMOTE successful\n")); /* Log the event */ create_event_record(conn, &options, options.node, "standby_promote", true, details.data); PQfinish(conn); return; } static void do_standby_follow(void) { PGconn *conn; char script[MAXLEN]; char master_conninfo[MAXLEN]; PGconn *master_conn; int master_id; int r, retval; char data_dir[MAXLEN]; bool success; /* We need to connect to check configuration */ log_info(_("connecting to standby database\n")); conn = establish_db_connection(options.conninfo, true); log_verbose(LOG_INFO, _("connected to standby, checking its state\n")); /* Check we are in a standby node */ retval = is_standby(conn); if (retval == 0 || retval == -1) { log_err(_(retval == 0 ? "this command should be executed on a standby node\n" : "connection to node lost!\n")); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* * we also need to check if there is any master in the cluster or wait for * one to appear if we have set the wait option */ log_info(_("discovering new master...\n")); do { if (!is_pgup(conn, options.master_response_timeout)) { conn = establish_db_connection(options.conninfo, true); } master_conn = get_master_connection(conn, options.cluster_name, &master_id, (char *) &master_conninfo); } while (master_conn == NULL && runtime_options.wait_for_master); if (master_conn == NULL) { log_err(_("unable to determine new master node\n")); PQfinish(conn); exit(ERR_BAD_CONFIG); } /* Check we are going to point to a master */ retval = is_standby(master_conn); if (retval) { log_err(_(retval == 1 ? "the node to follow should be a master\n" : "connection to node lost!\n")); PQfinish(conn); PQfinish(master_conn); exit(ERR_BAD_CONFIG); } /* * Verify that standby and master are supported and compatible server * versions */ check_master_standby_version_match(conn, master_conn); /* * set the host and masterport variables with the master ones before * closing the connection because we will need them to recreate the * recovery.conf file */ strncpy(runtime_options.host, PQhost(master_conn), MAXLEN); strncpy(runtime_options.masterport, PQport(master_conn), MAXLEN); strncpy(runtime_options.username, PQuser(master_conn), MAXLEN); /* * If 9.4 or later, and replication slots in use, we'll need to create a * slot on the new master */ if (options.use_replication_slots) { if (create_replication_slot(master_conn, repmgr_slot_name) == false) { PQExpBufferData event_details; initPQExpBuffer(&event_details); appendPQExpBuffer(&event_details, _("Unable to create slot '%s' on the master node: %s"), repmgr_slot_name, PQerrorMessage(master_conn)); log_err("%s\n", event_details.data); create_event_record(master_conn, &options, options.node, "repmgr_follow", false, event_details.data); PQfinish(conn); PQfinish(master_conn); exit(ERR_DB_QUERY); } } log_info(_("changing standby's master\n")); /* Get the data directory full path */ success = get_pg_setting(conn, "data_directory", data_dir); PQfinish(conn); if (success == false) { log_err(_("unable to determine data directory\n")); exit(ERR_BAD_CONFIG); } /* write the recovery.conf file */ if (!create_recovery_file(data_dir)) exit(ERR_BAD_CONFIG); /* Finally, restart the service */ maxlen_snprintf(script, "%s %s -w -D %s -m fast restart", make_pg_path("pg_ctl"), options.pg_ctl_options, data_dir); log_notice(_("restarting server using '%s'\n"), script); r = system(script); if (r != 0) { log_err(_("unable to restart server\n")); exit(ERR_NO_RESTART); } if (update_node_record_set_upstream(master_conn, options.cluster_name, options.node, master_id) == false) { log_err(_("unable to update upstream node")); PQfinish(master_conn); exit(ERR_BAD_CONFIG); } PQfinish(master_conn); return; } static void do_witness_create(void) { PGconn *masterconn; PGconn *witnessconn; PGresult *res; char sqlquery[QUERY_STR_LEN]; char script[MAXLEN]; char buf[MAXLEN]; FILE *pg_conf = NULL; int r = 0, retval; char master_hba_file[MAXLEN]; bool success; bool record_created; PQconninfoOption *conninfo_options; PQconninfoOption *conninfo_option; /* Connection parameters for master only */ keywords[0] = "host"; values[0] = runtime_options.host; keywords[1] = "port"; values[1] = runtime_options.masterport; /* We need to connect to check configuration and copy it */ masterconn = establish_db_connection_by_params(keywords, values, true); if (!masterconn) { /* No event logging possible here as we can't connect to the master */ log_err(_("unable to connect to master\n")); exit(ERR_DB_CON); } /* Verify that master is a supported server version */ check_server_version(masterconn, "master", true, NULL); /* Check we are connecting to a primary node */ retval = is_standby(masterconn); if (retval) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, "%s", _(retval == 1 ? "provided upstream node is not a master" : "connection to upstream node lost")); log_err("%s\n", errmsg.data); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg.data); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } log_info(_("successfully connected to master.\n")); r = test_ssh_connection(runtime_options.host, runtime_options.remote_user); if (r != 0) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to connect to remote host '%s' via SSH"), runtime_options.host); log_err("%s\n", errmsg.data); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg.data); PQfinish(masterconn); exit(ERR_BAD_SSH); } /* Check this directory could be used as a PGDATA dir */ if (!create_witness_pg_dir(runtime_options.dest_dir, runtime_options.force)) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to create witness server data directory (\"%s\")"), runtime_options.host); log_err("%s\n", errmsg.data); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg.data); exit(ERR_BAD_CONFIG); } /* * To create a witness server we need to: 1) initialize the cluster 2) * register the witness in repl_nodes 3) copy configuration from master */ /* Create the cluster for witness */ if (!runtime_options.superuser[0]) strncpy(runtime_options.superuser, "postgres", MAXLEN); sprintf(script, "%s %s -D %s init -o \"%s-U %s\"", make_pg_path("pg_ctl"), options.pg_ctl_options, runtime_options.dest_dir, runtime_options.initdb_no_pwprompt ? "" : "-W ", runtime_options.superuser); log_info(_("initializing cluster for witness: %s.\n"), script); r = system(script); if (r != 0) { char *errmsg = _("unable to initialize cluster for witness server"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } xsnprintf(buf, sizeof(buf), "%s/postgresql.conf", runtime_options.dest_dir); pg_conf = fopen(buf, "a"); if (pg_conf == NULL) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to open \"%s\" to add additional configuration items: %s\n"), buf, strerror(errno)); log_err("%s\n", errmsg.data); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg.data); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } xsnprintf(buf, sizeof(buf), "\n#Configuration added by %s\n", progname()); fputs(buf, pg_conf); /* Attempt to extract a port number from the provided conninfo string * This will override any value provided with '-l/--local-port', as it's * what we'll later try and connect to anyway. '-l/--local-port' should * be deprecated. */ conninfo_options = PQconninfoParse(options.conninfo, NULL); for (conninfo_option = conninfo_options; conninfo_option->keyword != NULL; conninfo_option++) { if (strcmp(conninfo_option->keyword, "port") == 0) { if (conninfo_option->val != NULL && conninfo_option->val[0] != '\0') { strncpy(runtime_options.localport, conninfo_option->val, MAXLEN); break; } } } PQconninfoFree(conninfo_options); /* * If not specified by the user, the default port for the witness server * is 5499; this is intended to support running the witness server as * a separate instance on a normal node server, rather than on its own * dedicated server. */ if (!runtime_options.localport[0]) strncpy(runtime_options.localport, WITNESS_DEFAULT_PORT, MAXLEN); xsnprintf(buf, sizeof(buf), "port = %s\n", runtime_options.localport); fputs(buf, pg_conf); xsnprintf(buf, sizeof(buf), "shared_preload_libraries = 'repmgr_funcs'\n"); fputs(buf, pg_conf); xsnprintf(buf, sizeof(buf), "listen_addresses = '*'\n"); fputs(buf, pg_conf); fclose(pg_conf); /* start new instance */ sprintf(script, "%s %s -w -D %s start", make_pg_path("pg_ctl"), options.pg_ctl_options, runtime_options.dest_dir); log_info(_("starting witness server: %s\n"), script); r = system(script); if (r != 0) { char *errmsg = _("unable to start witness server"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } /* check if we need to create a user */ if (runtime_options.username[0] && runtime_options.localport[0] && strcmp(runtime_options.username,"postgres") != 0) { /* create required user; needs to be superuser to create untrusted language function in c */ sprintf(script, "%s -p %s --superuser --login -U %s %s", make_pg_path("createuser"), runtime_options.localport, runtime_options.superuser, runtime_options.username); log_info(_("creating user for witness db: %s.\n"), script); r = system(script); if (r != 0) { char *errmsg = _("unable to create user for witness server"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } } /* check if we need to create a database */ if (runtime_options.dbname[0] && strcmp(runtime_options.dbname,"postgres") != 0 && runtime_options.localport[0]) { /* create required db */ sprintf(script, "%s -p %s -U %s --owner=%s %s", make_pg_path("createdb"), runtime_options.localport, runtime_options.superuser, runtime_options.username, runtime_options.dbname); log_info("creating database for witness db: %s.\n", script); r = system(script); if (r != 0) { char *errmsg = _("Unable to create database for witness server"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } } /* Get the pg_hba.conf full path */ success = get_pg_setting(masterconn, "hba_file", master_hba_file); if (success == false) { char *errmsg = _("unable to retrieve location of pg_hba.conf"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); exit(ERR_DB_QUERY); } r = copy_remote_files(runtime_options.host, runtime_options.remote_user, master_hba_file, runtime_options.dest_dir, false, -1); if (r != 0) { char *errmsg = _("unable to copy pg_hba.conf from master"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } /* reload to adapt for changed pg_hba.conf */ sprintf(script, "%s %s -w -D %s reload", make_pg_path("pg_ctl"), options.pg_ctl_options, runtime_options.dest_dir); log_info(_("reloading witness server configuration: %s"), script); r = system(script); if (r != 0) { char *errmsg = _("unable to reload witness server"); log_err("%s\n", errmsg); create_event_record(masterconn, &options, options.node, "witness_create", false, errmsg); PQfinish(masterconn); exit(ERR_BAD_CONFIG); } /* register ourselves in the master */ if (runtime_options.force) { bool node_record_deleted = delete_node_record(masterconn, options.node, "witness create"); if (node_record_deleted == false) { PQfinish(masterconn); exit(ERR_BAD_CONFIG); } } record_created = create_node_record(masterconn, "witness create", options.node, "witness", NO_UPSTREAM_NODE, options.cluster_name, options.node_name, options.conninfo, options.priority, NULL); if (record_created == false) { create_event_record(masterconn, &options, options.node, "witness_create", false, "Unable to create witness node record on master"); PQfinish(masterconn); exit(ERR_DB_QUERY); } /* establish a connection to the witness, and create the schema */ witnessconn = establish_db_connection(options.conninfo, true); log_info(_("starting copy of configuration from master...\n")); begin_transaction(witnessconn); if (!create_schema(witnessconn)) { rollback_transaction(witnessconn); create_event_record(masterconn, &options, options.node, "witness_create", false, _("unable to create schema on witness")); PQfinish(masterconn); PQfinish(witnessconn); exit(ERR_BAD_CONFIG); } commit_transaction(witnessconn); /* copy configuration from master, only repl_nodes is needed */ if (!copy_configuration(masterconn, witnessconn, options.cluster_name)) { create_event_record(masterconn, &options, options.node, "witness_create", false, _("Unable to copy configuration from master")); PQfinish(masterconn); PQfinish(witnessconn); exit(ERR_BAD_CONFIG); } /* drop superuser powers if needed */ if (runtime_options.username[0] && runtime_options.localport[0] && strcmp(runtime_options.username,"postgres") != 0) { sqlquery_snprintf(sqlquery, "ALTER ROLE %s NOSUPERUSER", runtime_options.username); log_info(_("revoking superuser status on user %s: %s.\n"), runtime_options.username, sqlquery); log_debug(_("witness create: %s\n"), sqlquery); res = PQexec(witnessconn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to alter user privileges for user %s: %s\n"), runtime_options.username, PQerrorMessage(witnessconn)); PQfinish(masterconn); PQfinish(witnessconn); exit(ERR_DB_QUERY); } } /* Log the event */ create_event_record(masterconn, &options, options.node, "witness_create", true, NULL); PQfinish(masterconn); PQfinish(witnessconn); log_notice(_("configuration has been successfully copied to the witness\n")); } static void help(void) { printf(_("%s: replication management tool for PostgreSQL\n"), progname()); printf(_("\n")); printf(_("Usage:\n")); printf(_(" %s [OPTIONS] master register\n"), progname()); printf(_(" %s [OPTIONS] standby {register|unregister|clone|promote|follow}\n"), progname()); printf(_(" %s [OPTIONS] cluster {show|cleanup}\n"), progname()); printf(_("\n")); printf(_("General options:\n")); printf(_(" -?, --help show this help, then exit\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_("\n")); printf(_("Logging options:\n")); printf(_(" -L, --log-level set log level (overrides configuration file)\n")); printf(_(" -v, --verbose display additional log output (useful for debugging)\n")); printf(_(" -t, --terse don't display hints and other non-critical output\n")); printf(_("\n")); printf(_("Connection options:\n")); printf(_(" -d, --dbname=DBNAME database to connect to\n")); printf(_(" -h, --host=HOSTNAME database server host or socket directory\n")); printf(_(" -p, --port=PORT database server port\n")); printf(_(" -U, --username=USERNAME database user name to connect as\n")); printf(_("\n")); printf(_("General configuration options:\n")); printf(_(" -b, --pg_bindir=PATH path to PostgreSQL binaries (optional)\n")); printf(_(" -D, --data-dir=DIR local directory where the files will be\n" \ " copied to\n")); printf(_(" -f, --config-file=PATH path to the configuration file\n")); printf(_(" -R, --remote-user=USERNAME database server username for rsync\n")); printf(_(" -F, --force force potentially dangerous operations to happen\n")); printf(_(" --check-upstream-config verify upstream server configuration\n")); printf(_("\n")); printf(_("Command-specific configuration options:\n")); printf(_(" -c, --fast-checkpoint (standby clone) force fast checkpoint\n")); printf(_(" -r, --rsync-only (standby clone) use only rsync, not pg_basebackup\n")); printf(_(" --recovery-min-apply-delay=VALUE (standby clone, follow) set recovery_min_apply_delay\n" \ " in recovery.conf (PostgreSQL 9.4 and later)\n")); printf(_(" --ignore-external-config-files (standby clone) don't copy configuration files located\n" \ " outside the data directory when cloning a standby\n")); printf(_(" -w, --wal-keep-segments=VALUE (standby clone) minimum value for the GUC\n" \ " wal_keep_segments (default: %s)\n"), DEFAULT_WAL_KEEP_SEGMENTS); printf(_(" -W, --wait (standby follow) wait for a master to appear\n")); printf(_(" -k, --keep-history=VALUE (cluster cleanup) retain indicated number of days of history\n")); printf(_(" --initdb-no-pwprompt (witness server) no superuser password prompt during initdb\n")); /* remove this line in the next significant release */ printf(_(" -l, --local-port=PORT (witness server) witness server local port, default: %s \n" \ " (DEPRECATED, put port in conninfo)\n"), WITNESS_DEFAULT_PORT); printf(_(" -S, --superuser=USERNAME (witness server) superuser username for witness database\n" \ " (default: postgres)\n")); printf(_("\n")); printf(_("%s performs the following node management tasks:\n"), progname()); printf(_("\n")); printf(_("COMMANDS:\n")); printf(_(" master register - registers the master in a cluster\n")); printf(_(" standby clone [node] - creates a new standby\n")); printf(_(" standby register - registers a standby in a cluster\n")); printf(_(" standby unregister - unregisters a standby in a cluster\n")); printf(_(" standby promote - promotes a specific standby to master\n")); printf(_(" standby follow - makes standby follow a new master\n")); printf(_(" witness create - creates a new witness server\n")); printf(_(" cluster show - displays information about cluster nodes\n")); printf(_(" cluster cleanup - prunes or truncates monitoring history\n" \ " (monitoring history creation requires repmgrd\n" \ " with --monitoring-history option)\n")); } /* * Creates a recovery file for a standby. */ static bool create_recovery_file(const char *data_dir) { FILE *recovery_file; char recovery_file_path[MAXLEN]; char line[MAXLEN]; maxlen_snprintf(recovery_file_path, "%s/%s", data_dir, RECOVERY_FILE); recovery_file = fopen(recovery_file_path, "w"); if (recovery_file == NULL) { log_err(_("unable to create recovery.conf file at '%s'\n"), recovery_file_path); return false; } log_debug(_("create_recovery_file(): creating '%s'...\n"), recovery_file_path); /* standby_mode = 'on' */ maxlen_snprintf(line, "standby_mode = 'on'\n"); if (write_recovery_file_line(recovery_file, recovery_file_path, line) == false) return false; log_debug(_("recovery.conf: %s"), line); /* primary_conninfo = '...' */ write_primary_conninfo(line); if (write_recovery_file_line(recovery_file, recovery_file_path, line) == false) return false; log_debug(_("recovery.conf: %s"), line); /* recovery_target_timeline = 'latest' */ maxlen_snprintf(line, "recovery_target_timeline = 'latest'\n"); if (write_recovery_file_line(recovery_file, recovery_file_path, line) == false) return false; log_debug(_("recovery.conf: %s"), line); /* recovery_min_apply_delay = ... (optional) */ if (*runtime_options.recovery_min_apply_delay) { maxlen_snprintf(line, "recovery_min_apply_delay = %s\n", runtime_options.recovery_min_apply_delay); if (write_recovery_file_line(recovery_file, recovery_file_path, line) == false) return false; log_debug(_("recovery.conf: %s"), line); } /* primary_slot_name = '...' (optional, for 9.4 and later) */ if (options.use_replication_slots) { maxlen_snprintf(line, "primary_slot_name = %s\n", repmgr_slot_name); if (write_recovery_file_line(recovery_file, recovery_file_path, line) == false) return false; log_debug(_("recovery.conf: %s"), line); } fclose(recovery_file); return true; } static bool write_recovery_file_line(FILE *recovery_file, char *recovery_file_path, char *line) { if (fputs(line, recovery_file) == EOF) { log_err(_("unable to write to recovery file at '%s'\n"), recovery_file_path); fclose(recovery_file); return false; } return true; } static int test_ssh_connection(char *host, char *remote_user) { char script[MAXLEN]; int r = 1, i; /* On some OS, true is located in a different place than in Linux * we have to try them all until all alternatives are gone or we * found `true' because the target OS may differ from the source * OS */ const char *truebin_paths[] = { "/bin/true", "/usr/bin/true", NULL }; /* Check if we have ssh connectivity to host before trying to rsync */ for(i = 0; truebin_paths[i] && r != 0; ++i) { if (!remote_user[0]) maxlen_snprintf(script, "ssh -o Batchmode=yes %s %s %s", options.ssh_options, host, truebin_paths[i]); else maxlen_snprintf(script, "ssh -o Batchmode=yes %s %s -l %s %s", options.ssh_options, host, remote_user, truebin_paths[i]); log_debug(_("command is: %s\n"), script); r = system(script); } if (r != 0) log_info(_("unable to connect to remote host (%s)\n"), host); return r; } static int copy_remote_files(char *host, char *remote_user, char *remote_path, char *local_path, bool is_directory, int server_version_num) { PQExpBufferData rsync_flags; char script[MAXLEN]; char host_string[MAXLEN]; int r; initPQExpBuffer(&rsync_flags); if (*options.rsync_options == '\0') { appendPQExpBuffer(&rsync_flags, "%s", "--archive --checksum --compress --progress --rsh=ssh"); } else { appendPQExpBuffer(&rsync_flags, "%s", options.rsync_options); } if (runtime_options.force) { appendPQExpBuffer(&rsync_flags, "%s", " --delete --checksum"); } if (!remote_user[0]) { maxlen_snprintf(host_string, "%s", host); } else { maxlen_snprintf(host_string, "%s@%s", remote_user, host); } /* * When copying the main PGDATA directory, certain files and contents * of certain directories need to be excluded. * * See function 'sendDir()' in 'src/backend/replication/basebackup.c' - * we're basically simulating what pg_basebackup does, but with rsync rather * than the BASEBACKUP replication protocol command. */ if (is_directory) { /* Files which we don't want */ appendPQExpBuffer(&rsync_flags, "%s", " --exclude=postmaster.pid --exclude=postmaster.opts --exclude=global/pg_control"); if (server_version_num >= 90400) { /* * Ideally we'd use PG_AUTOCONF_FILENAME from utils/guc.h, but * that has too many dependencies for a mere client program. */ appendPQExpBuffer(&rsync_flags, "%s", " --exclude=postgresql.auto.conf.tmp"); } /* Temporary files which we don't want, if they exist */ appendPQExpBuffer(&rsync_flags, " --exclude=%s*", PG_TEMP_FILE_PREFIX); /* Directories which we don't want */ appendPQExpBuffer(&rsync_flags, "%s", " --exclude=pg_xlog/* --exclude=pg_log/* --exclude=pg_stat_tmp/*"); if (server_version_num >= 90400) { appendPQExpBuffer(&rsync_flags, "%s", " --exclude=pg_replslot/*"); } maxlen_snprintf(script, "rsync %s %s:%s/* %s", rsync_flags.data, host_string, remote_path, local_path); } else { maxlen_snprintf(script, "rsync %s %s:%s %s", rsync_flags.data, host_string, remote_path, local_path); } log_info(_("rsync command line: '%s'\n"), script); r = system(script); if (r != 0) log_err(_("unable to rsync from remote host (%s:%s)\n"), host_string, remote_path); return r; } static int run_basebackup(const char *data_dir) { char script[MAXLEN]; int r = 0; PQExpBufferData params; TablespaceListCell *cell; /* Create pg_basebackup command line options */ initPQExpBuffer(¶ms); appendPQExpBuffer(¶ms, " -D %s", data_dir); if (strlen(runtime_options.host)) { appendPQExpBuffer(¶ms, " -h %s", runtime_options.host); } if (strlen(runtime_options.masterport)) { appendPQExpBuffer(¶ms, " -p %s", runtime_options.masterport); } if (strlen(runtime_options.username)) { appendPQExpBuffer(¶ms, " -U %s", runtime_options.username); } if (runtime_options.fast_checkpoint) { appendPQExpBuffer(¶ms, " -c fast"); } if (options.tablespace_mapping.head != NULL) { for (cell = options.tablespace_mapping.head; cell; cell = cell->next) { appendPQExpBuffer(¶ms, " -T %s=%s", cell->old_dir, cell->new_dir); } } maxlen_snprintf(script, "%s -l \"repmgr base backup\" %s %s", make_pg_path("pg_basebackup"), params.data, options.pg_basebackup_options); termPQExpBuffer(¶ms); log_info(_("executing: '%s'\n"), script); /* * As of 9.4, pg_basebackup only ever returns 0 or 1 */ r = system(script); return r; } /* * Check for useless or conflicting parameters, and also whether a * configuration file is required. */ static void check_parameters_for_action(const int action) { switch (action) { case MASTER_REGISTER: /* * To register a master we only need the repmgr.conf all other * parameters are at least useless and could be confusing so * reject them */ if (runtime_options.host[0] || runtime_options.masterport[0] || runtime_options.username[0] || runtime_options.dbname[0]) { error_list_append(&cli_warnings, _("master connection parameters not required when executing MASTER REGISTER")); } if (runtime_options.dest_dir[0]) { error_list_append(&cli_warnings, _("destination directory not required when executing MASTER REGISTER")); } break; case STANDBY_REGISTER: /* * To register a standby we only need the repmgr.conf we don't * need connection parameters to the master because we can detect * the master in repl_nodes */ if (runtime_options.host[0] || runtime_options.masterport[0] || runtime_options.username[0] || runtime_options.dbname[0]) { error_list_append(&cli_warnings, _("master connection parameters not required when executing STANDBY REGISTER")); } if (runtime_options.dest_dir[0]) { error_list_append(&cli_warnings, _("destination directory not required when executing STANDBY REGISTER")); } break; case STANDBY_UNREGISTER: /* * To unregister a standby we only need the repmgr.conf we don't * need connection parameters to the master because we can detect * the master in repl_nodes */ if (runtime_options.host[0] || runtime_options.masterport[0] || runtime_options.username[0] || runtime_options.dbname[0]) { error_list_append(&cli_warnings, _("master connection parameters not required when executing STANDBY UNREGISTER")); } if (runtime_options.dest_dir[0]) { error_list_append(&cli_warnings, _("destination directory not required when executing STANDBY UNREGISTER")); } break; case STANDBY_PROMOTE: /* * To promote a standby we only need the repmgr.conf we don't want * connection parameters to the master because we will try to * detect the master in repl_nodes if we can't find it then the * promote action will be cancelled */ if (runtime_options.host[0] || runtime_options.masterport[0] || runtime_options.username[0] || runtime_options.dbname[0]) { error_list_append(&cli_warnings, _("master connection parameters not required when executing STANDBY PROMOTE")); } if (runtime_options.dest_dir[0]) { error_list_append(&cli_warnings, _("destination directory not required when executing STANDBY PROMOTE")); } break; case STANDBY_FOLLOW: /* * To make a standby follow a master we only need the repmgr.conf * we don't want connection parameters to the new master because * we will try to detect the master in repl_nodes if we can't find * it then the follow action will be cancelled */ if (runtime_options.host[0] || runtime_options.masterport[0] || runtime_options.username[0] || runtime_options.dbname[0]) { error_list_append(&cli_warnings, _("master connection parameters not required when executing STANDBY FOLLOW")); } if (runtime_options.dest_dir[0]) { error_list_append(&cli_warnings, _("destination directory not required when executing STANDBY FOLLOW")); } break; case STANDBY_CLONE: /* * Explicitly require connection information for standby clone - * this will be written into `recovery.conf` so it's important to * specify it explicitly */ if (strcmp(runtime_options.host, "") == 0) { error_list_append(&cli_errors, _("master hostname (-h/--host) required when executing STANDBY CLONE")); } if (strcmp(runtime_options.dbname, "") == 0) { error_list_append(&cli_errors, _("master database name (-d/--dbname) required when executing STANDBY CLONE")); } if (strcmp(runtime_options.username, "") == 0) { error_list_append(&cli_errors, _("master database username (-U/--username) required when executing STANDBY CLONE")); } config_file_required = false; break; case WITNESS_CREATE: /* allow all parameters to be supplied */ break; case CLUSTER_SHOW: /* allow all parameters to be supplied */ break; case CLUSTER_CLEANUP: /* allow all parameters to be supplied */ break; } /* Warn about parameters which apply to STANDBY CLONE only */ if (action != STANDBY_CLONE) { if (runtime_options.fast_checkpoint) { error_list_append(&cli_warnings, _("-c/--fast-checkpoint can only be used when executing STANDBY CLONE")); } if (runtime_options.ignore_external_config_files) { error_list_append(&cli_warnings, _("--ignore-external-config-files can only be used when executing STANDBY CLONE")); } if (*runtime_options.recovery_min_apply_delay) { error_list_append(&cli_warnings, _("--recovery-min-apply-delay can only be used when executing STANDBY CLONE")); } if (runtime_options.rsync_only) { error_list_append(&cli_warnings, _("-r/--rsync-only can only be used when executing STANDBY CLONE")); } if (wal_keep_segments_used) { error_list_append(&cli_warnings, _("-w/--wal-keep-segments can only be used when executing STANDBY CLONE")); } } return; } /* The caller should wrap this function in a transaction */ static bool create_schema(PGconn *conn) { char sqlquery[QUERY_STR_LEN]; PGresult *res; /* create schema */ sqlquery_snprintf(sqlquery, "CREATE SCHEMA %s", get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create the schema %s: %s\n"), get_repmgr_schema(), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* create functions */ /* * to avoid confusion of the time_lag field and provide a consistent UI we * use these functions for providing the latest update timestamp */ sqlquery_snprintf(sqlquery, "CREATE FUNCTION %s.repmgr_update_last_updated() " " RETURNS TIMESTAMP WITH TIME ZONE " " AS '$libdir/repmgr_funcs', 'repmgr_update_last_updated' " " LANGUAGE C STRICT ", get_repmgr_schema_quoted(conn)); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create the function repmgr_update_last_updated: %s\n"), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); sqlquery_snprintf(sqlquery, "CREATE FUNCTION %s.repmgr_get_last_updated() " " RETURNS TIMESTAMP WITH TIME ZONE " " AS '$libdir/repmgr_funcs', 'repmgr_get_last_updated' " " LANGUAGE C STRICT ", get_repmgr_schema_quoted(conn)); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create the function repmgr_get_last_updated: %s\n"), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* Create tables */ /* CREATE TABLE repl_nodes */ sqlquery_snprintf(sqlquery, "CREATE TABLE %s.repl_nodes ( " " id INTEGER PRIMARY KEY, " " type TEXT NOT NULL CHECK (type IN('master','standby','witness')), " " upstream_node_id INTEGER NULL REFERENCES %s.repl_nodes (id), " " cluster TEXT NOT NULL, " " name TEXT NOT NULL, " " conninfo TEXT NOT NULL, " " slot_name TEXT NULL, " " priority INTEGER NOT NULL, " " active BOOLEAN NOT NULL DEFAULT TRUE )", get_repmgr_schema_quoted(conn), get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create table '%s.repl_nodes': %s\n"), get_repmgr_schema_quoted(conn), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* CREATE TABLE repl_monitor */ sqlquery_snprintf(sqlquery, "CREATE TABLE %s.repl_monitor ( " " primary_node INTEGER NOT NULL, " " standby_node INTEGER NOT NULL, " " last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL, " " last_apply_time TIMESTAMP WITH TIME ZONE, " " last_wal_primary_location TEXT NOT NULL, " " last_wal_standby_location TEXT, " " replication_lag BIGINT NOT NULL, " " apply_lag BIGINT NOT NULL) ", get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create table '%s.repl_monitor': %s\n"), get_repmgr_schema_quoted(conn), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* CREATE TABLE repl_events */ sqlquery_snprintf(sqlquery, "CREATE TABLE %s.repl_events ( " " node_id INTEGER NOT NULL, " " event TEXT NOT NULL, " " successful BOOLEAN NOT NULL DEFAULT TRUE, " " event_timestamp TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP, " " details TEXT NULL " " ) ", get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create table '%s.repl_events': %s\n"), get_repmgr_schema_quoted(conn), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* CREATE VIEW repl_status */ sqlquery_snprintf(sqlquery, "CREATE VIEW %s.repl_status AS " " SELECT m.primary_node, m.standby_node, n.name AS standby_name, " " n.type AS node_type, n.active, last_monitor_time, " " CASE WHEN n.type='standby' THEN m.last_wal_primary_location ELSE NULL END AS last_wal_primary_location, " " m.last_wal_standby_location, " " CASE WHEN n.type='standby' THEN pg_size_pretty(m.replication_lag) ELSE NULL END AS replication_lag, " " CASE WHEN n.type='standby' THEN age(now(), m.last_apply_time) ELSE NULL END AS replication_time_lag, " " CASE WHEN n.type='standby' THEN pg_size_pretty(m.apply_lag) ELSE NULL END AS apply_lag, " " age(now(), CASE WHEN pg_is_in_recovery() THEN %s.repmgr_get_last_updated() ELSE m.last_monitor_time END) AS communication_time_lag " " FROM %s.repl_monitor m " " JOIN %s.repl_nodes n ON m.standby_node = n.id " " WHERE (m.standby_node, m.last_monitor_time) IN ( " " SELECT m1.standby_node, MAX(m1.last_monitor_time) " " FROM %s.repl_monitor m1 GROUP BY 1 " " )", get_repmgr_schema_quoted(conn), get_repmgr_schema_quoted(conn), get_repmgr_schema_quoted(conn), get_repmgr_schema_quoted(conn), get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create view %s.repl_status: %s\n"), get_repmgr_schema_quoted(conn), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* an index to improve performance of the view */ sqlquery_snprintf(sqlquery, "CREATE INDEX idx_repl_status_sort " " ON %s.repl_monitor (last_monitor_time, standby_node) ", get_repmgr_schema_quoted(conn)); log_debug(_("master register: %s\n"), sqlquery); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("unable to create index 'idx_repl_status_sort' on '%s.repl_monitor': %s\n"), get_repmgr_schema_quoted(conn), PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); /* * XXX Here we MUST try to load the repmgr_function.sql not hardcode it * here */ sqlquery_snprintf(sqlquery, "CREATE OR REPLACE FUNCTION %s.repmgr_update_standby_location(text) " " RETURNS boolean " " AS '$libdir/repmgr_funcs', 'repmgr_update_standby_location' " " LANGUAGE C STRICT ", get_repmgr_schema_quoted(conn)); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "Cannot create the function repmgr_update_standby_location: %s\n", PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); sqlquery_snprintf(sqlquery, "CREATE OR REPLACE FUNCTION %s.repmgr_get_last_standby_location() " " RETURNS text " " AS '$libdir/repmgr_funcs', 'repmgr_get_last_standby_location' " " LANGUAGE C STRICT ", get_repmgr_schema_quoted(conn)); res = PQexec(conn, sqlquery); if (!res || PQresultStatus(res) != PGRES_COMMAND_OK) { fprintf(stderr, "Cannot create the function repmgr_get_last_standby_location: %s\n", PQerrorMessage(conn)); if (res != NULL) PQclear(res); return false; } PQclear(res); return true; } /* This function uses global variables to determine connection settings. Special * usage of the PGPASSWORD variable is handled, but strongly discouraged */ static void write_primary_conninfo(char *line) { char host_buf[MAXLEN] = ""; char conn_buf[MAXLEN] = ""; char user_buf[MAXLEN] = ""; char appname_buf[MAXLEN] = ""; char password_buf[MAXLEN] = ""; /* Environment variable for password (UGLY, please use .pgpass!) */ const char *password = getenv("PGPASSWORD"); if (password != NULL) { maxlen_snprintf(password_buf, " password=%s", password); } if (runtime_options.host[0]) { maxlen_snprintf(host_buf, " host=%s", runtime_options.host); } if (runtime_options.username[0]) { maxlen_snprintf(user_buf, " user=%s", runtime_options.username); } if (options.node_name[0]) { maxlen_snprintf(appname_buf, " application_name=%s", options.node_name); } maxlen_snprintf(conn_buf, "port=%s%s%s%s%s", (runtime_options.masterport[0]) ? runtime_options.masterport : DEF_PGPORT_STR, host_buf, user_buf, password_buf, appname_buf); maxlen_snprintf(line, "primary_conninfo = '%s'\n", conn_buf); } /** * check_server_version() * * Verify that the server is MIN_SUPPORTED_VERSION_NUM or later * * PGconn *conn: * the connection to check * * char *server_type: * either "master" or "standby"; used to format error message * * bool exit_on_error: * exit if reported server version is too low; optional to enable some callers * to perform additional cleanup * * char *server_version_string * passed to get_server_version(), which will place the human-readble * server version string there (e.g. "9.4.0") */ static int check_server_version(PGconn *conn, char *server_type, bool exit_on_error, char *server_version_string) { int server_version_num = 0; server_version_num = get_server_version(conn, server_version_string); if (server_version_num < MIN_SUPPORTED_VERSION_NUM) { if (server_version_num > 0) log_err(_("%s requires %s to be PostgreSQL %s or later\n"), progname(), server_type, MIN_SUPPORTED_VERSION ); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } return -1; } return server_version_num; } /* * check_master_standby_version_match() * * Check server versions of supplied connections are compatible for * replication purposes. * * Exits on error. */ static void check_master_standby_version_match(PGconn *conn, PGconn *master_conn) { char standby_version[MAXVERSIONSTR]; int standby_version_num = 0; char master_version[MAXVERSIONSTR]; int master_version_num = 0; standby_version_num = check_server_version(conn, "standby", true, standby_version); /* Verify that master is a supported server version */ master_version_num = check_server_version(conn, "master", false, master_version); if (master_version_num < 0) { PQfinish(conn); PQfinish(master_conn); exit(ERR_BAD_CONFIG); } /* master and standby version should match */ if ((master_version_num / 100) != (standby_version_num / 100)) { PQfinish(conn); PQfinish(master_conn); log_err(_("PostgreSQL versions on master (%s) and standby (%s) must match.\n"), master_version, standby_version); exit(ERR_BAD_CONFIG); } } /* * check_upstream_config() * * Perform sanity check on upstream server configuration * * TODO: * - check replication connection is possble * - check user is qualified to perform base backup */ static bool check_upstream_config(PGconn *conn, int server_version_num, bool exit_on_error) { int i; bool config_ok = true; char *wal_error_message = NULL; /* Check that WAL level is set correctly */ if (server_version_num < 90300) { i = guc_set(conn, "wal_level", "=", "hot_standby"); wal_error_message = _("parameter 'wal_level' must be set to 'hot_standby'"); } else { char *levels[] = { "hot_standby", "logical", }; int j = 0; wal_error_message = _("parameter 'wal_level' must be set to 'hot_standby' or 'logical'"); for(; j < 2; j++) { i = guc_set(conn, "wal_level", "=", levels[j]); if (i) { break; } } } if (i == 0 || i == -1) { if (i == 0) log_err("%s\n", wal_error_message); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } if (options.use_replication_slots) { /* Does the server support physical replication slots? */ if (server_version_num < 90400) { log_err(_("server version must be 9.4 or later to enable replication slots\n")); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } /* Server is 9.4 or greater - non-zero `max_replication_slots` required */ else { i = guc_set_typed(conn, "max_replication_slots", ">", "0", "integer"); if (i == 0 || i == -1) { if (i == 0) { log_err(_("parameter 'max_replication_slots' must be set to at least 1 to enable replication slots\n")); log_hint(_("'max_replication_slots' should be set to at least the number of expected standbys\n")); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } } } } /* * physical replication slots not available or not requested - * ensure some reasonably high value set for `wal_keep_segments` */ else { i = guc_set_typed(conn, "wal_keep_segments", ">=", runtime_options.wal_keep_segments, "integer"); if (i == 0 || i == -1) { if (i == 0) { log_err(_("parameter 'wal_keep_segments' must be be set to %s or greater (see the '-w' option or edit the postgresql.conf of the upstream server.)\n"), runtime_options.wal_keep_segments); if (server_version_num >= 90400) { log_hint(_("in PostgreSQL 9.4 and later, replication slots can be used, which " "do not require 'wal_keep_segments' to be set to a high value " "(set parameter 'use_replication_slots' in the configuration file to enable)\n" )); } } if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } } i = guc_set(conn, "archive_mode", "=", "on"); if (i == 0 || i == -1) { if (i == 0) log_err(_("parameter 'archive_mode' must be set to 'on'\n")); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } /* * check that 'archive_command' is non empty (however it's not practical to * check that it's actually valid) * * if 'archive_mode' is not on, pg_settings returns '(disabled)' regardless * of what's in 'archive_command', so until 'archive_mode' is on we can't * properly check it. */ if (guc_set(conn, "archive_mode", "=", "on")) { i = guc_set(conn, "archive_command", "!=", ""); if (i == 0 || i == -1) { if (i == 0) log_err(_("parameter 'archive_command' must be set to a valid command\n")); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } } /* * Check that 'hot_standby' is on. This isn't strictly necessary * for the primary server, however the assumption is that configuration * should be consistent for all servers in a cluster. */ i = guc_set(conn, "hot_standby", "=", "on"); if (i == 0 || i == -1) { if (i == 0) log_err(_("parameter 'hot_standby' must be set to 'on'\n")); if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } i = guc_set_typed(conn, "max_wal_senders", ">", "0", "integer"); if (i == 0 || i == -1) { if (i == 0) { log_err(_("parameter 'max_wal_senders' must be set to be at least 1\n")); log_hint(_("'max_wal_senders' should be set to at least the number of expected standbys\n")); } if (exit_on_error == true) { PQfinish(conn); exit(ERR_BAD_CONFIG); } config_ok = false; } return config_ok; } static bool update_node_record_set_master(PGconn *conn, int this_node_id) { PGresult *res; char sqlquery[QUERY_STR_LEN]; log_debug(_("setting node %i as master and marking existing master as failed\n"), this_node_id); begin_transaction(conn); sqlquery_snprintf(sqlquery, " UPDATE %s.repl_nodes " " SET active = FALSE " " WHERE cluster = '%s' " " AND type = 'master' " " AND active IS TRUE ", get_repmgr_schema_quoted(conn), options.cluster_name); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to set old master node as inactive: %s\n"), PQerrorMessage(conn)); PQclear(res); rollback_transaction(conn); return false; } PQclear(res); sqlquery_snprintf(sqlquery, " UPDATE %s.repl_nodes " " SET type = 'master', " " upstream_node_id = NULL " " WHERE cluster = '%s' " " AND id = %i ", get_repmgr_schema_quoted(conn), options.cluster_name, this_node_id); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { log_err(_("Unable to set current node %i as active master: %s\n"), this_node_id, PQerrorMessage(conn)); PQclear(res); PQexec(conn, "ROLLBACK"); return false; } PQclear(res); return commit_transaction(conn); } static void do_check_upstream_config(void) { PGconn *conn; bool config_ok; int server_version_num; parse_config(&options); /* Connection parameters for upstream server only */ keywords[0] = "host"; values[0] = runtime_options.host; keywords[1] = "port"; values[1] = runtime_options.masterport; keywords[2] = "dbname"; values[2] = runtime_options.dbname; /* We need to connect to check configuration and start a backup */ log_info(_("connecting to upstream server\n")); conn = establish_db_connection_by_params(keywords, values, true); /* Verify that upstream server is a supported server version */ log_verbose(LOG_INFO, _("connected to upstream server, checking its state\n")); server_version_num = check_server_version(conn, "upstream server", false, NULL); config_ok = check_upstream_config(conn, server_version_num, false); if (config_ok == true) { puts(_("No configuration problems found with the upstream server")); } PQfinish(conn); } static char * make_pg_path(char *file) { maxlen_snprintf(path_buf, "%s%s", pg_bindir, file); return path_buf; } static void exit_with_errors(void) { fprintf(stderr, _("%s: following command line errors were encountered.\n"), progname()); print_error_list(&cli_errors, LOG_ERR); fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname()); exit(ERR_BAD_CONFIG); } static void print_error_list(ErrorList *error_list, int log_level) { ErrorListCell *cell; for (cell = error_list->head; cell; cell = cell->next) { switch(log_level) { /* Currently we only need errors and warnings */ case LOG_ERR: log_err("%s\n", cell->error_message); break; case LOG_WARNING: log_warning("%s\n", cell->error_message); break; } } } repmgr-3.0.3/repmgr.conf.sample000066400000000000000000000113231264264412200164370ustar00rootroot00000000000000################################################### # Replication Manager sample configuration file ################################################### # Required configuration items # ============================ # # repmgr and repmgrd require these items to be configured: # Cluster name - this will be used by repmgr to generate its internal # schema (pattern: "repmgr_{cluster}"); while this name will be quoted # to preserve case, we recommend using lower case and avoiding whitespace # to facilitate easier querying of the repmgr views and tables. cluster=example_cluster # Node ID and name # (Note: we recommend to avoid naming nodes after their initial # replication funcion, as this will cause confusion when e.g. # "standby2" is promoted to primary) node=2 # a unique integer node_name=node2 # an arbitrary (but unique) string; we recommend using # the server's hostname or another identifier unambiguously # associated with the server to avoid confusion # Database connection information as a conninfo string # This must be accessible to all servers in the cluster; for details see: # http://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING conninfo='host=192.168.204.104 dbname=repmgr_db user=repmgr_usr' # Optional configuration items # ============================ # Replication settings # --------------------- # when using cascading replication and a standby is to be connected to an # upstream standby, specify that node's ID with 'upstream_node'. The node # must exist before the new standby can be registered. If a standby is # to connect directly to a primary node, this parameter is not required. # # upstream_node=1 # physical replication slots - PostgreSQL 9.4 and later only # (default: 0) # # use_replication_slots=0 # # NOTE: 'max_replication_slots' should be configured for at least the # number of standbys which will connect to the primary. # Logging and monitoring settings # ------------------------------- # Log level: possible values are DEBUG, INFO, NOTICE, WARNING, ERR, ALERT, CRIT or EMERG # (default: NOTICE) loglevel=NOTICE # Logging facility: possible values are STDERR or - for Syslog integration - one of LOCAL0, LOCAL1, ..., LOCAL7, USER # (default: STDERR) logfacility=STDERR # stderr can be redirected to an arbitrary file: # # logfile='/var/log/repmgr.log' # event notifications can be passed to an arbitrary external program # together with the following parameters: # # %n - node ID # %e - event type # %s - success (1 or 0) # %t - timestamp # %d - details # # the values provided for "%t" and "%d" will probably contain spaces, # so should be quoted in the provided command configuration, e.g.: # # event_notification_command='/path/to/some/script %n %e %s "%t" "%d"' # By default, all notifications will be passed; the notification types # can be filtered to explicitly named ones: # # event_notifications=master_register,standby_register,witness_create # Environment/command settings # ---------------------------- # path to PostgreSQL binary directory (location of pg_ctl, pg_basebackup etc.) # (if not provided, defaults to system $PATH) # pg_bindir=/usr/bin/ # external command options # rsync_options=--archive --checksum --compress --progress --rsh="ssh -o \"StrictHostKeyChecking no\"" # ssh_options=-o "StrictHostKeyChecking no" # external command arguments # pg_ctl_options='-s' # pg_basebackup_options='--xlog-method=s' # Standby clone settings # ---------------------- # # These settings apply when cloning a standby (`repmgr standby clone`). # Tablespaces can be remapped from one file system location to another: # # tablespace_mapping=/path/to/original/tablespace=/path/to/new/tablespace # Failover settings (repmgrd) # --------------------------- # # These settings are only applied when repmgrd is running. # Number of seconds to wait for a response from the primary server before # deciding it has failed master_response_timeout=60 # Number of times to try and reconnect to the primary before starting # the failover procedure reconnect_attempts=6 reconnect_interval=10 # Autofailover options failover=automatic # one of 'automatic', 'manual' priority=100 # a value of zero or less prevents the node being promoted to primary promote_command='repmgr standby promote -f /path/to/repmgr.conf' follow_command='repmgr standby follow -f /path/to/repmgr.conf -W' # monitoring interval in seconds; default is 2 # # monitor_interval_secs=2 # change wait time for primary; before we bail out and exit when the primary # disappears, we wait 'reconnect_attempts' * 'retry_promote_interval_secs' # seconds; by default this would be half an hour, as 'retry_promote_interval_secs' # default value is 300) # # retry_promote_interval_secs=300 repmgr-3.0.3/repmgr.h000066400000000000000000000045271264264412200144710ustar00rootroot00000000000000/* * repmgr.h * Copyright (c) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _REPMGR_H_ #define _REPMGR_H_ #include #include #include #include "strutil.h" #include "dbutils.h" #include "errcode.h" #include "config.h" #define MIN_SUPPORTED_VERSION "9.3" #define MIN_SUPPORTED_VERSION_NUM 90300 #include "config.h" #define MAXFILENAME 1024 #define ERRBUFF_SIZE 512 #define DEFAULT_WAL_KEEP_SEGMENTS "5000" #define DEFAULT_DEST_DIR "." #define DEFAULT_MASTER_PORT "5432" #define DEFAULT_DBNAME "postgres" #define DEFAULT_REPMGR_SCHEMA_PREFIX "repmgr_" #define DEFAULT_PRIORITY 100 #define FAILOVER_NODES_MAX_CHECK 50 #define MANUAL_FAILOVER 0 #define AUTOMATIC_FAILOVER 1 #define NODE_NOT_FOUND -1 #define NO_UPSTREAM_NODE -1 #define UNKNOWN_NODE_ID -1 /* Run time options type */ typedef struct { char dbname[MAXLEN]; char host[MAXLEN]; char username[MAXLEN]; char dest_dir[MAXFILENAME]; char config_file[MAXFILENAME]; char remote_user[MAXLEN]; char superuser[MAXLEN]; char wal_keep_segments[MAXLEN]; bool verbose; bool terse; bool force; bool wait_for_master; bool ignore_rsync_warn; bool initdb_no_pwprompt; bool rsync_only; bool fast_checkpoint; bool ignore_external_config_files; char masterport[MAXLEN]; char localport[MAXLEN]; char loglevel[MAXLEN]; /* parameter used by CLUSTER CLEANUP */ int keep_history; char pg_bindir[MAXLEN]; char recovery_min_apply_delay[MAXLEN]; } t_runtime_options; #define T_RUNTIME_OPTIONS_INITIALIZER { "", "", "", "", "", "", "", DEFAULT_WAL_KEEP_SEGMENTS, false, false, false, false, false, false, false, false, false, "", "", "", 0, "", "" } extern char repmgr_schema[MAXLEN]; #endif repmgr-3.0.3/repmgr.sql000066400000000000000000000042361264264412200150360ustar00rootroot00000000000000/* * repmgr.sql * * Copyright (C) 2ndQuadrant, 2010-2015 * */ CREATE USER repmgr; CREATE SCHEMA repmgr; /* * The table repl_nodes keeps information about all machines in * a cluster */ CREATE TABLE repl_nodes ( id integer primary key, cluster text not null, -- Name to identify the cluster name text not null, conninfo text not null, priority integer not null, witness boolean not null default false ); ALTER TABLE repl_nodes OWNER TO repmgr; /* * Keeps monitor info about every node and their relative "position" * to primary */ CREATE TABLE repl_monitor ( primary_node INTEGER NOT NULL, standby_node INTEGER NOT NULL, last_monitor_time TIMESTAMP WITH TIME ZONE NOT NULL, last_wal_primary_location TEXT NOT NULL, last_wal_standby_location TEXT, -- In case of a witness server this will be NULL replication_lag BIGINT NOT NULL, apply_lag BIGINT NOT NULL ); ALTER TABLE repl_monitor OWNER TO repmgr; /* * This view shows the latest monitor info about every node. * Interesting thing to see: * replication_lag: in bytes (this is how far the latest xlog record * we have received is from master) * apply_lag: in bytes (this is how far the latest xlog record * we have applied is from the latest record we * have received) * time_lag: how many seconds are we from being up-to-date with master */ CREATE VIEW repl_status AS SELECT primary_node, standby_node, name AS standby_name, last_monitor_time, last_wal_primary_location, last_wal_standby_location, pg_size_pretty(replication_lag) replication_lag, pg_size_pretty(apply_lag) apply_lag, age(now(), last_monitor_time) AS time_lag FROM repl_monitor JOIN repl_nodes ON standby_node = id WHERE (standby_node, last_monitor_time) IN (SELECT standby_node, MAX(last_monitor_time) FROM repl_monitor GROUP BY 1); ALTER VIEW repl_status OWNER TO repmgr; CREATE INDEX idx_repl_status_sort ON repl_monitor(last_monitor_time, standby_node); repmgr-3.0.3/repmgrd.c000066400000000000000000001646451264264412200146400ustar00rootroot00000000000000/* * repmgrd.c - Replication manager daemon * Copyright (C) 2ndQuadrant, 2010-2015 * * This module connects to the nodes of a replication cluster and monitors * how far are they from master * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include #include #include #include #include #include #include "repmgr.h" #include "config.h" #include "log.h" #include "strutil.h" #include "version.h" /* Required PostgreSQL headers */ #include "access/xlogdefs.h" #include "pqexpbuffer.h" /* Local info */ t_configuration_options local_options; PGconn *my_local_conn = NULL; /* Master info */ t_configuration_options master_options; PGconn *master_conn = NULL; char *config_file = ""; bool verbose = false; bool monitoring_history = false; t_node_info node_info; bool failover_done = false; char *pid_file = NULL; t_configuration_options config = T_CONFIGURATION_OPTIONS_INITIALIZER; static void help(void); static void usage(void); static void check_cluster_configuration(PGconn *conn); static void check_node_configuration(void); static void standby_monitor(void); static void witness_monitor(void); static bool check_connection(PGconn **conn, const char *type, const char *conninfo); static bool set_local_node_status(void); static void update_shared_memory(char *last_wal_standby_applied); static void update_registration(void); static void do_master_failover(void); static bool do_upstream_standby_failover(t_node_info upstream_node); static t_node_info get_node_info(PGconn *conn, char *cluster, int node_id); static t_server_type parse_node_type(const char *type); static XLogRecPtr lsn_to_xlogrecptr(char *lsn, bool *format_ok); /* * Flag to mark SIGHUP. Whenever the main loop comes around it * will reread the configuration file. */ static volatile sig_atomic_t got_SIGHUP = false; static void handle_sighup(SIGNAL_ARGS); static void handle_sigint(SIGNAL_ARGS); static void terminate(int retval); #ifndef WIN32 static void setup_event_handlers(void); #endif static void do_daemonize(void); static void check_and_create_pid_file(const char *pid_file); static void close_connections() { if (master_conn != NULL && PQisBusy(master_conn) == 1) cancel_query(master_conn, local_options.master_response_timeout); if (my_local_conn != NULL) PQfinish(my_local_conn); if (master_conn != NULL && master_conn != my_local_conn) PQfinish(master_conn); master_conn = NULL; my_local_conn = NULL; } int main(int argc, char **argv) { static struct option long_options[] = { {"config-file", required_argument, NULL, 'f'}, {"verbose", no_argument, NULL, 'v'}, {"monitoring-history", no_argument, NULL, 'm'}, {"daemonize", no_argument, NULL, 'd'}, {"pid-file", required_argument, NULL, 'p'}, {"help", no_argument, NULL, '?'}, {"version", no_argument, NULL, 'V'}, {NULL, 0, NULL, 0} }; int optindex; int c; bool daemonize = false; bool startup_event_logged = false; FILE *fd; int server_version_num = 0; set_progname(argv[0]); while ((c = getopt_long(argc, argv, "?Vf:vmdp:", long_options, &optindex)) != -1) { switch (c) { case 'f': config_file = optarg; break; case 'v': verbose = true; break; case 'm': monitoring_history = true; break; case 'd': daemonize = true; break; case 'p': pid_file = optarg; break; case '?': help(); exit(SUCCESS); case 'V': printf("%s %s (PostgreSQL %s)\n", progname(), REPMGR_VERSION, PG_VERSION); exit(SUCCESS); default: usage(); exit(ERR_BAD_CONFIG); } } /* * Parse the configuration file, if provided. If no configuration file * was provided, or one was but was incomplete, parse_config() will * abort anyway, with an appropriate message. * * XXX it might be desirable to create an event record for this, in * which case we'll need to refactor parse_config() not to abort, * and return the error message. */ load_config(config_file, verbose, &local_options, argv[0]); if (daemonize) { do_daemonize(); } if (pid_file) { check_and_create_pid_file(pid_file); } #ifndef WIN32 setup_event_handlers(); #endif fd = freopen("/dev/null", "r", stdin); if (fd == NULL) { fprintf(stderr, "error reopening stdin to '/dev/null': %s", strerror(errno)); } fd = freopen("/dev/null", "w", stdout); if (fd == NULL) { fprintf(stderr, "error reopening stdout to '/dev/null': %s", strerror(errno)); } logger_init(&local_options, progname()); if (verbose) logger_set_verbose(); if (log_type == REPMGR_SYSLOG) { fd = freopen("/dev/null", "w", stderr); if (fd == NULL) { fprintf(stderr, "error reopening stderr to '/dev/null': %s", strerror(errno)); } } /* Initialise the repmgr schema name */ /* XXX check this handles quoting properly */ maxlen_snprintf(repmgr_schema, "%s%s", DEFAULT_REPMGR_SCHEMA_PREFIX, local_options.cluster_name); log_info(_("connecting to database '%s'\n"), local_options.conninfo); my_local_conn = establish_db_connection(local_options.conninfo, true); /* Verify that server is a supported version */ log_info(_("connected to database, checking its state\n")); server_version_num = get_server_version(my_local_conn, NULL); if (server_version_num < MIN_SUPPORTED_VERSION_NUM) { if (server_version_num > 0) { log_err(_("%s requires PostgreSQL %s or later\n"), progname(), MIN_SUPPORTED_VERSION) ; } else { log_err(_("unable to determine PostgreSQL server version\n")); } terminate(ERR_BAD_CONFIG); } /* Retrieve record for this node from the local database */ node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node); /* No node record found - exit gracefully */ if (node_info.node_id == NODE_NOT_FOUND) { log_err(_("No metadata record found for this node - terminating\n")); log_hint(_("Check that 'repmgr (master|standby) register' was executed for this node\n")); terminate(ERR_BAD_CONFIG); } log_debug("node id is %i, upstream is %i\n", node_info.node_id, node_info.upstream_node_id); /* * MAIN LOOP This loops cycles at startup and once per failover and * Requisites: - my_local_conn needs to be already setted with an active * connection - no master connection */ do { /* * Set my server mode, establish a connection to master and start * monitor */ switch (node_info.type) { case MASTER: master_options.node = local_options.node; strncpy(master_options.conninfo, local_options.conninfo, MAXLEN); master_conn = my_local_conn; check_cluster_configuration(my_local_conn); check_node_configuration(); if (reload_config(&local_options)) { PQfinish(my_local_conn); my_local_conn = establish_db_connection(local_options.conninfo, true); master_conn = my_local_conn; update_registration(); } /* Log startup event */ if (startup_event_logged == false) { create_event_record(master_conn, &local_options, local_options.node, "repmgrd_start", true, NULL); startup_event_logged = true; } log_info(_("starting continuous master connection check\n")); /* * Check that master is still alive. * XXX We should also check that the * standby servers are sending info */ /* * Every local_options.monitor_interval_secs seconds, do * master checks */ do { if (check_connection(&master_conn, "master", NULL)) { sleep(local_options.monitor_interval_secs); } else { /* * XXX May we do something more verbose ? */ terminate(1); } if (got_SIGHUP) { /* * if we can reload the configuration file, then could need to change * my_local_conn */ if (reload_config(&local_options)) { PQfinish(my_local_conn); my_local_conn = establish_db_connection(local_options.conninfo, true); master_conn = my_local_conn; if (*local_options.logfile) { FILE *fd; fd = freopen(local_options.logfile, "a", stderr); if (fd == NULL) { fprintf(stderr, "error reopening stderr to '%s': %s", local_options.logfile, strerror(errno)); } } update_registration(); } got_SIGHUP = false; } } while (!failover_done); break; case WITNESS: case STANDBY: /* We need the node id of the master server as well as a connection to it */ log_info(_("connecting to master node '%s'\n"), local_options.cluster_name); master_conn = get_master_connection(my_local_conn, local_options.cluster_name, &master_options.node, NULL); if (master_conn == NULL) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to connect to master node '%s'"), master_options.node_name); log_err("%s\n", errmsg.data); create_event_record(NULL, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); terminate(ERR_BAD_CONFIG); } check_cluster_configuration(my_local_conn); check_node_configuration(); if (reload_config(&local_options)) { PQfinish(my_local_conn); my_local_conn = establish_db_connection(local_options.conninfo, true); update_registration(); } /* Log startup event */ if (startup_event_logged == false) { create_event_record(master_conn, &local_options, local_options.node, "repmgrd_start", true, NULL); startup_event_logged = true; } /* * Every local_options.monitor_interval_secs seconds, do * checks */ if (node_info.type == WITNESS) { log_info(_("starting continuous witness node monitoring\n")); } else if (node_info.type == STANDBY) { log_info(_("starting continuous standby node monitoring\n")); } do { log_verbose(LOG_DEBUG, "standby check loop...\n"); if (node_info.type == WITNESS) { witness_monitor(); } else if (node_info.type == STANDBY) { standby_monitor(); } sleep(local_options.monitor_interval_secs); if (got_SIGHUP) { /* * if we can reload, then could need to change * my_local_conn */ if (reload_config(&local_options)) { PQfinish(my_local_conn); my_local_conn = establish_db_connection(local_options.conninfo, true); update_registration(); } got_SIGHUP = false; } if (failover_done) { log_debug(_("standby check loop will terminate\n")); } } while (!failover_done); break; default: log_err(_("unrecognized mode for node %d\n"), local_options.node); } failover_done = false; } while (true); /* close the connection to the database and cleanup */ close_connections(); /* Shuts down logging system */ logger_shutdown(); return 0; } /* * witness_monitor() * * Monitors witness server; attempt to find and connect to new master * if existing master connection is lost */ static void witness_monitor(void) { char monitor_witness_timestamp[MAXLEN]; PGresult *res; char sqlquery[QUERY_STR_LEN]; bool connection_ok; /* * Check if master is available; if not, assume failover situation * and try to determine new master. There may be a delay between detection * of a missing master and promotion of a standby by that standby's * repmgrd, so we'll loop for a while before giving up. */ connection_ok = check_connection(&master_conn, "master", NULL); if (connection_ok == false) { int connection_retries; log_debug(_("old master node ID: %i\n"), master_options.node); /* We need to wait a while for the new master to be promoted */ log_info( _("waiting %i seconds for a new master to be promoted...\n"), local_options.master_response_timeout ); sleep(local_options.master_response_timeout); /* Attempt to find the new master */ for (connection_retries = 0; connection_retries < local_options.reconnect_attempts; connection_retries++) { log_info( _("attempt %i of %i to determine new master...\n"), connection_retries + 1, local_options.reconnect_attempts ); master_conn = get_master_connection(my_local_conn, local_options.cluster_name, &master_options.node, NULL); if (PQstatus(master_conn) != CONNECTION_OK) { log_warning( _("unable to determine a valid master server; waiting %i seconds to retry...\n"), local_options.reconnect_interval ); PQfinish(master_conn); sleep(local_options.reconnect_interval); } else { log_debug(_("new master found with node ID: %i\n"), master_options.node); connection_ok = true; /* * Update the repl_nodes table from the new master to reflect the changed * node configuration * * XXX it would be neat to be able to handle this with e.g. table-based * logical replication */ copy_configuration(master_conn, my_local_conn, local_options.cluster_name); break; } } if (connection_ok == false) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to determine a valid master node, terminating...")); log_err("%s\n", errmsg.data); create_event_record(NULL, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); terminate(ERR_DB_CON); } } /* Fast path for the case where no history is requested */ if (!monitoring_history) return; /* * Cancel any query that is still being executed, so i can insert the * current record */ if (!cancel_query(master_conn, local_options.master_response_timeout)) return; if (wait_connection_availability(master_conn, local_options.master_response_timeout) != 1) return; /* Get local xlog info */ sqlquery_snprintf(sqlquery, "SELECT CURRENT_TIMESTAMP"); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s\n"), PQerrorMessage(my_local_conn)); PQclear(res); /* if there is any error just let it be and retry in next loop */ return; } strcpy(monitor_witness_timestamp, PQgetvalue(res, 0, 0)); PQclear(res); /* * Build the SQL to execute on master */ sqlquery_snprintf(sqlquery, "INSERT INTO %s.repl_monitor " " (primary_node, standby_node, " " last_monitor_time, last_apply_time, " " last_wal_primary_location, last_wal_standby_location, " " replication_lag, apply_lag )" " VALUES(%d, %d, " " '%s'::TIMESTAMP WITH TIME ZONE, NULL, " " pg_current_xlog_location(), NULL, " " 0, 0) ", get_repmgr_schema_quoted(my_local_conn), master_options.node, local_options.node, monitor_witness_timestamp); /* * Execute the query asynchronously, but don't check for a result. We will * check the result next time we pause for a monitor step. */ if (PQsendQuery(master_conn, sqlquery) == 0) log_warning(_("query could not be sent to master: %s\n"), PQerrorMessage(master_conn)); } /* * standby_monitor() * * Monitor standby server and handle failover situation. Also insert * monitoring information if configured. */ static void standby_monitor(void) { PGresult *res; char monitor_standby_timestamp[MAXLEN]; char last_wal_master_location[MAXLEN]; char last_wal_standby_received[MAXLEN]; char last_wal_standby_applied[MAXLEN]; char last_wal_standby_applied_timestamp[MAXLEN]; bool last_wal_standby_received_gte_replayed; char sqlquery[QUERY_STR_LEN]; XLogRecPtr lsn_master; XLogRecPtr lsn_standby_received; XLogRecPtr lsn_standby_applied; int connection_retries, ret; bool did_retry = false; PGconn *upstream_conn; char upstream_conninfo[MAXCONNINFO]; int upstream_node_id; t_node_info upstream_node; int active_master_id; const char *type = NULL; /* * Verify that the local node is still available - if not there's * no point in doing much else anyway */ if (!check_connection(&my_local_conn, "standby", NULL)) { PQExpBufferData errmsg; set_local_node_status(); initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("failed to connect to local node, node marked as failed!")); log_err("%s\n", errmsg.data); goto continue_monitoring_standby; } upstream_conn = get_upstream_connection(my_local_conn, local_options.cluster_name, local_options.node, &upstream_node_id, upstream_conninfo); type = upstream_node_id == master_options.node ? "master" : "upstream"; // ZZZ "5 minutes"? /* * Check if the upstream node is still available, if after 5 minutes of retries * we cannot reconnect, try to get a new upstream node. */ check_connection(&upstream_conn, type, upstream_conninfo); /* * This takes up to local_options.reconnect_attempts * * local_options.reconnect_interval seconds */ if (PQstatus(upstream_conn) != CONNECTION_OK) { PQfinish(upstream_conn); upstream_conn = NULL; if (local_options.failover == MANUAL_FAILOVER) { log_err(_("Unable to reconnect to %s. Now checking if another node has been promoted.\n"), type); for (connection_retries = 0; connection_retries < local_options.reconnect_attempts; connection_retries++) { master_conn = get_master_connection(my_local_conn, local_options.cluster_name, &master_options.node, NULL); if (PQstatus(master_conn) == CONNECTION_OK) { /* * Connected, we can continue the process so break the * loop */ log_err(_("connected to node %d, continuing monitoring.\n"), master_options.node); break; } else { log_err( _("no new master found, waiting %i seconds before retry...\n"), local_options.retry_promote_interval_secs ); sleep(local_options.retry_promote_interval_secs); } } if (PQstatus(master_conn) != CONNECTION_OK) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("Unable to reconnect to master after %i attempts, terminating..."), local_options.reconnect_attempts); log_err("%s\n", errmsg.data); create_event_record(NULL, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); terminate(ERR_DB_CON); } } else if (local_options.failover == AUTOMATIC_FAILOVER) { /* * When we returns from this function we will have a new master * and a new master_conn */ /* * Failover handling is handled differently depending on whether * the failed node is the master or a cascading standby */ upstream_node = get_node_info(my_local_conn, local_options.cluster_name, node_info.upstream_node_id); if (upstream_node.type == MASTER) { log_debug(_("failure detected on master node (%i); attempting to promote a standby\n"), node_info.upstream_node_id); do_master_failover(); } else { log_debug(_("failure detected on upstream node %i; attempting to reconnect to new upstream node\n"), node_info.upstream_node_id); if (!do_upstream_standby_failover(upstream_node)) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to reconnect to new upstream node, terminating...")); log_err("%s\n", errmsg.data); create_event_record(master_conn, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); terminate(ERR_DB_CON); } } return; } } PQfinish(upstream_conn); continue_monitoring_standby: /* Check if we still are a standby, we could have been promoted */ do { ret = is_standby(my_local_conn); switch (ret) { case 0: /* * This situation can occur if `pg_ctl promote` was manually executed * on the node. If the original master is still running after this * node has been promoted, we're in a "two brain" situation which * will require manual resolution as there's no way of determing * which master is the correct one. * * We should log a message so the user knows of the situation at hand. * * XXX check if the original master is still active and display a * warning */ log_err(_("It seems this server was promoted manually (not by repmgr) so you might by in the presence of a split-brain.\n")); log_err(_("Check your cluster and manually fix any anomaly.\n")); terminate(1); break; case -1: log_err(_("standby node has disappeared, trying to reconnect...\n")); did_retry = true; if (!check_connection(&my_local_conn, "standby", NULL)) { set_local_node_status(); /* * Let's continue checking, and if the postgres server on the * standby comes back up, we will activate it again */ } break; } } while (ret == -1); if (did_retry) { /* * There's a possible situation where the standby went down for some reason * (maintenance for example) and is now up and maybe connected once again to * the stream. If we set the local standby node as failed and it's now running * and receiving replication data, we should activate it again. */ set_local_node_status(); log_info(_("standby connection recovered!\n")); } /* Fast path for the case where no history is requested */ if (!monitoring_history) return; /* * If original master has gone away we'll need to get the new one * from the upstream node to write monitoring information */ upstream_node = get_node_info(my_local_conn, local_options.cluster_name, node_info.upstream_node_id); sprintf(sqlquery, "SELECT id " " FROM %s.repl_nodes " " WHERE type = 'master' " " AND active IS TRUE ", get_repmgr_schema_quoted(my_local_conn)); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("standby_monitor() - query error:%s\n"), PQerrorMessage(my_local_conn)); PQclear(res); /* Not a fatal error, just means no monitoring records will be written */ return; } if (PQntuples(res) == 0) { log_err(_("standby_monitor(): no active master found\n")); PQclear(res); return; } active_master_id = atoi(PQgetvalue(res, 0, 0)); PQclear(res); if (active_master_id != master_options.node) { log_notice(_("connecting to active master (node %i)...\n"), active_master_id); \ if (master_conn != NULL) { PQfinish(master_conn); } master_conn = get_master_connection(my_local_conn, local_options.cluster_name, &master_options.node, NULL); } if (PQstatus(master_conn) != CONNECTION_OK) PQreset(master_conn); /* * Cancel any query that is still being executed, so i can insert the * current record */ if (!cancel_query(master_conn, local_options.master_response_timeout)) return; if (wait_connection_availability(master_conn, local_options.master_response_timeout) != 1) return; /* Get local xlog info */ sqlquery_snprintf(sqlquery, "SELECT CURRENT_TIMESTAMP, pg_last_xlog_receive_location(), " "pg_last_xlog_replay_location(), pg_last_xact_replay_timestamp(), " "pg_last_xlog_receive_location() >= pg_last_xlog_replay_location()"); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s\n"), PQerrorMessage(my_local_conn)); PQclear(res); /* if there is any error just let it be and retry in next loop */ return; } strncpy(monitor_standby_timestamp, PQgetvalue(res, 0, 0), MAXLEN); strncpy(last_wal_standby_received, PQgetvalue(res, 0, 1), MAXLEN); strncpy(last_wal_standby_applied, PQgetvalue(res, 0, 2), MAXLEN); strncpy(last_wal_standby_applied_timestamp, PQgetvalue(res, 0, 3), MAXLEN); last_wal_standby_received_gte_replayed = (strcmp(PQgetvalue(res, 0, 4), "t") == 0) ? true : false; PQclear(res); /* * In the unusual event of a standby becoming disconnected from the primary, * while this repmgrd remains connected to the primary, subtracting * "lsn_standby_applied" from "lsn_standby_received" and coercing to * (long long unsigned int) will result in a meaningless, very large * value which will overflow a BIGINT column and spew error messages into the * PostgreSQL log. In the absence of a better strategy, skip attempting * to insert a monitoring record. */ if (last_wal_standby_received_gte_replayed == false) { log_verbose(LOG_WARNING, "Invalid replication_lag value calculated - is this standby connected to its upstream?\n"); return; } /* Get master xlog info */ sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_current_xlog_location()"); res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s\n"), PQerrorMessage(master_conn)); PQclear(res); return; } strncpy(last_wal_master_location, PQgetvalue(res, 0, 0), MAXLEN); PQclear(res); /* Calculate the lag */ lsn_master = lsn_to_xlogrecptr(last_wal_master_location, NULL); lsn_standby_received = lsn_to_xlogrecptr(last_wal_standby_received, NULL); lsn_standby_applied = lsn_to_xlogrecptr(last_wal_standby_applied, NULL); /* * Build the SQL to execute on master */ sqlquery_snprintf(sqlquery, "INSERT INTO %s.repl_monitor " " (primary_node, standby_node, " " last_monitor_time, last_apply_time, " " last_wal_primary_location, last_wal_standby_location, " " replication_lag, apply_lag ) " " VALUES(%d, %d, " " '%s'::TIMESTAMP WITH TIME ZONE, '%s'::TIMESTAMP WITH TIME ZONE, " " '%s', '%s', " " %llu, %llu) ", get_repmgr_schema_quoted(master_conn), master_options.node, local_options.node, monitor_standby_timestamp, last_wal_standby_applied_timestamp, last_wal_master_location, last_wal_standby_received, (long long unsigned int)(lsn_master - lsn_standby_received), (long long unsigned int)(lsn_standby_received - lsn_standby_applied)); /* * Execute the query asynchronously, but don't check for a result. We will * check the result next time we pause for a monitor step. */ log_verbose(LOG_DEBUG, "standby_monitor:() %s\n", sqlquery); if (PQsendQuery(master_conn, sqlquery) == 0) log_warning(_("query could not be sent to master. %s\n"), PQerrorMessage(master_conn)); } /* * do_master_failover() * * Handles failover to new cluster master */ static void do_master_failover(void) { PGresult *res; char sqlquery[QUERY_STR_LEN]; int total_nodes = 0; int visible_nodes = 0; int ready_nodes = 0; bool candidate_found = false; int i; int r; XLogRecPtr xlog_recptr; bool lsn_format_ok; char last_wal_standby_applied[MAXLEN]; PGconn *node_conn = NULL; /* * will get info about until 50 nodes, which seems to be large enough for * most scenarios */ t_node_info nodes[FAILOVER_NODES_MAX_CHECK]; /* Store details of the failed node here */ t_node_info failed_master = T_NODE_INFO_INITIALIZER; /* Store details of the best candidate for promotion to master here */ t_node_info best_candidate = T_NODE_INFO_INITIALIZER; /* get a list of standby nodes, including myself */ sprintf(sqlquery, "SELECT id, conninfo, type, upstream_node_id " " FROM %s.repl_nodes " " WHERE cluster = '%s' " " AND active IS TRUE " " AND priority > 0 " " ORDER BY priority DESC, id " " LIMIT %i ", get_repmgr_schema_quoted(my_local_conn), local_options.cluster_name, FAILOVER_NODES_MAX_CHECK); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to retrieve node records: %s\n"), PQerrorMessage(my_local_conn)); PQclear(res); PQfinish(my_local_conn); terminate(ERR_DB_QUERY); } /* * total nodes that are registered */ total_nodes = PQntuples(res); log_debug(_("%d active nodes registered\n"), total_nodes); /* * Build an array with the nodes and indicate which ones are visible and * ready */ for (i = 0; i < total_nodes; i++) { nodes[i].node_id = atoi(PQgetvalue(res, i, 0)); strncpy(nodes[i].conninfo_str, PQgetvalue(res, i, 1), MAXCONNINFO); nodes[i].type = parse_node_type(PQgetvalue(res, i, 2)); /* Copy details of the failed node */ /* XXX only node_id is actually used later */ if (nodes[i].type == MASTER) { failed_master.node_id = nodes[i].node_id; failed_master.xlog_location = nodes[i].xlog_location; failed_master.is_ready = nodes[i].is_ready; } nodes[i].upstream_node_id = atoi(PQgetvalue(res, i, 3)); /* * Initialize on false so if we can't reach this node we know that * later */ nodes[i].is_visible = false; nodes[i].is_ready = false; nodes[i].xlog_location = InvalidXLogRecPtr; log_debug(_("node=%d conninfo=\"%s\" type=%s\n"), nodes[i].node_id, nodes[i].conninfo_str, PQgetvalue(res, i, 2)); node_conn = establish_db_connection(nodes[i].conninfo_str, false); /* if we can't see the node just skip it */ if (PQstatus(node_conn) != CONNECTION_OK) { if (node_conn != NULL) PQfinish(node_conn); continue; } visible_nodes++; nodes[i].is_visible = true; PQfinish(node_conn); } PQclear(res); log_debug(_("total nodes counted: registered=%d, visible=%d\n"), total_nodes, visible_nodes); /* * Am I on the group that should keep alive? If I see less than half of * total_nodes then I should do nothing */ if (visible_nodes < (total_nodes / 2.0)) { log_err(_("Unable to reach most of the nodes.\n" "Let the other standby servers decide which one will be the master.\n" "Manual action will be needed to re-add this node to the cluster.\n")); terminate(ERR_FAILOVER_FAIL); } /* Query all available nodes to determine readiness and LSN */ for (i = 0; i < total_nodes; i++) { log_debug("checking node %i...\n", nodes[i].node_id); /* if the node is not visible, skip it */ if (!nodes[i].is_visible) continue; /* if the node is a witness node, skip it */ if (nodes[i].type == WITNESS) continue; /* if node does not have same upstream node, skip it */ if (nodes[i].upstream_node_id != node_info.upstream_node_id) continue; node_conn = establish_db_connection(nodes[i].conninfo_str, false); /* * XXX This shouldn't happen, if this happens it means this is a major * problem maybe network outages? anyway, is better for a human to * react */ if (PQstatus(node_conn) != CONNECTION_OK) { log_err(_("It seems new problems are arising, manual intervention is needed\n")); terminate(ERR_FAILOVER_FAIL); } sqlquery_snprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()"); res = PQexec(node_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_info(_("unable to retrieve node's last standby location: %s\n"), PQerrorMessage(node_conn)); log_debug(_("connection details: %s\n"), nodes[i].conninfo_str); PQclear(res); PQfinish(node_conn); terminate(ERR_FAILOVER_FAIL); } xlog_recptr = lsn_to_xlogrecptr(PQgetvalue(res, 0, 0), &lsn_format_ok); log_debug(_("LSN of node %i is: %s\n"), nodes[i].node_id, PQgetvalue(res, 0, 0)); PQclear(res); PQfinish(node_conn); /* If position is 0/0, error */ /* XXX do we need to terminate ourselves if the queried node has a problem? */ if (xlog_recptr == InvalidXLogRecPtr) { log_err(_("InvalidXLogRecPtr detected on standby node %i\n"), nodes[i].node_id); terminate(ERR_FAILOVER_FAIL); } nodes[i].xlog_location = xlog_recptr; } /* last we get info about this node, and update shared memory */ sprintf(sqlquery, "SELECT pg_catalog.pg_last_xlog_receive_location()"); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s.\nReport an invalid value to not be " " considered as new master and exit.\n"), PQerrorMessage(my_local_conn)); PQclear(res); sprintf(last_wal_standby_applied, "'%X/%X'", 0, 0); update_shared_memory(last_wal_standby_applied); terminate(ERR_DB_QUERY); } /* write last location in shared memory */ update_shared_memory(PQgetvalue(res, 0, 0)); PQclear(res); /* Wait for each node to come up and report a valid LSN */ for (i = 0; i < total_nodes; i++) { /* * ensure witness server is marked as ready, and skip * LSN check */ if (nodes[i].type == WITNESS) { if (!nodes[i].is_ready) { nodes[i].is_ready = true; ready_nodes++; } continue; } /* if the node is not visible, skip it */ if (!nodes[i].is_visible) continue; /* if node does not have same upstream node, skip it */ if (nodes[i].upstream_node_id != node_info.upstream_node_id) continue; node_conn = establish_db_connection(nodes[i].conninfo_str, false); /* * XXX This shouldn't happen, if this happens it means this is a * major problem maybe network outages? anyway, is better for a * human to react */ if (PQstatus(node_conn) != CONNECTION_OK) { /* XXX */ log_info(_("At this point, it could be some race conditions " "that are acceptable, assume the node is restarting " "and starting failover procedure\n")); continue; } while (!nodes[i].is_ready) { sqlquery_snprintf(sqlquery, "SELECT %s.repmgr_get_last_standby_location()", get_repmgr_schema_quoted(node_conn)); res = PQexec(node_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s.\nReport an invalid value to not " "be considered as new master and exit.\n"), PQerrorMessage(node_conn)); PQclear(res); PQfinish(node_conn); terminate(ERR_DB_QUERY); } xlog_recptr = lsn_to_xlogrecptr(PQgetvalue(res, 0, 0), &lsn_format_ok); /* If position reported as "invalid", check for format error or * empty string; otherwise position is 0/0 and we need to continue * looping until a valid LSN is reported */ if (xlog_recptr == InvalidXLogRecPtr) { if (lsn_format_ok == false) { /* Unable to parse value returned by `repmgr_get_last_standby_location()` */ if (*PQgetvalue(res, 0, 0) == '\0') { log_crit( _("unable to obtain LSN from node %i"), nodes[i].node_id ); log_info( _("please check that 'shared_preload_libraries=repmgr_funcs' is set in postgresql.conf\n") ); PQclear(res); PQfinish(node_conn); exit(ERR_BAD_CONFIG); } /* * Very unlikely to happen; in the absence of any better * strategy keep checking */ log_warning(_("unable to parse LSN \"%s\"\n"), PQgetvalue(res, 0, 0)); } else { log_debug( _("invalid LSN returned from node %i: '%s'\n"), nodes[i].node_id, PQgetvalue(res, 0, 0) ); } PQclear(res); /* If position is 0/0, keep checking */ /* XXX we should add a timeout here to prevent infinite looping * if the other node's repmgrd is not up */ continue; } if (nodes[i].xlog_location < xlog_recptr) { nodes[i].xlog_location = xlog_recptr; } log_debug(_("LSN of node %i is: %s\n"), nodes[i].node_id, PQgetvalue(res, 0, 0)); PQclear(res); ready_nodes++; nodes[i].is_ready = true; } PQfinish(node_conn); } /* Close the connection to this server */ PQfinish(my_local_conn); my_local_conn = NULL; /* * determine which one is the best candidate to promote to master */ for (i = 0; i < total_nodes; i++) { /* witness server can never be a candidate */ if (nodes[i].type == WITNESS) continue; if (!nodes[i].is_ready || !nodes[i].is_visible) continue; if (!candidate_found) { /* * If no candidate has been found so far, the first visible and ready * node becomes the best candidate by default */ best_candidate.node_id = nodes[i].node_id; best_candidate.xlog_location = nodes[i].xlog_location; best_candidate.is_ready = nodes[i].is_ready; strncpy(best_candidate.conninfo_str, nodes[i].conninfo_str, MAXCONNINFO); candidate_found = true; } /* * Nodes are retrieved ordered by priority, so if the current best * candidate is lower than the next node's wal location then assign * next node as the new best candidate. */ if (best_candidate.xlog_location < nodes[i].xlog_location) { best_candidate.node_id = nodes[i].node_id; best_candidate.xlog_location = nodes[i].xlog_location; best_candidate.is_ready = nodes[i].is_ready; strncpy(best_candidate.conninfo_str, nodes[i].conninfo_str, MAXCONNINFO); } } /* Terminate if no candidate found */ if (!candidate_found) { log_err(_("no suitable candidate for promotion found; terminating.\n")); terminate(ERR_FAILOVER_FAIL); } /* if local node is the best candidate, promote it */ if (best_candidate.node_id == local_options.node) { PQExpBufferData event_details; initPQExpBuffer(&event_details); /* wait */ sleep(5); log_notice(_("this node is the best candidate to be the new master, promoting...\n")); log_debug(_("promote command is: \"%s\"\n"), local_options.promote_command); if (log_type == REPMGR_STDERR && *local_options.logfile) { fflush(stderr); } r = system(local_options.promote_command); if (r != 0) { log_err(_("promote command failed. You could check and try it manually.\n")); terminate(ERR_DB_QUERY); } /* and reconnect to the local database */ my_local_conn = establish_db_connection(local_options.conninfo, true); /* update internal record for this node */ node_info = get_node_info(my_local_conn, local_options.cluster_name, local_options.node); appendPQExpBuffer(&event_details, _("node %i promoted to master; old master %i marked as failed"), node_info.node_id, failed_master.node_id); /* my_local_conn is now the master */ create_event_record(my_local_conn, &local_options, node_info.node_id, "repmgrd_failover_promote", true, event_details.data); } /* local node not promotion candidate - find the new master */ else { PGconn *new_master_conn; PQExpBufferData event_details; initPQExpBuffer(&event_details); /* wait */ sleep(10); log_info(_("node %d is the best candidate for new master, attempting to follow...\n"), best_candidate.node_id); /* * The new master may some time to be promoted. The follow command * should take care of that. */ if (log_type == REPMGR_STDERR && *local_options.logfile) { fflush(stderr); } log_debug(_("executing follow command: \"%s\"\n"), local_options.follow_command); r = system(local_options.follow_command); if (r != 0) { appendPQExpBuffer(&event_details, _("Unable to execute follow command:\n %s"), local_options.follow_command); log_err("%s\n", event_details.data); /* It won't be possible to write to the event notification * table but we should be able to generate an external notification * if required. */ create_event_record(NULL, &local_options, node_info.node_id, "repmgrd_failover_follow", false, event_details.data); terminate(ERR_BAD_CONFIG); } /* and reconnect to the local database */ my_local_conn = establish_db_connection(local_options.conninfo, true); /* update internal record for this node*/ new_master_conn = establish_db_connection(best_candidate.conninfo_str, true); node_info = get_node_info(new_master_conn, local_options.cluster_name, local_options.node); appendPQExpBuffer(&event_details, _("Node %i now following new upstream node %i"), node_info.node_id, best_candidate.node_id); log_info("%s\n", event_details.data); create_event_record(new_master_conn, &local_options, node_info.node_id, "repmgrd_failover_follow", true, event_details.data); PQfinish(new_master_conn); termPQExpBuffer(&event_details); } /* to force it to re-calculate mode and master node */ // ^ ZZZ check that behaviour ^ failover_done = true; } /* * do_upstream_standby_failover() * * Attach cascaded standby to new upstream server * * Currently we will try to attach to the failed upstream's upstream. * It might be worth providing a selection of reconnection strategies * as different behaviour might be desirable in different situations; * or maybe the option not to reconnect might be required? * * XXX check this handles replication slots gracefully */ static bool do_upstream_standby_failover(t_node_info upstream_node) { PGresult *res; char sqlquery[QUERY_STR_LEN]; int upstream_node_id = node_info.upstream_node_id; int r; PQExpBufferData event_details; log_debug(_("do_upstream_standby_failover(): performing failover for node %i\n"), node_info.node_id); /* * Verify that we can still talk to the cluster master even though * node upstream is not available */ if (!check_connection(&master_conn, "master", NULL)) { log_err(_("do_upstream_standby_failover(): Unable to connect to last known master node\n")); return false; } while(1) { sqlquery_snprintf(sqlquery, "SELECT id, active, upstream_node_id, type, conninfo " " FROM %s.repl_nodes " " WHERE id = %i ", get_repmgr_schema_quoted(master_conn), upstream_node_id); res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to query cluster master: %s\n"), PQerrorMessage(master_conn)); PQclear(res); return false; } if (PQntuples(res) == 0) { log_err(_("no node with id %i found"), upstream_node_id); PQclear(res); return false; } /* upstream node is inactive */ if (strcmp(PQgetvalue(res, 0, 1), "f") == 0) { /* * Upstream node is an inactive master, meaning no there are no direct * upstream nodes available to reattach to. * * XXX For now we'll simply terminate, however it would make sense to * provide an option to either try and find the current master and/or * a strategy to connect to a different upstream node */ if (strcmp(PQgetvalue(res, 0, 4), "master") == 0) { log_err(_("unable to find active master node\n")); PQclear(res); return false; } upstream_node_id = atoi(PQgetvalue(res, 0, 2)); } else { upstream_node_id = atoi(PQgetvalue(res, 0, 0)); log_notice(_("found active upstream node with id %i\n"), upstream_node_id); PQclear(res); break; } PQclear(res); sleep(local_options.reconnect_interval); } /* Close the connection to this server */ PQfinish(my_local_conn); my_local_conn = NULL; initPQExpBuffer(&event_details); /* Follow new upstream */ r = system(local_options.follow_command); if (r != 0) { appendPQExpBuffer(&event_details, _("Unable to execute follow command:\n %s"), local_options.follow_command); log_err("%s\n", event_details.data); /* It won't be possible to write to the event notification * table but we should be able to generate an external notification * if required. */ create_event_record(NULL, &local_options, node_info.node_id, "repmgrd_failover_follow", false, event_details.data); terminate(ERR_BAD_CONFIG); } if (update_node_record_set_upstream(master_conn, local_options.cluster_name, node_info.node_id, upstream_node_id) == false) { appendPQExpBuffer(&event_details, _("Unable to set node %i's new upstream ID to %i"), node_info.node_id, upstream_node_id); create_event_record(NULL, &local_options, node_info.node_id, "repmgrd_failover_follow", false, event_details.data); terminate(ERR_BAD_CONFIG); } appendPQExpBuffer(&event_details, _("Node %i is now following upstream node %i"), node_info.node_id, upstream_node_id); create_event_record(NULL, &local_options, node_info.node_id, "repmgrd_failover_follow", true, event_details.data); my_local_conn = establish_db_connection(local_options.conninfo, true); return true; } static bool check_connection(PGconn **conn, const char *type, const char *conninfo) { int connection_retries; /* * Check if the node is still available if after * local_options.reconnect_attempts * local_options.reconnect_interval * seconds of retries we cannot reconnect return false */ for (connection_retries = 0; connection_retries < local_options.reconnect_attempts; connection_retries++) { if (*conn == NULL) { if (conninfo == NULL) { log_err("INTERNAL ERROR: *conn == NULL && conninfo == NULL"); terminate(ERR_INTERNAL); } *conn = establish_db_connection(conninfo, false); } if (!is_pgup(*conn, local_options.master_response_timeout)) { log_warning(_("connection to %s has been lost, trying to recover... %i seconds before failover decision\n"), type, (local_options.reconnect_interval * (local_options.reconnect_attempts - connection_retries))); /* wait local_options.reconnect_interval seconds between retries */ sleep(local_options.reconnect_interval); } else { if (connection_retries > 0) { log_info(_("connection to %s has been restored.\n"), type); } return true; } } if (!is_pgup(*conn, local_options.master_response_timeout)) { log_err(_("unable to reconnect to %s (timeout %i seconds)...\n"), type, local_options.master_response_timeout ); return false; } return true; } /* * set_local_node_status() * * If failure of the local node is detected, attempt to connect * to the current master server (as stored in the global variable * `master_conn`) and update its record to failed. */ static bool set_local_node_status(void) { PGresult *res; char sqlquery[QUERY_STR_LEN]; int active_master_node_id = NODE_NOT_FOUND; char master_conninfo[MAXLEN]; if (!check_connection(&master_conn, "master", NULL)) { log_err(_("set_local_node_status(): Unable to connect to last known master node\n")); return false; } /* * Check that the node `master_conn` is connected to is node is still * master - it's just about conceivable that it might have become a * standby of a new master in the intervening period */ sqlquery_snprintf(sqlquery, "SELECT id, conninfo " " FROM %s.repl_nodes " " WHERE type = 'master' " " AND active IS TRUE ", get_repmgr_schema_quoted(master_conn)); res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("unable to obtain record for active master: %s\n"), PQerrorMessage(master_conn)); return false; } if (!PQntuples(res)) { log_err(_("no active master record found\n")); return false; } active_master_node_id = atoi(PQgetvalue(res, 0, 0)); strncpy(master_conninfo, PQgetvalue(res, 0, 1), MAXLEN); PQclear(res); if (active_master_node_id != master_options.node) { log_notice(_("current active master is %i; attempting to connect\n"), active_master_node_id); PQfinish(master_conn); master_conn = establish_db_connection(master_conninfo, false); if (PQstatus(master_conn) != CONNECTION_OK) { log_err(_("unable to connect to active master\n")); return false; } log_notice(_("Connection to new master was successful\n")); } /* * Attempt to set the active record to the correct value. * First */ if (!update_node_record_status(master_conn, local_options.cluster_name, node_info.node_id, "standby", node_info.upstream_node_id, is_standby(my_local_conn)==1)) { log_err(_("unable to set local node %i as inactive on master: %s\n"), node_info.node_id, PQerrorMessage(master_conn)); return false; } log_notice(_("marking this node (%i) as inactive on master\n"), node_info.node_id); return true; } static void check_cluster_configuration(PGconn *conn) { PGresult *res; char sqlquery[QUERY_STR_LEN]; log_info(_("checking cluster configuration with schema '%s'\n"), get_repmgr_schema()); sqlquery_snprintf(sqlquery, "SELECT oid FROM pg_class " " WHERE oid = '%s.repl_nodes'::regclass ", get_repmgr_schema_quoted(master_conn)); res = PQexec(conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s\n"), PQerrorMessage(conn)); PQclear(res); terminate(ERR_DB_QUERY); } /* * If there isn't any results then we have not configured a master node * yet in repmgr or the connection string is pointing to the wrong * database. * * XXX if we are the master, should we try to create the tables needed? */ if (PQntuples(res) == 0) { log_err(_("the replication cluster is not configured\n")); PQclear(res); terminate(ERR_BAD_CONFIG); } PQclear(res); } static void check_node_configuration(void) { PGresult *res; char sqlquery[QUERY_STR_LEN]; /* * Check if this node has an entry in `repl_nodes` */ log_info(_("checking node %d in cluster '%s'\n"), local_options.node, local_options.cluster_name); sqlquery_snprintf(sqlquery, "SELECT COUNT(*) " " FROM %s.repl_nodes " " WHERE id = %d " " AND cluster = '%s' ", get_repmgr_schema_quoted(my_local_conn), local_options.node, local_options.cluster_name); res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_err(_("PQexec failed: %s\n"), PQerrorMessage(my_local_conn)); PQclear(res); terminate(ERR_BAD_CONFIG); } /* * If there isn't any results then we have not configured this node yet in * repmgr, if that is the case we will insert the node to the cluster, * except if it is a witness */ if (PQntuples(res) == 0) { PQclear(res); if (node_info.type == WITNESS) { log_err(_("The witness is not configured\n")); terminate(ERR_BAD_CONFIG); } /* Adding the node */ log_info(_("adding node %d to cluster '%s'\n"), local_options.node, local_options.cluster_name); sqlquery_snprintf(sqlquery, "INSERT INTO %s.repl_nodes" " (id, cluster, name, conninfo, priority, witness) " " VALUES (%d, '%s', '%s', '%s', 0, FALSE) ", get_repmgr_schema_quoted(master_conn), local_options.node, local_options.cluster_name, local_options.node_name, local_options.conninfo); if (!PQexec(master_conn, sqlquery)) { log_err(_("unable to insert node details, %s\n"), PQerrorMessage(master_conn)); terminate(ERR_BAD_CONFIG); } } else { PQclear(res); } } /* * lsn_to_xlogrecptr() * * Convert an LSN represented as a string to an XLogRecPtr; * optionally set a flag to indicated the provided string * could not be parsed */ static XLogRecPtr lsn_to_xlogrecptr(char *lsn, bool *format_ok) { uint32 xlogid; uint32 xrecoff; if (sscanf(lsn, "%X/%X", &xlogid, &xrecoff) != 2) { if (format_ok != NULL) *format_ok = false; log_err(_("incorrect log location format: %s\n"), lsn); return 0; } if (format_ok != NULL) *format_ok = true; return (((XLogRecPtr) xlogid * 16 * 1024 * 1024 * 255) + xrecoff); } void usage(void) { log_err(_("%s: Replicator manager daemon \n"), progname()); log_err(_("Try \"%s --help\" for more information.\n"), progname()); } void help(void) { printf(_("%s: replication management daemon for PostgreSQL\n"), progname()); printf(_("\n")); printf(_("Usage:\n")); printf(_(" %s [OPTIONS]\n"), progname()); printf(_("\n")); printf(_("Options:\n")); printf(_(" -?, --help show this help, then exit\n")); printf(_(" -V, --version output version information, then exit\n")); printf(_(" -v, --verbose output verbose activity information\n")); printf(_(" -m, --monitoring-history track advance or lag of the replication in every standby in repl_monitor\n")); printf(_(" -f, --config-file=PATH path to the configuration file\n")); printf(_(" -d, --daemonize detach process from foreground\n")); printf(_(" -p, --pid-file=PATH write a PID file\n")); printf(_("\n")); printf(_("%s monitors a cluster of servers and optionally performs failover.\n"), progname()); } #ifndef WIN32 static void handle_sigint(SIGNAL_ARGS) { terminate(0); } /* SIGHUP: set flag to re-read config file at next convenient time */ static void handle_sighup(SIGNAL_ARGS) { got_SIGHUP = true; } static void setup_event_handlers(void) { pqsignal(SIGHUP, handle_sighup); pqsignal(SIGINT, handle_sigint); pqsignal(SIGTERM, handle_sigint); } #endif static void terminate(int retval) { close_connections(); logger_shutdown(); if (pid_file) { unlink(pid_file); } log_info(_("%s terminating...\n"), progname()); exit(retval); } static void update_shared_memory(char *last_wal_standby_applied) { PGresult *res; char sqlquery[QUERY_STR_LEN]; sprintf(sqlquery, "SELECT %s.repmgr_update_standby_location('%s')", get_repmgr_schema_quoted(my_local_conn), last_wal_standby_applied); /* If an error happens, just inform about that and continue */ res = PQexec(my_local_conn, sqlquery); if (PQresultStatus(res) != PGRES_TUPLES_OK) { log_warning(_("Cannot update this standby's shared memory: %s\n"), PQerrorMessage(my_local_conn)); /* XXX is this enough reason to terminate this repmgrd? */ } else if (strcmp(PQgetvalue(res, 0, 0), "f") == 0) { /* this surely is more than enough reason to exit */ log_crit(_("Cannot update this standby's shared memory, maybe shared_preload_libraries=repmgr_funcs is not set?\n")); exit(ERR_BAD_CONFIG); } PQclear(res); } static void update_registration(void) { PGresult *res; char sqlquery[QUERY_STR_LEN]; sqlquery_snprintf(sqlquery, "UPDATE %s.repl_nodes " " SET conninfo = '%s', " " priority = %d " " WHERE id = %d ", get_repmgr_schema_quoted(master_conn), local_options.conninfo, local_options.priority, local_options.node); res = PQexec(master_conn, sqlquery); if (PQresultStatus(res) != PGRES_COMMAND_OK) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to update registration: %s"), PQerrorMessage(master_conn)); log_err("%s\n", errmsg.data); create_event_record(master_conn, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); terminate(ERR_DB_CON); } PQclear(res); } static void do_daemonize() { char *ptr, path[MAXLEN]; pid_t pid = fork(); int ret; switch (pid) { case -1: log_err("Error in fork(): %s\n", strerror(errno)); exit(ERR_SYS_FAILURE); break; case 0: /* child process */ pid = setsid(); if (pid == (pid_t) -1) { log_err("Error in setsid(): %s\n", strerror(errno)); exit(ERR_SYS_FAILURE); } /* ensure that we are no longer able to open a terminal */ pid = fork(); if (pid == -1) /* error case */ { log_err("Error in fork(): %s\n", strerror(errno)); exit(ERR_SYS_FAILURE); break; } if (pid != 0) /* parent process */ { exit(0); } /* a child just flows along */ memset(path, 0, MAXLEN); for (ptr = config_file + strlen(config_file); ptr > config_file; --ptr) { if (*ptr == '/') { strncpy(path, config_file, ptr - config_file); } } if (*path == '\0') { *path = '/'; } ret = chdir(path); if (ret != 0) { log_err("Error changing directory to '%s': %s", path, strerror(errno)); } break; default: /* parent process */ exit(0); } } static void check_and_create_pid_file(const char *pid_file) { struct stat st; FILE *fd; char buff[MAXLEN]; pid_t pid; size_t nread; if (stat(pid_file, &st) != -1) { memset(buff, 0, MAXLEN); fd = fopen(pid_file, "r"); if (fd == NULL) { log_err("PID file %s exists but could not opened for reading. " "If repmgrd is no longer alive remove the file and restart repmgrd.\n", pid_file); exit(ERR_BAD_CONFIG); } nread = fread(buff, MAXLEN - 1, 1, fd); if (nread == 0 && ferror(fd)) { log_err("Error reading PID file '%s', giving up...\n", pid_file); exit(ERR_BAD_CONFIG); } fclose(fd); pid = atoi(buff); if (pid != 0) { if (kill(pid, 0) != -1) { log_err("PID file %s exists and seems to contain a valid PID. " "If repmgrd is no longer alive remove the file and restart repmgrd.\n", pid_file); exit(ERR_BAD_CONFIG); } } } fd = fopen(pid_file, "w"); if (fd == NULL) { log_err("Could not open PID file %s!\n", pid_file); exit(ERR_BAD_CONFIG); } fprintf(fd, "%d", getpid()); fclose(fd); } t_node_info get_node_info(PGconn *conn, char *cluster, int node_id) { PGresult *res; t_node_info node_info = T_NODE_INFO_INITIALIZER; res = get_node_record(conn, cluster, node_id); if (PQresultStatus(res) != PGRES_TUPLES_OK) { PQExpBufferData errmsg; initPQExpBuffer(&errmsg); appendPQExpBuffer(&errmsg, _("unable to retrieve record for node %i: %s"), node_id, PQerrorMessage(conn)); log_err("%s\n", errmsg.data); create_event_record(NULL, &local_options, local_options.node, "repmgrd_shutdown", false, errmsg.data); PQclear(res); terminate(ERR_DB_QUERY); } if (!PQntuples(res)) { log_warning(_("No record found record for node %i\n"), node_id); PQclear(res); node_info.node_id = NODE_NOT_FOUND; return node_info; } node_info.node_id = atoi(PQgetvalue(res, 0, 0)); node_info.upstream_node_id = atoi(PQgetvalue(res, 0, 1)); strncpy(node_info.conninfo_str, PQgetvalue(res, 0, 2), MAXLEN); node_info.type = parse_node_type(PQgetvalue(res, 0, 3)); strncpy(node_info.slot_name, PQgetvalue(res, 0, 4), MAXLEN); node_info.active = (strcmp(PQgetvalue(res, 0, 5), "t") == 0) ? true : false; PQclear(res); return node_info; } static t_server_type parse_node_type(const char *type) { if (strcmp(type, "master") == 0) { return MASTER; } else if (strcmp(type, "standby") == 0) { return STANDBY; } else if (strcmp(type, "witness") == 0) { return WITNESS; } return UNKNOWN; } repmgr-3.0.3/sql/000077500000000000000000000000001264264412200136135ustar00rootroot00000000000000repmgr-3.0.3/sql/Makefile000066400000000000000000000006311264264412200152530ustar00rootroot00000000000000# # Makefile # # Copyright (c) 2ndQuadrant, 2010-2015 # MODULE_big = repmgr_funcs DATA_built=repmgr_funcs.sql DATA=uninstall_repmgr_funcs.sql OBJS=repmgr_funcs.o ifdef USE_PGXS PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS) else subdir = contrib/repmgr/sql top_builddir = ../../.. include $(top_builddir)/src/Makefile.global include $(top_srcdir)/contrib/contrib-global.mk endif repmgr-3.0.3/sql/repmgr2_repmgr3.sql000066400000000000000000000041511264264412200173520ustar00rootroot00000000000000/* * Update a repmgr 2.x installation to repmgr 3.0 * ---------------------------------------------- * * 1. Stop any running repmgrd instances * 2. On the master node, execute the SQL statements listed below, * taking care to identify the master node and any inactive * nodes * 3. Restart repmgrd (being sure to use repmgr 3.0) */ /* * Set the search path to the name of the schema used by * your repmgr installation * (this should be "repmgr_" + the cluster name defined in * 'repmgr.conf') */ -- SET search_path TO 'name_of_repmgr_schema'; BEGIN; ALTER TABLE repl_nodes RENAME TO repl_nodes2_0; CREATE TABLE repl_nodes ( id INTEGER PRIMARY KEY, type TEXT NOT NULL CHECK (type IN('master','standby','witness')), upstream_node_id INTEGER NULL REFERENCES repl_nodes (id), cluster TEXT NOT NULL, name TEXT NOT NULL, conninfo TEXT NOT NULL, slot_name TEXT NULL, priority INTEGER NOT NULL, active BOOLEAN NOT NULL DEFAULT TRUE ); INSERT INTO repl_nodes (id, type, cluster, name, conninfo, priority) SELECT id, CASE WHEN witness IS TRUE THEN 'witness' ELSE 'standby' END AS type, cluster, name, conninfo, priority + 100 FROM repl_nodes2_0; /* * You'll need to set the master explicitly; the following query * should identify the master node ID but will only work if all * standby servers are connected: * * SELECT id FROM repmgr_test.repl_nodes WHERE name NOT IN (SELECT application_name FROM pg_stat_replication) * * If in doubt, execute 'repmgr cluster show' will definitively identify * the master. */ UPDATE repl_nodes SET type = 'master' WHERE id = $master_id; /* If any nodes are known to be inactive, update them here */ -- UPDATE repl_nodes SET active = FALSE WHERE id IN (...); /* When you're sure of your changes, commit them */ -- COMMIT; /* * execute the following command when you are sure you no longer * require the old table: */ -- DROP TABLE repl_nodes2_0; repmgr-3.0.3/sql/repmgr_funcs.c000066400000000000000000000122221264264412200164500ustar00rootroot00000000000000/* * repmgr_funcs.c * Copyright (c) 2ndQuadrant, 2010 * * Shared memory state management and some backend functions in SQL */ #include "postgres.h" #include "fmgr.h" #include "access/xlog.h" #include "miscadmin.h" #include "replication/walreceiver.h" #include "storage/ipc.h" #include "storage/lwlock.h" #include "storage/procarray.h" #include "storage/shmem.h" #include "storage/spin.h" #include "utils/builtins.h" #include "utils/timestamp.h" /* same definition as the one in xlog_internal.h */ #define MAXFNAMELEN 64 PG_MODULE_MAGIC; /* * Global shared state */ typedef struct repmgrSharedState { LWLockId lock; /* protects search/modification */ char location[MAXFNAMELEN]; /* last known xlog location */ TimestampTz last_updated; } repmgrSharedState; /* Links to shared memory state */ static repmgrSharedState *shared_state = NULL; static shmem_startup_hook_type prev_shmem_startup_hook = NULL; void _PG_init(void); void _PG_fini(void); static void repmgr_shmem_startup(void); static Size repmgr_memsize(void); static bool repmgr_set_standby_location(char *locationstr); Datum repmgr_update_standby_location(PG_FUNCTION_ARGS); Datum repmgr_get_last_standby_location(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(repmgr_update_standby_location); PG_FUNCTION_INFO_V1(repmgr_get_last_standby_location); Datum repmgr_update_last_updated(PG_FUNCTION_ARGS); Datum repmgr_get_last_updated(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(repmgr_update_last_updated); PG_FUNCTION_INFO_V1(repmgr_get_last_updated); /* * Module load callback */ void _PG_init(void) { /* * In order to create our shared memory area, we have to be loaded via * shared_preload_libraries. If not, fall out without hooking into any of * the main system. (We don't throw error here because it seems useful to * allow the repmgr functions to be created even when the module isn't * active. The functions must protect themselves against being called * then, however.) */ if (!process_shared_preload_libraries_in_progress) return; /* * Request additional shared resources. (These are no-ops if we're not in * the postmaster process.) We'll allocate or attach to the shared * resources in repmgr_shmem_startup(). */ RequestAddinShmemSpace(repmgr_memsize()); RequestAddinLWLocks(1); /* * Install hooks. */ prev_shmem_startup_hook = shmem_startup_hook; shmem_startup_hook = repmgr_shmem_startup; } /* * Module unload callback */ void _PG_fini(void) { /* Uninstall hooks. */ shmem_startup_hook = prev_shmem_startup_hook; } /* * shmem_startup hook: allocate or attach to shared memory, */ static void repmgr_shmem_startup(void) { bool found; if (prev_shmem_startup_hook) prev_shmem_startup_hook(); /* reset in case this is a restart within the postmaster */ shared_state = NULL; /* * Create or attach to the shared memory state, including hash table */ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE); shared_state = ShmemInitStruct("repmgr shared state", sizeof(repmgrSharedState), &found); if (!found) { /* First time through ... */ shared_state->lock = LWLockAssign(); snprintf(shared_state->location, sizeof(shared_state->location), "%X/%X", 0, 0); } LWLockRelease(AddinShmemInitLock); } /* * Estimate shared memory space needed. */ static Size repmgr_memsize(void) { return MAXALIGN(sizeof(repmgrSharedState)); } static bool repmgr_set_standby_location(char *locationstr) { /* Safety check... */ if (!shared_state) return false; LWLockAcquire(shared_state->lock, LW_EXCLUSIVE); strncpy(shared_state->location, locationstr, MAXFNAMELEN); LWLockRelease(shared_state->lock); return true; } /* SQL Functions */ /* Read last xlog location reported by this standby from shared memory */ Datum repmgr_get_last_standby_location(PG_FUNCTION_ARGS) { char location[MAXFNAMELEN]; /* Safety check... */ if (!shared_state) PG_RETURN_NULL(); LWLockAcquire(shared_state->lock, LW_SHARED); strncpy(location, shared_state->location, MAXFNAMELEN); LWLockRelease(shared_state->lock); PG_RETURN_TEXT_P(cstring_to_text(location)); } /* Set update last xlog location reported by this standby to shared memory */ Datum repmgr_update_standby_location(PG_FUNCTION_ARGS) { text *location = PG_GETARG_TEXT_P(0); char *locationstr; /* Safety check... */ if (!shared_state) PG_RETURN_BOOL(false); locationstr = text_to_cstring(location); PG_RETURN_BOOL(repmgr_set_standby_location(locationstr)); } /* update and return last updated with current timestamp */ Datum repmgr_update_last_updated(PG_FUNCTION_ARGS) { TimestampTz last_updated = GetCurrentTimestamp(); /* Safety check... */ if (!shared_state) PG_RETURN_NULL(); LWLockAcquire(shared_state->lock, LW_SHARED); shared_state->last_updated = last_updated; LWLockRelease(shared_state->lock); PG_RETURN_TIMESTAMPTZ(last_updated); } /* get last updated timestamp */ Datum repmgr_get_last_updated(PG_FUNCTION_ARGS) { TimestampTz last_updated; /* Safety check... */ if (!shared_state) PG_RETURN_NULL(); LWLockAcquire(shared_state->lock, LW_EXCLUSIVE); last_updated = shared_state->last_updated; LWLockRelease(shared_state->lock); PG_RETURN_TIMESTAMPTZ(last_updated); } repmgr-3.0.3/sql/repmgr_funcs.sql.in000066400000000000000000000012531264264412200174340ustar00rootroot00000000000000/* * repmgr_function.sql * Copyright (c) 2ndQuadrant, 2010-2015 * */ -- SET SEARCH_PATH TO 'repmgr'; CREATE FUNCTION repmgr_update_standby_location(text) RETURNS boolean AS 'MODULE_PATHNAME', 'repmgr_update_standby_location' LANGUAGE C STRICT; CREATE FUNCTION repmgr_get_last_standby_location() RETURNS text AS 'MODULE_PATHNAME', 'repmgr_get_last_standby_location' LANGUAGE C STRICT; CREATE FUNCTION repmgr_update_last_updated() RETURNS TIMESTAMP WITH TIME ZONE AS 'MODULE_PATHNAME', 'repmgr_update_last_updated' LANGUAGE C STRICT; CREATE FUNCTION repmgr_get_last_updated() RETURNS TIMESTAMP WITH TIME ZONE AS 'MODULE_PATHNAME', 'repmgr_get_last_updated' LANGUAGE C STRICT; repmgr-3.0.3/sql/uninstall_repmgr_funcs.sql000066400000000000000000000004151264264412200211170ustar00rootroot00000000000000/* * uninstall_repmgr_funcs.sql * Copyright (c) 2ndQuadrant, 2010-2015 * */ DROP FUNCTION repmgr_update_standby_location(text); DROP FUNCTION repmgr_get_last_standby_location(); DROP FUNCTION repmgr_update_last_updated(); DROP FUNCTION repmgr_get_last_updated(); repmgr-3.0.3/strutil.c000066400000000000000000000035551264264412200146760ustar00rootroot00000000000000/* * strutil.c * * Copyright (C) 2ndQuadrant, 2010-2015 * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #include #include #include #include "log.h" #include "strutil.h" static int xvsnprintf(char *str, size_t size, const char *format, va_list ap) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 0))); static int xvsnprintf(char *str, size_t size, const char *format, va_list ap) { int retval; retval = vsnprintf(str, size, format, ap); if (retval >= (int) size) { log_err(_("Buffer of size not large enough to format entire string '%s'\n"), str); exit(ERR_STR_OVERFLOW); } return retval; } int xsnprintf(char *str, size_t size, const char *format,...) { va_list arglist; int retval; va_start(arglist, format); retval = xvsnprintf(str, size, format, arglist); va_end(arglist); return retval; } int sqlquery_snprintf(char *str, const char *format,...) { va_list arglist; int retval; va_start(arglist, format); retval = xvsnprintf(str, QUERY_STR_LEN, format, arglist); va_end(arglist); return retval; } int maxlen_snprintf(char *str, const char *format,...) { va_list arglist; int retval; va_start(arglist, format); retval = xvsnprintf(str, MAXLEN, format, arglist); va_end(arglist); return retval; } repmgr-3.0.3/strutil.h000066400000000000000000000024351264264412200146770ustar00rootroot00000000000000/* * strutil.h * Copyright (C) 2ndQuadrant, 2010-2015 * * * This program is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation, either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program. If not, see . * */ #ifndef _STRUTIL_H_ #define _STRUTIL_H_ #include #include "errcode.h" #define QUERY_STR_LEN 8192 #define MAXLEN 1024 #define MAXLINELENGTH 4096 #define MAXVERSIONSTR 16 #define MAXCONNINFO 1024 extern int xsnprintf(char *str, size_t size, const char *format,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 3, 4))); extern int sqlquery_snprintf(char *str, const char *format,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); extern int maxlen_snprintf(char *str, const char *format,...) __attribute__((format(PG_PRINTF_ATTRIBUTE, 2, 3))); #endif /* _STRUTIL_H_ */ repmgr-3.0.3/uninstall_repmgr.sql000066400000000000000000000003321264264412200171200ustar00rootroot00000000000000/* * uninstall_repmgr.sql * * Copyright (C) 2ndQuadrant, 2010-2015 * */ DROP TABLE IF EXISTS repl_nodes; DROP TABLE IF EXISTS repl_monitor; DROP VIEW IF EXISTS repl_status; DROP SCHEMA repmgr; DROP USER repmgr; repmgr-3.0.3/version.h000066400000000000000000000001201264264412200146430ustar00rootroot00000000000000#ifndef _VERSION_H_ #define _VERSION_H_ #define REPMGR_VERSION "3.0.3" #endif