![]() |
VOOZH | about |
A PostgreSQL database cluster refers to a collection of databases managed by a single instance of the PostgreSQL server. Setting up a PostgreSQL cluster is an essential task for organizing multiple databases and achieving high availability, scalability, and load balancing.
Whether we are working with multiple nodes or just a single server with many databases, understanding how to set up a PostgreSQL cluster is key to effective data management. This article will discuss setting up a PostgreSQL cluster, including installation, configuration, and basic cluster management.
These steps install the PostgreSQL server and the additional tools necessary for managing and enhancing PostgreSQL Database Cluster. By enabling the service, it ensures that PostgreSQL is ready to accept connections upon system startup.
Before setting up a PostgreSQL cluster, we need to install PostgreSQL on our system. Here’s how to do it on different operating systems. If PostgreSQL is not yet installed on our system, we can install it with the following commands.
sudo apt update
sudo apt install postgresql postgresql-contrib
sudo yum install postgresql-server postgresql-contribInitializing the Database Cluster
Once the installation is complete, initialize the database cluster by running. This ensures that the PostgreSQL service starts and runs automatically on boot
sudo systemctl start postgresql
sudo systemctl enable postgresql
Understanding the components of a PostgreSQL cluster helps us to manage databases effectively. The server is responsible for processing requests, while the data directory is where all database files are stored. When you set up PostgreSQL, we are essentially creating a cluster.
A PostgreSQL cluster consists of:
This data directory is created automatically during installation, but we can create additional clusters if necessary
If we want to create a new PostgreSQL cluster (besides the default one), you need to initialize it manually. The initdb command is used to initialize a new cluster.
Query:
sudo mkdir /newdata
sudo chown postgres:postgres /newdata
sudo -u postgres initdb -D /newdata
Explanation:
In this example, the new cluster will store its data in the /newdata directory. Make sure PostgreSQL has permission to access this directory by setting the owner to the postgres user.
The pg_ctl command starts the PostgreSQL server with the specified data directory. Using psql, we can connect to the PostgreSQL prompt and manage databases within the new cluster.
sudo -u postgres pg_ctl -D /newdata startWe can connect to the new cluster using the psql tool. This opens the PostgreSQL prompt where we can start managing the databases inside the new cluster.
psql -d postgresPostgreSQL allows us to manage multiple clusters on a single machine, each with its own data directory and configurations. If we want to manage multiple clusters, we can use the pg_lsclusters and pg_ctlcluster commands (specific to Ubuntu/Debian systems).
The pg_lsclusters command lists all the PostgreSQL clusters on the machine, showing their status.
sudo pg_lsclustersThis is useful if you have more than one PostgreSQL version or multiple clusters with different configurations.
sudo pg_ctlcluster <version> <cluster-name> startFor high availability and scalability, we can set up streaming replication within the PostgreSQL cluster. This involves setting up a primary server (master) and one or more standby servers (replicas). Here’s a basic overview of setting up streaming replication.
Edit the postgresql.conf file on the primary server:
sudo nano /var/lib/pgsql/data/postgresql.confUncomment and set the following parameters:
wal_level = replica
max_wal_senders = 3
Next, edit the pg_hba.conf file to allow replication connections from the standby server:
host replication all <standby-ip-address>/32 md5Use pg_basebackup to create a base backup of the primary server data, which the standby will use to replicate:
sudo -u postgres pg_basebackup -h <primary-server-ip> -D /var/lib/pgsql/data -P -U replication -Fp -Xs -RStart the standby server, and it will begin streaming changes from the primary server:
sudo systemctl start postgresqlThese commands are essential for managing the PostgreSQL service. Restarting and stopping the service allows for maintenance and configuration changes, while checking the status ensures the service is running correctly.
Restart PostgreSQL Cluster:
sudo systemctl restart postgresqlStop PostgreSQL Cluster:
sudo systemctl stop postgresqlChecking the Cluster Status:
sudo systemctl status postgresqlSetting up a PostgreSQL database cluster is important for managing multiple databases and enhancing availability and performance. Whether we are managing a single server or multiple nodes, PostgreSQL provides flexible clustering options, including multi-database management and replication.