**This is an old revision of the document!**

SME File Fabric HA Setup "2 x 2"

Disclaimer

The information in this document is provided on an as-is basis. You use it at your own risk. We accept no responsibility for errors or omissions, nor do we have any obligation to provide support for implementing or maintaining the configuration described here. Furthermore, we do not warrant that the design presented here is appropriate for your requirements.

SME designs, implements and supports HA File Fabric solutions for customers on a paid professional services basis. For more information please contact sales@storagemadeeasy.com

Introduction

The SME Cloud Control appliance as shipped is configured for deployment on a single virtual machine. However, a common deployment scenario for production deployments are redundant web frontends in front of a master-slave MySQL deployment.

By deploying multiple web frontends and a Master-Slave database your SME Cloud will increase availability and number of concurrent users. This is accomplished by reducing points of failure and allowing for native MySQL backups without downtime.

Scope

This guide details how to deploy SME with multiple web frontends and a Master-Slave database. While multi-master synchronous MySQL clusters are also possible, they are beyond the scope of this document. Please contact your Account Manager to discuss MySQL clustering options

The SME Deep Search Appliance is also not covered in this guide.

Part I

Assumptions

This guide assumes you have working knowledge and an understanding of Linux operating systems, databases, etc. If any questions come up, please contact your account manager or SME support.

For this guide we are using the following hostnames: smeweb01, smeweb02, smesql01, smesql02 and smesql-vip, you are of course free to select your own names that matches your naming schema.

In addition, you should have DNS configured and verified for the above 5 DNS records and ip addresses, as well as opened up any internal firewalls that can restrict necessary traffic between the systems

Initial State

This guide assumes you set up the four appliances following the instructions in the Appliance Installation guide. For easy failover of the application in case of a database failover, you must assign a virtual IP address for the database servers. This will mean that you can change the primary DB server without changing the configuration on the application servers. There are many ways in Linux to configure automatic failover of the VIP, but this guide will configure the DB servers requiring manual failover. This is often preferred in our scenario of master - slave replication.

Preparation

Before you start, please be sure to collect / prepare the necessary information.

  • 4 SME Appliances deployed
  • SME linux root password

  • SME linux smeconfiguser password
  • 5 IP addresses on your LAN, 2 for application servers, 2 for database servers and a virtual IP (VIP) for the master DB server.
  • 5 DNS names configured for the IPs.
  • New DB user for app server and password, in this guide we use smestoreremote.
  • New DB user for DB server replication, in this guide we use repl_user.
Linux Login

For Linux command line operations, you must run the commands shown in this document as the root user unless otherwise specified. However, for security reasons you cannot connect with ssh to the machine directly. Instead, you should ssh to the box using smeconfiguser and then su to root:

# ssh smeconfiguser@smeweb01

Enter the smeconfiguser password at the prompt. Once logged in, elevate your privileges to root.

# su -

Enter the root password at the prompt

The '-' (minus sign) in the su command is important: it restarts the bash shell with a new environmental varaibles, which gives access to restricted “root level” commands.

Database Login

For database operations, to login to the smestorage database, you do not need a password. As root from the command line, type:

# mysql smestorage

Note: This command only works if you are root, smeconfiguser does not have privilege to run this command.

Stop services

Before we start the configuration, ensure that all services are stopped, and nothing is connecting to the databases. The below commands should be executed on all four servers as the root user:

# systemctl stop httpd
# systemctl stop cloudftp
# systemctl stop vsftpd

Part II

Configuring the Database Servers

You must perform these steps to create a specialized database server from the standard SME appliance distribution. In this guide we also run memcached as a part of the DB server.

Restrict external access

The database server does not serve web pages and does not need to be accessible from outside WAN. The only traffic you need to allow is TCP port 3306 and 11211 between the four SME appliances.

Disable unnecessary services

The DB server will only be used for memcache and database services, so the following services, are unnecessary and can be disabled. The below commands should be executed on both smesql01 and smesql02 as the root user.

# systemctl disable httpd
# systemctl disable cloudftp
# systemctl disable vsftpd
crontab

You must also disable some jobs in crontab, these should only run on one application server. Again, as root on smesql01 and smesql02:

# crontab -e -u smestorage

Place a “#” infront of the seven jobs listed, when you are done, it should look like this:

iptables for dbservers

On both smesql01 and smesql02, you must update iptables to allow incoming connections to mariadb, do the following.

As root:

# systemctl stop iptables
# vi /etc/sysconfig/iptables

Comment out lines 16 through 21, theses are port numbers 21, 80, 8080, 443, 990 and 2200.

Then add the following two lines after line 21:

-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 11211 -j ACCEPT
-A RH-Firewall-1-INPUT -p tcp -m state --state NEW -m tcp --dport 3306 -j ACCEPT

Note: Port 3306 is the database port, 11211 is the port for memcached. If the traffic from the SME webservers will go through a firewall on your LAN, you have to ensure these two ports are also allowed there.

The file should now look like this:

Restart the firewall:

# systemctl start iptables

Memcached configuration

By default, the appliance memcached service listens only for connections from localhost, in order to share a memcache between the two application servers, we must change that. For redundancy we will do this on both smesql01 and smesql02.

To allow other machines to connect, edit /etc/sysconfig/memcached.

# vi /etc/sysconfig/memcached 

Change the line OPTIONS=“-l 127.0.0.1”

to

OPTIONS=“-l 0.0.0.0”

Making the file look like this

then restart the memcached service

# systemctl restart memcached 

MySQL configuration for HA

Master MySQL Database

The settings for MySQL are stored in /etc/my.cnf.d/localhost.cnf. By default there are minimal settings in this file, so we will replace all the content with new. As root on the primary sql server, smesql01 only:

# vi /etc/my.cnf.d/localhost.cnf 

Paste the following:

[mysqld]
# Allow remote addresses to connect to this server.
bind-address = 0.0.0.0
# Avoid sysdate() ambiguity.
sysdate-is-now=1

#+REPLICATION MASTER
#============
# server-id is 1 for the master. (The slave has server-id=2.)
server-id=1
# Transaction logs on the master are written to
# /var/lib/mysql/master-tx-log-bin directory
log-bin = /var/lib/mysql/master-tx-log-bin
log-bin-index = /var/lib/mysql/master-tx-log-bin-index
expire_logs_days = 10
max_binlog_size = 100M
binlog_format=row

#============
#-REPLICATION

Then restart the database

# systemctl restart mariadb 
Active VIP

To configure the active VIP address on the master server, create a virtual interface configuration file:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0 

Paste the following content in the newly created file, adjusting for your IP and netmask, etc.

# VIP ip configuration
DEVICE="eth0:0"
BOOTPROTO=none
IPADDR=172.17.2.110
NETMASK=255.255.255.0
USERCTL=YES
ONPARENT=YES
ONBOOT=NO
#EOF

Once saved, execute:

# /sbin/ifup eth0:0

From another host, now try to ping smesql-vip or the IP address.

To disable the VIP, or before running ifup on the slave, execute

# /sbin/ifdown eth0:0 
Slave MySQL Database

The settings for MySQL are stored in /etc/my.cnf.d/localhost.cnf. By default there are minimal settings in this file, so we will replace all the content with new.

As root on the secondary sql server, smesql02 only:

# vi /etc/my.cnf.d/localhost.cnf

Paste the following:

[mysqld]
# Allow remote addresses to connect to this server.
bind-address = 0.0.0.0
# Avoid sysdate() ambiguity.
sysdate-is-now=1

#+REPLICATION SLAVE
#============
#Server-id is 2 for the slave. (The master has server-id=1.)
server-id=2
#Transaction logs from the master are replicated/relayed to the
#slave in the /var/lib/mysql/slave-relay-log-bin/ directory

relay-log = /var/lib/mysql/slave-relay-log-bin
relay-log-index = /var/lib/mysql/slave-relay-log-bin-index
#============
#-REPLICATION

Then restart the database

# systemctl restart mariadb
Passive VIP

To configure the passive VIP address on the slave server, create a virtual interface configuration file:

# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0

Paste the following content in the newly created file, adjusting for your IP and netmask, etc.

# VIP ip configuration
DEVICE="eth0:0"
BOOTPROTO=none
IPADDR=172.17.2.110
NETMASK=255.255.255.0
USERCTL=YES
ONPARENT=NO
ONBOOT=NO
#EOF

Once saved, if the VIP is diabled on the master, execute:

# /sbin/ifup eth0:0

From another host, now try to ping smesql-vip or the IP address.

To disable the VIP, or before running ifup on the master, execute

# /sbin/ifdown eth0:0

Configuring database replication

Using database replication allows you to quickly continue service if a database goes down.

In MySQL replication, there is one master and one or more slave servers.
This means that all application server calls to the database are directed to the master server. The master server handles all queries and DB changes.
Any changes are then replicated to the slave database server(s).

Starting replication

You will need to synchronize the replication between the master and slave servers when

  • You start the database servers using replication for the first time.

  • You are starting the servers with replication enabled after a period of running without replication.
  • The master did not shut down cleanly (and its saved replication information may be incorrect).

Master actions

Create replication user account

A replication account must be added to the Master MySQL server to allow the slave to pull the bin logs. Once the user is created the database is locked to prevent new writes and the current database position is listed.

On the Master DB server smesql01:

Log into mysql and execute the following commands to create a user called repl_user. Replace the <repluserpasswd>, with a password for your environment.

CREATE USER repl_user IDENTIFIED BY '<repl_user_passwd>';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;
FLUSH TABLES WITH READ LOCK;
SHOW MASTER STATUS;

NOTE: Record the <repluserpasswd> passwords, they will needed again. Do not exit this screen until the next section is complete or the READ LOCK will be released.

Each command should succeed without error. The last command issued will have output similar to the below. Record both the File and Position for a future step.

Replicated DB and Start Slave

The database on the Master Server is now locked and will not accept any writes. It is now necessary to copy the database to the Slave server.

This section is executed from the smesql02 server only as the root user. From the command prompt replicate the database from smesql01 to smesql02. Substitute the ip address or FQDN of smesql01 below. The passwords and database listed are the defaults that are provided with the appliance.

# ssh smeconfiguser@<smesql01> mysqldump -u smestore -pbesp5fyx smestorage | mysql -u smestore -pbesp5fyx smestorage

The password for smeconfiguser will need to be supplied. Once the replication is complete log into mysql by typing:

# mysql

NOTE: The section below needs modification before it can be executed in the mysql cli. It can be easier and less error prone if you copy/paste it in to a notepad application, modify the values, and then finally copy/paste it again into mysql.

RESET SLAVE;

CHANGE MASTER TO MASTER_HOST='<smesql01>',
MASTER_USER='repl_user',
MASTER_PASSWORD='<repl_user_passwd>',
MASTER_LOG_FILE='master-tx-log-bin.<File#>',
MASTER_LOG_POS=<Position>;

START SLAVE;

Note: I have highlighted the values in <> above that you will have to customize for your environment.
This command gives the slave the information it needs to start replication from the master. repl_user is the username created on the master server. <Position> and <File#> are the values from SHOW MASTER STATUS; in the previous step.

Note that START SLAVE command is sticky: the DB server will start up in slave mode on subsequent restarts

Verify the setup

To verify that replication is correctly configured, you can now test if the slave is configured by running this command on the slave server

From the MySQL cli:

SHOW SLAVE STATUS\G

In your output look for the following values: Slave_IO_State, Slave_IO_Running, Slave_SQL_Running, and compare them to this sample output.

Slave doesn't start

If there is a problem with starting the slave, the SlaveIOState column will remain blank - you will never see it go to “Waiting…”. In this case, look for clues in LastIOError field.

Unlock Master and create User

On the Master server the database can now be unlocked allowing writes. In addition a user must now be created for use with the webservers. The default user for the smestorage database is limited to only accessing the data from localhost.

UNLOCK TABLES;
CREATE USER 'smestoreremote'@'%' IDENTIFIED BY '<smestoreremote_password>';
GRANT ALL PRIVILEGES ON *.* TO 'smestoreremote'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

Note: Replace the <smestoreremote_password> with a password and keep this for use in the web servers.

Verifing Replication

The user created in the previous step show now be visible in the Slave database. To verify this run the following from the mysql interface.

SELECT User FROM mysql.user;
 
+----------------+
| User           |
+----------------+
| smestoreremote |
| root           |
| root           |
|                |
| root           |
| smestore       |
|                |
+----------------+
7 rows in set (0.00 sec)

Notice that the smestoreremote user has been replicated to the Slave server.

MySQL can now be exited on both servers with the command:

exit 

Part III

Configure the application servers

In this section, you change the application server's configuration to the database server by editing the PHP configuration directly from command line. By default, the database host is localhost. These steps are repeated on both smeweb01 and smeweb02.

Make a backup of the configuration file:

# cd /var/www/smestorage/public_html/
# cp config.inc.php /home/smeconfiguser/config.inc.php.

Open the configuration file to edit the settings.

# cd /var/www/smestorage/public_html/
# nano config.inc.php

We have to update 5 variables in the file. They are all close to the top under “DB settings”, change the address for the database server by changing the line

var $dbhost='localhost';

to

var $dbhost='smesql-vip';

In addition add a newline to define a remote memcached instance

var $memcachehost='smesql-vip:11211';

Further we need to update the $dbuser and $password to match the <smestoreremote> user and password used when setting up the database in the previous section.

Do not change $dbname.

After the update of this is what my file looks like:

Disable unused services

As the mariadb and memcached services are not used on the application server, you should disable them:

These steps are repeated on both smeweb01 and smeweb02.

mariadb
# systemctl disable mariadb
# systemctl stop mariadb
memcached
# systemctl disable memcached
# systemctl stop memcached
crontab

You must also disable crontab on the smeweb02 server, crontab should only run from one application server. As root on smeweb02 only:

# crontab -e -u smestorage

Place a # in front of the seven jobs listed in the crontab schedule. For a screenshot, refer to the DB section above.

Restart both servers, so on both smeweb01 and smeweb02, execute a reboot.

Appendix

The following section covers additional topics not necessary to establish HA on a clean pair of new servers.

Custom Branding

When setting up multiple application servers, the branding information is saved at /var/www/smestorage/public_html/templates/company

If branding is changed you should copy this directory manually to other application servers.

On smeweb01, as root:

# cd /
# tar czf branding.tgz /var/www/smestorage/public_html/templates/company/*
# scp branding.tgz smeconfiguser@smeweb02:

On smeweb02, as root:

# cd /
# mv /home/smeconfiguser/branding.tgz /
# tar xzf branding.tgz

Applying Updates

Each Application server will need to be updated separately when a new version is available.

Startup / Shutdown of HA Systems

In the event the environment needs to be brought down for a maintenance cycle, shutdown servers in the following order:

  1. WebServers
  2. Master MySQL
  3. Slave MySQL

For startup, reverse the order above. Give time between servers starting to ensure all services are running properly.

Database Backup

One advantage of using replication is that you can perform a backup of the database from the slave at a point in time without stopping the master - so the application is still providing service to users.

You do the backup from the slave as follows:

  • stop the slave (systemctl stop mariadb)

  • backup the database from the slave - see the MySQL documentation Using
    • Replication for Backups for procedures.

    • restart the slave (systemctl start mariadb).

    Once you restart the slave, it will automatically catch up with the master database status from the point where it left off.

Failing over to the MySQL Slave

In the event that the Master MySQL fails or gets corrupted promoting the Slave to a Standalone MySQL server and moving the VIP is the fastest way to restore functionality to SME.

If the old Master server is still connected to the network it is important to disable the VIP IP address so that it doesn't conflict with the slave server.
On the Master server:

Bring down the VIP

# /sbin/ifdown eth0:0

Edit ifcfg-eth0:0 to disable it starting on boot. This will ensure that the Master does not startup with the same IP as the Slave while the slave is serving data.

# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0

Switch the line ONPARENT from YES to NO

# VIP ip configuration
DEVICE="eth0:0"
BOOTPROTO=none
IPADDR=172.17.2.110
NETMASK=255.255.255.0
USERCTL=YES
ONPARENT=NO
ONBOOT=NO
#EOF

On the Slave Server:

Stop the MySQL replication

# mysql
STOP SLAVE;

exit

Bring up the VIP

# /sbin/ifup eth0:0

Edit ifcfg-eth0:0 to enable it starting on boot. This keep the slave serving data across reboots.

# vi /etc/sysconfig/network-scripts/ifcfg-eth0:0

Switch the line ONPARENT from NO to YES

# VIP ip configuration
DEVICE="eth0:0"
BOOTPROTO=none
IPADDR=172.17.2.110
NETMASK=255.255.255.0
USERCTL=YES
ONPARENT=NO
ONBOOT=NO
#EOF

Verify that the IP address has moved by running the command “ip addr” on both the Master and Slave servers.

Failback to Master

Once the Master MySQL server is repaired the Master-Slave relationship can be re-established. This procedure can also be used if the Master server is recovered. If the previous Mater server is not recoverable from backup, a new appliance can be deployed and configured as a Slave. This can be done by running the initial procedure detailed to create a Master-Slave system.

On the WebServers:

The webservers must either be shutdown during this process or the following commands must be run on each.

# systemctl stop httpd
# systemctl stop crond

On the Slave Server:

Login and become root with “su -”. Disable the VIP and enter MySQL

# su -
# /sbin/ifdown eth0:0# 
# mysql

Lock the tables and ensure there are no other processes still committing data. If any processes show other than the one listed below ensure the webservers are shutdown or disabled from the previous step.

FLUSH TABLES WITH READ LOCK;
SHOW FULL PROCESSLIST;
 
+-----+------+-----------+------+---------+------+-------+-----------------------+----------+
| Id  | User | Host      | db   | Command | Time | State | Info                  | Progress |
+-----+------+-----------+------+---------+------+-------+-----------------------+----------+
| 215 | root | localhost | NULL | Query   |    0 | NULL  | SHOW FULL PROCESSLIST |    0.000 |
+-----+------+-----------+------+---------+------+-------+-----------------------+----------+
1 row in set (0.00 sec)

On the Master Server:

Login to smesql01 and become root. Then copy the current database from the Slave Server while the Slave tables are locked for READ. Substitute the IP address or FQDN of <smesql02> below. The password for smeconfiguser will need to be entered. After the copy enter mysql.

# su -
# ssh smeconfiguser@<smesql02> mysqldump -u smestore -pbesp5fyx smestorage | mysql -u smestore -pbesp5fyx smestorage

# mysql

In mysql enter the following commands to reset the Master server.

RESET MASTER:
SHOW MASTER STATUS;
 
+--------------------------+----------+--------------+------------------+
| File                     | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+--------------------------+----------+--------------+------------------+
| master-tx-log-bin.000001 |   1      |              |                  |
+--------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

On the Slave Server:

UNLOCK TABLES;
RESET SLAVE;

CHANGE MASTER TO MASTER_HOST='<smesql01>',
MASTER_USER='repl_user',
MASTER_PASSWORD='<repl_user_passwd>',
MASTER_LOG_FILE='master-tx-log-bin.<File#>',
MASTER_LOG_POS=<Position>;

START SLAVE;

Note: I have highlighted the values with <> above that you will have to set for your environment.

This command gives the slave the information it needs to restart replication from the master. repl_user is the username created on the master server. <Position> and <File#> are the values from SHOW MASTER STATUS; in the previous step.

Note that START SLAVE command is sticky: the DB server will start up in slave mode on subsequent restarts.

To verify that replication is correctly running again with this command on the slave server

SHOW SLAVE STATUS\G

In the output look for the following values: SlaveIOState, SlaveIORunning, SlaveSQLRunning, and compare them to this sample output.

Now that replication is running again exit mysql on both servers and set the VIP to run on the Master server again.