Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
cloudappliance/solrreplication [2020_03_10 14:07] – [Disclaimer] jimcloudappliance:solrreplication [2023_06_22 05:41] – external edit 127.0.0.1
Line 1: Line 1:
 ===== Solr Replication for Highly Available EFF Content Search ===== ===== Solr Replication for Highly Available EFF Content Search =====
-== last updated March 062020 ==+== last updated April 102022 ==
  
 ==== Disclaimer ==== ==== Disclaimer ====
 The information in this document is provided on an as-is basis. You use it at your own risk. We accept no responsibility for errors or omissions, nor do we have any obligation to provide support for implementing or maintaining the configuration described here. Furthermore, we do not warrant that the design presented here is appropriate for your requirements. The information in this document is provided on an as-is basis. You use it at your own risk. We accept no responsibility for errors or omissions, nor do we have any obligation to provide support for implementing or maintaining the configuration described here. Furthermore, we do not warrant that the design presented here is appropriate for your requirements.
  
-SME designs, implements and supports HA File Fabric solutions for customers on a paid professional services basis. For more information please contact [[mailto:sales@storagemadeeasy.com?subject=SOLR constancy enquiry|sales@storagemadeeasy.com]].+SME designs, implements and supports HA File Fabric solutions for customers on a paid professional services basis. For more information please contact [[mailto:sales@storagemadeeasy.com?subject=SOLR consultancy enquiry|sales@storagemadeeasy.com]].
  
 <WRAP center round important 100%> <WRAP center round important 100%>
Line 15: Line 15:
 The Enterprise File Fabric as shipped is configured for deployment on a single virtual machine. However, a common deployment scenario for production deployments are redundant web frontends in front of a Highly Available Statefull Metadata server pair.  The Enterprise File Fabric as shipped is configured for deployment on a single virtual machine. However, a common deployment scenario for production deployments are redundant web frontends in front of a Highly Available Statefull Metadata server pair. 
  
-This guide will step through the setup of a Master-Slave Solr database pair, which allows for automatic failover without any loss of data. When the master returns online, there is additional work required to migrate any new index data back to the former master - so as such automatic failback is not supported. +This guide will step through the setup of a Leader-Follower Solr database pair, which allows for automatic failover without any loss of data. When the leader returns online, there is additional work required to migrate any new index data back to the former leader - so as such automatic failback is not supported. 
  
 ==== Part 1 ==== ==== Part 1 ====
Line 58: Line 58:
 === Configuring the Solr === === Configuring the Solr ===
  
-You must perform these steps to create a specialized solr server from the standard SME appliance distribution. +You must perform these steps to create a specialized Solr server from the standard SME appliance distribution. 
  
-=== Restrict external access  === +=== Install Solr Replica Containers === 
-The solr server does not serve web pages and does not need to be accessible from outside WAN. The only traffic you need to allow is TCP port 7070 from all web frontend servers.+The standard solr containers deployed by default in the appliance do not support replicationInstead we will install containers designed for Leader/Follower replication
  
 +```
 +yum install sme-containers-solr-replicas
 +```
  
-=== iptables for dbservers ===+We can then stop the existing solr container and start up the replias version
  
-On both smesql01 and smesql02, you must update iptables to allow incoming connections to mariadb, do the following.+``` 
 +cd /var/www/smestorage/containers/solr && docker-compose down 
 +```
  
-As root: +After finishing configuration we will start up the new replicas version 
- +
-<code> +
-iptables-save > /var/tmp/iptables_backup_`date -I` +
-ipt_line=`iptables -L RH-Firewall-1-INPUT -n --line-numbers | grep REJECT | awk '{print $1}'+
-insert_line=`expr $ipt_line - 1` +
-iptables -I RH-Firewall-1-INPUT $insert_line -p tcp -m state --state NEW -m tcp --dport 7070 -j ACCEPT +
-iptables-save > /etc/sysconfig/iptables +
-</code>+
  
 === Solr configuration for HA === === Solr configuration for HA ===
 == Solr Database Configuration == == Solr Database Configuration ==
-The settings for Solr startup are stored in /home/sme/sme_jetty.We will update the jetty.host to listen on all IPs on the host. 
-We will perform this on both smesql01 and smesql02. 
- 
-<code> 
-jetty.host=0.0.0.0 
-</code> 
- 
- 
-Then restart solr for this change to take effect 
- 
-<code> 
-systemctl restart jetty 
-</code> 
  
 === Configure Database Replication === === Configure Database Replication ===
Line 98: Line 82:
 == Update Configuration to enable ReplicationHandler == == Update Configuration to enable ReplicationHandler ==
  
-We will edit the following file in order to turn on the Replication Handler within solr, which will handle all solr index replication. +We will edit the following file in order to turn on the Replication Handler within Solr, which will handle all Solr index replication. 
  
-Add the following into this file: /smedata/sme_solr/sme/conf/solrconfig.xml+Add the following into this file: /var/solr/data/sme/conf/solrconfig.xml
  
 Add this after the "<!-- Request Handlers" section like so: Add this after the "<!-- Request Handlers" section like so:
Line 111: Line 95:
        Incoming queries will be dispatched to a specific handler by name        Incoming queries will be dispatched to a specific handler by name
        based on the path specified in the request.        based on the path specified in the request.
- 
-       Legacy behavior: If the request path uses "/select" but no Request 
-       Handler has that name, and if handleSelect="true" has been specified in 
-       the requestDispatcher, then the Request Handler is dispatched based on 
-       the qt parameter.  Handlers without a leading '/' are accessed this way 
-       like so: http://host/app/[core/]select?qt=name  If no qt is 
-       given, then the requestHandler that declares default="true" will be 
-       used or the one named "standard". 
  
        If a Request Handler is declared with startup="lazy", then it will        If a Request Handler is declared with startup="lazy", then it will
Line 126: Line 102:
  
 <requestHandler name="/replication" class="solr.ReplicationHandler" > <requestHandler name="/replication" class="solr.ReplicationHandler" >
-    <lst name="master"> +    <lst name="leader"> 
-        <str name="enable">${enable.master:false}</str>+        <str name="enable">${enable.leader:false}</str>
         <!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. -->         <!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. -->
         <str name="replicateAfter">startup</str>         <str name="replicateAfter">startup</str>
         <str name="replicateAfter">commit</str>         <str name="replicateAfter">commit</str>
-         +
-        <!--The default value of reservation is 10 secs.See the documentation below . Normally , you should not need to specify this --> +
-        <str name="commitReserveDuration">00:00:10</str>+
     </lst>     </lst>
  
-    <lst name="slave">+    <lst name="follower">
  
-        <str name="enable">${enable.slave:false}</str>+        <str name="enable">${enable.follower:false}</str>
  
-        <!--fully qualified url to the master core. It is possible to pass on this as a request param for the fetchindex command--> +        <!--fully qualified url to the leader core. It is possible to pass on this as a request param for the fetchindex command--> 
-        <str name="masterUrl">http://smesearch:7070/sme/replication</str>+        <str name="leaderUrl">http://smesearch:8983/solr/sme</str>
  
-        <!--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. +        <!--Interval in which the follower should poll leader .Format is HH:mm:ss . If this is absent follower does not poll automatically. 
-                                   But a fetchindex can be triggered from the admin or the http API -->+                                                But a fetchindex can be triggered from the admin or the http API -->
         <str name="pollInterval">00:00:20</str>         <str name="pollInterval">00:00:20</str>
-        <!--The following values are used when the slave connects to the master to download the index files. +        <!--The following values are used when the follower connects to the leader to download the index files. 
-                                   Default values implicitly set as 5000ms and 10000ms respectively. The user DOES NOT need to specify+                                                Default values implicitly set as 5000ms and 10000ms respectively. The user DOES NOT need to specify
          these unless the bandwidth is extremely low or if there is an extremely high latency-->          these unless the bandwidth is extremely low or if there is an extremely high latency-->
         <str name="httpConnTimeout">5000</str>         <str name="httpConnTimeout">5000</str>
         <str name="httpReadTimeout">10000</str>         <str name="httpReadTimeout">10000</str>
  
-        <!-- If HTTP Basic authentication is enabled on the master, then the slave can be configured with the following -->+        <!-- If HTTP Basic authentication is enabled on the leader, then the follower can be configured with the following -->
         <str name="httpBasicAuthUser">solr</str>         <str name="httpBasicAuthUser">solr</str>
-        <str name="httpBasicAuthPassword"> <SOLR USER PASSWORD>  </str>+        <str name="httpBasicAuthPassword">drom6etsh9Onk</str>
  
      </lst>      </lst>
 </requestHandler> </requestHandler>
 </code> </code>
-Please note the use of the smesearch dns name for masterUrl. If you have a different dns name please update the above configuration accordingly. +Please note the use of the smesearch dns name for leaderUrl. If you have a different dns name please update the above configuration accordingly. 
  
-== Define Master and Slaves == +== Define Leader and Follower == 
-Each solr instance is configured to be able to act as either a master or a slave. In order to define the state of each we will use core properties. +Each Solr instance is configured to be able to act as either a leader or a follower. In order to define the state of each we will use core properties. 
  
-On smesql01 to make it master add the following two lines at the bottom of /smedata/sme_solr/sme/core.properties+On smesql01 to make it leader add the following two lines at the bottom of /var/solr/data/sme/core.properties
  
 <code> <code>
-enable.master=true +enable.leader=true 
-enable.slave=false+enable.follower=false
 </code> </code>
  
-On smesql01 to make it a slave, dd the following two lines at the bottom of /smedata/sme_solr/sme/core.properties+On smesql02 to make it a follower, dd the following two lines at the bottom of/var/solr/data/sme/core.properties
  
 <code> <code>
-enable.master=false +enable.leader=false 
-enable.slave=true+enable.follower=true
 </code> </code>
  
-Finally, we will restart solr on both hosts in order to have those changes take effect:+== Allow replication whitelist == 
 +Next we will configuration the whitelist to allow the solr containers to replicate:
  
 +We will edit  /var/solr/data/solr.xml to add the following configuration at the bottom:
 <code> <code>
-systemctl restart jetty+  <shardHandlerFactory name="shardHandlerFactory" 
 +    class="HttpShardHandlerFactory"> 
 +    <int name="socketTimeout">${socketTimeout:600000}</int> 
 +    <int name="connTimeout">${connTimeout:60000}</int> 
 +     <str name="shardsWhitelist">${solr.shardsWhitelist:smesql01:8983/solr/sme,smesql02:8983/solr/sme}</str> 
 +  </shardHandlerFactory> 
 +</code>   
 +Replacing smesql01/02 with their respective ip addresses.  
 + 
 +== Start solr containers == 
 +Finally, we will start the Solr replica containers on both hosts in order to have those changes take effect: 
 + 
 +<code> 
 +cd /var/www/smestorage/containers/solr-replicas/ && docker-compose up -d
 </code> </code>
  
 +== Validate replication is running ==
 +
 +You can run the following command to view the replication status of the host 
 +
 +<code>
 +curl -u solr:drom6etsh9Onk "http://127.0.0.1:8983/solr/sme/replication?command=details&wt=json" | jq
 +</code>
  
 ==== Part III ==== ==== Part III ====
Line 231: Line 227:
  
 vrrp_instance DB { vrrp_instance DB {
-  state MASTER+  state BACKUP
   interface eth0   interface eth0
   virtual_router_id 51   virtual_router_id 51
Line 250: Line 246:
  
 vrrp_instance MEMCACHE { vrrp_instance MEMCACHE {
-  state MASTER+  state BACKUP
   interface eth0   interface eth0
   virtual_router_id 61   virtual_router_id 61
Line 270: Line 266:
 #### update to add solr vip configuration here #### #### update to add solr vip configuration here ####
 vrrp_instance SOLR { vrrp_instance SOLR {
-  state MASTER+  state BACKUP
   interface eth0   interface eth0
   virtual_router_id 71   virtual_router_id 71
Line 388: Line 384:
 == Restart Keepalived == == Restart Keepalived ==
 We will now restart keepalived to apply the new configuration. We will now restart keepalived to apply the new configuration.
-If this is a running production enviornment, please take care to shutdown keepalived on the slave, restart the master and then start the slave, otherwise there will be a relection and failover of mysql and memcache during this restart+If this is a running production environment, please take care to shutdown keepalived on the follower, restart the leader and then start the follower, otherwise there will be a re-election and failover of mysql and memcache during this restart
  
 <code> <code>
Line 437: Line 433:
 Settings > Search Integrations Settings > Search Integrations
  
-Replace the solr uri as follows:+Replace the Solr uri as follows:
 <code> <code>
-http://smesearch:7070/sme/+http://smesearch:8983/solr/sme
 </code> </code>
  
Line 448: Line 444:
 === Failover and Recovery === === Failover and Recovery ===
  
-In the case of an outage of the solr service, or the smesql01 service, keepalived will fail over traffic to the solr instance running on smesql02. +In the case of an outage of the Solr service, or the smesql01 service, keepalived will fail over traffic to the Solr instance running on smesql02. 
  
-This process is all automatic. All new solr read and writes will now occur on the smesql02 server without any intervention. +This process is all automatic. All new Solr read and writes will now occur on the smesql02 server without any intervention. 
  
-However, when smesq01 server/solr service is available again, we will not fail traffic back over in the other direction, as the smesql01 database will NOT contain any of the new indexes created during the outage. Solr replication is setup to run in only one direction at a time, so unlike the mysql setup, the smesql01 solr service will not automatically copy back any changes. +However, when smesq01 server/solr service is available again, we will not fail traffic back over in the other direction, as the smesql01 database will NOT contain any of the new indexes created during the outage. Solr replication is setup to run in only one direction at a time, so unlike the mysql setup, the smesql01 Solr service will not automatically copy back any changes. 
  
-In order to get the master up to date you will need to change the master/slave status of each host. +In order to get the leader up to date you will need to change the leader/follower status of each host. 
  
-On smesql01 we will update the /smedata/sme_solr/sme/core.properties file and change master/slave status+On smesql01 we will update the /var/solr/data/sme/core.properties file and change leader/follower status
  
 <code> <code>
-enable.master=false +enable.leader=false 
-enable.slave=true+enable.follower=true
 </code> </code>
  
-On smesql02 /smedata/sme_solr/sme/core.properties we will do the opposite. +On smesql02 /var/solr/data/sme/core.properties  we will do the opposite. 
  
 <code> <code>
-enable.master=true +enable.leader=true 
-enable.slave=false+enable.follower=false
 </code> </code>
  
-Finally, we will restart solr on both hosts in order to have those changes take effect:+Finally, we will restart Solr on both hosts in order to have those changes take effect:
  
 <code> <code>
-systemctl restart jetty+cd /var/www/smestorage/containers/solr-replicas/ && docker-compose down && docker-compose up -d
 </code> </code>
  
-This will switch the status and start replicating data from smesql02 (the new master) over to smesql01 (new slave). +This will switch the status and start replicating data from smesql02 (the new leader) over to smesql01 (new follower).  
 + 
 +We can check replication status on the hosts via this webpage: 
 +``` 
 +http://<smesql01_or_smesql02>:8983/solr/#/sme/replication
  
-We can check replicatin status on the hosts via this webpage: +```
-http://<smesql01/smesql02>:7070/#/sme/replication+
  
 Once both have the same Version/Gen number they are in sync.  Once both have the same Version/Gen number they are in sync. 
  
-From there you can then leave the hosts in their current master/slave state, or revert back by adjusting the /smedata/sme_solr/sme/core.properties and restarting. +From there you can then leave the hosts in their current leader/follower state, or revert back by adjusting the /var/solr/data/sme/core.properties and restarting. 
  
-Do not fail back over the keepalived vip until replication is back in sync and you make this update to make smesql01 master again. +Do not fail back over the keepalived vip until replication is back in sync and you make this update to make smesql01 leader again.