Appliance Troubleshooting

Last updated on April 17, 2019.

This document covers troubleshooting of the Enterprise File Fabric Server. The Server runs as a virtual machine, an instance of the Enterprise File Fabric virtual appliance and may be scaled out horizontally (run as multiple instances).

The virtual appliance is a hardened CentOS image with:

  • Enterprise File Fabric Engine components
  • MariaDB database server
  • Apache Solr search engine

It is most commonly used this these deployment scenarios

  • Single server with database - with the File Fabric Engine and database server on a single instance
  • Single server separate database - The File Fabric Engine runs on one instance. The database runs separately either as an additional instance or through a database-as-as-service.
  • High availability - The File Fabric Engine runs on multiple instances behind a load balancer. A high availability database cluster or database-as-as-service is used.
  • The Apache Solr application is required when content search is required. It should run on a separate machine instance.

Before you begin troubleshooting and checking the Enterprise File Fabric Server you should have the following information:

  • Server domain name
  • Are multiple servers being used?
    • Load balancer domain name
    • All server instance hostnames or IP addresses
  • Is Cloud FTP being used?
  • Is ClamAV being used?
  • Passwords:
    • smeconfigure password(s)
    • root password(s)
    • database credentials (optional)

The web server should start up automatically and be accessible when the appliance server starts. From a remote machine use a browser or the following command to test connectivity to the web server. Curl is available on Linux and Mac.

curl -k https://hostname | head -n 20

If port 443 is not responding, and to validate network performance, run ping from a remote machine. Ping is available on Linux, Windows and Mac.

ping hostname

If the server is reachable remote shell into the system as smeconfiguser.

Run this command from the command line of a machine that has the ssh utility installed (Linux or Mac), or run the equivalent using a Windows tool like putty. You may be prompted about the authenticity of the host. If so, answer 'yes'. You will be prompted for the password.

ssh smeconfiguser@hostname

On success you will see a Linux command prompt. Unless otherwise noted commands in this document can be run as smeconfiguser.

To open a shell as root or change to the user smestorage use the command su. You cannot log into the machine remotely directly as root:

su root
su smestorage

Connection Refused

If the password fails several times you will be locked out.

ssh: connect to host 10.0.10.194 port 22: Connection refused

To verify, log in via the console as root and execute:

iptables -L f2b-SSH -n

If your IP address is locked you can unlock via fail2ban (as root):

fail2ban-client set ssh-iptables unbanip 192.168.1.1

Log

This is a general log file for the appliance.

tail -f /var/www/smestorage/sitelogs/logits.txt

You should see the last few lines of the log file, and new lines should appear from time to time as the appliance is used. Lines containing the word “Error” indicate a possible problem with the way the appliance has been set up or is being used.

The tail -f command will run until you terminate it (Ctrl-c).

Error Log

These files are created the first time an error is received:

tail -f /var/www/smestorage/sitelogs/errorlogs.txt
tail -f /var/www/smestorage/sitelogs/errorlogs_trace.txt

You may see the last few lines of the log file, and new lines may appear from time to time as the appliance is used. Lines containing the word “Error” indicate a possible problem with the appliance. The file

errorlogs_trace.txt

contains a full trace of errors in

errorlogs.txt

.

Email Log

tail -f /var/www/smestorage/sitelogs/allemails.txt

Sent emails “To” addresses and “Subject”s are logged here.

Log Archive

An archive of logs can be found at:

/var/www/smestorage/tmp/logsarchive

To resolve a port conflict or to determine what ports are in use by what service use:

netstat -plnt

Check CPU usage or check for a runaway process using top. Investigate processes maxing out CPU over three refreshes.

top

In the third line, which is labelled “%Cpu(s):”, the fourth number (labelled “id”) shows the percentage of CPU cycles that are idle. If this number is less than 10% then your CPU is very busy. If t remains at less than 10% for more than a few seconds then your CPU may be overloaded. Sometimes this indicates that a program is in an error state.

Look for memory issues with top.

top

In the fourth line, which is labelled “KiB Mem :” or “Mem:” depending on the version of top, the fourth number (labelled “free”) shows the amount of free memory in kilobytes. If this number is less than 150,000 then your server is probably low on memory.

df -h

In the “use” column, a value close to or at 100% indicates a severe lack of space. Generally a value above 89% indicates that space should be cleared.

You can check the disk space by running the command

df -kh

, you will see the following error for any mysql table in the SME Error Logs

  <code>/var/www/smestorage/sitelogs/errorlogs.txt</code>.

If you have configured notification email, then you will receive notification email with the errors.)

If you ran out of diskspace please see the instruction below:

DB! Table './smestorage/TABLE' is marked as crashed and should be repaired Symptom - You open the configured SME appliance url in a browser and see an empty page

Increase Disk Size

To increase the diskspace on SME appliance see the recipe to increase disk space: https://storagemadeeasy.com/wiki/cloudappliance/appladmin

Repair the Database

ssh into the appliance as smeconfiguser

Backup the database, this is the easiest way to find the crashed tables.

mysqldump -u smestore -p --opt smestorage > smestorage.sql

IF YOU GET AN ERROR MESSAGE INDICATING A CRASHED TABLE:

ssh to appliance and run the following command

mysql -u smestore -p smestorage

Enter the password

Make sure the database is smestorage

use smestorage

And then repair the table that has crases

repair table <TABLE_NAME>

Go back to the database backup step until the backup completes without errors.

Delete compiled templates

SME uses compiled templates if disk space is low then the templates can be deleted or corrupted. To fix this

ssh in to appliance as smeconfiguser

Then sudo as root and then smestorage linux user.

su - root
su - smestorage

Go to the templates directory

cd /var/www/smestorage/public_html/smarty/site/templates_c

Delete all the compiled templates by executing the following command

rm *.tpl.php

This should help you get your Appliance back online.

This command finds the top 50 files above 10M:

find / -xdev -type f -size +10M -exec du -sh {} ';' | sort -rh | head -n50

Check that the following services are healthy (more detail in sections below)

systemctl status php-fpm
systemctl status httpd
systemctl status jetty
systemctl status crond
systemctl status mariadb   # if running locally
systemctl status memcached

If you see this error check status of PHP-FPM:

The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later.

You can do this by issuing this command:

systemctl status php-fpm

You'll see the real status of the service, it may have hung and the status will be active - either way it's a good idea to restart the PHP-FPM service. You will need root access (or use sudo).

sudo systemctl restart php-fpm

Logs

/var/log/php-fpm/error.log

Check that the Apache HTTP Server is running. su to root.

systemctl status httpd

To start the HTTP Server

systemctl status httpd

To stop the HTTP Server

systemctl status httpd

Configuration

HTTPD server configuration files are located in the following two directories:

/etc/httpd/conf

/etc/httpd/conf.d

Logs

Apache Httpd server logs are located at:

/etc/httpd/logs

tail /etc/httpd/logs/access_log
tail /etc/httpd/logs/filefabric-error_log
tail /etc/httpd/logs/filefabric-access_log
tail /etc/httpd/logs/ssl_accees_log
tail /etc/httpd/logs/ssl_error_log
tail /etc/httpd/logs/webdav.filefabric-error_log
tail /etc/httpd/logs/webdav.filefabric-access_log
tail /etc/httpd/logs/ssl_webdavfilefabric_log

If your memcached stops working or it hangs then this manifests itself as users being unable to upload files.

When you’ll try to upload something you will get a message:

Can not find uploading process meta data.

This means a record could not be added to memcached and because of that the upload failed.

To solve this as a root issue the following command:

systemctl restart memcached

After that you can also check the service status:

systemctl status memcached

You should see something similar to the below:

Active: active (running) since Thu 2016-08-25 13:30:00 BST; 1s ago

The Jetty service is used for Apache Solr and PDF Annotation. It runs as a Java process, by default listening on localhost port 7070.

Check the health of jetty using the command:

systemctl status jetty

You should see a few lines of output including one that says, “Active: active (exited) ”. If you do not then the service is not running. This will prevent content search from working.

To check that Apache Solr is running and responsive on the appliance run:

curl -u solr:drom6etsh9Onk "http://127.0.0.1:7070/sme/select?q=the&start=0&rows=100&wt=json&indent=true"

Configuration Files

/home/sme/sme_jetty/start.ini

/smedata/sme_solr/solr.xml

See https://docs.storagemadeeasy.com/cloudappliance/solr for more information.

Logs

/home/sme/sme_jetty/logs/solr.log

Production

Note: For production Apache Solr should be running on a separate instance to the Enterprise File Fabric Server (Web Tier).

Access Solr Admin GUI Remotely

To access the Solr admin from another machine:

  1. Add this line to /etc/sysconfig/iptables:

    -A RH-Firewall-1-INPUT -p tcp -m state –state NEW -m tcp –dport 7070 -j ACCEPT

  2. Restart iptables

    systemctl reload iptables

  3. Comment out the line in /home/sme/sme_jetty/start.ini:

    #jetty.host=127.0.0.1

  4. From a browser:

    http://hostname:7070/ with user solr password drom6etsh9Onk

By default the File Fabric Cloud FTP service is configured to run an FTP service on port 21 and FTPS (FTP over SSL) service on port 990.

Check the health of the Cloud FTP service using the service command:

systemctl status cloudftp

You should see a few lines of output including one that says, “Active: active (exited)”.

To start the service:

systemctl start cloudftp

To stop the service:

systemctl stop cloudftp

Client Testing

sftp -v  user@hostname

Configuration

/var/www/smestorage/ftpserver/ftpserver.conf
/var/www/smestorage/ftpserver/sftpserver/log.txt

Log

/var/www/smestorage/ftpserver/sftpserver/log.txt

For production use this service should be removed along with the demo accounts.

An appliance FTP server listens by default on IP Address 127.0.0.1 and port 2001. It is used for default storage for the clouduser.

Status of local ftp service

systemctl status vsftpd

Start FTP Server

systemctl start vsftpd

To stop

systemctl stop vsftpd

FTP Server Configuration

/etc/vsftpd/vsftpd.conf

The cron service executes cron jobs that roll logs and kick off periodic tasks for the system such as daily maintenance tasks. These scripts should be run only once in a multi-server environment.

For version 1705.00 and above cron runs on all instances using cronmutex.php to make sure only one is executed:

php /usr/bin/cronmutex.php default 716f3 900 && /var/www/smestorage/cron/scheduler_daily.pl

Check the health of cron using the service command:

systemctl status crond

You should see a few lines of output including one that says, “Active: active (running) ”. If you do not then the service is not running. This will prevent some functions from working.

If it is not then you should enable it:

systemctl start crond

Logs

/var/log/cron
/var/www/smestorage/cron/log.txt

Configuration

To see cron jobs for a server:

crontab -u smestorage -l      

Note

These cron jobs use scripts in:

/var/www/smestorage/cron
/var/www/smestorage/config/cron/config.conf

To see crontab jobs run as root (currently only freshclam):

cat /etc/crontab            

CloudDAV is our implementation of WebDAV on top of the File Fabric. It runs as a CGI script from /var/www/smestorage/webdav_html/cgi-bin.

Log

/var/www/smestorage/webdav_html/cgi-bin/log.txt

Configuration

/var/www/smestorage/config/webdav_html/configuration

Cloud S3 is our implementation of an Amazon S3 compatible API on top of the File Fabric.

Log

/var/www/smestorage/ftpserver/sftpserver/log.txt

Configuration

# See <VirtualHost *:80>
#  DocumentRoot /var/www/smestorage/s3_html
/etc/httpd/conf/httpd.conf 

The Antivirus scanner ClamAV is included with the appliance and can be used to check all uploaded files.

In order to be used virus scanning must be enabled on a per-organization (tenant) basis through Organization Policies. Scanning of individual uploaded files can be verified through the audit log if logging of File add/updates is turned on.

The ClamAV process is called

/usr/sbin/clamd

and runs as a system daemon. When the daemon is running it creates a filesystem socket that is used to communicate to and from the file upload process.

Check the health of the ClamAV scanner using the service command:

systemctl status clamd@scan

In High Availability configurations each appliance leverages its local copy of ClamAV.

Error Messages

If a file is uploaded, antivirus scanning is enabled, and the daemon is not running, the user will see the following message:

Uploading of 1 files failed
[Restart] [Cancel]
Seems file is not uploaded. Uploading in progress?
[Close]

Verify scans are being successful through the audit trail. The policy “Audit File add/update” must be enabled:

File sme-solution-brief.pdf uploaded to My Cloud files/mybucket. Scanned with antivirus ClamAV 0.99.2/24143/

Log

/var/log/sme-clamd.log

Configuration

/etc/clamd.d/scan.conf

https://linux.die.net/man/5/clamd.conf 

ClamAV Antivirus Database Updater

Virus definitions are updated once an hour (see /etc/crontab) with freshclam. To check connection to the online database (and update definitions) run as root:

freshclam

Log

/var/log/freshclam.log

Configuration

/var/www/smestorage/config/clamd/freshclam.conf

The appliance includes an email server

/var/log/maillog

This error on attempted login indicates problems with the license key including not present or expired.

Sorry, Organization accounts are not supported. No valid key. Contact with your administrator.

The license is configured for each appliance through the Appliance Administration interface under Settings > License Key. This is reached by logging in as the appladmin user at https://hostname which you can do without a valid license.

The appliance license can also be viewed and changed from within the appliance at:

/var/www/smestorage/config/public_html/license.txt

High Availability: The license must be configured on every instance.

There are several ways to check the version of the appliance.

Appliance Administration

Log in as the appliance admin. From the hamburger menu under the menu “Admin” see the appliance version and build number.

System version: 1803.02

Version build: 2018022700008

(The hotfix number (.xx) is only shown for versions 1803.00 and above)

Command line

1) From the shell use the System Package Manager (For versions 1705.00 and above)

 yum info sme-release

2) From the database

 mysql> SELECT * FROM smestorage.se_version;

3) From the shell as smeconfiguser run the alias:

 smeversion
/var/www/smestorage/patches

Keeps copy of public_html after upgrades.

If the database is running locally check the service is running:

systemctl status mariadb

You should see a few lines of output including one that says, “Active: active (running) ”. If you do not then the service is not running. This will prevent the Cloud File Manager from working.

You can log into the local database through:

mysql

You should be successful or see an error message like“ERROR 1045 (28000): Access denied for user 'smeconfiguser'@'localhost' (using password: NO)”.

If you do not then the database is not accessible from this machine. For remote databases this may be a network issue.

Local Database Service

If there is something wrong with the MariaDB Server you will most likely see this page when attempting to access the cloud file manager:

It seems we encountered a problem. Please contact support and provide as much details as possible as to how this occurred.
Thanks, and apologies for any inconvenience.
Please first check the mysql status:
systemctl status mariadb

If the service is up and running it is likely there are some corrupted tables due to a power outage. You can see next step how to fix them.

Or you can try to restart the service – after restart mysql will check the state of tables and will try to repair them

systemctl restart mariadb

Backup

You can backup the database using the following command:

mysqldump smestorage >smestorage.sql

Configuration

/etc/my.cnf

The SME Appliance ships with a customized version of Fail2Ban (http://www.fail2ban.org/).

Fail2Ban scans logs file for malicious patterns ie. DoS attacks, too many password failures, SSH logins, seeking exploits, trying to scan for download links etc.

If a malicious pattern is detected it automatically updates the firewall rules to reject IP addresses for a specified amount of time (10 minutes). Fail2Ban is constantly working and scanning providing extra protection for the appliance.

Log

/var/log/fail2ban.log

Unbanning an IP Address

Once you are the root user we'll need to find the IP address that was banned, and then un-ban it. To do this run :

iptables -L f2b-SSH -n

In that list you may see your IP address. With that IP address we then want to run (swap the IP address with your one):

fail2ban-client set ssh-iptables unbanip 192.168.1.1

Your IP address should now be unbanned.

https://www.storagemadeeasy.com/wiki/cloudappliance/bestpractices/

We recommend customers use tools from the hypervisor vendor or third-parties to backup the appliance and database.