Table of Contents
Enabling Deep Content Search and PDF Burn Service
last updated on April 18, 2023
Introduction
Nasuni Access Anywhere can index the content of federated storage endpoints to provide searching of their contents. Apache Solr is integrated into the Access Anywhere stack and used to index the content from the various on-cloud and on-premises storage solutions.
Solr (and underlying Lucene) index is a specially designed data structure, stored on the file system as a set of index files. The index is designed with efficient data structures to maximize performance and minimize resource usage.
Contents of the following file types can be indexed for searching:
7z | docx | jar | odt | pub | vsd |
afm | dotm | jpg | oga | qpw | war |
aif | dwg | js | ogg | rdf | wav |
apk | ear | key | opus | rss | wb3 |
ar | emf | kml | p7s | rtf | webarchive |
asf | eml | kmz | pages | sda | wma |
au | emlx | m4a | pbm | sdc | wmf |
bmp | epub | mbox | pct | sdd | wmv |
box | epub | mdb | sdw | wps | |
c | exe | mhtml | pgm | shw | xhtml |
chm | fb2 | mid | png | svg | xlr |
class | fits | mp3 | potm | svgz | xls |
cpio | flac | mp4 | ppm | tar | xlsb |
css | flv | mpp | ppsm | tbz2 | xlsm |
csv | gif | msg | ppsx | tgz | xlsx |
dat | hdf | nc | ppt | thmx | xml |
dita | he5 | numbers | pptm | tif | xmp |
ditamap | htm | odf | pptx | ttf | xps |
doc | html | odp | prt | txt | zip |
docm | ibooks | ods | psd | vor |
*Note that Solr can throw very occasional indexing errors when indexing content it supports. This can be due to reasons such as incompatible versions or corrupted files. When this occurs the Access Anywhere Server still indexes the base metadata of the file or object (filename, type etc) so it can still be found during search where search terms match what is available.
For evaluation the standard appliance is configured for deep content search. The service is disabled out of the box. This guide walks you through the steps to enable deep content search
A dedicated Access Anywhere appliance should be used for Solr in production.
The Access Anywhere Server also provides a PDF Annotation feature that allows PDFs to be annotated and burnt. For that service to work, it needs to be enabled and this guide will also step through how the burn service can be enabled.
Enabling the Search Service and or the Burn Service in v2006 and Above
Content indexing and searching is delivered for recent Access Anywhere versions as a Docker Compose service, “solr”. If high availability is required then a second service, “solr-replicas”, must also be used.
PDF burning is delivered for recent Access Anywhere versions as a Docker Compose service, “pdfburner.
Information on starting the Access Anywhere's Docker Compose services can be found here. Information on high availability for content indexing and searching can be found here.
Enabling Content Search and or PDF Burn in Older Versions
ssh as root
For these commands you will need to su as root
$ ssh smeconfiguser@appliance IP address
after establishing the ssh session su as root
-bash-3.2$ su - root Password:
Start the search and Burn PDF Service
Execute the following 2 commands:
service jetty start chkconfig jetty on
The first command will start the service and second command will automatically start the service after a reboot.
Enabling PDF Annotations for a user package
Login as appladmin and enable the PDF Annotations tool in the extra options section for the package and press save.
Enabling Search for a user package
Configure Search Values
Login as appladmin and from the right hand menu select search integration For the internal service you can use the following default values
Solr URI http://127.0.0.1:7070/sme/
Solr login solr
Solr password drom6etsh9Onk
Max file size to index 10485760
Assign Search to User Package
Login as appladmin and enable Content Search Enabled in the extra options section for the package and press save
Activating Content Search for Providers
After Content Search has enabled in the package, each Storage Provider Settings page will present the option to enable content search for the provider.
Use this option to control whether Content Search will be available for each provider.
Enabling Content Search After You Have Started Using Access Anywhere
If you wish to enable Content Search and you have already started using one or more storage providers with your Access Anywhere server, how you should proceed depends on whether Content Search has been integrated with your appliance and Content Search been enabled in your organization's package for the entire time that Access Anywhere has been in use:
- If Content Search was both integrated and enabled then you need only tick the “Index content for search” box on the Provider Settings page for each provider for which you want Content Search enabled. Don't forget to save the change to the settings.
- If, however, Content Search was not integrated or was not enabled then please contact SME Support.