Introduction to the Cloud Drive Concept
Most users are familiar with the concept of cloud file synchronization, which satisfies some data replication access use cases, but which has the disadvantage of requiring all of the synchronized data to be stored on the user’s workstation or laptop. The user is therefore limited to the storage capacity of the physical machine. Synchronization is a valid strategy for offline access but it is not suited to browsing data repositories on-demand.
An alternative way to access remote data is to provide the user the ability to browse through a list of data available but only download it when it’s actually accessed. This is how network file systems work (via CIFS or NFS). Users see all of the available data from a mount or mapped drive letter, and only pull the data on access. However, network attached storage was not developed with the Internet/Cloud in mind. As such, these shares are not secure over the Internet without complicated VPNs and tend to work poorly as the latency increases.
Enter “FUSE”, a Filesystem in Userspace. FUSE enables the creation of virtual file systems by delegating file system tasks to a kernel module, without needing to give privileged access. This makes FUSE a good choice for creating virtual file systems that, for example, do not store data. FUSE can utilize modern, secure RESTful APIs which were built with the latency of the Internet in mind.
FUSE-based file systems appear as a regular mount; data is stored locally using a loopback adaptor. The remote file hierarchy is stored locally enabling quick browsing and file listings, just like a local file system. Files, which are actually stubs, when invoked, cause the loopback adaptor to download, upload or modify files.
Changes to a network connection (disconnects, roaming, switching of networks) are typically accommodated for a FUSE-based drive mount via a local cache for a specified period of time.
FUSE based drive mounts are now commonly used to provide cloud drive like technology and this is what the File Fabric uses to provide its Cloud Drive feature for Mac, Windows and Linux.
The advantages of a FUSE-based filesystem must be considered with some caveats kept in mind:
- A network drive based on FUSE has a higher latency than a local file system (it is going off the Cloud after all !). Throughput will be reduced in particular when reading or writing files serially. Use of larger files (Zipped archives) and/or transferring multiple files in parallel can help.
- Most FUSE-based file systems cache files that are being accessed. There needs to be enough local storage capacity for cached copies of the files and an intelligent cache invalidation mechanism (which is what the File Fabric software provides out of the box, with options to tweak / change these settings).
- If the remote storage is object based then renaming a directory can result in a long running process in which as files within the folder hierarchy are renamed which can breach the request/response timings for the file system kernel.
- Applications may expect file access to behave a certain way which may result in specific integrations with Application having to be done to enable the FUSE drive to works as expected. The File Fabric drive has some of these integrations built in with Applications such as Microsoft Office and Libre Office, but there may well be some applications that it does not and which may not play well with the drive.
- There are some settings that need to be turned off to enable the drive to work effectively.Thumbnails is one such feature. Thumbnails require the file to be downloaded to work which negates the premise of not storing the files locally so thumbnail generation is turned off by default (although this too can be changed).
The benefit of using FUSE based Drives is not only in ease of use but also in the increase in compatibility with third-party software which sees the drive as if it were a local drive, but there is a need to understand and work within the type of limitations that we outline above.