Jibe and AWS CloudTrail

Jibe is able to parse AWS CloudTrail log files to identify Amazon S3 objects that have been modified outside of Nasuni Access Anywhere and sync them with the Access Anywhere metadata.

For general information on Jibe see Jibe Documentation.

CloudTrail Architecture

Jibe may run as a Python application processing S3 log files. The steps are:

S3 API - External applications call Amazon S3 APIs to create and delete objects.
CloudTrail - Amazon S3 buckets log S3 API Calls to AWS CloudTrail log files in a nominated S3 bucket.
Scan - Jibe scans the bucket to pickup and process the AWS CloudTrail log files.
Sync Request - Jibe asks the Access Anywhere Server to refresh object metadata.
S3 Sync Object - The Access Anywhere Server verifies the object status on S3 and updates metadata.

Jibe Sync with AWS CloudTrail Log Files

S3 Events can only be sent to queues in the same region. For buckets in the other region send events through to an Amazon Simple Notification Service topic in that region which in turn forwards to the queue in the target region.

Features

Near Real-Time - Metadata in Access Anywhere is updated in most cases within 2-6 minutes after objects are updated in Amazon S3.
No Direct Access to S3 - Jibe loads files and refreshes metadata only through API calls to the Access Anywhere Server. It looks for AWS CloudTrail log files within S3 buckets that are being managed by the Access Anywhere.
Provider Report - Jibe generates a report every 15 minutes that summarizes the sync status of Amazon S3 providers and buckets. It's also useful for troubleshooting. You can configure an upload folder so that the report is automatically uploaded to the Access Anywhere Server on completion.
Slack Integration - Jibe can be configured to send errors and warnings to a Slack channel.
Multiple Instances - Jibe can be run across multiple appliance nodes with each instance focusing on a subset of providers. Status information is shared so reports are consistent.

Getting Started

AWS Setup

All Amazon S3 buckets must be configured for object-level logging. You can do this all in one go, at least for each account you are using, from the Amazon CloudTrail Console. Create a new CloudTrail, choose “Select all S3 buckets in your account” and “Write”. If you are logging “Read” events for another use Jibe will ignore them.

If you choose “New S3 bucket” for your storage location be sure to register the new bucket with Access Anywhere since that's how Jibe has access to the log files. From the Provider Detail page choose “Manage Buckets” to locate and add new buckets.

For more information see Logging Amazon S3 API Calls Using AWSCloudTrail.

Access Anywhere Setup

Jibe reads CloudTrail log files from the buckets you have designated through Access Anywhere. The buckets you are using must be added to the Access Anywhere provider if they are not available already.

Jibe needs an Access Anywhere account with an Administrator role in order to read logs and synchronize objects.

Jibe Installation

See Installation.

Configuration File

Change the endpoint and credentials in jibe-config.json to those of your Nasuni Access Anywhere server.

{
    "endpoint":"https://files.example.com",
    "login":"adminuser@example.com",
    "password":"*****"
}

For more information see Jibe Configuration and Jibe Logging.

Tips & Tricks

To minimize the size of the log files: * Only log S3 write events * Don't enable S3 logging for the CloudTrail buckets

Implementation Notes

AWS CloudTrail Logs

AWS CloudTrail log files are compressed json files containing logs of S3 and non-S3 API calls. New log files are uploaded into the destination bucket every five minutes. If there are many events new log files may be uploaded more frequently.

Generally log files include events for the prior 5-minute period but not always. Occasionally the delivery of log files is delayed for several hours or longer. Sometimes for a period of time batches are very small and result in hundreds of thousands of log files being created.

AWS CloudTrail files are delivered in folders by date and region.

Access to Logs

CloudTrail log file are accessed by Jibe through the Access Anywhere Server. The bucket containing the logs therefore must be accessible through Access Anywhere. (See Provider Settings > Manage Buckets)

Jibe looks for the CloudTrail root folder in every bucket by looking for /AWSLogs. If a different location is being used it must be configured.

Jibe calls cloud refresh on folders to discover new log files and subfolders. By default it checks for new log files every three minutes.

Processing Logic

Jibe only processes S3 calls for creating and deleting objects including multi-part uploads. The exception is the DeleteObjects S3 API call which is used to delete multiple objects in one request. It does not log the objects that were deleted. This call is used by the AWS Console for example to delete multiple objects.

Jibe ignores failed API calls and API calls that originated from Access Anywhere itself since they reflect existing changes to the metadata.

Jibe ignores folders that were created before the last provider sync started. (Jibe determines the start of the last sync from the background task, or if that's not available by estimating from the sync completion time and number of files indexed)

Events may be received through one provider (configured as a “source”) but may be applied to buckets in other providers.

Troubleshooting

Main places to check:

Jibe log files
Jibe report
Buckets with CloudTrail logs

Table of Contents