Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
piidiscovery [2018_04_05 15:55] – [Configuration] stevenpiidiscovery [2018_04_06 20:49] – refactor of configuring rules section steven
Line 1: Line 1:
-# PII Discovery+# PII Scanning and Detection
 (available in v1803) (available in v1803)
  
 This page covers the identification and classification of PII (Personally Identifiable Information). This page covers the identification and classification of PII (Personally Identifiable Information).
  
-The Enterprise File Fabric's PII feature helps enterprise customers manage personal information by automatically detecting PII in documents and alerting the organisation’s information security specialists or other designated users to its presence.+The Enterprise File Fabric's PII Scanning and Detection feature helps enterprise customers manage personal information by automatically detecting PII in documents and alerting the organisation’s information security specialists or other designated users to its presence.
  
 [[https://www.youtube.com/watch?time_continue=7&v=zxpyY3Rw34c|{{ ::piidiscovery:piivideo.png?nolink&300 |}}]] [[https://www.youtube.com/watch?time_continue=7&v=zxpyY3Rw34c|{{ ::piidiscovery:piivideo.png?nolink&300 |}}]]
Line 14: Line 14:
 ### Scanning ### Scanning
  
-The PII Discovery feature works by scanning documents when they are added or updated. Documents are searched using a configurable set of rules, looking for personal information such as telephone numbers, email addresses and national identity numbers.+The PII Scanning and Detection feature works by scanning documents when they are added or updated. Documents are searched using a configurable set of rules, looking for personal information such as telephone numbers, email addresses and national identity numbers.
  
 {{ :piidiscovery:file-and-folders-pii.png?nolink&800 |}} {{ :piidiscovery:file-and-folders-pii.png?nolink&800 |}}
Line 20: Line 20:
 ### Tagging ### Tagging
  
-Files in which personal information is found are classified as PII with the types of PII data that they contain:+Files in which personal information is found are classified as PII with the types of PII data that they contain. Users with appropriate permissions can see the PII that has been found in a document on the “info” tab for that document:
  
 {{ :piidiscovery:tagging.png?nolink |}} {{ :piidiscovery:tagging.png?nolink |}}
Line 32: Line 32:
 {{ :piidiscovery:notify_admin_email.png?500&nolink |}} {{ :piidiscovery:notify_admin_email.png?500&nolink |}}
  
-The file owner, the user who uploaded the file, receives an email and a message:+The file owner, (the user who uploaded the file,receives an email and a message:
  
 {{ :piidiscovery:notify_owner_alert.png?nolink |}} {{ :piidiscovery:notify_owner_alert.png?nolink |}}
Line 44: Line 44:
 {{ :piidiscovery:searchpiicheckboxes.png?nolink |}} {{ :piidiscovery:searchpiicheckboxes.png?nolink |}}
  
-Look under the “info” tab for specific PII information a document contains: 
- 
-{{ :piidiscovery:infotab_piifound.png?nolink | 
-}} 
 ## Workflow ## Workflow
  
 ### Uploading ### Uploading
  
-When a file or object is uploaded, updated or synchronized the File Fabric recognizes it as containing new content; it is a candidate for being indexed and scanned.+When a file is uploaded, updated or synchronized the File Fabric recognizes it as containing new content; it is a candidate for being scanned for PII.
  
 To be scanned the file must be located on a storage provider that has content search enabled (this is set set when the provider is created). To be scanned the file must be located on a storage provider that has content search enabled (this is set set when the provider is created).
Line 72: Line 68:
 ### Tagging of PII Files ### Tagging of PII Files
  
-When PII is detected in a file, a tag is added to the file indicating the type of PII that was detected.  For example, if the File Fabric is configured to scan for US Social Security Numbers (SSNs) and one or more data values that match the US SSN detection rule are found when the file is scanned, then a tag with the value “US Social Security Number” will be added to the file metadata under the PII classification.+When PII is detected in a file, a tag is added to the file indicating the type of PII that was detected.  For example, if the File Fabric is configured to scan for US Social Security Numbers (SSNs) and one or more data values that match the US SSN detection rule are found when the file is scanned, then a tag with the value “US Social Security Number” will be added to the file'metadata under the PII classification.
  
 {{ :piidiscovery:tagging.png?nolink |}} {{ :piidiscovery:tagging.png?nolink |}}
- 
  
  
Line 81: Line 76:
  
 Administrators and users with PII permission are notified when a file that matches the PII rules has been detected. Administrators and users with PII permission are notified when a file that matches the PII rules has been detected.
- 
  
 Users with PII permission, including administrators, receive a notification by email: Users with PII permission, including administrators, receive a notification by email:
Line 88: Line 82:
 {{ :piidiscovery:notify_admin_email.png?500&nolink |}} {{ :piidiscovery:notify_admin_email.png?500&nolink |}}
  
-The file ownerthe user who uploaded the file, receives both an email and a message.+The file owner (the user who uploaded the file), receives both an email and a message.
  
 {{ :piidiscovery:notify_owner_email.png?500&nolink |}} {{ :piidiscovery:notify_owner_email.png?500&nolink |}}
Line 129: Line 123:
 ### File Information ### File Information
  
-Available to uIf a file contains PII, a “Show PII matches” button is displayed on the File Manager Info tab for the file. This is available to users with PII or administration permissions.+If a file contains PII, a “Show PII matches” button is displayed on the File Manager Info tab for the file. This is available to users with PII or administration permissions.
  
 {{ :piidiscovery:show_pii_matches_button.jpg?nolink |}} {{ :piidiscovery:show_pii_matches_button.jpg?nolink |}}
Line 159: Line 153:
  * Add Storage Providers with Content Search  * Add Storage Providers with Content Search
  * Give Users PII Authorization  * Give Users PII Authorization
- Edit the PII Detection Rules (optional)+ Configure PII Detection Rules (optional)
  * Change the Name of the PII Classification (optional)  * Change the Name of the PII Classification (optional)
  
 ### 1. Enabling the Content Search Engine ### 1. Enabling the Content Search Engine
  
-Content search must be enabled for PII scanning and detection to work. The content search engine scans documents for PII as they are uploaded or synchronized. The search engine is available only with the appliance and must be explicitly enabled.+Content search must be enabled for PII scanning and detection to work. The content search engine scans documents for PII as they are uploaded or synchronized. The search engine is available only with the Enterprise File Fabric appliance and must be explicitly enabled.
  
 Here is a link to instructions for configuring the content search engine:  [[cloudappliance/solr]] Here is a link to instructions for configuring the content search engine:  [[cloudappliance/solr]]
Line 170: Line 164:
 ### 2. Enabling PII Scanning and Detection in User Packages ### 2. Enabling PII Scanning and Detection in User Packages
  
-PII scanning and detection is only available to Teams (orgs.) that have been assigned a User Package in which the feature is enabled.  The appliance administrator (appladmin) can set this option for a Package by:+PII Scanning and Detection is only available to Organizations that have been assigned a User Package in which the feature is enabled.  The appliance administrator (appladmin) can set this option for a Package by:
  
   * choosing “User Packages” from the hamburger menu;   * choosing “User Packages” from the hamburger menu;
Line 182: Line 176:
 ### 3. Enable the Policy “PII Scanning & Detection” ### 3. Enable the Policy “PII Scanning & Detection”
  
-An administrator can enable this features under Policies > PII Scanning & Detection.+An administrator can enable this feature under Policies > PII Scanning & Detection.
  
 {{ :piidiscovery:org_policies_pii.png?nolink |}} {{ :piidiscovery:org_policies_pii.png?nolink |}}
Line 192: Line 186:
 {{ :piidiscovery:cos_info.png?600&nolink |}} {{ :piidiscovery:cos_info.png?600&nolink |}}
  
-Files that existed before are indexes during the initial provider synchronization. Subsequently files are indexed when created or updated, or if a provider cloud sync is executed.+Files that existed before the provider was added are indexed during the initial provider synchronization. Subsequently files are indexed when created or updated, or if a provider cloud sync is executed and new or updated files are discovered.
  
 Search cannot be enabled for an existing provider data source. To verify that content search is enabled for a provider, as an organizational administrator go to the Dashboard. Select the Setting gear icon to go to see the data source provider detail. The //Content index// for search setting must be set to //Yes//. Search cannot be enabled for an existing provider data source. To verify that content search is enabled for a provider, as an organizational administrator go to the Dashboard. Select the Setting gear icon to go to see the data source provider detail. The //Content index// for search setting must be set to //Yes//.
Line 214: Line 208:
 Another way to give a user PII authorization is to assign the Admin role.  Assigning the Admin role to a user gives several other administrative privileges and should not be done without a complete understanding of the implications. Another way to give a user PII authorization is to assign the Admin role.  Assigning the Admin role to a user gives several other administrative privileges and should not be done without a complete understanding of the implications.
  
-### 6. Editing the PII Detection Rules+### 6. Configuring PII Detection Rules
  
-A set of rules for detecting different kinds of PII is provided with the Enterprise File Fabric. These rules can be used as provided, or the administrator can remove or change rules to meet the organization’s specific requirements.+A set of rules for detecting different kinds of PII is provided with the Enterprise File Fabric. These rules can be used as provided, or the administrator can add, remove or change them.
  
 +The PII Detection Rules are defined in a JSON document that is accessible from the PII Scanning & Detection tab of the organization’s Policies page. Prior to editing the PII Detection Rules, make a safe copy of the JSON document by copying the contents to a text file. That way you can easily revert the changes if needed.
  
-PII detection rules are defined in a JSON document that is presented on the PII administration tab of the organization’s Policies page:+The PII Detection Rules JSON document is an array of objects with each object describing one rule. A rule has the following properties:
  
-    {   +  * ''id'' - A unique identifier. 
-        "id":"creditcard", +  * ''title'' - The name of the rule shown in the user interface 
-        "tag":"credit card", +  * ''tag'' - Files found with this rule are tagged with this value 
-        "title":"Credit card numbers", +  * ''filters'' - An array of one PII filter objects with matching criteria
-        "filters":  +
-              +
-                "name":"The main credit card filter", +
-                "code":"creditcard" +
-            } +
-        ] +
-    }+
  
 +The document is validated against a JSON schema on update. If there is an error the document will not be saved:
  
-This contents of this document must conform to a JSON schema specification that is included with the File Fabric appliance and can be downloaded from the same page:+{{ :piidiscovery:pii_filter_error.png?nolink |}} 
 + 
 +The JSON schema can be downloaded from the same page:
  
 {{ :piidiscovery:enable_pii_scanning.jpg?nolink |}} {{ :piidiscovery:enable_pii_scanning.jpg?nolink |}}
  
-Prior to editing the JSON document that contains the PII detection rules, make a safe copy of the current version by copying the contents to a text file.  That way you can easily revert the changes if needed.+#### Rule Id
  
-The JSON document consists of an array of structures, each of which describes a rule. Each rule is identified by an id.  The id must be unique within the JSON document.+To add scanning rule create a new unique ''id''An id must only contain the characters A-Z, a-z 0-9 and _ (underscore). It is only used internally and should not be changed.
  
-Each rule contains a list of filters.+#### Rule Title
  
-The JSON schema describes two styles of filters that the JSON document can contain.  Only the code filter is currently supported Here is an example:+The ''title'' will be the name of the data type in the “Contains PII” checklist on the File Manager’s search screen and in the PII list for a file in the File Manager’s Info panel.
  
-      +{{ :piidiscovery:contain_pii.jpg?nolink |}}
-        "id":"us_ssn", +
-        "tag":"US Social Security Number", +
-        "title":"Social Security Numbers (US)", +
-        "filters":  +
-              +
-                "name":"The main SSN filter", +
-                "code":"usSsn" +
-            } +
-        ] +
-    }+
  
 +#### Rule Tag
  
-The tag” value will be the name of the tag in the File Fabric’s tagging system.  Tag values must be unique within the JSON document.+The ''tag'' value is the name of one tag. It does not have to be predefined.  Tag values should be unique within the JSON document. 
  
 {{ :piidiscovery:edit_tags.jpg?nolink |}} {{ :piidiscovery:edit_tags.jpg?nolink |}}
  
-The “title” will be the name of the data type in the “Contains PII” tick list on the File Manager’s search screen and in the PII list for a file in the File Manager’s Info panel.+#### Rule Filters
  
-{{ :piidiscovery:contain_pii.jpg?nolink |}}+Two types of matching filters are supported. Regular expression filters support the detection of PII content through search patterns. Code filters are predefined filters in the product that match common types of PII.  
 + 
 +##### Regular Expression Filters 
 + 
 +Rules created by users (admins) can each contain one user-supplied regular expression filter. 
 + 
 +The regex property is the regular expression that will be used to detect data of the type described by the rule when a file is scanned. The regular expression must be delimited by slashes (‘/’). For more information on syntax see [[http://us1.php.net/manual/en/regexp.reference.meta.php|Regexp Reference]].  
 + 
 +This is an example of a rule using a regular expression filter: 
 + 
 +     
 +      "id":"USVIN", 
 +      "tag":"US VIN", 
 +      "title":"US Vehicle Identification Number", 
 +      "filters":  
 +           
 +            "name":"VIN filter", 
 +            "regex":"/([A-HJ-NPR-Z0-9]{17})/" 
 +         } 
 +      ] 
 +   } 
 + 
 +##### Code Filters 
 + 
 +This is an example of a rule using a code filter: 
 + 
 +     
 +      "id":"us_ssn", 
 +      "tag":"US Social Security Number", 
 +      "title":"Social Security Numbers (US)", 
 +      "filters":  
 +           
 +            "name":"The main SSN filter", 
 +            "code":"usSsn" 
 +         } 
 +      ] 
 +   } 
 + 
 +Adding new code filters to this version of the File Fabric requires paid professional services support from Storage Made Easy. Users wishing to add their own code filters should contact their SME sales representatives. 
 + 
 +The following predefined code filters are included with the File Fabric:
  
-When you try to save your changes to the JSON document on the “PII Detection & Scanning” tab of the “Policies” page, the edited JSON is validated.  If your edits have introduced an error then the document will not be saved.+ * General 
 +    * bankIban - Bank account numbers (IBAN) 
 +    * bankSwift -  SWIFT 
 +    * creditcard - Credit cards 
 +    * email  - Email 
 +    * Icd10cm - ICD 10-CM Code rule 
 +    * Icd9cm - ICD 9-CM Code rule 
 +    * Ip - IPv4 and IPv6 addresses 
 + * Australia 
 +    * auMedicare - Australian Medicare account number 
 +    * auTaxFileNumber - Australian Tax File number 
 +    * Brazil 
 +    * brCpfNumber - Brazilian CPF Number rule 
 + * Canada 
 +    * caBritishColumbiaInsuranceNumber - British Columbian Personal Health Number (PHN) 
 +    * caOntarioInsuranceNumber - Ontario Health Insurance Plan number 
 +    * caPassport - Canadaian Passport 
 +    * caQuebecInsuranceNumber -  Quebec Health Insurance Number 
 +    * caSin -  Canadaian Social Insurance Number (SIN) 
 + * China 
 +    * cnPassport - Chinese passport 
 +    * Germany 
 +    * dePassport - German passport 
 + * Spain 
 +    * esNie - Spanish NIE Number rule 
 +    * esNif - Spanish NIF Number rule 
 +    * esPassport - Spanish passport 
 + * French 
 +    * frIDCard - French National ID Card 
 +    * frPassport - French passport 
 +    * frSsn - French social security number (NIR) 
 + * India 
 +    * inPersonalNumber - Indian Personal Permanent Account Number 
 + * Japan 
 +    * jpPassport - Japanese passport 
 + * South Korea 
 +    * krPassport - South Korean passport 
 + * Mexico 
 +    * mxNationalNumber - Mexican National Identification Number 
 +    * mxPassport- Mexican passport 
 + * Netherlands 
 +    * nlIdNumber - Dutch national identification number (BSN) 
 + * United Kingdom 
 +    * ukDrivingLicense - UK Driving License rule 
 +    * ukNationalInsuranceNumber -  UK National Insurance Number rule 
 +    * ukNhsNumber - UK NHS Number rule 
 +    * ukNumberPlate - UK Number Plate 
 +    * ukPassport - UK passport 
 +    * ukTaxpayerNumber -UK Taxpayer Identification Number 
 +    * ukTelephone - UK telephone number 
 + * United States
  
-{{ :piidiscovery:error_pii_rules.jpg?nolink |}}+#### Removing Rules
  
 You may also want to remove from the JSON document rules that scan for data items that are not of interest to your organization.  In that case, remove the entire section starting with the curly brace before the id, and ending with the comma preceding the next rule (unless you are removing the final rule in the document, in which case there is no comma).  For example, if you don’t want the File Fabric to scan for Australian tax file numbers, you would remove this text (including the trailing comma) from the JSON document: You may also want to remove from the JSON document rules that scan for data items that are not of interest to your organization.  In that case, remove the entire section starting with the curly brace before the id, and ending with the comma preceding the next rule (unless you are removing the final rule in the document, in which case there is no comma).  For example, if you don’t want the File Fabric to scan for Australian tax file numbers, you would remove this text (including the trailing comma) from the JSON document: