Use AI to search for sensitive data
In order to use AI to search for sensitive data, take the following steps:
Create a custom masking rule
This step should be done only once per every masking rule!
Goto Masking Rules
screen, click Add Masking Rule
Enter meaningful name - it is very important to write correct wording here, since the AI will use it to recognize data.
The best results are in English. Even if you need to search for data in other languages - give the name in form of “Medical procedure name in Spanish”
Choose Search Type of Data - Lookup
.
On the next screen, for Search Lookup File
provide a dummy file with a dummy value.
Example:
On the next screen choose any Masking Definition
Once the new masking rule was created, add it to any privacy policies that you use (use the Privacy Policies
screen).
Perform Search Sensitive Data
This step should be done every time you want to search for sensitive data!
Goto screen Sensitive Search
, click Scan
Select a Privacy Policy
that includes the custom masking rule (that was created in the previous step)
Pay attention to the parameter Number of unique values to analyze
as explained further
Choose AI scan
Restrictions and limits of the AI scan
General restrictions
AI is allowed by the application’s license
AI was chosen in the
Sensitive Search
options
The AI scan for a specific column is executed when :
The column contains string values (not numeric, dates, time, etc…)
The standard scan provides probability less than 50% (configured by application parameter
aiScanTriggerThreshold
)An average value length is between 5 to 50
In average, there are less than 4 non-alpha characters (a value of 12-34+abc includes 6 non-alpha characters and 3 alpha characters)
There are more unique values in the sample dataset than defined in Scan parameter
Number of unique values to analyze