37. Use AI to search for sensitive data

In order to use AI to search for sensitive data, take the following steps:

Create a custom masking rule

This step should be done only once per every masking rule!

Goto Masking Rules screen, click Add Masking Rule

Enter meaningful name - it is very important to write correct wording here, since the AI will use it to recognize data.

image-20240908-080427.png

The best results are in English. Even if you need to search for data in other languages - give the name in form of “Medical procedure name in Spanish”

Choose Search Type of Data - Lookup.

On the next screen, for Search Lookup File provide a dummy file with a dummy value.

Example:

On the next screen choose any Masking Definition

Once the new masking rule was created, add it to any privacy policies that you use (use the Privacy Policies screen).

Perform Search Sensitive Data

This step should be done every time you want to search for sensitive data!

Goto screen Sensitive Search, click Scan

Select a Privacy Policy that includes the custom masking rule (that was created in the previous step)

Pay attention to the parameter Number of unique values to analyze as explained further

Choose AI scan

image-20240908-080234.png

Restrictions and limits of the AI scan

General restrictions

  1. AI is allowed by the application’s license

     

  2. AI was chosen in the Sensitive Search options

The AI scan for a specific column is executed when :

  1. The column contains string values (not numeric, dates, time, etc…)

  2. The standard scan provides probability less than 50% (configured by application parameter aiScanTriggerThreshold)

  3. An average value length is between 5 to 50

  4. In average, there are less than 4 non-alpha characters (a value of 12-34+abc includes 6 non-alpha characters and 3 alpha characters)

  5. There are more unique values in the sample dataset than defined in Scan parameter Number of unique values to analyze