Data Subsetting

There are use cases that require only a subset of data for migration.

 

There are two types of data subsetting available:

Row Subset - in this case user will define/limit rows for the migration as the SQL’s “WHERE“ condition.

Workflow → Tables → Fill the “Subsetting“ field:

Important!: The value of the Subsetting field must be the valid “WHERE“ condition for the Source Database Type (See the Known Issues and Examples). Otherwise it will cause the migration to fail.

 

Column Subset - this option allows user to be able to check/un-check the specific columns for migration. If a column was not checked then the data will not be read and included from the source database.

Workflow → Tables → Click on the icon in the “Columns“ field:

As result the selected table will appear in the destination database without un-checked columns.

 

Important!: The Column Subset with Overwriting methods “Tables“ and “No Overwriting” leads to the migration fail. Also the migration will fail if the un-checked column has indexes or foreign key.

 

Known Issues:

Different databases have different “where” clause (sometimes incompatible) syntax.

 

Examples:

1. Oracle

The original SQL query:

SELECT

user_id, address

FROM

customers

WHERE

last_name = 'Anderson'

The string for the Subsetting: last_name = 'Anderson'

 

2. MySQL

Query:

SELECT

name,

price

FROM

products

WHERE

price < 1.0 OR

productCode = 'PEN'

The Subsetting-string: price < 1.0 OR productCode = 'PEN'

 

3. For PostgreSQL

Query:

SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE 'Bra%' AND
last_name <> 'Motley'

 

The Subsetting-string: first_name LIKE 'Bra%' AND last_name <> 'Motley'