Extraction from Database -
Filtration
1) Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made.
2) Using a constraint to filter a record set is much slower than performing a SELECT … WHERE….er joins in SQL itself if possible.
Sorting - Try not to use a sort stage when you can use an ORDER BY clause in the database.
Join - Always prefer joins in SQL itself if possible.
Parallel Read and Partitioning
For Netezza -
Set Enable partitioned reads to Yes.
WHERE mod(datasliceid,[[node-count]])=[[node-number]]
For Oracle -
Partitioned reads method property is set to Modulus.
WHERE MOD(TABLE.COL_NM, 4) = 0
** COL_NM must be a number datatype
For ODBC Connector -
Set Enable partitioning to Yes.
Select partitioning method to divide your data into subsets.
Set Column name - key field column in which the data is partitioned.
Extraction from Flat Files-
Filtration
1) Remove the data not used from the source as early as possible in the job.
2) Use Filter option to specify command to filter records from a text file.
3) Also read only columns which are required in the job instead of reading all.
Filtration
1) Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made.
2) Using a constraint to filter a record set is much slower than performing a SELECT … WHERE….er joins in SQL itself if possible.
Sorting - Try not to use a sort stage when you can use an ORDER BY clause in the database.
Parallel Read and Partitioning
For Netezza -
Set Enable partitioned reads to Yes.
WHERE mod(datasliceid,[[node-count]])=[[node-number]]
For Oracle -
Partitioned reads method property is set to Modulus.
WHERE MOD(TABLE.COL_NM, 4) = 0
** COL_NM must be a number datatype
For ODBC Connector -
Set Enable partitioning to Yes.
Select partitioning method to divide your data into subsets.
Set Column name - key field column in which the data is partitioned.
Filtration
2) Use Filter option to specify command to filter records from a text file.
3) Also read only columns which are required in the job instead of reading all.
Parallel Read
Sequential Files can be read in parallel by using -
1) Read From Multiple Nodes
2) Number Of Readers Per Node
Sequential Files can be read in parallel by using -
1) Read From Multiple Nodes
2) Number Of Readers Per Node