The ETL Help Guide: Datastage for Beginners

This page lists Datastage topics which should be covered for grasping basic understanding of ETL Datastage (Parallel Jobs). All Topics are listed in a logical order with required questions to be asked and practicals to be performed.

1) What is Datastage

Pipelining Concept
Partitioning Concept
Partitioning in Datastage
Collecting in Datastage

Questions –

Why is ETL needed?
What are the benefits of partitioning?
Why Collecting is required?
Why different type of partitioning and collecting methods, which is the best, fastest, slowest and why?

2) Configuration File in Datastage

What is a node
What happens in background when a Datastage Job runs

3) Datasets

Descriptor file
Data(Binary file)
Datastage management utility
orchadmin utility

4) Sequential File

Read a Delimited File
Read a Fixed Width File
Read by File Pattern
Read by Filter option
Read by Schema file
Read from multiple nodes
Number of readers per node

5) Funnel Stage

Continuous Funnel
Sort Funnel
Sequence Funnel

6) Copy
7) Filter

8) Transformer
9) Modify
10) Sort
11) Remove Duplicate
12) Aggregator
13) Change Capture
14) Join
15) Lookup
16) Merge
17) Pivot Enterprise
18) Sequence Job
19) Director
20) Administrator

Reference - ibm-datastage-reference-links

The ETL Help Guide

Sunday, 24 December 2017

Datastage for Beginners

About Me

Total Pageviews

Sunday, 24 December 2017

Datastage for Beginners

About Me

Total Pageviews

Subscribe To