This is for first timers who are working with SFDC
CDC Implementation. Informatica provided detailed implementation steps in the
Help file. I assume that you have gone through the Help file atleast once.
Here I am trying to simplify the process and
identify the focus areas.
There are two ways to implement SFDC Change Data
Capture in informatica.
1. Capture
Changed Data continuously
2. Capture
Changed Data for a specific Period.
Enabling
Capture Changed Data Continuously:
To set a continuous CDC session, follow below steps:
1. Set CDC Time Limit
property on the mapping tab. Time period (in seconds) for which
you want to captures changes from Salesforce .
2. Set Flush Interval
property on mapping tab. Interval (in seconds) at which the PowerCenter
Integration Service captures changed Salesforce data.
Setting
these two properties will enable PowerCentre to capture the data changes continuously
for a time period.To capture changed data for an infinite period of time, Set CDC Time Limit property to -1.
When the PowerCenter
Integration Service runs a continuous CDC session, it reads all records and
process the records with row type as
insert. After the PowerCenter Integration Service reads all source data, the
CDC time limit and flush interval begin.
PowerCenter
Integration Service completes the following tasks to capture changed data for a
continuous CDC session:
1.
Reads
all records created since the initial read and passes them to the next
transformation as rows flagged for insert.
2.
Reads
all records updated since the initial read and passes them to the next
transformation as rows flagged for update.
3.
Reads
all records deleted since the initial read and passes them to the next
transformation as rows flagged for delete.
After
the PowerCenter Integration service finishes reading all changed data, the
flush interval is reset. The PowerCenter Integration Service stops reading from
Salesforce when the CDC time limit ends.
Powercenter Help has detailed description on how a
continuous CDC session is processed by the Integration service.
Enable
Capture Changed Data for a specific Period:
To
enable change data capture for a specific time period, define the start and end
time for the time period in the session properties. Start Time and EndTime must be in the format YYYY-MM-DDTHH:MI:SS.SSSZ.
The PowerCenter Integration Service completes the following steps to capture changed data for a time-period based CDC session:
1.
Reads
all records created between the CDC start time and end time, and passes them to
the next transformation as rows flagged for insert.
2.
Reads
all records updated between the CDC start time and end time, and passes them to
the next transformation as rows flagged for update.
3.
Reads
all records deleted between the CDC start time and end time, and passes them to
the next transformation as rows flagged for delete.
Please go through Rules and Guidelines for
Processing a Time-Period Based CDC Session before enabling Time period based
CDC session.
Having said
this, I have faced some strange problems while trying to implement SFDC CDC
session.
Issues
and Work around for SFDC CDC session
While running a CDC Time-period enabled session, while reading source
records were getting doubled. Integration service was reading exactly double
the records as I Inserted. i.e, when I added 10 records to SFDC org,
Informatica integration service read 20 records.
This problem occurred due to the “3 step processes” .
- 1. Read all created records
- 2. Read all updated records
- 3. Read all deleted records.
For reading Created Records , Integration service uses CreatedDate . For eg, [Select Id, LastName from Contact Where
(CreatedDate>= $StartDate AND CreatedDate< $EndDate)].
For reading updated records, Integration service uses
LastModifiedDate.For eg, [Select Id,
LastName from Contact Where (LastModifiedDate >= $StartDate AND
LastModifiedDate < $EndDate)].
So, all the newly Created records were getting fetched in both “Created Records” group an d ”Updated
records” group.
Simple and
easiest way to avoid this is to remove start
and end time from the session properties (i.e disable time-period CDC )
and add the time period in the fiter
criteria. For eg. (LastModifiedDate >= $StartDate AND LastModifiedDate < $EndDate)
.
Having said this,
There are always different ways to approach a solution and The solution again
depends on the requirement.