Data taps are one of the great significant features in SQL Server in the Data analysis during runtime. In a nutshell, they enable us, at runtime, to choose a given path from a data flow and capture a copy of the data at that specific point of the data flow in a .csv file on a given execution instance of the package.
Let’s break it down further:
Problem
When we develop any SSIS package, to view the data passing through the pipeline we add Dataviewer, but once we deploy a package there was no built-in support for data tapping to analyze the data for troubleshooting. In SQL2012 and later versions this can be done using data taps.
What is a Data Tap?
A Data Tap is an output of the data from one of the data flow paths in your package. Data Taps are like Data Viewers. The difference is that a Data Tap exports the data to a file from an already deployed package.
How to add a Data Tap?
To get started, make sure you are using SQL Server 2012 or Later version and have deployed your package using the new Project Deployment Model.
After you deploy the package to the server, you need to execute T-SQL scripts against the SSISDB database to add data taps before executing the package.
1. Create an execution instance of a package by using the catalog.create_execution (SSISDB Database) stored procedure.
2. Add a data tap by using either catalog.add_data_tap or catalog.add_data_tap_by_guid stored procedure.
3. Start the execution instance of the package by using the catalog.start_execution (SSISDB Database).
Instead of typing the SQL statements, you can generate the execute package script by performing the following steps:
1. Right-click Package.dtsx and click Execute.
2. Click Script toolbar button to generate the script.
3. Now, add the add_data_tap statement before the start_execution call.
In our example following is the script generated
Declare @execution_id bigint
EXEC [SSISDB].[catalog].[create_execution] @package_name=N’Package.dtsx’,
@execution_id=@execution_id OUTPUT, @folder_name=N’DataTap Example’,
@project_name=N’DataTap’, @use32bitruntime=False, @reference_id=Null
Select @execution_id
DECLARE @var0 smallint = 1
EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id, @object_type=50, @parameter_name=N’LOGGING_LEVEL’, @parameter_value=@var0
EXEC [SSISDB].[catalog].[start_execution] @execution_id
GO
There are 2 options for adding the datatap.
1. add_data_tap
To supply the values for all of the parameters, you’ll need the package in design mode available to you. Here is the add_data_tap command we’ll insert before the start_execution line.
EXEC [SSISDB].[catalog].add_data_tap @execution_id = @execution_id,
@task_package_path = ‘\Package\DFT_ExportCurrency’,
@dataflow_path_id_string = ‘Paths[OLE_SRC_Currency.OLE DB Source Output]’,
@data_filename = ‘output.txt’
@task_Pacakge_path is obtained from the DFT property name PackagePath.
@Dataflow_path_id_string value is obtained from the propertyname IdentificationString.
The datatap output file will get saved in the folder path \Microsoft SQL Server\110\DTS\DataDumps.
2. add_data_tap_by_guid
This function takes the task ID instead of the task package path. The command in my example is
EXEC [SSISDB].[catalog].add_data_tap_by_guid @execution_id = @execution_id,
@dataflow_task_guid = ‘{0210C9CC-635B-4038-AFC4-54220909F4F3}’,
@dataflow_path_id_string = ‘Paths[OLE_SRC_Currency.OLE DB Source Output]’,
@data_filename = ‘output.txt’
Listing all the Datataps
You can also list all the data taps by using the catalog.execution_data_taps view.
Select * from [SSISDB].[catalog].execution_data_taps where execution_id=@execid
Performance Consideration
Enabling verbose logging level and adding data taps increase the I/O operations performed by your data integration solution. Hence, add data taps should be used only for troubleshooting purposes and when the data volume is less.
Conclusion:
Troubleshooting data issues can get hairy at times and data taps are an extra tool in our belt that can help in cases where logging (which has also improved a lot in SSIS 2012) and dumps may not give us the required level of information.