Import variable record length CSV or any delimited file to a database using SSIS – SQL Server


Delimited text files are the method to store user data  in a file system in the past and even now. These files are easy to create and simple to transfer from one system to another system without trouble. In modern data technology we use various data base management system that keeps the data in tabular format and the latest Big data evolution keeps data in key and value pair or in No-SQL database. Now, we have got delimited files and database system to store. But, We also need to migrate data from the text file to database using data export utility. These utilities are very well designed to handle the delimited file with single data structure (ie fixed record length).

How do we import  variable record length delimited file to a SQL database?

In this demo, I will be using VB.net code to read variable length CSV file data and store it in SQL tables.

Scenario:

I have a CSV  file with ATM transactions of bank customers. This file contains two types of record with variable length. Customer information and ATM transactions are two types of record in this file. Now, I need to upload this file to tables tblCustomer and tblTransactions in SQL Server database.

We already know that we have to use script task to write .net code to read the file and split the line with delimited character and upload in to the tables respectively. I do construct the SQL insert statement in side the script task for data upload.

Sample CSV File

pic0Download CSV file (Bank_AX.TXT)

Table structure

tblCustomer

CREATE TABLE [dbo].[tblCustomer](
[Rec_ID] [varchar](50) NULL,
[AC_No] [varchar](50) NULL,
[Ac_Name] [varchar](50) NULL,
[TransactionDate] [varchar](50) NULL
) ON [PRIMARY]

tblTransaction

CREATE TABLE [dbo].[tblTransaction](
[Rec_ID] [varchar](50) NULL,
[Ac_No] [varchar](50) NULL,
[Transaction_Type] [varchar](50) NULL,
[Amount] [varchar](50) NULL,
[Transaction_Time] [varchar](50) NULL
) ON [PRIMARY]

Script

This image shows partial script. You can download the full code below this image.

pic1Full source code for reading variable length records is available for download here.

Output

Data has been uploaded to tblCustomer and tblTransaction tables after executing this script.

pic3

Removing Process from the memory – SSIS


We often use Task Manager to kill the unresponsive or unwanted process in the memory. I had a situation where I was using excel interop to automate some task in SSIS script task. But at the end of execution excel process was staying in the memory even I close the object. But, later I found the proper solution to remove excel object from memory without killing the process. But, I will talk about that method in my next blog post. In this blog post, I am going to give you the code snippet for killing any process sitting in the memory using VB.NET in SSIS.

Note: This blog post is for just information purpose. I am not responsible for any damage and use at you own risk.

code snippet

pic1

Download Code

Package Configuration Editor – SSIS


We already know that we can store configuration details for SSIS packages in different location and map the package to read the value using BIDS. Please refer this link to learn about Integration Services Configuration. It helps us to deploy a package in different environments by just copy the source package and the configuration to the destination environment without trouble. In some case, we may need to store the configuration and package in different locations and want to map the location of the configuration details without editing the package using BIDS. In this can, we have a tool to accomplish this task.

There is a tool that make our life easier in updating the configuration location details without editing the package using typical BIDS. This is an open-source tool available in codeplex. This works well for SSIS 2008 and 2008 R2.

Download Package Configuration File Editor

Usage:

Step 1: Install this utility in your machine

Step 2: Open the utility and provide the necessary value as shown in this image.

pic1Just 4 steps. Now, we are done with mapping the configuration location for the package  without pain.

SSIS Data Types Cheat Sheet – Integration Services


Data types are used to store the value of an expression in a programming language. There are different data types available in SSIS to handle different varieties of data that are coming from outside world or inside the package.

Different Data Types available in SSIS:

pic1

I have made this SSIS data type details in a PDF document for my personal use. But, I would like to share this with my blog readers. So, please feel free to download and share your feed back.

Download SSIS Data type PDF Document

Balanced Data Distributor – SSIS 2012


Balanced Data Distributor Task is used to distribute the incoming data to the destination uniformly. This task can send the data to one or more output pipelines to destination using threading. So, we can get better performance in the data loading from slow data source like flat file and etc.

This task is available for SQL Server 2008/R2 and SQL Server 2012.

Download Balanced Data Distributor (BDD) Task for SQL Server 2012.

pic1Implementation

Prerequisite: Install the BDD component in your system before implementing this demo.

In this demo, I will be using AdventureWorks2012 database table “SalesOrderDetail” to a dummy destination to show how this task is distributing the incoming data to destination.

Step 1: Create a package

Step 2: Add a Data flow task. Add a ADO.net source task and connect to Adventureworks2012 database.

Step 3: Add BDD task and connect ADO.Net source with it. There is no special settings for BDD task for data distribution. It takes care of it by itself.

pic2Step 4: Add 3 Union All tasks to the Data flow and connect the BDD output to each task.

pic3Step 5: Execute this package.

pic5As you can see the execution, the total incoming rows are 121317 and the BDD distributes the rows to three output equally as (39888, 39888 and 41541) rows. Usually, we should connect BDS output to same destination finally.