Showing posts with label SSIS. Show all posts
Showing posts with label SSIS. Show all posts

20/01/2011

SSIS and distinct union

 

How to join data from several sources knowing that there are or might be duplicates in both sources?

In a SQL query one can use UNION (instead of UNION ALL) to merge several sources and to remove duplicates.

e.g. (knowing that both sources have same columns)

SELECT *
FROM SourceA
UNION
SELECT *
FROM SourceB

In SSIS there’s no such component to accomplish this task immediately.

To accomplish the same behavior in SSIS as in a SQL query, one should combine a “UNION ALL”-component with a “SORT”-component.

The “SORT”-component provides an option to remove the duplicate rows.

image

In our example above, edit the “SORT”-component to specify the sorting order based on the column or columns that uniquely identifies a record (for example the record-ID column).

In the SORT-editor (the normal editor or the advanced editor) one can select the option to remove the rows with duplicate sort values as shown below.

image image
Normal Editor Advanced Editor

Once this property is set to true, the combination of the “UNION ALL”-component and the “SORT”-component achieves the same thing as our “UNION” query, so your output from the “SORT”-component will no longer contain duplicate rows.

12/01/2011

Exceeded storage allocation. The server response was …

 

I was trying to send a mail via SSIS using the smtp server of windows server 2008, when I got the following error:

“Exceeded storage allocation. The server response was: 4.3.1 Message size exceeds fixed maximum message size.”

To fix this problem, you have to do the following:

Open IIS 6.0 Manager, expand your computer’s name, scroll down to the SMTP server, right-click it and select “Properties” from the context menu.

image

Click the “Messages” tab and set a maximum message size. Alternatively you could allow any size messages by deselecting the “Limit message size to (KB):” checkbox.

image

22/12/2010

SSIS Data Load and geography data-type


When loading and saving data to a SQL Server table containing a geography field, execution results in an error (0xC0209029; DTS_E_PROCESSINPUTFAILED). It turns out that there's no built-in support in SSIS for spatial data.
After some googling I ended up with this post suggesting a workaround for the problem. In short the workaround is to first remove the spatial data column, then import your data, and finally recreate your spatial data column.
In SSIS this can be done as following:
1. In the control flow pane, add a 'Execute SQL Task' before and after the 'Data Flow Task'
ssis-geography
2. In the first task ('Remove spatial column') add a SQL statement to remove your column:
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'yourTableName' AND TABLE_SCHEMA = 'yourSchemaName' AND COLUMN_NAME = 'yourSpatialColumnName')
BEGIN
ALTER TABLE yourSchemaName. yourTableName DROP COLUMN yourSpatialColumnName
END
3. In the Data Load task, do what you need to do to load and save your data (without the spatial data column !)
4. In the last task add a SQL statement to add your spatial column again.
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'yourTableName' AND TABLE_SCHEMA = 'yourSchemaName' AND COLUMN_NAME = 'yourSpatialColumnName')
BEGIN
ALTER TABLE yourSchemaName. yourTableName ADD yourSpatialColumnName geography NULL
END