Last week, I was working on a new SSIS package. I had to perform an extract of an Excel file into a staging table.
After playing around, I suddenly noticed that the value of some of my columns were NULL, although some fields of that column contained data.
When looking into this problem, I found the following:
The Excel driver reads a certain number of rows (by default, 8 rows) in the specified source to guess at the data type of each column. When a column appears to contain mixed data types, especially numeric data mixed with text data, the driver decides in favor of the majority data type, and returns null values for cells that contain data of the other type. (In a tie, the numeric type wins.) Most cell formatting options in the Excel worksheet do not seem to affect this data type determination.
More info on: http://msdn.microsoft.com/en-us/library/ms141683.aspx
When I checked my Excel file, it turned out that my first 8 rows were empty
How can you fix this problem?
You can modify this behavior of the Excel driver by specifying Import Mode. To specify Import Mode, add IMEX=1 to the value of Extended Properties in the connection string of the Excel connection manager in the Properties window.
In my case, I had the following connection string:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:UserTempScrambling Requests_Requests_Inventory.xls;Extended Properties=”EXCEL 8.0;HDR=NO;IMEX=1“;
In this state, the registry setting ImportMixedTypes=Text will be noticed. This forces mixed data to be converted to text.
After this modification, my values were shown correctly.