Removing leading zeros in SSIS

Removing leading zeros in SSIS - ssis

I process data from a legacy system with SSIS before importing data into a SQL Server 2008 db.
A currency field is so formatted:
000000xxx.xx
I need to remove the leading zeros. Note that the actual currency value has not a fixed number of digit. So, for instance, it could be xxxxx.xx or x.xx or xxxxxxxxxxx.xx
I found this answer, but the derived column tool display only limited string functions, like the Access formula wizard.
So my question:
how may I use PATINDEX in the DTS derived column tool
otherwise, how may I remove leading zeros from input column

how are you querying the data? I suppose you are using direct table access on your OleDBSource component, right? I do not advise that because a tiny change on the table may break your package.
I would right a view containing the select from your source table and on this select I would use the PATINDEX function and inside the package, I would select the view instead of the table.
This is a good approach because if you need to do a minor change, you can change the view instead of the package
If you dont want to write a view, fine, instead of selecting "data access mode" = "table or view", select "sql command" and write your sql directly on the package.

Are you not able to cast the data into a numeric equivalent type in the source system? That should be a quick way to drop the leading zeros. SELECT CAST(myCurrency AS decimal(18,2)) AS leadingZerosDropped FROM myTable
Otherwise, the dead simple SSIS way to do it is to use a Data Conversion task and cast it to a numeric type (DT_CY/Currency or DT_Numeric/Numeric worked just fine)
Source query
SELECT '000000111.11' AS stringCurrency
UNION ALL SELECT '0.22'
UNION ALL SELECT '03.33'
UNION ALL SELECT '004.44'
UNION ALL SELECT '0005.55'
UNION ALL SELECT '00000000000000000006.66'
Data Conversion transformation
I created a new column, currencyCurrency which was the stringCurrency with a data type of currency [DT_CY] applied.
Results
stringCurrency currencyCurrency
000000111.11 111.11
0.22 0.22
03.33 3.33
004.44 4.44
0005.55 5.55
00000000000000000006.66 6.66
I fully support not using table access mode and in general I'm in favor of pushing work onto the source system but this seems like a low effort if the source system doesn't allow for conversion or if you have to resort to mucking about with strings.

Related

Reading negative numbers in a column

I'm using SSIS to separate good data from unusable date. In order to do that I used derived columns, script task and conditional split where I assigned certain conditions. One of the conditions I need to apply is that none of the numbers in one column cannot be negative. I'm guessing that the best way to solve this would be using conditional split, but I cannot get it to work. I'm new to SSIS, so any help would be appreciated.

You'd have an Expression like
[MyCaseSensitiveColumnName] < 0
and then name the output path something like BadData_NegativeValue
From the comments
that is what I did before, but I'm getting an error saying that The data types "DT_WSTR" and "DT_I4" are incompatible for binary operator ">"
That error message indicates that you are attempting to compare a unicode string (DT_WSTR) and an integer (DT_I4) and that the expression language does not allow it.
To resolve this type incompatibility, you would need to first convert the value of MyCaseSensitiveColumnName from DT_WSTR to an integer.
I'd likely add a Derived Column Component to my data flow and create a new column called MyCaseSensitiveColumnNameAsInteger with an expression like
(DT_I4) [MyCaseSensitiveColumnName]
Now, that may be perilous depending on the quality of your source data. I don't know why you are pulling numeric data in as a string. If there could be non whole numbers in the data set, then we will need to check before making the cast. If there are NULLs in that dataset, those too may cause issues.
That would result in our conditional split check becoming
[MyCaseSensitiveColumnNameAsInteger] < 0

Syntax error in date in query expression for non-date fields

I'm having trouble building a query in Access 2013. The database isn't mine and the only thing I really have control over is this query. There is a table, I'm pulling 7 fields from it and eventually adding an 8th field to the query to do some string manipulation.
However, I keep getting getting "Syntax error in date in query expression 'fieldname'." error whenever I click on the arrow to sort the fields. The odd thing is these errors pop up when sorting non-date fields. When sorting the date field I get "Syntax error (missing operator) in query expression 'Release Date'."
This happens after a fresh build. I have no WHERE conditions, just SELECT and FROM. Ideas?
Here's the sql query, though I'm mainly working in the query design view:
SELECT Transmissions.[Job#], Transmissions.[Part#], Transmissions.TransmissionSN, Transmissions.Status, Transmissions.[Release Date], Transmissions.[Build Book Printed], Transmissions.[ID Tags Required]
FROM Transmissions;

Well... it seems you are the lucky inheritor of a poorly designed database.
Using special characters in a field name is just asking for trouble. And you've found what that trouble is.
Access uses the # sign to designate a Date type for query comparisons. Such as:
dtSomeDate = #2/20/2017#
You surround the date with the # signs.
In your case, the query thinks [Job#] and [Part#] are trying to wrap dates. But of course, that's not the case and thus it fails.
You can try a couple of work arounds. (I leave it to you to experiment.)
1) You can try to rename the problem fields within your query. So that:
Transmissions.[Job#] becomes Transmissions.[Job#] as JobNum
and
Transmissions.[Part#] becomes Transmissions.[Part#] as PartNum
2) You can try to copy the [Transmissions] table to a new table that you create
that does not have the naming problems.
3) Export the [Transmissions] table to a CSV file and re-import it to a new
table (or possibly new database) without the naming problems.
Here is a link to a microsoft article that tells you why to avoid special characters in Access:
Big Bad Special Chars.
Hope that puts you on the right track. :)

Typically, this means that the field names are missing or misspelled.
Try running this to see:
SELECT * FROM Transmissions;

Left Trim 'abc' and right trim 'xyz'

A vendor writes user defined data into a a single column as XML, I need to write an SQL query (2008 / 2012 / 2014) that pulls data from that column for a 3rd party application, here's an example of what is in the column
<udf><udf_date_ppe>15/12/2019</udf_date_ppe><udf_text_ppn>300965994</udf_text_ppn><udf_date_ved>8/12/2016</udf_date_ved><udf_text_vtno>417 - Working holiday</udf_text_vtno><udf_text_ppi>Taiwan</udf_text_ppi></udf>
The problem is I need to grab all the actual data not the XML and the XML isn't stored in the same order meaning I have to dynamically figure out the lengths to trim left and right. For example, I want the date only inbetween this XML
<udf_date_ppe>15/12/2019</udf_date_ppe>
but I don't know how many characters are before it. Once I figure out how to do one I can replicate for the other fields, this is only one user defined filed but at least the XML isn't going to change. I only have view access to the server.
Bit of a pain I know but any help is appreciated.

If you just want to grab the 10 characters appearing inside the <udf_date_ppe> tags, you can use SQL Server's string functions and the following query:
SELECT SUBSTRING(col, CHARINDEX('<udf_date_ppe>', col) + 14, 10)
FROM yourTable
This assumes that there is only a single <udf_date_ppe> tag in the column.

How to store a String (length > 255) from a query?

I'm using Access 2000 and I have a query like this:
SELECT function(field1) AS Results FROM mytable;
I need to export the results as a text file.
The problem is:
function(field1) returns a fairly long string (more than 255 char) that cannot be entirely stored in the Results field created from this query.
When i export this query as a text file, i can't see the string entirely. (truncated)
Is it possible to cast function(field1) so it returns a Memo type field containing the string ?
Something like this:
SELECT (MEMO)function(field1) AS Results FROM mytable;
Do you know others solutions?

There is an official microsoft support page on this problem:
ACC2000: Exported Query Expression Truncated at 255 Characters
They recommend that you append the expression data to a table that has a memo field, and export it from there. It's kinda an ugly solution, but you cannot cast parameters to types in MS Access, so it might be the best option available.

i don't know how to do quite what you're hoping (which makes sense) but a possible alternative could be to create 2 or 3 fields (or separate queries) and extract different portions of the text into each then concat after retrieved.
pseudo: concat((chars 1-255) & (chars 256-510) & (chars 511-etc...))
edit: it's odd that a string longer than 255 is stored but it's not memo. what's up there? another alternative, if you have access to the db, is change the field type. (backup the db first!)

How do I get SSIS Data Flow to put '0.00' in a flat file?

I have an SSIS package with a Data Flow that takes an ADO.NET data source (just a small table), executes a select * query, and outputs the query results to a flat file (I've also tried just pulling the whole table and not using a SQL select).
The problem is that the data source pulls a column that is a Money datatype, and if the value is not zero, it comes into the text flat file just fine (like '123.45'), but when the value is zero, it shows up in the destination flat file as '.00'. I need to know how to get the leading zero back into the flat file.
I've tried various datatypes for the output (in the Flat File Connection Manager), including currency and string, but this seems to have no effect.
I've tried a case statement in my select, like this:
CASE WHEN columnValue = 0 THEN
'0.00'
ELSE
columnValue
END
(still results in '.00')
I've tried variations on that like this:
CASE WHEN columnValue = 0 THEN
convert(decimal(12,2), '0.00')
ELSE
convert(decimal(12,2), columnValue)
END
(Still results in '.00')
and:
CASE WHEN columnValue = 0 THEN
convert(money, '0.00')
ELSE
convert(money, columnValue)
END
(results in '.0000000000000000000')
This silly little issue is killin' me. Can anybody tell me how to get a zero Money datatype database value into a flat file as '0.00'?

I was having the exact same issue, and soo's answer worked for me. I sent my data into a derived column transform (in the Data Flow Transform toolbox). I added the derived column as a new column of data type Unicode String ([DT_WSTR]), and used the following expression:
Price < 1 ? "0" + (DT_WSTR,6)Price : (DT_WSTR,6)Price
I hope that helps!

Could you use a Derived Column to change the format of the value? Did you try that?

I used the advanced editor to change the column from double-precision float to decimal and then set the Scale to 2:

Since you are exporting to text file, just export data preformatted.
You can do it in the query or create a derived column, whatever you are more comfortable with.
I chose to make the column 15 characters wide. If you import into a system that expects numbers those zeros should be ignored...so why not just standardize the field length?
A simple solution in SQL is as follows:
select
cast(0.00 as money) as col1
,cast(0.00 as numeric(18,2)) as col2
,right('000000000000000' + cast( 0.00 as varchar(10)), 15) as col3
go
col1 col2 col3
--------------------- -------------------- ---------------
.0000 .00 000000000000.00
Simply replace '0.00' with your column name and don't forget to add the FROM table_name, etc..

It is good to use derived column and need to check the condition as well
pricecheck <=0 ? "0" + (DT_WSTR,10)pricecheck : (DT_WSTR,10)pricecheck
or alternative way is to use vb script

Ultimately what I ended up doing was using the FORMAT() function.
CAST(FORMAT(balance, '0000000000.0000') AS varchar(30)) AS "balance"
This does have some significant CPU performance impact (often at least an order of magnitude) due to the way SQL Server implements that function, but nothing worked easier, more correctly, or more consistently for me. I was working with less than 100,000 rows and the package executes no more than once an hour. Going from 100ms to 1000ms just wasn't a big deal in my situation.
The FORMAT() function returns an nvarchar(4000) by default, so I also cast it back to a varchar of appropriate size since my output file needed to be in Windows-1252 encoding. Transcoding text is much more obnoxious in SSIS than it has any right to be.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Removing leading zeros in SSIS - ssis

Related

Reading negative numbers in a column

Syntax error in date in query expression for non-date fields

Left Trim 'abc' and right trim 'xyz'

How to store a String (length > 255) from a query?

How do I get SSIS Data Flow to put '0.00' in a flat file?

Categories

Resources