BCP CHAR value to Snowflake - mysql

I am trying to create a BCP file with | delimiter and then load it to a snowflake table.
Issue:
in SQL server there are columns defined as CHAR(4) and have values "sss"
so when i do BCP the its being padded to length of 4 "sss " and being loaded to snowflake
due to which our reports are failing because they do something like where column="SSS" but due to trailing space in snowflake the correct columns are not showing up.
we do not want to change our reports. So, is there a way that BCP can handle the padding or trimming of these columns?
note that there 24 tables and each have around 130+ columns so i cant go and put Trim functions on each char column

If your BCP file is maintaining the trailing space, then Snowflake will retain it, too, as long as the field is being FIELD_OPTIONALLY_ENCLOSED_BY a " or '. You may also want to make sure your TRIM_SPACE option is correctly set on your format definition for your COPY INTO command.
If your BCP file isn't maintaining the space and you can't figure out how to get that to work, you could force the space back in during the COPY INTO command with some string functions in your SELECT, or you could create a view for your report that does the same set of string functions to force the space for your report to work from.

So, is there a way that BCP can handle the padding or trimming of these columns?
Yes, but not by some switch or option. The correct way to handle this is to set your datatypes up front. As someone mentioned in comments to your question, your query that is creating BCP output should use VARCHAR(4) instead of CHAR(4). BCP is giving you what you asked of it. They way to avoid whitespace is to use varchar.
Seems like a fairly quick "find and replace" against scripted out query objects would work fine but you know your situation best.
Additionally, "trim" wont work - FYI. Even if the value of the field was only "SSS" (as in your example); if the result/column is defined as CHAR(4) you will get 4 bytes of data and a blank in the 4th place since you only had 3 bytes of data. Trim will work during the query... the padded " " you are getting is placed there by the copy out. The way to correct this is to set your data types as you need up front.
Unless someone knows of a better way in snowflake (im not familiar with it) the only other option is to manipulate the file inbetween SQL and Snowflake. replace " |" with "|"... but... blech.

This is a known "issue" with BCP. The "solution" is to use the queryout option, which means you must include a query with every export. But the data are the way they are.
Eg: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/88c258fe-d1a6-4f3a-9dac-40388d04e9c7/remove-space-in-columns-on-bcp-out?forum=transactsql
But this is really a Snowflake problem, because Snowflake has its own default CHAR semantics.
You get a warning in the documentation String & Binary Data Types but that doesn't tell the whole truth.
The following executed on Oracle (and apparently MSSQL? MySQL?) will select the aaa line:
CREATE TABLE C AS SELECT CAST('aaa ' AS CHAR(4)) t FROM DUAL;
SELECT * FROM C WHERE t = 'aaa';
but won't on Snowflake, unless you create the column with COLLATION:
CREATE OR REPLACE TABLE C (t CHAR(4) COLLATE 'en_US-rtrim');
INSERT INTO C VALUES('aaa ');
SELECT * FROM C WHERE t = 'aaa';
Unfortunately, you can't ALTER the collation after creation, which would have been convenient after a COPY INTO <table>.
PS: Mike Walton's answer is better, TRIM_SPACE is much cleaner than COLLATE.

Related

MS SQL Read value with apostrophe from table and save it in another table

I am reading a value from table with apostrophe with which I create a dynamic query and than I run a sp to save it in another table, which works fine without apostrophe but throw an error when it contains an apostrophe.
Select #arguments = argument from Mytable
e.g.
set #sql = 'exec nameOfSP' + #arguments
#arguments value comes from database
#argument sample value '612f0', 'This is an example second string'
Yes I know and agree that this is very bad code smell and therefore the question is not about design (which unfortunately couldn't be changed) but about the best possible solution in current scenario.
I am looking possible for a solution with encoding?
If there is a possibility of a quote coming through in your arguments do something like this:
set #sql = 'exec nameOfSP ' + REPLACE(#arguments, '''', '''''');
"the question is not about design (which unfortunately couldn't be changed)" seems like someone is going for the risk of saving in design that will cost a lot after... if you really must use dynamic sql like this you can use replace on ' to '' (that's right, just double it).
However, I must say that this is not a solution to your problem in any way, it's only a workaround.
You should do whatever you can to change the desing.

How to store a String (length > 255) from a query?

I'm using Access 2000 and I have a query like this:
SELECT function(field1) AS Results FROM mytable;
I need to export the results as a text file.
The problem is:
function(field1) returns a fairly long string (more than 255 char) that cannot be entirely stored in the Results field created from this query.
When i export this query as a text file, i can't see the string entirely. (truncated)
Is it possible to cast function(field1) so it returns a Memo type field containing the string ?
Something like this:
SELECT (MEMO)function(field1) AS Results FROM mytable;
Do you know others solutions?
There is an official microsoft support page on this problem:
ACC2000: Exported Query Expression Truncated at 255 Characters
They recommend that you append the expression data to a table that has a memo field, and export it from there. It's kinda an ugly solution, but you cannot cast parameters to types in MS Access, so it might be the best option available.
i don't know how to do quite what you're hoping (which makes sense) but a possible alternative could be to create 2 or 3 fields (or separate queries) and extract different portions of the text into each then concat after retrieved.
pseudo: concat((chars 1-255) & (chars 256-510) & (chars 511-etc...))
edit: it's odd that a string longer than 255 is stored but it's not memo. what's up there? another alternative, if you have access to the db, is change the field type. (backup the db first!)

Detecting the value 0x in a varchar column sourced from Excel

I have a sql table that gets populated via SQLBulkCopy from Excel. The copy down is done using the Microsoft ACE drivers.
I had a problem with one particular file - when it was loaded down to sql, some of the columns (which appear empty in excel) contained an odd value.
For example, running this sql:
SELECT
CONVERT(VARBINARY(10),MyCol),
LEN(MyCol)
FROM MyTab
would return
0x, 0
i.e. - converting the value in the column to varbinary shows something, but doing length of the varchar shows no length. I realise that the value shown is the stem of a hex value, but its weird that its gets there, and how hard it is to detect.
Obviously I can just clear out the cells in Excel, but I really need to detect this automatically as end users will have the same issue. It is causing issues further down the line when the data gets processed. Its quite hard to trace the problem back from its eventual symptoms to being this issue in the source.
Other than the above conversion to varbinary to output in SSMS I've not come up with a way of detecting these values, either in Excel or via a SQL script to remove them.
Any ideas?
This may help you:
-- Conversion from hex string to varbinary:
DECLARE #hexstring VarChar(MAX);
SET #hexstring = 'abcedf012439';
SELECT CAST('' AS XML).Value('xs:hexBinary( substring(sql:variable("#hexstring"), sql:column("t.pos")) )', 'varbinary(max)')
FROM (SELECT CASE SubString(#hexstring, 1, 2) WHEN '0x' THEN 3 ELSE 0 END) AS t(pos)
GO
-- Conversion from varbinary to hex string:
DECLARE #hexbin VarBinary(MAX);
SET #hexbin = 0xabcedf012439;
SELECT '0x' + CAST('' AS XML).Value('xs:hexBinary(sql:variable("#hexbin") )', 'varchar(max)');
GO
One method is to add a new column, convert the data, drop the
old column and rename the new column to the old name.
As Martin points out above, 0x is what you get when you convert an empty string. eg:
SELECT CONVERT(VARBINARY(10),'')
So the problem of detecting it obviously goes away.
I have to assume that there is some rubbish in the excel cell, that is being filtered out in the process of the write down by either the ACE driver or the SQLBulkCopy. Because there was something in the field originally, the value written is empty instead of null.
In order to make sure that everything is consistent in the data we'll need to do a post process to switch all empty values to nulls so that the next lots of scripts work.

Creating variables and reusing within a mysql update query? possible?

I am struggling with this query and want to know if I am wasting my time and need to write a php script or is something like the following actually possible?
UPDATE my_table
SET #userid = user_id
AND SET filename('http://pathto/newfilename_'#userid'.jpg')
FROM my_table
WHERE filename
LIKE '%_%' AND filename
LIKE '%jpg'AND filename
NOT LIKE 'http%';
Basically I have 700 odd files that need renaming in the database as they do not match the filenames as I am changing system, they are called in the database.
The format is 2_gfhgfhf.jpg which translates to userid_randomjumble.jpg
But not all files in the database are in this format only about 700 out of thousands. So I want to identify names that contain _ but don't contain http (thats the correct format that I don't want to touch).
I can do that fine but now comes the tricky bit!!
I want to replace that file name userid_randomjumble.jpg with http://pathto/filename_userid.jpg So I want to set the column user_id in that row to a variable and insert it into my new filename.
The above doesn't work for obvious reasons but I am not sure if there is a way round what I'm trying to do. I have no idea if it's possible? Am I wasting my time with this and should I turn to PHP with mysql and stop being lazy? Or is there a way to get this to work?
Yes it is possible without the php. Here is a simple example
SET #a:=0;
SELECT * FROM table WHERE field_name = #a;
Yes you can do it using straightforward SQL:
UPDATE my_table
SET filename = CONCAT('http://pathto/newfilename_', userid, '.jpg')
WHERE filename LIKE '%\_%jpg'
AND filename NOT LIKE 'http%';
Notes:
No need for variables. Any columns of rows being updated may be referenced
In mysql, use CONCAT() to add text values together
With LIKE, an underscore (_) has a special meaning - it means "any single character". If you want to match a literal underscore, you must escape it with a backslash (\)
Your two LIKE predicates may be safely merged into one for a simpler query

How do I get SSIS Data Flow to put '0.00' in a flat file?

I have an SSIS package with a Data Flow that takes an ADO.NET data source (just a small table), executes a select * query, and outputs the query results to a flat file (I've also tried just pulling the whole table and not using a SQL select).
The problem is that the data source pulls a column that is a Money datatype, and if the value is not zero, it comes into the text flat file just fine (like '123.45'), but when the value is zero, it shows up in the destination flat file as '.00'. I need to know how to get the leading zero back into the flat file.
I've tried various datatypes for the output (in the Flat File Connection Manager), including currency and string, but this seems to have no effect.
I've tried a case statement in my select, like this:
CASE WHEN columnValue = 0 THEN
'0.00'
ELSE
columnValue
END
(still results in '.00')
I've tried variations on that like this:
CASE WHEN columnValue = 0 THEN
convert(decimal(12,2), '0.00')
ELSE
convert(decimal(12,2), columnValue)
END
(Still results in '.00')
and:
CASE WHEN columnValue = 0 THEN
convert(money, '0.00')
ELSE
convert(money, columnValue)
END
(results in '.0000000000000000000')
This silly little issue is killin' me. Can anybody tell me how to get a zero Money datatype database value into a flat file as '0.00'?
I was having the exact same issue, and soo's answer worked for me. I sent my data into a derived column transform (in the Data Flow Transform toolbox). I added the derived column as a new column of data type Unicode String ([DT_WSTR]), and used the following expression:
Price < 1 ? "0" + (DT_WSTR,6)Price : (DT_WSTR,6)Price
I hope that helps!
Could you use a Derived Column to change the format of the value? Did you try that?
I used the advanced editor to change the column from double-precision float to decimal and then set the Scale to 2:
Since you are exporting to text file, just export data preformatted.
You can do it in the query or create a derived column, whatever you are more comfortable with.
I chose to make the column 15 characters wide. If you import into a system that expects numbers those zeros should be ignored...so why not just standardize the field length?
A simple solution in SQL is as follows:
select
cast(0.00 as money) as col1
,cast(0.00 as numeric(18,2)) as col2
,right('000000000000000' + cast( 0.00 as varchar(10)), 15) as col3
go
col1 col2 col3
--------------------- -------------------- ---------------
.0000 .00 000000000000.00
Simply replace '0.00' with your column name and don't forget to add the FROM table_name, etc..
It is good to use derived column and need to check the condition as well
pricecheck <=0 ? "0" + (DT_WSTR,10)pricecheck : (DT_WSTR,10)pricecheck
or alternative way is to use vb script
Ultimately what I ended up doing was using the FORMAT() function.
CAST(FORMAT(balance, '0000000000.0000') AS varchar(30)) AS "balance"
This does have some significant CPU performance impact (often at least an order of magnitude) due to the way SQL Server implements that function, but nothing worked easier, more correctly, or more consistently for me. I was working with less than 100,000 rows and the package executes no more than once an hour. Going from 100ms to 1000ms just wasn't a big deal in my situation.
The FORMAT() function returns an nvarchar(4000) by default, so I also cast it back to a varchar of appropriate size since my output file needed to be in Windows-1252 encoding. Transcoding text is much more obnoxious in SSIS than it has any right to be.