I am sorry if this is a duplicate question. Please point me in correct direction. I have a table with a column named MailBody with varchar(max) as datatype like below.
CREATE TABLE Mail
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[MailBody] [varchar](max) NULL
)
When I try to insert a very long string, length > 10,000 characters into MailBody, it is not storing full string. It is truncating and then storing the truncated string in MailBody column. Can anybody tell me how to store the full string but not truncated string into MailBody column.
UPDATE
As stated by marc_s below in one of his comments, it is storing the full MailBody string into MailBody column. I created a small C# unit test method to read MailBody column and saw that I am getting my full string without truncated. I didnot change any settings in my SSMS. Thanks marc_s
Using your table structure, I was easily able to insert a 120 KB+ .txt file into the table with this code:
INSERT INTO Mail(MailBody)
SELECT BulkColumn
FROM OPENROWSET (BULK 'c:\tmp\large.txt', SINGLE_CLOB) MyFile
This can be seen by checking the length of the MailBody column:
SELECT ID, LEN(MailBOdy) FROM Mail
Output:
VARCHAR(MAX) is easily able to handle large text - up to 2 billion characters, actually - enough to hold way past 100 copies of the entire "War And Peace" by Leo Tolstoj. ....
DO NOT change your datatype - it's the right datatype to use! There must be something else going on that truncates your data ...
Update: you can set the amount of data that SSMS will show you under Tools > Options:
You can crank up the number of characters displayed - but be aware: the higher you go, the more data might need to be transferred to your computer to be displayed! Don't start complaining about lack of performance if you ask for 2 GB of data for each column! :-)
Related
I am trying to create a BCP file with | delimiter and then load it to a snowflake table.
Issue:
in SQL server there are columns defined as CHAR(4) and have values "sss"
so when i do BCP the its being padded to length of 4 "sss " and being loaded to snowflake
due to which our reports are failing because they do something like where column="SSS" but due to trailing space in snowflake the correct columns are not showing up.
we do not want to change our reports. So, is there a way that BCP can handle the padding or trimming of these columns?
note that there 24 tables and each have around 130+ columns so i cant go and put Trim functions on each char column
If your BCP file is maintaining the trailing space, then Snowflake will retain it, too, as long as the field is being FIELD_OPTIONALLY_ENCLOSED_BY a " or '. You may also want to make sure your TRIM_SPACE option is correctly set on your format definition for your COPY INTO command.
If your BCP file isn't maintaining the space and you can't figure out how to get that to work, you could force the space back in during the COPY INTO command with some string functions in your SELECT, or you could create a view for your report that does the same set of string functions to force the space for your report to work from.
So, is there a way that BCP can handle the padding or trimming of these columns?
Yes, but not by some switch or option. The correct way to handle this is to set your datatypes up front. As someone mentioned in comments to your question, your query that is creating BCP output should use VARCHAR(4) instead of CHAR(4). BCP is giving you what you asked of it. They way to avoid whitespace is to use varchar.
Seems like a fairly quick "find and replace" against scripted out query objects would work fine but you know your situation best.
Additionally, "trim" wont work - FYI. Even if the value of the field was only "SSS" (as in your example); if the result/column is defined as CHAR(4) you will get 4 bytes of data and a blank in the 4th place since you only had 3 bytes of data. Trim will work during the query... the padded " " you are getting is placed there by the copy out. The way to correct this is to set your data types as you need up front.
Unless someone knows of a better way in snowflake (im not familiar with it) the only other option is to manipulate the file inbetween SQL and Snowflake. replace " |" with "|"... but... blech.
This is a known "issue" with BCP. The "solution" is to use the queryout option, which means you must include a query with every export. But the data are the way they are.
Eg: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/88c258fe-d1a6-4f3a-9dac-40388d04e9c7/remove-space-in-columns-on-bcp-out?forum=transactsql
But this is really a Snowflake problem, because Snowflake has its own default CHAR semantics.
You get a warning in the documentation String & Binary Data Types but that doesn't tell the whole truth.
The following executed on Oracle (and apparently MSSQL? MySQL?) will select the aaa line:
CREATE TABLE C AS SELECT CAST('aaa ' AS CHAR(4)) t FROM DUAL;
SELECT * FROM C WHERE t = 'aaa';
but won't on Snowflake, unless you create the column with COLLATION:
CREATE OR REPLACE TABLE C (t CHAR(4) COLLATE 'en_US-rtrim');
INSERT INTO C VALUES('aaa ');
SELECT * FROM C WHERE t = 'aaa';
Unfortunately, you can't ALTER the collation after creation, which would have been convenient after a COPY INTO <table>.
PS: Mike Walton's answer is better, TRIM_SPACE is much cleaner than COLLATE.
I need help. I keep getting a "Conversion failed when converting date and/or time from character string" error right before the INSERT statement. I can see that the problem stems from either the [nowtime] or the [a_Timestamp] because of the datatypes, and I have to convert them to the same type. However somehow I can't get it to work no matter what I try. In the background [a_Timestamp] is CHAR(13) and #nowtime is CONVERT(time,CONVERT(char(8), GETDATE(), 108)).
The purpose of the full query is to monitor a DBs past 1, 5 and 10 seconds of data, so the time operations are vital and must be quick. If you need more of the code (or all of it) im happy to provide!
CREATE TABLE #base
(
[date] CHAR(8),
[a_MemberId] CHAR(5),
[a_Timestamp]CHAR(8),
[nowTime] TIME
)
INSERT INTO #base
([date],
[a_MemberId],
[a_Timestamp],
[nowTime])
SELECT [date] AS [date],
[a_MemberId] AS [a_MemberId],
SUBSTRING([a_Timestamp],0,7) AS [timeStamp],
#nowTime AS [nowTime]
FROM [ObserverDB].[dbo].[onti_ord] WITH (NOLOCK)
WHERE [date] = #TodaysDateTEST AND [a_Timestamp] < #nowTime
ORDER BY [date]
This is probably culture related. Your string is converted implicitly and your culture does not fit to the stored format. It is always a bad idea to store date and/or time values as text.
There is only one sure format: ISO8601 which means (one of many examples)
yyyy-mm-ddThh:mm:ss.ttt (e.g. 2016-08-17T13:11:23)
In this line you are comparing your values. Obviously one of them is treated as string and one as date-time value
WHERE [date] = #TodaysDateTEST AND [a_Timestamp] < #nowTime
You must make sure (by using CONVERT (details here) with the proper format number), that a date and/or time value is converted to string exactly to the same format as your strings stored (sargable, therefore faster), or to convert your stored values to real date and/or time values and compare them type safe (much cleaner).
Of course the advise should be: Change the database to store this propperly, but - as you stated in comment - you have to deal with this...
Good luck :-)
I have a sql table that gets populated via SQLBulkCopy from Excel. The copy down is done using the Microsoft ACE drivers.
I had a problem with one particular file - when it was loaded down to sql, some of the columns (which appear empty in excel) contained an odd value.
For example, running this sql:
SELECT
CONVERT(VARBINARY(10),MyCol),
LEN(MyCol)
FROM MyTab
would return
0x, 0
i.e. - converting the value in the column to varbinary shows something, but doing length of the varchar shows no length. I realise that the value shown is the stem of a hex value, but its weird that its gets there, and how hard it is to detect.
Obviously I can just clear out the cells in Excel, but I really need to detect this automatically as end users will have the same issue. It is causing issues further down the line when the data gets processed. Its quite hard to trace the problem back from its eventual symptoms to being this issue in the source.
Other than the above conversion to varbinary to output in SSMS I've not come up with a way of detecting these values, either in Excel or via a SQL script to remove them.
Any ideas?
This may help you:
-- Conversion from hex string to varbinary:
DECLARE #hexstring VarChar(MAX);
SET #hexstring = 'abcedf012439';
SELECT CAST('' AS XML).Value('xs:hexBinary( substring(sql:variable("#hexstring"), sql:column("t.pos")) )', 'varbinary(max)')
FROM (SELECT CASE SubString(#hexstring, 1, 2) WHEN '0x' THEN 3 ELSE 0 END) AS t(pos)
GO
-- Conversion from varbinary to hex string:
DECLARE #hexbin VarBinary(MAX);
SET #hexbin = 0xabcedf012439;
SELECT '0x' + CAST('' AS XML).Value('xs:hexBinary(sql:variable("#hexbin") )', 'varchar(max)');
GO
One method is to add a new column, convert the data, drop the
old column and rename the new column to the old name.
As Martin points out above, 0x is what you get when you convert an empty string. eg:
SELECT CONVERT(VARBINARY(10),'')
So the problem of detecting it obviously goes away.
I have to assume that there is some rubbish in the excel cell, that is being filtered out in the process of the write down by either the ACE driver or the SQLBulkCopy. Because there was something in the field originally, the value written is empty instead of null.
In order to make sure that everything is consistent in the data we'll need to do a post process to switch all empty values to nulls so that the next lots of scripts work.
Using SQL-Server 2008 and concatenating string literals to more than 8000 characters by obvious modification of the following script, I always get the result 8000. Is there a way to tag string literals as varchar(max)?
DECLARE #t TABLE (test varchar(max));
INSERT INTO #t VALUES ( '0123456789012345678901234567890123456789'
+ '0123456789012345678901234567890123456789'
+ '... and 200 times the previous line'
);
select datalength(test) from #t
I used the following code on SQL Server 2008
CREATE TABLE [dbo].[Table_1](
[first] [int] IDENTITY(1,1) NOT NULL,
[third] [varchar](max) NOT NULL
) ON [PRIMARY]
END
GO
declare #maxVarchar varchar(max)
set #maxVarchar = (REPLICATE('x', 7199))
set #maxVarchar = #maxVarchar+(REPLICATE('x', 7199))
select LEN(#maxVarchar)
insert table_1( third)
values (#maxVarchar)
select LEN(third), SUBSTRING (REVERSE(third),1,1) from table_1
The value you are inserting in your example is being stored temporally as a varchar(8000) because. To make the insert one will need to use a variable which is varchar(max) and append to it to overcome the internal 8000 limit.
Try casting your value being inserted as a varchar(max):
INSERT INTO #t VALUES (CAST('0123456789012345678901234567890123456789'
+ '0123456789012345678901234567890123456789'
+ '... and 200 times the previous line' AS varchar(max)
);
Also, you may have to concatenate several <8000 length strings (each casted as varchar(max)).
See this MSDN Forum Post.
When I posted the question, I was convinced that there are some limitations for the length or maximal line width of a single string literal to be used in INSERT and UPDATE statement.
This assumption is wrong.
I was led to this impression by the fact the SSMS limits output width for a single column in text mode to 8192 characters and output of PRINT statements to 8000 characters.
Fact is, as far as I know you need only enclose the string with apostrophes and double all embedded apostrophes. I found no restrictions concerning width or total length of a string.
For the opposite task, to convert such strings back from database back to script the best tool I found is ssms toolspack which works for SQL-Server 2005+.
I have an SSIS package with a Data Flow that takes an ADO.NET data source (just a small table), executes a select * query, and outputs the query results to a flat file (I've also tried just pulling the whole table and not using a SQL select).
The problem is that the data source pulls a column that is a Money datatype, and if the value is not zero, it comes into the text flat file just fine (like '123.45'), but when the value is zero, it shows up in the destination flat file as '.00'. I need to know how to get the leading zero back into the flat file.
I've tried various datatypes for the output (in the Flat File Connection Manager), including currency and string, but this seems to have no effect.
I've tried a case statement in my select, like this:
CASE WHEN columnValue = 0 THEN
'0.00'
ELSE
columnValue
END
(still results in '.00')
I've tried variations on that like this:
CASE WHEN columnValue = 0 THEN
convert(decimal(12,2), '0.00')
ELSE
convert(decimal(12,2), columnValue)
END
(Still results in '.00')
and:
CASE WHEN columnValue = 0 THEN
convert(money, '0.00')
ELSE
convert(money, columnValue)
END
(results in '.0000000000000000000')
This silly little issue is killin' me. Can anybody tell me how to get a zero Money datatype database value into a flat file as '0.00'?
I was having the exact same issue, and soo's answer worked for me. I sent my data into a derived column transform (in the Data Flow Transform toolbox). I added the derived column as a new column of data type Unicode String ([DT_WSTR]), and used the following expression:
Price < 1 ? "0" + (DT_WSTR,6)Price : (DT_WSTR,6)Price
I hope that helps!
Could you use a Derived Column to change the format of the value? Did you try that?
I used the advanced editor to change the column from double-precision float to decimal and then set the Scale to 2:
Since you are exporting to text file, just export data preformatted.
You can do it in the query or create a derived column, whatever you are more comfortable with.
I chose to make the column 15 characters wide. If you import into a system that expects numbers those zeros should be ignored...so why not just standardize the field length?
A simple solution in SQL is as follows:
select
cast(0.00 as money) as col1
,cast(0.00 as numeric(18,2)) as col2
,right('000000000000000' + cast( 0.00 as varchar(10)), 15) as col3
go
col1 col2 col3
--------------------- -------------------- ---------------
.0000 .00 000000000000.00
Simply replace '0.00' with your column name and don't forget to add the FROM table_name, etc..
It is good to use derived column and need to check the condition as well
pricecheck <=0 ? "0" + (DT_WSTR,10)pricecheck : (DT_WSTR,10)pricecheck
or alternative way is to use vb script
Ultimately what I ended up doing was using the FORMAT() function.
CAST(FORMAT(balance, '0000000000.0000') AS varchar(30)) AS "balance"
This does have some significant CPU performance impact (often at least an order of magnitude) due to the way SQL Server implements that function, but nothing worked easier, more correctly, or more consistently for me. I was working with less than 100,000 rows and the package executes no more than once an hour. Going from 100ms to 1000ms just wasn't a big deal in my situation.
The FORMAT() function returns an nvarchar(4000) by default, so I also cast it back to a varchar of appropriate size since my output file needed to be in Windows-1252 encoding. Transcoding text is much more obnoxious in SSIS than it has any right to be.