MYSQL not accepting whitespaces - mysql

I have created a table in mysql as:
CREATE TABLE `test1` (
`age` int(12) NOT NULL DEFAULT '0',
`name` varchar(20) NOT NULL DEFAULT '',
`gender` varchar(10) DEFAULT NULL,
PRIMARY KEY (`age`,`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I am inserting 2 rows in this table as:
insert into test1 values(1,'user1','m');
insert into test1 values(1,'user1 ','m');
In the second row insertion, I want my 'name' filed to have white space.
But when I run the second query it gives error of primary key violation.
Is there a way I can insert white spaces in the table having primary key also?

Values in VARCHAR columns are variable-length strings. You can declare
a VARCHAR column to be any length between 1 and 255, just as for CHAR
columns. However, in contrast to CHAR, VARCHAR values are stored using
only as many characters as are needed, plus one byte to record the
length. Values are not padded; instead, trailing spaces are removed
when values are stored. (This space removal differs from the SQL-99
specification.)
You probably want lpad, rpad, or space
If you are developing for html you can replace the white space with a different character and once you query the you replace the character with the white space, you can even use " " that will insert an empty space into your html browser

If you need to insert the values with white spaces you can use name nvarchar(20) instead of varchar(20)
Note :
The exact problem is that for SQL norm if you compare two string with different lengths the first thing done by SQL is to make them to the same length by adding trailing spaces.
So, if your query compares string1 'a' and string2 'a ', string1 is first converted to 'a ' then compared to string2, and now the two string are the same.
Finally and fortunately, if the field is a UNIQUE INDEX or Primary key it is not possible to have 'a' and 'a ' in two different rows. If it is not a UNIQUE INDEX or primary key field, then you will have to use RTRIM, LTRIM and LEN function with an extra character like LEN('a'+'#')=2 and LEN('a '+'#')=3.
Len('a') and len('a ') give ...1
What you must keep in mind is :
Remove trailing spaces before insertion ! It will be the better option
I have seen character or string primary key systems speeded up radically when converted to Integer.

Related

In SQL, what does a leading "X" mean when defining a string?

I'm working on a project where I was given a SQL file to generate a database and some sample values. One of the fields (HTMLContent) is type blob and the values being inserted into it are in the form X'<long-string-of-numbers-and-letters>'.
What does the leading 'X' signify?
CREATE TABLE `advertDOM` (
`id` int(11) NOT NULL,
`HTMLContent` blob COMMENT 'DOM data to be displayed on screen',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `advertDOM` (`id`, `HTMLContent`)
VALUES
(1,X'3C646976206964203D2022636F6E74656E742220636C617373203D202266756C6C73637265656E2D6C616E647363617065223E0A20203C646976206964203D202277312D636F6E7461696E6572223E0A202020207B77317D0A20203C2F6469763E0A3C2F6469763E'),
(2,X'3C646976206964203D2022636F6E74656E742220636C617373203D202274776F2D77696E646F772D6C616E647363617065223E0A20203C646976206964203D202277312D636F6E7461696E6572223E0A202020207B77317D0A20203C2F6469763E0A20203C646976206964203D202277322D636F6E7461696E6572223E0A202020207B77327D0A20203C2F6469763E0A3C2F6469763E'),
(3,X'3C646976206964203D2022636F6E74656E742220636C617373203D202266756C6C73637265656E2D706F727472616974223E0A20203C646976206964203D202277312D636F6E7461696E6572223E0A202020207B77317D0A20203C2F6469763E0A3C2F6469763E'),
(4,X'3C646976206964203D2022636F6E74656E742220636C617373203D202274776F2D77696E646F772D706F727472616974223E0A20203C646976206964203D202277312D636F6E7461696E6572223E0A202020207B77317D0A20203C2F6469763E0A20203C646976206964203D202277322D636F6E7461696E6572223E0A202020207B77327D0A20203C2F6469763E0A3C2F6469763E');
From the MySQL documentation:
Hexadecimal literal values are written using X'val' or 0xval notation, where val contains hexadecimal digits (0..9, A..F). Lettercase of the digits and of any leading X does not matter. A leading 0x is case-sensitive and cannot be written as 0X.
So, you are just looking at hexadecimal string literals. You can see based on your table definition that these strings are being stored as binary, in a BLOB column.

How do I add leading 0's to a string in mysql?

I have a column in my table of zip codes. Some of the zip codes got truncated since they started with 0. For instance, 00123 appears as 123 and 04567 appears as 04567.
Is there a function that I can use to update all entries of the column so that if the length of the string is 3, there will be 0's place in front of the number to make it length of 5? (i.e. 123 --> 00123 & 4567 --> 04567)
If your column already is in a string type, you can use LPAD to add leading strings:
update table set zipcode = LPAD(zipCode, 5, '0');
If it's a numeric datatype, change the column to use ZEROFILL, then do the same as above. Please note that this will automatically make your column unsigned.
See the manual
Make your zipcode field one of the text type fields. Problem solved. This makes sense when you think about it as it is unlikely that you are going to do any mathematical computations on this data. Also, this is more flexible if and when you need to accommodate countries with non-numeric postal code values.
Create or ALTER the field to zerofill and set the length to that
CREATE TABLE `abc` (
`zip` int(5) unsigned zerofill DEFAULT NULL,
`b` int(11)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Try UPDATE Table SET zipCode = LPAD(zipCode, 5, '0');
This will fill your data with leading zeros. Hope that helps !

What is difference between char and varchar

CREATE TABLE IF NOT EXISTS `test` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`country` varchar(5) NOT NULL,
`state` char(5) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
I tried following query to insert data
INSERT INTO `test`.`test` (`id` ,`country` ,`state`)
VALUES (NULL , 'south-india', 'Gujarat');
When I execute above query It will shows following warning
Warning: #1265 Data truncated for column 'country' at row 1
Warning: #1265 Data truncated for column 'state' at row 1
I found Reference that VARCHAR is variable-length.CHAR is fixed length.
Then what you mean by
VARCHAR is variable-length.
CHAR is fixed length.
VARCHAR(5) will use at most 5 characters of storage, while CHAR(5) will always use exactly 5.
For a field holding a person's name, for example, you'd want to use a VARCHAR, because while on average someone's name is usually short, you still want to cope with the few people with very long names, without having to have that space wasted for the majority of your database rows.
As you said varchar is variable-length and char is fixed. But the main difference is the byte it uses.
Example.
column: username
type: char(10)
if you have data on column username which is 'test', it will use 10 bytes. and it will have space.
'test______'
Hence the varchar column will only uses the byte you use. for 'test' it will only use 4 bytes. and your data will be
'test'
THanks.
As you mentioned VARCHAR is variable-length. CHAR is fixed length.
when you say
Varchar(5) and if the data you store in it is of length 1, The
remaining 4 byte memory space will be used by others. example: "t"
on the other hand
Char(5) and if the data you store in it is of length 1, The remaining
4 byte memory space cant be used. The 4 byte will end up not used by
any other data. example: "t____" here ____ is the unused space.

MySQL database with unique fields ignored ending spaces

My projects requires to start inputs from the user with the spacing on the left and spacing on the right of a word, for example 'apple'. If the user types in ' apple' or 'apple ', whether it is one space or multiple space on the left or right of the word, I need to store it that way.
This field has the Unique attribute, but I attempt to insert the word with spacing on the left, and it works fine. But when I attempt to insert the word with spacing on the right it trims off all the spacing from the right of the word.
So I am thinking of adding a special character to the right of the word after the spacing. But I am hoping there is a better solution for this issue.
CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (id), UNIQUE KEY string (string) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
The problem is that MySQL ignores trailing whitespace when doing string comparison. See
http://dev.mysql.com/doc/refman/5.7/en/char.html
All MySQL collations are of type PADSPACE. This means that all CHAR, VARCHAR, and TEXT values in MySQL are compared without regard to any trailing spaces.
...
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters will result in a duplicate-key error. For example, if a table contains 'a', an attempt to store 'a ' causes a duplicate-key error.
(This information is for 5.7; for 8.0 this changed, see below)
The section for the like operator gives an example for this behavior (and shows that like does respect trailing whitespace):
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
Unfortunately the UNIQUE index seems to use the standard string comparison to check if there is already such a value, and thus ignores trailing whitespace.
This is independent from using VARCHAR or CHAR, in both cases the insert is rejected, because the unique check fails. If there is a way to use like semantics for the UNIQUE check then I do not know it.
What you could do is store the value as VARBINARY:
mysql> create table test_ws ( `value` varbinary(255) UNIQUE );
Query OK, 0 rows affected (0.13 sec)
mysql> insert into test_ws (`value`) VALUES ('a');
Query OK, 1 row affected (0.08 sec)
mysql> insert into test_ws (`value`) VALUES ('a ');
Query OK, 1 row affected (0.06 sec)
mysql> SELECT CONCAT( '(', value, ')' ) FROM test_ws;
+---------------------------+
| CONCAT( '(', value, ')' ) |
+---------------------------+
| (a) |
| (a ) |
+---------------------------+
2 rows in set (0.00 sec)
You better do not want to do anything like sorting alphabetically on this column, because sorting will happen on the byte values instead, and that will not be what the users expect (most users, anyway).
The alternative is to patch MySQL and write your own collation which is of type NO PAD. Not sure if someone wants to do that, but if you do, let me know ;)
Edit: meanwhile MySQL has collations which are of type NO PAD, according to https://dev.mysql.com/doc/refman/8.0/en/char.html :
Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD.
and https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html
Unicode collations based on UCA versions later than 4.0.0 include the version in the collation name. Thus, utf8mb4_unicode_520_ci is based on UCA 5.2.0 weight keys, whereas utf8mb4_0900_ai_ci is based on UCA 9.0.0 weight keys.
So if you try:
create table test_ws ( `value` varbinary(255) UNIQUE )
character set utf8mb4 collate utf8mb4_0900_ai_ci;
you can insert values with and without trailing whitespace
You can find all available NO PAD collations with:
show collation where Pad_attribute='NO PAD';
This is not about CHAR vs VARCHAR. SQL Server does not consider trailing spaces when it comes to string comparison, which is applied also when checking a unique key constraint. So it is not that you cannot insert value with trailing spaces, but once you insert, you cannot insert another value with more or fewer spaces.
As a solution to your problem, you can add a column that keeps the length of the string, and make the length AND the string value as a composite unique key constraint.
In SQL Server 2012, you can even make the length column as a computed column so that you don't have to worry about the value at all. See http://sqlfiddle.com/#!6/32e94 for an example with SQL Server 2012. (I bet something similar is possible in MySQL.)
You probably need to read about the differences between VARCHAR and CHAR types.
The CHAR and VARCHAR Types
When CHAR values are stored, they are right-padded with spaces to the specified length. When CHAR values are retrieved, trailing spaces are removed unless the PAD_CHAR_TO_FULL_LENGTH SQL mode is enabled.
For VARCHAR columns, trailing spaces in excess of the column length are truncated prior to insertion and a warning is generated, regardless of the SQL mode in use. For CHAR columns, truncation of excess trailing spaces from inserted values is performed silently regardless of the SQL mode.
VARCHAR values are not padded when they are stored. Trailing spaces are retained when values are stored and retrieved, in conformance with standard SQL.
Conclusion: if you want to retain whitespace on the right side of a text string, use the CHAR type (and not VARCHAR).
Thanks to #kennethc. His answer works for me.
Add a string length field to the table and to the unique key.
CREATE TABLE strings
( id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
string varchar(255) COLLATE utf8_bin NOT NULL,
created_ts timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
string_length int(3),
PRIMARY KEY (id), UNIQUE KEY string (string,string_length) )
ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
In MySQL it's possible to update the string length field with couple of triggers like this:
CREATE TRIGGER `string_length_insert` BEFORE INSERT ON `strings` FOR EACH ROW SET NEW.string_length = char_length(NEW.string);
CREATE TRIGGER `string_length_update` BEFORE UPDATE ON `strings` FOR EACH ROW SET NEW.string_length = char_length(NEW.string);

MySQL index for long strings

I have MySQL InnoDb table where I want to store long (limit is 20k symbols) strings. Is there any way to create index for this field?
you can put an MD5 of the field into another field and index that. then when u do a search, u match versus the full field that is not indexed and the md5 field that is indexed.
SELECT *
FROM large_field = "hello world hello world ..."
AND large_field_md5 = md5("hello world hello world ...")
large_field_md5 is index and so we go directly to the record that matches. Once in a blue moon it might need to test 2 records if there is a duplicate md5.
You will need to limit the length of the index, otherwise you are likely to get error 1071 ("Specified key was too long"). The MySQL manual entry on CREATE INDEX describes this:
Indexes can be created that use only the leading part of column values, using col_name(length) syntax to specify an index prefix length:
Prefixes can be specified for CHAR, VARCHAR, BINARY, and VARBINARY columns.
BLOB and TEXT columns also can be indexed, but a prefix length must be given.
Prefix lengths are given in characters for nonbinary string types and in bytes for binary string types. That is, index entries consist of the first length characters of each column value for CHAR, VARCHAR, and TEXT columns, and the first length bytes of each column value for BINARY, VARBINARY, and BLOB columns.
It also adds this:
Prefix support and lengths of prefixes (where supported) are storage engine dependent. For example, a prefix can be up to 1000 bytes long for MyISAM tables, and 767 bytes for InnoDB tables.
Here is an example how you could do that. As #Gidon Wise mentioned in his answer you can index the additional field. In this case it will be query_md5.
CREATE TABLE `searches` (
`id` int(10) UNSIGNED NOT NULL,
`query` varchar(10000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`query_md5` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
) ENGINE=InnoDB;
ALTER TABLE `searches`
ADD PRIMARY KEY (`id`),
ADD KEY `searches_query_md5_index` (`query_md5`);
To make sure you will not have any similar md5 hashes you want to double check by doing and `query` =''.
The query will look like this:
select * from `searches` where `query_md5` = "b6d31dc40a78c646af40b82af6166676" and `query` = 'long string ...'
b6d31dc40a78c646af40b82af6166676 is md5 hash of the long string ... string. This, I think can improve query performance and you can be sure that you will get right results.
Use the sha2 function with a specific length. Add this to your table:
`hash` varbinary(32) GENERATED ALWAYS AS (unhex(sha2(`your_text`,256)))
ADD UNIQUE KEY `ix_hash` (`hash`);
Read about the SHA2 function