How to make MySql table data Case Sensitive? - mysql

I wanted to enter data in MySql table's primary key field with respect to case sensitivity.
But default it is not considering case sensitivity for table data.
Here is my query.
mysql> select id from product where id = 'a1';
+----+
| id |
+----+
| A1 |
+----+
1 row in set (0.00 sec)
mysql> insert into product values('a1', 'SomeName', 'SomeName', 200, 10, 10);
ERROR 1062 (23000): Duplicate entry 'a1' for key 'product.PRIMARY'
Also i have tried Collation while creating table but not getting result as required.
can any one suggest which collation has to use or any other technique to make table's column domain case sensitive.

ALTER TABLE product
MODIFY COLUMN id VARCHAR(...) COLLATION ..._bin NOT NOT NULL;
Where the ... are the current column size and character set.
The only case-sensitive things I can think of are
base-64
Unix file names
But those do not seem likely as PKs. What is your use case? Most things are better off being case-insensitive.
(A Comment links to a SQL Server suggestion using ALTER DATABASE; that will not work for MySQL since that only changes the default for subsequently created tables.)

Related

What is the default select order in PostgreSQL or MySQL?

I have read in the PostgreSQL docs that without an ORDER statement, SELECT will return records in an unspecified order.
Recently on an interview, I was asked how to SELECT records in the order that they inserted without an PK or created_at or other field that can be used for order. The senior dev who interviewed me was insistent that without an ORDER statement the records will be returned in the order that they were inserted.
Is this true for PostgreSQL? Is it true for MySQL? Or any other RDBMS?
I can answer for MySQL. I don't know for PostgreSQL.
The default order is not the order of insertion, generally.
In the case of InnoDB, the default order depends on the order of the index read for the query. You can get this information from the EXPLAIN plan.
For MyISAM, it returns orders in the order they are read from the table. This might be the order of insertion, but MyISAM will reuse gaps after you delete records, so newer rows may be stored earlier.
None of this is guaranteed; it's just a side effect of the current implementation. MySQL could change the implementation in the next version, making the default order of result sets different, without violating any documented behavior.
So if you need the results in a specific order, you should use ORDER BY on your queries.
Following BK's answer, and by way of example...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table(id INT NOT NULL) ENGINE = MYISAM;
INSERT INTO my_table VALUES (1),(9),(5),(8),(7),(3),(2),(6);
DELETE FROM my_table WHERE id = 8;
INSERT INTO my_table VALUES (4),(8);
SELECT * FROM my_table;
+----+
| id |
+----+
| 1 |
| 9 |
| 5 |
| 4 | -- is this what
| 7 |
| 3 |
| 2 |
| 6 |
| 8 | -- we expect?
+----+
In the case of PostgreSQL, that is quite wrong.
If there are no deletes or updates, rows will be stored in the table in the order you insert them. And even though a sequential scan will usually return the rows in that order, that is not guaranteed: the synchronized sequential scan feature of PostgreSQL can have a sequential scan "piggy back" on an already executing one, so that rows are read starting somewhere in the middle of the table.
However, this ordering of the rows breaks down completely if you update or delete even a single row: the old version of the row will become obsolete, and (in the case of an UPDATE) the new version can end up somewhere entirely different in the table. The space for the old row version is eventually reclaimed by autovacuum and can be reused for a newly inserted row.
Without an ORDER BY clause, the database is free to return rows in any order. There is no guarantee that rows will be returned in the order they were inserted.
With MySQL (InnoDB), we observe that rows are typically returned in the order by an index used in the execution plan, or by the cluster key of a table.
It is not difficult to craft an example...
CREATE TABLE foo
( id INT NOT NULL
, val VARCHAR(10) NOT NULL DEFAULT ''
, UNIQUE KEY (id,val)
) ENGINE=InnoDB;
INSERT INTO foo (id, val) VALUES (7,'seven') ;
INSERT INTO foo (id, val) VALUES (4,'four') ;
SELECT id, val FROM foo ;
MySQL is free to return rows in any order, but in this case, we would typically observe that MySQL will access rows through the InnoDB cluster key.
id val
---- -----
4 four
7 seven
Not at all clear what point the interviewer was trying to make. If the interviewer is trying to sell the idea, given a requirement to return rows from a table in the order the rows were inserted, a query without an ORDER BY clause is ever the right solution, I'm not buying it.
We can craft examples where rows are returned in the order they were inserted, but that is a byproduct of the implementation, ... not guaranteed behavior, and we should never rely on that behavior to satisfy a specification.

what are the changes in mysql 8 result rowset case?

when running
SELECT maxlen FROM `information_schema`.`CHARACTER_SETS`;
mysql 5.7 and mysql 8 produce different results:
on mysql 5.7 the results row names are lower cased,
on mysql 8 the results row names are upper cased.
NB : in the CHARACTER_SETS table, the comumn name is MAXLEN (upper cased).
Since I can't find a resource documenting it, my question is :
what are the changes in mysql 8 result rowset case ?
MySQL 8.0 did change the implementation of some views in the INFORMATION_SCHEMA:
https://mysqlserverteam.com/mysql-8-0-improvements-to-information_schema/ says:
Now that the metadata of all database tables is stored in transactional data dictionary tables, it enables us to design an INFORMATION_SCHEMA table as a database VIEW over the data dictionary tables. This eliminates costs such as the creation of temporary tables for each INFORMATION_SCHEMA query during execution on-the-fly, and also scanning file-system directories to find FRM files. It is also now possible to utilize the full power of the MySQL optimizer to prepare better query execution plans using indexes on data dictionary tables.
So it's being done for good reasons, but I understand that it has upset some of your queries when you fetch results in associative arrays based on column name.
You can see the definition of the view declares the column name explicitly in uppercase:
mysql 8.0.14> SHOW CREATE VIEW CHARACTER_SETS\G
*************************** 1. row ***************************
View: CHARACTER_SETS
Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`#`localhost` SQL SECURITY DEFINER VIEW `CHARACTER_SETS` AS
select
`cs`.`name` AS `CHARACTER_SET_NAME`,
`col`.`name` AS `DEFAULT_COLLATE_NAME`,
`cs`.`comment` AS `DESCRIPTION`,
`cs`.`mb_max_length` AS `MAXLEN` -- delimited column explicitly uppercase
from (`mysql`.`character_sets` `cs`
join `mysql`.`collations` `col` on((`cs`.`default_collation_id` = `col`.`id`)))
character_set_client: utf8
collation_connection: utf8_general_ci
You can work around the change in a couple of ways:
You can declare your own column aliases in the case you want when you query a view:
mysql 8.0.14> SELECT MAXLEN AS `maxlen`
FROM `information_schema`.`CHARACTER_SETS` LIMIT 2;
+--------+
| maxlen |
+--------+
| 2 |
| 1 |
+--------+
You could start a habit of querying columns in uppercase prior to 8.0. Here's a test showing results in my 5.7 sandbox:
mysql 5.7.24> SELECT MAXLEN
FROM `information_schema`.`CHARACTER_SETS` LIMIT 2;
+--------+
| MAXLEN |
+--------+
| 2 |
| 1 |
+--------+
Or you could fetch results into a non-associative array, and reference columns by column number, instead of by name.
There is no change in case sensitivity. If you check mysql documentation on identifier case sensitivity, both v5.7 and v8.0 say that field names are case insensitive:
Column, index, stored routine, event, and resource group names are not case-sensitive on any platform, nor are column aliases.
To me this seems more like a display difference.

What's the purpose of varchar(0)

I recently encountered a problem caused by a typo in the database creation script, whereby a column in the database was created as varchar(0) instead of varchar(20).
I expected that I would have gotten an error for 0-length string field, but I didn't. What is the purpose of varchar(0) or char(0) as I wouldn't be able to store any data in this column anyway.
It's not allowed per the SQL-92 standard, but permitted in MySQL. From the MySQL manual:
MySQL permits you to create a column of type CHAR(0). This is useful primarily when you have to be compliant with old applications that depend on the existence of a column but that do not actually use its value. CHAR(0) is also quite nice when you need a column that can take only two values: A column that is defined as CHAR(0) NULL occupies only one bit and can take only the values NULL and '' (the empty string).
Just checked MySQL, it's true that it allows zero-length CHAR and VARCHAR.
Not that it can be extremely useful but I can think of a situation when you truncate a column to 0 length when you no longer need it but you don't want to break existing code that writes something there. Anything you assign to a 0-length column will be truncated and a warning issued, but warnings are not errors, they don't break anything.
As they're similar types, char and varchar, I'm going to venture to guess that the use-case of varchar(0) is the same as char(0).
From the documentation of String Types:
MySQL permits you to create a column of type CHAR(0). This is useful
primarily when you have to be compliant with old applications that
depend on the existence of a column but that do not actually use its
value. CHAR(0) is also quite nice when you need a column that can take
only two values: A column that is defined as CHAR(0) NULL occupies
only one bit and can take only the values NULL and '' (the empty
string).
It's useful in combination with a unique index for if you want to mark one specific row in your table (for instance, because it serves as a default). The unique index ensures that all other rows have to be null, so you can always retrieve the row by that column.
You can use it to store boolean values.
Look this code:
mysql> create table chartest(a char(0));
Query OK, 0 rows affected (0.26 sec)
mysql> insert into chartest value(NULL);
Query OK, 1 row affected (0.01 sec)
mysql> insert into chartest value('');
Query OK, 1 row affected (0.00 sec)
mysql> select 'true' from chartest where a is null;
+------+
| true |
+------+
| true |
+------+
1 row in set (0.00 sec)
mysql> select 'false' from chartest where a is not null;
+-------+
| false |
+-------+
| false |
+-------+
1 row in set (0.00 sec)
We can use NULL to represent true and '' (empty string) to represent false!
According to MySQL reference manual, only NULL occupies one bit.

Mysql delete duplicates

I'm able to display duplicates in my table
table name reportingdetail and column name ReportingDetailID
SELECT DISTINCT ReportingDetailID from reportingdetail group by ReportingDetailID HAVING count(ReportingDetailID) > 1;
+-------------------+
| ReportingDetailID |
+-------------------+
| 664602311 |
+-------------------+
1 row in set (2.81 sec)
Dose anyone know how can I go about deleting duplicates and keep only one record?
I tired the following
SELECT * FROM reportingdetail USING reportingdetail, reportingdetail AS vtable WHERE (reportingdetailID > vtable.id) AND (reportingdetail.reportingdetailID=reportingdetailID);
But it just deleted everything and kept single duplicates records!
The quickest way (that I know of) to remove duplicates in MySQL is by adding an index.
E.g., assuming reportingdetailID is going to be the PK for that table:
mysql> ALTER IGNORE TABLE reportingdetail
-> ADD PRIMARY KEY (reportingdetailID);
From the documentation:
IGNORE is a MySQL extension to standard SQL. It controls how ALTER
TABLE works if there are duplicates on unique keys in the new table or
if warnings occur when strict mode is enabled. If IGNORE is not
specified, the copy is aborted and rolled back if duplicate-key errors
occur. If IGNORE is specified, only the first row is used of rows with
duplicates on a unique key. The other conflicting rows are deleted.
Incorrect values are truncated to the closest matching acceptable
value.
Adding this index will both remove duplicates and prevent any future duplicates from being inserted. If you do not want the latter behavior, just drop the index after creating it.
The following MySQL commands will create a temporary table and populate it with all columns GROUPED by one column name (the column that has duplicates) and order them by the primary key ascending. The second command creates a real table from the temporary table. The third command drops the table that is being used and finally the last command renames the second temporary table to the current being used table name.
Thats a really fast solution. Here are the four commands:
CREATE TEMPORARY TABLE videos_temp AS SELECT * FROM videos GROUP by
title ORDER BY videoid ASC;
CREATE TABLE videos_temp2 AS SELECT * FROM videos_temp;
DROP TABLE videos;
ALTER TABLE videos_temp2 RENAME videos;
This should give you duplicate entries.
SELECT `ReportingDetailID`, COUNT(`ReportingDetailID`) AS Nummber_of_Occurrences FROM reportingdetail GROUP BY `ReportingDetailID` HAVING ( COUNT(`ReportingDetailID`) > 1 )

Are ids in mysql guaranteed not to be repeated, even if rows are deleted?

Of course if I put enough rows eventually there will be a repeat. But let's assume I choose a big enough id field.
I want to know if I can assume that the id uniquely identifies the row over time. And if the client sends me an id I want to be able to determine what row it refers or if the row was deleted (or if it is a fake id, but in that case I will not care telling, wrongly, that the row was deleted).
Please refer also to the following: if I restart the database, or backup and restore - will it be continue creating ids where it left - or may be it will decide to "fill in the holes"
If you have a "int not null auto increment primary key" that you never reset, then yes IDs will not be reused.
However, this raises an interesting question - what happens if you happen to reuse an old ID (even though it won't happen by default in MySQL, but the human factor always counts) ?
If your database is properly normalized, cascaded and constrained your application should be able to handle the reuse of an ID.
Edit (since you edited your post, I'll flesh out my answer), about this quote: "And if the client sends me an id I want to be able to determine what row it refers or if the row was deleted". It's always possible to determine what row an ID belongs to if it's not deleted (kind of vital to be able to extract information out of your database).
However if the row the id refers to is deleted, then it's not possible to determine what row it belongs to... since it's not there. If you need this I would advice you to implement some type of auditing functionality, preferably by triggers .
The current autoincrement value is preserved across backup/restores via an extra attribute attached to the table. You can see it in a dump just after the ENGINE=:
mysql> create table foo ( bar int(11) primary key auto_increment );
Query OK, 0 rows affected (0.06 sec)
mysql> insert into foo () values (), (), (), ();
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> show create table foo \G
*************************** 1. row ***************************
Table: foo
Create Table: CREATE TABLE `foo` (
`bar` int(11) NOT NULL auto_increment,
PRIMARY KEY (`bar`)
) ENGINE=MyISAM AUTO_INCREMENT=5 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)
You can reset it using "ALTER TABLE tablename AUTO_INCREMENT=", but it looks to me like
the attribute value is ignored if it isn't more than the maximum existing id when you do an insert.
also note that the described behaviour of not reusing a previously used ID is valid for the MyISAM engine, but actually is not valid for all available engines.