What are specific features from Postgres that are not available in MySQL?
Are there some queries that you wouldn't be able to do as easily? Or are the differences mostly in how you store your data?
I would say that two of the largest differences are WITH queries and window functions -- standard SQL features (from the SQL-99 standard) that are also available in other major SQL implementations (such as Oracle, DB2, SQL Server, ...), but not in MySQL.
Many minor things of course, e.g.:
MySQL has some non-standard conveniences, such as INSERT IGNORE and REPLACE, which PostgreSQL lacks. PostgreSQL's stored procedures and triggers can use any of several languages (such as Python, Java, Perl, ...), MySQL's (like DB2's) use the SQL'03 standard syntax here.
Also outside the standard, PostgreSQL has many peculiar data types (including user-defined types and multi-dimensional arrays), MySQL has unsigned integers.
What are specific features from Postgres that are not available in MySQL?
There are many, but most importantly: errors are raised when erroneous data is inserted --- and clients can't disable the sanity checks, as opposed to:
mysql> create table foo (id tinyint not null check id > 100);
Query OK, 0 rows affected (0.01 sec)
mysql> insert into foo values (null), ('abc'), (128), ('1');
Query OK, 4 rows affected, 3 warnings (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 2
mysql> select * from foo;
+-----+
| id |
+-----+
| 0 |
| 0 |
| 127 |
| 1 |
+-----+
4 rows in set (0.00 sec)
Are there some queries that you wouldn't be able to do as easily?
Complex queries with lots of joins: PostgreSQL's query optimizer is vastly better, and nested loops isn't the only available join algorithm. Also, it can flatten sub-queries in the FROM-part of the query. Those are currently materialized in the stable releases of MySQL.
Related
when running
SELECT maxlen FROM `information_schema`.`CHARACTER_SETS`;
mysql 5.7 and mysql 8 produce different results:
on mysql 5.7 the results row names are lower cased,
on mysql 8 the results row names are upper cased.
NB : in the CHARACTER_SETS table, the comumn name is MAXLEN (upper cased).
Since I can't find a resource documenting it, my question is :
what are the changes in mysql 8 result rowset case ?
MySQL 8.0 did change the implementation of some views in the INFORMATION_SCHEMA:
https://mysqlserverteam.com/mysql-8-0-improvements-to-information_schema/ says:
Now that the metadata of all database tables is stored in transactional data dictionary tables, it enables us to design an INFORMATION_SCHEMA table as a database VIEW over the data dictionary tables. This eliminates costs such as the creation of temporary tables for each INFORMATION_SCHEMA query during execution on-the-fly, and also scanning file-system directories to find FRM files. It is also now possible to utilize the full power of the MySQL optimizer to prepare better query execution plans using indexes on data dictionary tables.
So it's being done for good reasons, but I understand that it has upset some of your queries when you fetch results in associative arrays based on column name.
You can see the definition of the view declares the column name explicitly in uppercase:
mysql 8.0.14> SHOW CREATE VIEW CHARACTER_SETS\G
*************************** 1. row ***************************
View: CHARACTER_SETS
Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`#`localhost` SQL SECURITY DEFINER VIEW `CHARACTER_SETS` AS
select
`cs`.`name` AS `CHARACTER_SET_NAME`,
`col`.`name` AS `DEFAULT_COLLATE_NAME`,
`cs`.`comment` AS `DESCRIPTION`,
`cs`.`mb_max_length` AS `MAXLEN` -- delimited column explicitly uppercase
from (`mysql`.`character_sets` `cs`
join `mysql`.`collations` `col` on((`cs`.`default_collation_id` = `col`.`id`)))
character_set_client: utf8
collation_connection: utf8_general_ci
You can work around the change in a couple of ways:
You can declare your own column aliases in the case you want when you query a view:
mysql 8.0.14> SELECT MAXLEN AS `maxlen`
FROM `information_schema`.`CHARACTER_SETS` LIMIT 2;
+--------+
| maxlen |
+--------+
| 2 |
| 1 |
+--------+
You could start a habit of querying columns in uppercase prior to 8.0. Here's a test showing results in my 5.7 sandbox:
mysql 5.7.24> SELECT MAXLEN
FROM `information_schema`.`CHARACTER_SETS` LIMIT 2;
+--------+
| MAXLEN |
+--------+
| 2 |
| 1 |
+--------+
Or you could fetch results into a non-associative array, and reference columns by column number, instead of by name.
There is no change in case sensitivity. If you check mysql documentation on identifier case sensitivity, both v5.7 and v8.0 say that field names are case insensitive:
Column, index, stored routine, event, and resource group names are not case-sensitive on any platform, nor are column aliases.
To me this seems more like a display difference.
On my development server I have a column indexed with a cardinality of 200.
The table has about 6 million rows give or take and I have confirmed it is an identical row count on the production server.
However the production servers index has a cardinality of 31938.
They are both mysql 5.5 however my dev server is Ubuntu server 13.10 and the production server is Windows server 2012.
Any ideas on what would cause such a difference in what should be the exact same data?
The data was loaded into the production server from a MySQL dump of the dev server.
EDIT: Its worth noting that I have queries that take about 15 minutes to run on my dev server that seem to run forever on the production server due to what i believe to be these indexing issues. Different amounts of rows are being pulled within sub-queries.
Mysql checksums might help you verify that the tables are the same
-- a table
create table test.t ( id int unsigned not null auto_increment primary key, r float );
-- some data ( 18000 rows or so )
insert into test.t (r) select rand() from mysql.user join mysql.user u2;
-- a duplicate
create table test.t2 select * from test.t;
-- introduce a difference somewhere in there
update test.t2 set r = 0 order by rand() limit 1;
-- and prove the tables are different easily:
mysql> checksum table test.t;
+--------+------------+
| Table | Checksum |
+--------+------------+
| test.t | 2272709826 |
+--------+------------+
1 row in set (0.00 sec)
mysql> checksum table test.t2
-> ;
+---------+-----------+
| Table | Checksum |
+---------+-----------+
| test.t2 | 312923301 |
+---------+-----------+
1 row in set (0.01 sec)
Beware the checksum locks tables.
For more advanced functionality, the percona toolkit can both checksum and sync tables (though it's based on master/slave replication scenarios so it might not be perfect for you).
Beyond checksumming, you might consider looking at REPAIR OR OPTIMIZE.
I recently encountered a problem caused by a typo in the database creation script, whereby a column in the database was created as varchar(0) instead of varchar(20).
I expected that I would have gotten an error for 0-length string field, but I didn't. What is the purpose of varchar(0) or char(0) as I wouldn't be able to store any data in this column anyway.
It's not allowed per the SQL-92 standard, but permitted in MySQL. From the MySQL manual:
MySQL permits you to create a column of type CHAR(0). This is useful primarily when you have to be compliant with old applications that depend on the existence of a column but that do not actually use its value. CHAR(0) is also quite nice when you need a column that can take only two values: A column that is defined as CHAR(0) NULL occupies only one bit and can take only the values NULL and '' (the empty string).
Just checked MySQL, it's true that it allows zero-length CHAR and VARCHAR.
Not that it can be extremely useful but I can think of a situation when you truncate a column to 0 length when you no longer need it but you don't want to break existing code that writes something there. Anything you assign to a 0-length column will be truncated and a warning issued, but warnings are not errors, they don't break anything.
As they're similar types, char and varchar, I'm going to venture to guess that the use-case of varchar(0) is the same as char(0).
From the documentation of String Types:
MySQL permits you to create a column of type CHAR(0). This is useful
primarily when you have to be compliant with old applications that
depend on the existence of a column but that do not actually use its
value. CHAR(0) is also quite nice when you need a column that can take
only two values: A column that is defined as CHAR(0) NULL occupies
only one bit and can take only the values NULL and '' (the empty
string).
It's useful in combination with a unique index for if you want to mark one specific row in your table (for instance, because it serves as a default). The unique index ensures that all other rows have to be null, so you can always retrieve the row by that column.
You can use it to store boolean values.
Look this code:
mysql> create table chartest(a char(0));
Query OK, 0 rows affected (0.26 sec)
mysql> insert into chartest value(NULL);
Query OK, 1 row affected (0.01 sec)
mysql> insert into chartest value('');
Query OK, 1 row affected (0.00 sec)
mysql> select 'true' from chartest where a is null;
+------+
| true |
+------+
| true |
+------+
1 row in set (0.00 sec)
mysql> select 'false' from chartest where a is not null;
+-------+
| false |
+-------+
| false |
+-------+
1 row in set (0.00 sec)
We can use NULL to represent true and '' (empty string) to represent false!
According to MySQL reference manual, only NULL occupies one bit.
Earlier today, I asked for an easy way to store a version number for the SQL table layout you are using in SQLite, and got the suggestion to use PRAGMA user_version. As there is no such thing as a Pragma in MySQL, I was wondering on how you would go about this in MySQL (Except for creating a table named "META" with a column "DB-Scheme-Version").
Just to repeat what I said in the linked question: I'm not looking for a way to find out which version of MySQL is installed, but to save a version nuber that tells me what version of my MySQL-Scheme I am using, without checking every table via script.
I also saw this question, but it only allows me to version single tables. Is there something similar or, preferably, easier, for whole Databases (Since it would be no fun to query every single table seperately)? Thanks in advance.
MySQL's SET GLOBAL would probably work, but I prefer a solution that does not reset itself every time the server reboots and does not require SUPER Privilege and / or access to the configuration file to use. To put it short: It should work with a standard MySQL-Database that you get when you rent a small webhosting package, not the ones you get if you rent a full server, as you tend to have more access to those.
There are a couple of choices, depending on the privileges that you have. The higher privileges you have, the more “elegant” the solution.
The most direct route is to create a stored function, which requires the CREATE ROUTINE privilege. e.g.
mysql> CREATE FUNCTION `mydb`.DB_VERSION() RETURNS VARCHAR(15)
RETURN '1.2.7.2861';
Query OK, 0 rows affected (0.03 sec)
mysql> SELECT `mydb`.DB_VERSION();
+--------------+
| DB_VERSION() |
+--------------+
| 1.2.7.2861 |
+--------------+
1 row in set (0.06 sec)
If your privileges limit you to only creating tables, you can create a simple table and put the version in a default value:
mysql> CREATE TABLE `mydb`.`db_version` (
`version` varchar(15) not null default '1.2.7.2861');
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW COLUMNS FROM `mydb`.`db_version`;
+---------+-------------+------+-----+------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+-------------+------+-----+------------+-------+
| version | varchar(15) | NO | | 1.2.7.2861 | |
+---------+-------------+------+-----+------------+-------+
1 row in set (0.00 sec)
Setup:
mysql> create table test(id integer unsigned,s varchar(30));
Query OK, 0 rows affected (0.05 sec)
mysql> insert into test(id,s) value(1,'s');
Query OK, 1 row affected (0.00 sec)
mysql> insert into test(id,s) value(1,'tsr');
Query OK, 1 row affected (0.00 sec)
mysql> insert into test(id,s) value(1,'ts3r');
Query OK, 1 row affected (0.00 sec)
mysql> create index i_test_id on test(id);
Query OK, 3 rows affected (0.08 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> create index i_test_s on test(s);
Query OK, 3 rows affected (0.05 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> insert into test(id,s) value(21,'ts3r');
Query OK, 1 row affected (0.00 sec)
And then run this:
mysql> explain select * from test where id in (1) order by s desc;
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | test | ref | i_test_id | i_test_id | 5 | const | 2 | Using where; Using filesort |
+----+-------------+-------+------+---------------+-----------+---------+-------+------+-----------------------------+
1 row in set (0.02 sec)
We can see it uses filesort instead of using the index on s,which will be slow when the selected result set is big.How to optimize it?
Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table.
From: MySQL 5.1 Reference Manual: How MySQL Uses Indexes
The index on id is being used to identify the rows to return. Depending on the version of MySQL you are using, it may only allow the use of one index per table, and the optimizer has determined it is more efficient to use the index for filtering the rows rather than for ordering.
Create a clustered index on the column 'id'.
Clustered index means a physical sort. That way I am guessing there wont be a filesort, when this query is invoked.
But a table can have only one clustered index. Hence , if you have another column that is a primary key for the table, you may not be able to create a clustered index on column 'id'.
As primary keys by default are clustered.
What version of MySQL are you on? Not until version 5 could MySQL use more than one index per table.
The choice of the indexes to use also depends on the size of the result set. With only two records returned in the result, it may not use the index anyway. For such small result sets, MySQL doesn't seem to mind sorting things manually.
However, what you could do to really help MySQL out, if this is a common query for you, is to add a compound index ('id', 's'). Basically, it's almost like your creating another little table that is always sorted by id then s, so no filesort would be required, and it would only need the one index, not two.
The problem you are experiencing is coming from the fact that you are putting an Order by clause in your sql statement. This is causing MySql to skip using any of the indexes and doing a full sort on S. The explain statement is showing that MySql has the i_test_id a possible index to choose from and the key field is showing that it has been chosen, but it must perform a sort on s as well. The optimizer has chosen to not use i_test_s as a possible index because it would be more costly in term of performance. You can go around this issue by building componsite indexes at the expense of disk space, or you can structure your query differently using Unions instead. Haven't tried it in your example though.