MySQL INSERT Statement Inserting Too Many Rows - mysql

I'm trying to insert a row in a MySQL if a row already exists in the same table with nick in it. Here is the query I'm attempting:
INSERT IGNORE INTO gold_log (nick, amount, stream_online, modification_type, dt)
SELECT DISTINCT nick, 0, 0, 253, DATE_FORMAT(NOW(), '%Y-01-01 00:00:00')
FROM gold_log WHERE nick='PrestonConnors';
I get result:
Query OK, 2243 rows affected, 2 warnings (0.24 sec)
Records: 2243 Duplicates: 0 Warnings: 2
When I run the SELECT statement independently it only returns one result (which is what I expect):
mysql> SELECT DISTINCT nick, 0, 0, 253, DATE_FORMAT(NOW(), '%Y-01-01 00:00:00') FROM gold_log WHERE nick='PrestonConnors';
+----------------+---+---+-----+-----------------------------------------+
| nick | 0 | 0 | 253 | DATE_FORMAT(NOW(), '%Y-01-01 00:00:00') |
+----------------+---+---+-----+-----------------------------------------+
| PrestonConnors | 0 | 0 | 253 | 2015-01-01 00:00:00 |
+----------------+---+---+-----+-----------------------------------------+
1 row in set (0.01 sec)
Can someone help form my INSERT IGNORE INTO statement into one that will only INSERT one row into the table and also explain what was wrong with my query?
Here is the layout of the table:
mysql> describe gold_log;
+-------------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| nick | char(25) | NO | PRI | NULL | |
| amount | decimal(10,4) | YES | MUL | NULL | |
| stream_online | tinyint(1) | NO | MUL | NULL | |
| modification_type | tinyint(3) unsigned | NO | MUL | NULL | |
| dt | datetime | NO | PRI | NULL | |
+-------------------+---------------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
And here are the warnings:
mysql> SHOW warnings;
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1592 | Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. INSERT IGNORE... SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are ignored. This order cannot be predicted and may differ on master and the slave. |
| Note | 1592 | Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. Statements writing to a table with an auto-increment column after selecting from another table are unsafe because the order in which rows are retrieved determines what (if any) rows will be written. This order cannot be predicted and may differ on master and the slave. |
+-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

Simply, It is a bug in MySQL (I found it in mysql5.5.16); but when I update to (5.6.24):
mysql> status
--------------
D:\xampp\mysql\bin\mysql.exe Ver 14.14 Distrib 5.6.24, for Win32 (x86)
It works properly.
See:
http://bugs.mysql.com/bug.php?id=58637
http://bugs.mysql.com/bug.php?id=72921

Related

How to build index and query with time range and id sort?

Here is the table data
id
time
amount
1
20221104
15
2
20221104
10
3
20221105
7
4
20221105
19
5
20221106
10
The id and time field is asc, but time can be same.
The rows are very large, so we don't want to use page limit offset method, but with cursor id.
first query:
select * from t where time > xxx and time < yyy order by id asc limit 10;
get the biggest id zzz, then
next query:
select * from t where time > xxx and time < yyy and id > zzz order by id asc limit 10;
How should I build the index?
If I use id as index, the time range will cause huge scan if time is far away.
And If I use time as index, seek id will not be effective.
The following index should be enough for both queries:
alter table t add index `time_id` (`time`,`id`);
Note, use proper date/datetime data types , will save a lot of pain in the future
The key is composite index by leftmost prefixing principle. But both queries here start with
range expression. So I suppose that simply creating index on (a, b) is unable to optimize effectively
because indexing process stops after range condition. It is enough to create index like this:
CREATE INDEX index_time ON t (`time`)
More can be referenced here:
https://www.ibm.com/docs/en/informix-servers/12.10?topic=indexes-use-composite
https://orangematter.solarwinds.com/2019/02/05/the-left-prefix-index-rule/
First I agree with #ErgestBasha Suggestion:
If you follow the general performance rules:
CREATE TABLE t (id INT, time DATE, amount DEC(3,1));
Query OK, 0 rows affected (0.02 sec)
mysql> INSERT INTO t VALUES
-> (1, '2022-11-04', 15),
-> (2, '2022-11-04', 10),
-> (3, '2022-11-05', 7),
-> (4, '2022-11-05', 19),
-> (5, '2022-11-06', 10);
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM t;
+------+------------+--------+
| id | time | amount |
+------+------------+--------+
| 1 | 2022-11-04 | 15.0 |
| 2 | 2022-11-04 | 10.0 |
| 3 | 2022-11-05 | 7.0 |
| 4 | 2022-11-05 | 19.0 |
| 5 | 2022-11-06 | 10.0 |
+------+------------+--------+
5 rows in set (0.00 sec)
mysql> ALTER TABLE t ADD INDEX idx_time_id (time,id);
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SHOW INDEXES FROM t;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| t | 1 | idx_time_id | 1 | time | A | 3 | NULL | NULL | YES | BTREE | | | YES | NULL |
| t | 1 | idx_time_id | 2 | id | A | 5 | NULL | NULL | YES | BTREE | | | YES | NULL |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
2 rows in set (0.01 sec)
mysql> SELECT * FROM t WHERE time > '2022-11-04' AND time < '2022-11-06' AND id > 3 ORDER BY id ASC LIMIT 10;
+------+------------+--------+
| id | time | amount |
+------+------------+--------+
| 4 | 2022-11-05 | 19.0 |
+------+------------+--------+
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT * FROM t WHERE time > '2022-11-04' AND time < '2022-11-06' AND id > 3 ORDER BY id ASC LIMIT 10;
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+---------------------------------------+
| 1 | SIMPLE | t | NULL | range | idx_time_id | idx_time_id | 4 | NULL | 2 | 33.33 | Using index condition; Using filesort |
+----+-------------+-------+------------+-------+---------------+-------------+---------+------+------+----------+---------------------------------------+
1 row in set, 1 warning (0.00 sec)
As you can see, It uses the indexes defined (time,id) and uses the range scan access method. Also Extra column you can see that index is used during operation!
See this for iterating through a compound key:
http://mysql.rjweb.org/doc.php/deletebig#iterating_through_a_compound_key
It cannot be done with two ANDs; tt needs one AND and one OR.
See this for why OFFSET should be avoided when Paginating

MySQL bigint number - different output in SQL and R

I have stored a value as varchar and as bigint in a MySQL DB:
userID_as_varchar varchar(50) DEFAULT NULL,
userID_as_bigint bigint(20) DEFAULT NULL,
+--------------------+---------------------------+
| userID_as_varchar | userID_as_bigint |
+--------------------+---------------------------+
| 917876131364446205 | 917876131364446200 |
+--------------------+---------------------------+
For any reason, I can't query the full userID_as_bigint value in full precision with SQL, but with R.
Behaviour SQL:
If I query the data or cast it it's always the "rounded" value.
Tested in phpMyAdmin and directly with sql command in shell.
Behaviour R:
If I query the field with R (RMySQL package) the value is complete 917876131364446205
Can anyone explain this behaviour or know a way how to get the full value with SQL.
Best regards.
Not quite sure what you mean, here's a test:
create table test(t1 varchar(50), t2 bigint);
Query OK, 0 rows affected (0.03 sec)
mysql> desc test
-> ;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| t1 | varchar(50) | YES | | NULL | |
| t2 | bigint(20) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
mysql> insert into test values('917876131364446205', 917876131364446205);
Query OK, 1 row affected (0.01 sec)
mysql> select * from test;
+--------------------+--------------------+
| t1 | t2 |
+--------------------+--------------------+
| 917876131364446205 | 917876131364446205 |
+--------------------+--------------------+
1 row in set (0.00 sec)

MySQL "Column count doesn't match value count" but the count DOES match

MySQL is issuing this error when I try to execute a query where the column count does match. Here is the structure of the table:
mysql> desc S_3068;
+-------------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+----------------------+------+-----+---------+-------+
| SfmID | smallint(5) unsigned | NO | PRI | 1 | |
| DatValue | float | NO | | 0 | |
| DatRawValue | int(10) unsigned | NO | | 0 | |
| DatTime | int(10) unsigned | NO | PRI | 0 | |
| DatBusOrder | tinyint(3) unsigned | NO | PRI | 1 | |
| DatFormulaVersion | tinyint(3) unsigned | NO | | 0 | |
+-------------------+----------------------+------+-----+---------+-------+
6 rows in set (0.00 sec)
I get the aforementioned error when I execute this query:
mysql> insert ignore into S_3068 values (133, 15.82, 5542, 1339309260, 0, 1);
ERROR 1136 (21S01): Column count doesn't match value count at row 1
As you can see, the column count does match the value count. Now what's even more puzzling is that the query works perfectly fine with SfmID = 132:
mysql> insert ignore into S_3068 values (132, 15.82, 5542, 1339309260, 0, 1);
Query OK, 1 row affected (0.00 sec)
SfmID being a unsigned smallint, that doesn't make any sense to me.
Any help on this matter would be greatly appreciated.
EDIT: The error was caused by a trigger associated to the table. Please see comments for more information.
The error was caused by a trigger associated to the table, doing a side insert on another table for value 133 but not for value 132. The error issued by MySQL was about the other table (which column count was indeed wrong) and not about the main table in which I was inserting data.

Getting max value from many tables

There are two ways, that I can think of, to obtain similar results from multiple tables. One is UNION and the other is JOIN. The similar questions on SO have all been answered with a UNION. Here's the coder I just found:
SELECT max(up.id) AS up, max(sc.id) AS sc, max(cl.id) AS cl
FROM updates up, chat_staff sc, change_log cl
explain:
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
My question is -- Is this better than the following?
SELECT "up.id" AS K, max(id) AS V FROM updates
UNION
SELECT "sc.id" AS K, max(id) AS V FROM chat_staff
UNION
SELECT "cl.id" AS K, max(id) AS V FROM change_log
explain:
+----+--------------+--------------+------+---------------+------+---------+------+-------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+------+---------------+------+---------+------+------+------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 3 | UNION | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| NULL | UNION RESULT | <union1,2,3> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+------+---------------+------+---------+------+------+------------------------------+
Both of those methods are just fine. In fact, I have another method:
SELECT
IFNULL(maxidup,0) max_id_up,
IFNULL(maxscup,0) max_sc_up,
IFNULL(maxclup,0) max_cl_up
FROM
(SELECT max(id) maxidup FROM updates) up,
(SELECT max(id) maxidsc FROM chat_staff) sc,
(SELECT max(id) maxidcl FROM change_log) cl
;
This method presents the three values side by side like your first example. It also shows 0 in the event one of the tables are empty.
mysql> DROP DATABASE IF EXISTS junk;
Query OK, 3 rows affected (0.11 sec)
mysql> CREATE DATABASE junk;
Query OK, 1 row affected (0.00 sec)
mysql> use junk
Database changed
mysql> CREATE TABLE updates (id int not null auto_increment primary key,x int);
Query OK, 0 rows affected (0.07 sec)
mysql> CREATE TABLE chat_staff LIKE updates;
Query OK, 0 rows affected (0.07 sec)
mysql> CREATE TABLE change_log LIKE updates;
Query OK, 0 rows affected (0.06 sec)
mysql> INSERT INTO updates (x) VALUES (37),(84),(12);
Query OK, 3 rows affected (0.06 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> INSERT INTO change_log (x) VALUES (37),(84),(12),(14),(35);
Query OK, 5 rows affected (0.09 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SELECT
-> IFNULL(maxidup,0) max_id_up,
-> IFNULL(maxidsc,0) max_sc_up,
-> IFNULL(maxidcl,0) max_cl_up
-> FROM
-> (SELECT max(id) maxidup FROM updates) up,
-> (SELECT max(id) maxidsc FROM chat_staff) sc,
-> (SELECT max(id) maxidcl FROM change_log) cl
-> ;
+-----------+-----------+-----------+
| max_id_up | max_sc_up | max_cl_up |
+-----------+-----------+-----------+
| 3 | 0 | 5 |
+-----------+-----------+-----------+
1 row in set (0.00 sec)
mysql> explain SELECT IFNULL(maxidup,0) max_id_up, IFNULL(maxidsc,0) max_sc_up, IFNULL(maxidcl,0) max_cl_up FROM (SELECT max(id) maxidup FROM updates) up, (SELECT max(id) maxidsc FROM chat_staff) sc, (SELECT max(id) maxidcl FROM change_log) cl;
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
| 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | <derived3> | system | NULL | NULL | NULL | NULL | 1 | |
| 1 | PRIMARY | <derived4> | system | NULL | NULL | NULL | NULL | 1 | |
| 4 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No matching min/max row |
| 2 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+------------+--------+---------------+------+---------+------+------+------------------------------+
6 rows in set (0.02 sec)
In my EXPLAIN plan, it has Select tables optimized away just like yours. Why ?
Since id is indexed in all the tables, the index is used to retrieve the max(id) rather than the table. Thus, Select tables optimized away is the correct response.
Six of one, half dozen of the other. How you present data from there is strictly your personal preference.
UPDATE 2011-10-20 15:32 EDT
You commented : Do you know how table locking would compromise this? Let's say one of the tables in question is locked. Would this query lock the other two and keep 'em locked until the first one was freed up?
This would depend on the storage engine. If all tables in question are MyISAM, definite possibility since MyISAM performs a full table lock on INSERT, UPDATE, DELETE. If the three tables are InnoDB, you have the benefit of MVCC to provide transaction isolation. This would allow everyone their view of the data in a point-in-time. Aside from DDL and an explcit LOCK TABLES against InnoDB, your query should not be blocked.
Actually, while they're similar, there's a subtle difference. The first gives you a one-row, three-column table (with the values going "across") and the second gives you a three-row, two-column table (with the values going "down").
Provided you're happy processing or viewing that data in either form, it's probably going to come down to performance.
In my experience (and this is nothing to do specifically with MySQL), the latter query will probably be better. That's because the DBMS' I work with are able to run queries like that in parallel for efficiency, combining them at completion of all. The fact that they're on different tables means that lock contention between them will be zero.
It may be that the query analysis engine of a DBMS could do a similar optimisation for the first query but it would require a lot more intelligence than I've seen from most of them.
One quick point, if you use union all instead of just union, you tell the database not to remove duplicate rows. You won't get any duplicates in this case due to the K column being different for all three sub-queries.
But, as with all optimisations, measure, don't guess! Certainly don't take as gospel the rants of random internet roamers (yes, even me).
Put together various candidate tables with the properties you're likely to have in production, and compare the performance of each.

Mysql, as 1 query, if row does not exist, do other query

For a preferences module I have "system defaults", and "user preferences".
If there is no personal/user preference stored, then use the system default values instead.
Here is my system preferences table:
mysql> desc rbl;
+-------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+-------+
| id | varchar(3) | NO | PRI | | |
| rbl_url | varchar(100) | NO | | | |
| description | varchar(100) | NO | | | |
| is_default | tinyint(1) unsigned | YES | | 1 | |
+-------------+---------------------+------+-----+---------+-------+
4 rows in set (0.00 sec)
Example data from system prefs:
mysql> select * from rbl;
+----+----------------------+------------------------------+------------+
| id | rbl_url | description | is_default |
+----+----------------------+------------------------------+------------+
| 1 | sbl-xbl.spamhaus.org | Spamhaus SBL-XBL | 1 |
| 2 | pbl.spamhaus.org | Spamhaus PBL | 1 |
| 3 | bl.spamcop.net | Spamcop Blacklist | 1 |
| 4 | rbl.example.com | Example RBL - not functional | 0 |
+----+----------------------+------------------------------+------------+
... and Query for system defaults:
mysql> SELECT rbl_url FROM rbl WHERE is_default='1';
+----------------------+
| rbl_url |
+----------------------+
| sbl-xbl.spamhaus.org |
| pbl.spamhaus.org |
| bl.spamcop.net |
+----------------------+
3 rows in set (0.01 sec)
So far so good.
OK. Now I need a user preferences table, and I came up with this:
mysql> desc rbl_pref;
+-----------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| domain_id | mediumint(8) unsigned | NO | | NULL | |
| rbl_id | tinyint(1) unsigned | NO | | NULL | |
+-----------+-----------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
(FYI - A "user" is represented by "domain_id". )
Let's view the preferences of a specific user who has personalized preferences saved:
mysql> select * from rbl_pref where domain_id='2277';
+----+-----------+--------+
| id | domain_id | rbl_id |
+----+-----------+--------+
| 4 | 2277 | 1 |
| 5 | 2277 | 2 |
| 6 | 2277 | 4 |
+----+-----------+--------+
3 rows in set (0.00 sec)
... again, but in a simpler format:
mysql> SELECT rbl.rbl_url FROM rbl_pref,rbl
WHERE rbl_pref.rbl_id=rbl.id AND domain_id='2277';
+----------------------+
| rbl_url |
+----------------------+
| sbl-xbl.spamhaus.org |
| pbl.spamhaus.org |
| rbl.example.com |
+----------------------+
3 rows in set (0.00 sec)
.. so far so good. If a user has stored a preference, a result is found.
The problem example now is, user 1999 has no custom preferences.
In place of the "Empty set" result, I want the system defaults.
mysql> SELECT rbl.rbl_url FROM rbl_pref,rbl
WHERE rbl_pref.rbl_id=rbl.id AND domain_id='1999';
Empty set (0.00 sec)
I was excited to find a very similar question:
mysql if row doesn't exist, grab default value
However after a couple of days trial and error and documentation review, I could not translate that answer over to here.
Like the above question, this must be done as a single MySQL query. I am not actually making this query from PHP, but from Exim macros (and it is a very picky language... best to feed it "one liners" as variable assignments, as I try to do here.. )
UPDATE: Tried one type of a UNION query suggested by #Biff McGriff, below. The table did not display in my comment reply, so here it is again:
mysql> SELECT rbl.rbl_url FROM rbl_pref,rbl
WHERE rbl_pref.rbl_id=rbl.id AND domain_id='2277'
UNION SELECT rbl_url FROM rbl WHERE is_default='1';
+----------------------+
| rbl_url |
+----------------------+
| sbl-xbl.spamhaus.org |
| pbl.spamhaus.org |
| rbl.example.com |
| bl.spamcop.net |
+----------------------+
4 rows in set (0.00 sec)
As you can see above, user 2277 did not opt in to rbl_id 3 (bl.spamcop.net), but that's showing up anyways.
What my UNION query seems to be doing is combining the result set. So user_pref acts as "in addition to" global defaults, and I was assuming/expecting I would get a result set matching either half of the query.
So my question now is, is it better (or possible, how) to solve this as "either result set" (either subquery on either side of the UNION)? OR do I really need a new field on rbl_pref, called for example "enabled". The latter seems to be more correct - that I need something in rbl_pref to explicitly designate opt-in or opt-out (other than the implicit "that pref is not here - no rbl_id=3 - in the over ridden user result SET")
UPDATE: All set, thanks #Imre L, and everyone else. I learned something through this example.
You should be able to use a left join and then coalesce the user's field with the default field.
NOTE: you have to enter the domain_id in two places.
SELECT rbl.rbl_url FROM rbl
JOIN rbl_pref ON rbl_pref.rbl_id=rbl.id AND domain_id=2277
UNION
SELECT rbl.rbl_url FROM rbl
WHERE rbl.is_default
AND NOT EXISTS (SELECT 1 FROM rbl_pref WHERE domain_id=2277 LIMIT 1)
;
Now one or the other side of UNION will be optimized away with impossible where
You also should not use varchar(3) for rbl.id but some sort of integer
and preferable same type as rbl_pref.rbl_id for which tinyint is too tiny
and when you compare integers fields in sql code domain_id='2277' you should not use ' or " around constants integers.
You can get away whith it mostly but sometimes it may confuse mysql optimizer.
Also for optimal performance and consistency i suggest you the add the index:
ALTER TABLE rbl_pref
ADD UNIQUE INDEX ux_domain_rbl (domain_id, rbl_id);