Find MySQL row identified by number in a warning message - mysql

The MySQL "show warnings" output identifies problematic rows by number. What's the best way to quickly see all the data for such a row?
For example, after running an update statement the result indicates "1 warning" and running show warnings gives a message like this: "Data truncated for column 'person' at row 65278". How can I select exactly that row?
Here is a concrete example exploring the limit solution:
create table test1 (
id mediumint,
value varchar(2)
);
insert into test1 (id, value) values
(11, "a"),
(12, "b"),
(13, "c"),
(14, "d"),
(15, "ee"),
(16, "ff");
update test1 set value = concat(value, "X") where id % 2 = 1;
show warnings;
That results in this warning output:
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1265 | Data truncated for column 'value' at row 5 |
+---------+------+--------------------------------------------+
To get just that row 5 I can do this:
select * from test1 limit 4,1;
resulting in this:
+------+-------+
| id | value |
+------+-------+
| 15 | ee |
+------+-------+
So it seems that the limit offset (4) must be one less than the row number, and the row number given in the warning is for the source table of the update without regard to the where clause.

As far as I'm aware, the only way to select those rows is to just SELECT them using the criteria from your original UPDATE query:
mysql> UPDATE foo SET bar = "bar" WHERE baz = "baz";
mysql> SHOW WARNINGS;
...
Message: Data truncated for column 'X' at row 420
...
mysql> SELECT * FROM foo WHERE baz = "baz" LIMIT 420,1;
Obviously, this doesn't work if you've modified one or more of the columns that were part of your original query.

LIMIT x,y returns y number of rows after row x, based on the order of the resultset from your select query. However, if you look closely at what I just said, you'll notice that without an ORDER BY clause, you've got no way to guarantee the position of the row(s) you're trying to get.
You might want to add an autoincrement field to your insert or perhaps a trigger that fires before each insert, then use that index to ensure the order of the results to limit by.

Not to raise this question from the dead, but I'll add one more method of finding the source of warning data that can be helpful in certain cases.
If you are importing a complete dataset from one table into another and receive a truncation warning on a specific field you can run a query joining the two tables on an ID value and then filter by records where the field in question doesn't match. Obviously this only will work if you are importing from a separate table and still have access to the unmodified source table.
So if the field in question is testfield and your import query looks like this:
INSERT INTO newtable (
id,
field1,
field2,
testfield
)
SELECT
id,
field1,
field2,
testfield
FROM oldtable;
The diagnostic query could look something like this:
SELECT newtable.testfield, oldtable.testfield
FROM newtable
INNER JOIN oldtable ON newtable.id = oldtable.id
WHERE newtable.testfield != oldtable.testfield;
This has the added advantage that the order of records in either table doesn't matter.

Related

mysql query message save into table or external file

I have multiple store procedures to do the ETL work in mysql. Normally, it is running on the server for over night.
inside the store procedures there are multiple update statement like
update table1 set column1=3 when column2 = 4
if there any way, I can keep the mysql workbench result like
Rows matched: 100 Changed: 50 Warnings: 0
for each statement I run either into mysql table or external file?
prefer mysql native method. if not, any python I could possible use?
"Rows changed" can be retrieved with ROW_COUNT() function.
"Rows matched" needs in a trick with user-defined variable usage.
CREATE TABLE test (id INT, val INT);
INSERT INTO test VALUES
(1,1), (1,2), (1,3), (2,4);
Now we want to perform UPDATE test SET val = 3 WHERE id = 1; and count the amounts.
UPDATE test
-- add user-defined variable for matched rows counting
CROSS JOIN ( SELECT #matched := 0 ) init_variable
-- increment matched rows counter (this expression is always TRUE)
SET val = CASE WHEN #matched := #matched + 1
-- update the column
THEN 3
END
WHERE id = 1;
SELECT #matched matched, ROW_COUNT() changed;
matched | changed
------: | ------:
3 | 2
db<>fiddle here
If more than one column should be updated in a query then only one expression must be accompanied with the counter increment.

MYSQL retrieve old values when updating row by INSERT/UPDATE ON DUPLICATE KEY

I need to update a table daily by its primary key, inserting new rows when needed, so the statement INSERT/UPDATE ON DUPLICATE KEY allows me to do it in a single query.
I'd also like to issue a text report listing the values that changed, but the only way I can think of involves 2 separate queries for each row: first a SELECT and then an UPDATE or INSERT.
(I could also SELECT and save the whole table in memory by a single query before starting with the updates, but the size of the table doesn't allow it)
Is there a way to SELECT and retrieve the old values, and UPDATE/INSERT in a single query?
Here's a solution but it's awfully hacky:
mysql> create table mytable (id int primary key, x int );
mysql> insert into mytable values (1, 42);
mysql> insert into mytable values (1, 47)
on duplicate key update x = case x when #old_x:=x then values(x) end;
Query OK, 2 rows affected (0.02 sec)
mysql> select * from mytable;
+----+------+
| id | x |
+----+------+
| 1 | 47 |
+----+------+
mysql> select #old_x;
+--------+
| #old_x |
+--------+
| 42 |
+--------+
This solution relies on a side-effect of using := to set a user-defined variable. After the query is done, the user-defined variable retains its value during the current session.

Inconsistent/Strange behavior of MySQL when using the group by clause and count() function

Due to a bug(?) in MySQL the COUNT() function along with the GROUP BY clause can cause MySQL to leak out db details like the following -
mysql> select count(*), floor(rand()*2)x from users group by x;
ERROR 1062 (23000): Duplicate entry '1' for key 'group_key'
Sensitive details can be revealed here with a well crafted query. This
is unexpected behavior, maybe a bug?
mysql> select count(*), floor(rand()*2)x from users group by x;
+----------+---+
| count(*) | x |
+----------+---+
| 8 | 0 |
| 5 | 1 |
+----------+---+
2 rows in set (0.00 sec) <-- Sometimes the query runs without any errors(Expected behavior)
Does anyone know what exactly causes the MySQL error.
The test bed that I am using is this excellent resource - https://github.com/Audi-1/sqli-labs
This looks to be a reported (and old!) bug: http://bugs.mysql.com/bug.php?id=58081
Description: A GROUP BY query returns this error under certain
circumstances:
Duplicate entry '107374182410737418241' for key 'group_key'
'group_key' is not a real column name in the table. It looks like a
name for the grouping column in the temporary table.
How to repeat: set names latin1; drop table if exists t1; create table
t1(a int) engine=myisam; insert into t1 values (0),(0),(1),(0),(0);
select count(*) from t1, t1 t2 group by insert('', t2.a,
t1.a,(##global.max_binlog_size));
ERROR 1062 (23000): Duplicate entry '107374182410737418241' for key
'group_key'
Comments indicate a suggested work around is to increase the available heap and temp table size:
The workaround i found is to increase the size of the tmp_table:
SET SESSION max_heap_table_size=536870912; SET SESSION
tmp_table_size=536870912;
now my request work !
Or to check your available disk space

Return default value for non-existing rows

To the very basic query
SELECT id, column1, column2
FROM table1
WHERE id IN ("id1", "id2", "id3")
in which the the arguments in the where statement are passed as a variable, I need to return values also for rows where the id doesn't exist. In general, this is a very similar problem as outlined here: SQL: How to return an non-existing row? However, multiple parameters are in the WHERE statement
The result right now when id2 doesn't exist:
-------------------------------
| id | column1 | column2 |
-------------------------------
| id1 | some text | some text |
| id3 | some text | some text |
-------------------------------
Desired outcome when id2 doesn't exist
-----------------------------------
| id | column1 | column2 |
-----------------------------------
| id1 | some text | some text |
| id2 | placeholder | placeholder |
| id3 | some text | some text |
-----------------------------------
My first thought was to create a temporary table and join it against the query. Unfortunately, I don't have the rights to create any kind of temporary table so that I am limited to a SELECT statement.
Is there way to do that in with a SQL SELECT query?
Edit:
Indeed, the above mentioned is a hypothetical situation. In the WHERE clause can be hundreds of ids where the amount of missing in unknown.
You can do a derived table to create something like a temp table, but it can only be used for this one query:
SELECT t.id, COALESCE(t.column1, _dflt.column1) AS column1
FROM (
SELECT 'id1' AS id, 'placeholder text 1' as column1
UNION ALL
SELECT 'id2', 'placeholder text 3'
UNION ALL
SELECT 'id3', 'placeholder text 3'
) AS _dflt
LEFT OUTER JOIN table1 t USING (id);
Re comments:
I just tested the method above on MySQL 5.6.15 to see how many distinct SELECTs I can get with a series of UNION ALLs, one row per SELECT.
I got the derived table to return 5332 rows, but I think I could go higher if I had more RAM.
If I try one more UNION ALL, I get: ERROR 1064 (42000): memory exhausted near '' at line 10665. I only have 2.0GB of RAM configured on this VM.
It doesn't matter how many ids are unknown for this solution to work. Just put them all in the derived table. By using LEFT OUTER JOIN, it automatically finds those that exist in your table1, and for the ones that are missing, the entry from the derived table will be matched up with NULLs.
The COALESCE() function returns its first non-null argument, so it'll use columns from the matched rows if those are present. Where none is found, it'll default to the columns in the derived table.
Create a stored procedure that would take as input id1, id2 and so on...
DELIMITER //
CREATE PROCEDURE P1(IN p_in varchar(5))
BEGIN
DECLARE count integer;
SELECT count(id) INTO count FROM TABLE1
WHERE id = p_in;
IF count = 1 THEN
SELECT * from table1 where id = p_in;
ELSE
select p_id, 'some text', 'some text';
END IF;
END//
DELIMITER ;
The call the procedure to get desired output..
CALL P1('id1');
CALL P2('id2');
.. and so on from your program..
Project a derived table containing all the candidate ids you want, then left join to it:
select ids.id, coalesce(table1.column1,'placeholder')
From
(Select 'id1' as id
Union
Select 'id2'
Union
Select 'id3') ids
left join table1
on ids.id1 = table1.id1
and table1.id in (...);
If you are producing the list of candidate ids from an external source (e.g. an application), you could insert the ids into a temporary table and then join to it (MySql doesn't support table variables yet).

A multitude of the same id in an WHERE id IN () statement

I have a simple query that increases the value of a field by 1.
Now I used to loop over all id's and fire a query for each of them, but now that things are getting a bit resource heavy I wanted to optimize this. Normally I would just do
UPDATE table SET field = field + 1 WHERE id IN (all the ids here)
but now I have the problem that there are id's that occur twice (or more, I can't know that on forehand).
Is there a way to have the query run twice for id 4 if the query looks like this:
UPDATE table SET field = field + 1 WHERE id IN (1, 2, 3, 4, 4, 5)
Thanks,
lordstyx
Edit: sorry for not being clear enough.
The id here is an auto inc field, so it are all unique ID's. the id's that have to be updated are indirectly comming from users, so I can't predict which id is going to occur how often.
If there are the ID's (1, 2, 3, 4, 4, 5) I need the field of row with id 4 to be incremented with 2, and all the rest with 1.
If (1, 2, 3, 4, 4, 5) comes from a SELECT id ... query, then you can do something like this:
UPDATE yourTable
JOIN
( SELECT id
, COUNT(id) AS counter
....
GROUP BY id
) AS data
ON yourTable.id = data.id
SET yourTable.field = yourTable.field + data.counter
;
Since the input comes from users, perhaps you can manipulate it a bit. Change (1, 2, 3, 4, 4, 5) to (1), (2), (3), (4), (4), (5).
Then (having created a temporary table):
CREATE TABLE tempUpdate
( id INT )
;
Do the following procedure:
add the values in the temporary table,
run the update and
delete the values.
Code:
INSERT INTO TempUpdate
VALUES (1), (2), (3), (4), (4), (5)
;
UPDATE yourTable
JOIN
( SELECT id
, COUNT(id) AS counter
FROM TempUpdate
GROUP BY id
) AS data
ON yourTable.id = data.id
SET yourTable.field = yourTable.field + data.counter
;
DELETE FROM TempUpdate
;
No. But you could perform something like
UPDATE table
SET field = field + (LENGTH(',1,2,3,4,4,5,') - LENGTH(REPLACE(',1,2,3,4,4,5,', CONCAT(',', id, ','), ''))) / LENGTH(CONCAT(',', id, ','))
WHERE id IN (1, 2, 3, 4, 4, 5)
if you need row with id = 4 specifically to be incremented twice
Here is solution you wanted, but I'm not sure this is what you need.
Let's say that your talbe is called test. You want to increase id. I've added a field idwas to easily show what was the id before the query:
CREATE TABLE `test` (
`id` int(10) unsigned NOT NULL auto_increment,
`idwas` int(8) unsigned default NULL,
PRIMARY KEY (`id`)
) ;
Let's fill it with data:
truncate table test;
insert into test(id) VALUES(1),(3),(15);
update test set idwas = id;
Now let's say that you have user input 1,3,5,3, so:
id 1 should be increased by 1
id 3 should be increased by 2
id 5 is missing, nothing to increase.
row with id 15 should not be changed because not in user input
We'll put the user input in a variable to be easier to use it:
SET #userInput = '1,3,5,3';
then do the magic:
SET #helperTable = CONCAT(
'SELECT us.id, count(us.id) as i FROM ',
'(SELECT ',REPLACE(#userInput, ',',' AS `id` UNION ALL SELECT '),
') AS us GROUP BY us.id');
SET #stmtText = CONCAT(
' UPDATE ',
'(',#helperTable,') AS h INNER JOIN test as t ON t.id = h.id',
' SET t.id = t.id + h.i');
PREPARE stmt FROM #stmtText;
EXECUTE stmt;
And this is the result:
mysql> SELECT * FROM test;
+----+-------+
| id | idwas |
+----+-------+
| 2 | 1 |
| 5 | 3 |
| 15 | 15 |
+----+-------+
3 rows in set (0.00 sec)
If it's reasonable, you could try doing a combination of what you had before and what you have now.
In whatever is creating this list, separate it into (depending on the language's constructs) some type of array. Follow this by sorting it,finding how many multiples of each there are, and doing whatever else you need to to get the following: an array with (increment-number => list of ids), so you do one query for each increment amount. Thus, your example becomes
UPDATE table SET field = field + 1 WHERE id IN (1, 2, 3, 5)
UPDATE table SET field = field + 2 WHERE id IN (4)
In php, for example, I would take the array, sort the array, use the content of the array as the keys for another array of the form (id => count), and then fold that over into the (count => list of ids) array.
It's not that efficient, but is definitely better than one query per id. It's also probably better than using iteration and string manipulation in SQL. Unless you're forced to use SQL to do everything (which it sounds like you're not), I wouldn't use it to do everything, when it's overly awkward to do so.
You could use the following:
create temporary table temp1 (id integer);
insert into temp1 (id) values (1),(2),(3),(4),(4),(5);
update your_table set your_field = your_field + (select count(*) from temp1 where id = your_table.id)
This solution requires you to format the id list like (1),(2),(3),(4),(4),(5) but I don't think that is a problem, right?
This worked on my test database, hope it works for you too!
Regards,
Arthur