How to investigate MySQL errors - mysql

Below is a query I found to display my errors and warnings in MySQL:
SELECT
`DIGEST_TEXT` AS `query`,
`SCHEMA_NAME` AS `db`,
`COUNT_STAR` AS `exec_count`,
`SUM_ERRORS` AS `errors`,
(ifnull((`SUM_ERRORS` / nullif(`COUNT_STAR`,0)),0) * 100) AS `error_pct`,
`SUM_WARNINGS` AS `warnings`,
(ifnull((`SUM_WARNINGS` / nullif(`COUNT_STAR`,0)),0) * 100) AS `warning_pct`,
`FIRST_SEEN` AS `first_seen`,
`LAST_SEEN` AS `last_seen`,
`DIGEST` AS `digest`
FROM
performance_schema.events_statements_summary_by_digest
WHERE
((`SUM_ERRORS` > 0) OR (`SUM_WARNINGS` > 0))
ORDER BY
`SUM_ERRORS` DESC,
`SUM_WARNINGS` DESC;
Is there some way to drill down into performance_schema to find the exact error message that is associated with the errors or warnings above?
I was also curious what it means if the db column or query column shows up as NULL. Below are a few specific examples of what I'm talking about
+---------------------+--------+------------+--------+----------+--------+
| query | db | exec_count | errors | warnings | digest |
+---------------------+--------+------------+--------+----------+--------+
| SHOW MASTER LOGS | NULL | 192 | 192 | 0 | ... |
+---------------------+--------+------------+--------+----------+--------+
| NULL | NULL | 553477 | 64 | 18783 | NULL |
+---------------------+--------+------------+--------+----------+--------+
| SELECT COUNT ( * ) | NULL | 48 | 47 | 0 | ... |
|FROM `mysql` . `user`| | | | | |
+---------------------+--------+------------+--------+----------+--------+
I am also open to using a different query that someone may have to display these errors/warnings

The message will be in performance_schema.events_statements_history.message_text column. You do need to make sure that performance_schema_events_statements_history_size config variable is set to a positive and sufficiently large value, and that the history collection is enabled. To enable the history collection, run:
update performance_schema.setup_consumers set enabled='YES'
where name='events_statements_history';
To check if it is enabled:
select * from performance_schema.setup_consumers where
name='events_statements_history';
NULL value of db means the there was no active database selected. Note that the active database does not have to be the same as the database of the table involved in a query. The active database is used as default when it is not explicitly specified in a query.
This will only give you error messages, not warning messages. From a brief look at the code it appears that the warning text does not get logged anywhere - which is understandable given that one statement could produce millions of them. So you have a couple of options:
Extract the statement from events_statements_history.sql_text, re-execute it, and then run SHOW WARNINGS
Extract the statement, track it down in your application code, then instrument your code to log the output of SHOW WARNINGS in hopes of catching it live if manual execution fails to reproduce the warnings

Related

Monitoring progress of a SQL script with many UPDATEs in MariaDB

I'm running a script with several million update statements like this:
UPDATE data SET value = 0.9234 WHERE fId = 47616 AND modDate = '2018-09-24' AND valueDate = '2007-09-01' AND last_updated < '2018-10-01';
fId, modDate and valueDate are the 3 components of the data table's composite primary key.
I initially ran this with AUTOCOMMIT=1 but I figured it would speed up if I set AUTOCOMMIT=0 and wrapped the transactions into blocks of 25.
In autocommit mode, I used SHOW PROCESSLIST and I'd see the UPDATE statement in the output, so from the fId foreign key, I could tell how far the script had progressed.
However without autocommit, watching it running now, I haven't seen anything with SHOW PROCESSLIST, just this:
610257 schema_owner_2 201.177.12.57:53673 mydb Sleep 0 NULL 0.000
611020 schema_owner_1 201.177.12.57:58904 mydb Query 0 init show processlist 0.000
The Sleep status makes me paranoid that other users on the system are blocking the updates, but if I run SHOW OPEN TABLES I'm not sure whether there's a problem:
MariaDB [mydb]> SHOW OPEN TABLES;
+----------+----------------+--------+-------------+
| Database | Table | In_use | Name_locked |
+----------+----------------+--------+-------------+
| mydb | data | 2 | 0 |
| mydb | forecast | 1 | 0 |
| mydb | modification | 0 | 0 |
| mydb | data3 | 0 | 0 |
+----------+----------------+--------+-------------+
Is my script going to wait forever? Should I go back to using autocommit mode? Is there any way to see how far it's progressed? I guess I can inspect the data for the updates but that would be laborious to put together.
Check for progress by actually checking the data.
I assume you are doing COMMIT?
It is reasonable to see nothing -- each UPDATE will take very few milliseconds; there will be Sleep time between UPDATEs.
Time being 0 is your clue that it is progressing.
There won't necessarily be any clue in the PROCESSLIST of how far it has gotten.
You could add SELECT SLEEP(1), $fid; in the loop, where $fid (or whatever) is the last UPDATEd row id. That would slow down the progress by 1 second per 25 rows, so maybe you should do groups of 100 or 500.

Updating to every number after column value, up to 9?

So I created a database table in MySQL that held permission rights for permissions and commands, the command rights started with the prefix command_ in the column permission_name and then I have an extra column called allowed_ranks, which is a list of INT rank ID's that are required, split by a , character.
The issue is, the command ones were anything higher, and I've put 1 id in allowed_ranks, is there a way I can loop through all the ones with column starting with command_ and change the allowed_ranks that are just 1 ID to every number starting from that to 9? 9 is the maximum rank ID.
I've already done part of the query, I'm just not sure how to do the updating?
UPDATE permission_rights` SET `allowed_ranks` = '?' WHERE `permission_name` LIKE 'command_%';
How would I update it to every number after the columns value up to 9? So lets say I had this record... just a quick example to ensure you know what I mean.
| permission_name | allowed_ids |
----------------------------------
| command_hello | 2
| command_junk | 5
| command_delete | 8
| command_update | 1
Would become...
| permission_name | allowed_ids |
----------------------------------
| command_hello | 2,3,4,5,6,7,8,9
| command_junk | 5,6,7,8,9
| command_delete | 8,9
| command_update | 1,2,3,4,5,6,7,8,9
The better approach would be to use a number generator (some method which will produce number from 1 to n), but general MySQL has no such capability.
If you use MariaDB you can use seq_1_to_1000 as suggested here in Answer by O.Jones.
However your use case seems to be simpler, since you said that the highest rank is 9, I would just use
update a
set a.allowed_ids = RIGHT('1,2,3,4,5,6,7,8,9',19-2*a.allowed_ids)
where a.permission_name like 'command_%'

MySQL performance for version 5.7 vs. 5.6

I have noticed a particular performance issue that I am unsure on how to deal with.
I am in the process of migrating a web application from one server to another with very similar specifications. The new server typically outperforms the old server to be clear.
The old server is running MySQL 5.6.35
The new server is running MySQL 5.7.17
Both the new and old server have virtually identical MySQL configurations.
Both the new and old server are running the exact same database perfectly duplicated.
The web application in question is Magento 1.9.3.2.
In Magento, the following function
Mage_Catalog_Model_Category::getChildrenCategories()
is intended to list all the immediate children categories given a certain category.
In my case, this function bubbles down eventually to this query:
SELECT `main_table`.`entity_id`
, main_table.`name`
, main_table.`path`
, `main_table`.`is_active`
, `main_table`.`is_anchor`
, `url_rewrite`.`request_path`
FROM `catalog_category_flat_store_1` AS `main_table`
LEFT JOIN `core_url_rewrite` AS `url_rewrite`
ON url_rewrite.category_id=main_table.entity_id
AND url_rewrite.is_system=1
AND url_rewrite.store_id = 1
AND url_rewrite.id_path LIKE 'category/%'
WHERE (main_table.include_in_menu = '1')
AND (main_table.is_active = '1')
AND (main_table.path LIKE '1/494/%')
AND (`level` <= 2)
ORDER BY `main_table`.`position` ASC;
While the structure for this query is the same for any Magento installation, there will obviously be slight discrepancies on values between Magento Installation to Magento Installation and what category the function is looking at.
My catalog_category_flat_store_1 table has 214 rows.
My url_rewrite table has 1,734,316 rows.
This query, when executed on its own directly into MySQL performs very differently between MySQL versions.
I am using SQLyog to profile this query.
In MySQL 5.6, the above query performs in 0.04 seconds. The profile for this query looks like this: https://codepen.io/Petce/full/JNKEpy/
In MySQL 5.7, the above query performs in 1.952 seconds. The profile for this query looks like this: https://codepen.io/Petce/full/gWMgKZ/
As you can see, the same query on almost the exact same setup is virtually 2 seconds slower, and I am unsure as to why.
For some reason, MySQL 5.7 does not want to use the table index to help produce the result set.
Anyone out there with more experience/knowledge can explain what is going on here and how to go about fixing it?
I believe the issue has something to do with the way that MYSQL 5.7 optimizer works. For some reason, it appears to think that a full table scan is the way to go. I can drastically improve the query performance by setting max_seeks_for_key very low (like 100) or dropping the range_optimizer_max_mem_size really low to forcing it to throw a warning.
Doing either of these increases the query speed by almost 10x down to 0.2 sec, however, this is still magnitudes slower that MYSQL 5.6 which executes in 0.04 seconds, and I don't think either of these is a good idea as I'm not sure if there would be other implications.
It is also very difficult to modify the query as it is generated by the Magento framework and would require customisation of the Magento codebase which I'd like to avoid. I'm also not even sure if it is the only query that is effected.
I have included the minor versions for my MySQL installations. I am now attempting to update MySQL 5.7.17 to 5.7.18 (the latest build) to see if there is any update to the performance.
After upgrading to MySQL 5.7.18 I saw no improvement. In order to bring the system back to a stable high performing state, we decided to downgrade back to MySQL 5.6.30. After doing the downgrade we saw an instant improvement.
The above query executed in MySQL 5.6.30 on the NEW server executed in 0.036 seconds.
Wow! This is the first time I have seen something useful from Profiling. Dynamically creating an index is a new Optimization feature from Oracle. But it looks like that was not the best plan for this case.
First, I will recommend that you file a bug at http://bugs.mysql.com -- they don't like to have regressions, especially this egregious. If possible, provide EXPLAIN FORMAT=JSON SELECT... and "Optimizer trace". (I do not accept tweaking obscure tunables as an acceptable answer, but thanks for discovering them.)
Back to helping you...
If you don't need LEFT, don't use it. It returns NULLs when there are no matching rows in the 'right' table; will that happen in your case?
Please provide SHOW CREATE TABLE. Meanwhile, I will guess that you don't have INDEX(include_in_menu, is_active, path). The first two can be in either order; path needs to be last.
And INDEX(category_id, is_system, store_id, id_path) with id_path last.
Your query seems to have a pattern that works well for turning into a subquery:
(Note: this even preserves the semantics of LEFT.)
SELECT `main_table`.`entity_id` , main_table.`name` , main_table.`path` ,
`main_table`.`is_active` , `main_table`.`is_anchor` ,
( SELECT `request_path`
FROM url_rewrite
WHERE url_rewrite.category_id=main_table.entity_id
AND url_rewrite.is_system = 1
AND url_rewrite.store_id = 1
AND url_rewrite.id_path LIKE 'category/%'
) as request_path
FROM `catalog_category_flat_store_1` AS `main_table`
WHERE (main_table.include_in_menu = '1')
AND (main_table.is_active = '1')
AND (main_table.path like '1/494/%')
AND (`level` <= 2)
ORDER BY `main_table`.`position` ASC
LIMIT 0, 1000
(The suggested indexes apply here, too.)
THIS is not a ANSWER only for comment for #Nigel Ren
Here you can see that LIKE also use index.
mysql> SELECT *
-> FROM testdb
-> WHERE
-> vals LIKE 'text%';
+----+---------------------------------------+
| id | vals |
+----+---------------------------------------+
| 3 | text for line number 3 |
| 1 | textline 1 we rqwe rq wer qwer q wer |
| 2 | textline 2 asdf asd fas f asf wer 3 |
+----+---------------------------------------+
3 rows in set (0,00 sec)
mysql> EXPLAIN
-> SELECT *
-> FROM testdb
-> WHERE
-> vals LIKE 'text%';
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | testdb | NULL | range | vals | vals | 515 | NULL | 3 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0,01 sec)
mysql>
sample with LEFT()
mysql> SELECT *
-> FROM testdb
-> WHERE
-> LEFT(vals,4) = 'text';
+----+---------------------------------------+
| id | vals |
+----+---------------------------------------+
| 3 | text for line number 3 |
| 1 | textline 1 we rqwe rq wer qwer q wer |
| 2 | textline 2 asdf asd fas f asf wer 3 |
+----+---------------------------------------+
3 rows in set (0,01 sec)
mysql> EXPLAIN
-> SELECT *
-> FROM testdb
-> WHERE
-> LEFT(vals,4) = 'text';
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | testdb | NULL | index | NULL | vals | 515 | NULL | 5 | 100.00 | Using where; Using index |
+----+-------------+--------+------------+-------+---------------+------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0,01 sec)
mysql>

MySQL - Select all geometries inside a rectangle

I have a table AREAGEOMETRY with the following structure:
+-----------------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+----------+------+-----+---------+-------+
| AREAGEOMETRY_ID | int(11) | NO | PRI | NULL | |
| AreaManagerId | int(11) | YES | | NULL | |
| AreaId | text | YES | | NULL | |
| EndDateArea | datetime | YES | | NULL | |
| StartDateArea | datetime | YES | | NULL | |
| AreaGeometryTxt | text | YES | | NULL | |
+-----------------+----------+------+-----+---------+-------+
It contains data from parking zones. Now what I am trying to do is selecting all rows within a bounding box.
A bounding box may be the following:
LatLngBounds{southwest=lat/lng: (52.35631327204287,4.881156384944916), northeast=lat/lng: (52.38006384519922,4.913054890930653)}
I came up with the following query for this:
SELECT * FROM AREAGEOMETRY WHERE ST_OVERLAPS(GeomFromText(AreaGeometryTxt), GeomFromText('LINESTRING(52.35631327204287 4.881156384944916, 52.38006384519922 4.881156384944916, 52.38006384519922 4.913054890930653, 52.35631327204287 4.913054890930653, 52.35631327204287 4.881156384944916)'))
However it seems to return all the rows in the table, I have no idea what is going wrong here.
Maybe someone could point me in the right direction.
Edit:
For example it returns this row:
# AREAGEOMETRY_ID, AreaManagerId, AreaId, EndDateArea, StartDateArea, AreaGeometryTxt
493, 299, 8721, 0000-00-00 00:00:00, 0000-00-00 00:00:00, POLYGON ((6.071624141 51.927465383, 6.071167939 51.927755315, 6.073816653 51.928513734, 6.07434586 51.928376592, 6.072239751 51.927748706, 6.072269225 51.927414931, 6.071624141 51.927465383))
Which it shouldn't.
Edit2:
Some possible viewports and their results:
Viewport 1:
LatLngBounds{southwest=lat/lng: (52.367693923958065,6.981273405253887), northeast=lat/lng: (52.3812037840295,6.99942022562027)}
Query:
SELECT * FROM AREAGEOMETRY WHERE ST_CONTAINS(GeomFromText('LINESTRING(52.367693923958065 6.981273405253887, 52.3812037840295 6.981273405253887, 52.3812037840295 6.99942022562027, 52.367693923958065 6.99942022562027, 52.367693923958065 6.981273405253887)'), GeomFromText(AreaGeometryTxt));
Expected Result: 0 rows returned
Actual Result: 0 rows returned
Viewport 2:
LatLngBounds{southwest=lat/lng: (52.20765248996001,6.881230026483536), northeast=lat/lng: (52.23028692988024,6.911527253687382)}
Query:
SELECT * FROM AREAGEOMETRY WHERE ST_CONTAINS(GeomFromText('LINESTRING(52.20765248996001 6.881230026483536, 52.23028692988024 6.881230026483536, 52.23028692988024 6.911527253687382, 52.20765248996001 6.911527253687382, 52.20765248996001 6.881230026483536)'), GeomFromText(AreaGeometryTxt));
Expected Result: About 25 rows returned
Actual Result: 0 rows returned
Another query same viewport:
SELECT * FROM AREAGEOMETRY WHERE ST_Overlaps(GeomFromText('LINESTRING(52.20765248996001 6.881230026483536, 52.23028692988024 6.881230026483536, 52.23028692988024 6.911527253687382, 52.20765248996001 6.911527253687382, 52.20765248996001 6.881230026483536)'), GeomFromText(AreaGeometryTxt));
Expected Result: About 25 rows returned
Actual Result: 1000 rows returned (limited by MySQL Workbench)
Edit 3:
The following query returns exactly what I want:
SELECT * FROM AREAGEOMETRY WHERE ST_Intersects(GeomFromText('Polygon((6.881230026483536 52.20765248996001, 6.881230026483536 52.23028692988024, 6.911527253687382 52.20765248996001, 6.881230026483536 52.20765248996001))'), GeomFromText(AreaGeometryTxt));
Seems like I mixed up Lat/Lng and had the parameters in the wrong order.
You should use Contains or Intersects, depending on whether you want objects on the border to be included, or whether you want full containment.
However, your main issue, is that you have the geometries the wrong way round, if you look at the Contains documentation you will see it is Contains(g1, g2) returns 1 if g1 contains g2, so you will want to put your bounding box first.
SELECT * FROM AREAGEOMETRY WHERE ST_CONTAINS(ST_GeomFromText('LINESTRING(52.35631327204287 4.881156384944916, 52.38006384519922 4.881156384944916, 52.38006384519922 4.913054890930653, 52.35631327204287 4.913054890930653, 52.35631327204287 4.881156384944916)'), ST_GeomFromText(AreaGeometryTxt));
You might also want to consider storing AreaGeometryTxt as a geometry, rather than text, as this will give you two advantages:
You can then put a spatial index on it, which will lead to much faster query times as table size grows.
You will avoid the overhead of the GeomFromText conversion on each query, which in conjunction with point 1, will prevent doing a full table scan each time.
EDIT: I ran the following query, using the row you say should not be returned, and your original lat/lon bounding box:
select ST_Overlaps(ST_GeomFromText('POLYGON ((6.071624141 51.927465383, 6.071167939 51.927755315, 6.073816653 51.928513734, 6.07434586 51.928376592, 6.072239751 51.927748706, 6.072269225 51.927414931, 6.071624141 51.927465383))'),
ST_GeomFromText('POLYGON ((4.881156384944916 52.35631327204287, 4.881156384944916 52.38006384519922, 4.913054890930653 52.38006384519922, 4.913054890930653 52.35631327204287, 4.881156384944916 52.35631327204287))'));
This query returned 0 (false) for overlaps, intersects and contains as it should.

Fastest way to count the rows in any database table?

We assume that there is no primary key defined for a table T. In that case, how does one count all the rows in T quickly/efficiently for these databases - Oracle 11g, MySql, Mssql ?
It seems that count(*) and count(column_name) can be slow and inaccurate respectively. The following seems to be the fastest and most reliable way to do it-
select count(rowid) from MySchema.TableInMySchema;
Can you tell me if the above statement also has any shortcomings ? If it is good, then do we have similar statements for mysql and mssql ?
Thanks in advance.
Source -
http://www.thewellroundedgeek.com/2007/09/most-people-use-oracle-count-function.html
count(column_name) is not inaccurate, it's simply something completely different than count(*).
The SQL standard defines count(column_name) as equivalent to count(*) where column_name IS NOT NULL. To the result is bound to be different if column_name is nullable.
In Oracle (and possibly other DBMS as well), count(*) will use an available index on a not null column to count the rows (e.g. PK index). So it will be just as fas
Additionally there is nothing similar to the rowid in SQL Server or MySQL (in PostgreSQL it would be ctid).
Do use count(*). It's the best option to get the row count. Let the DBMS do any optimization in the background if adequate indexes are available.
Edit
A quick demo on how Oracle automatically uses an index if available and how that reduces the amount of work done by the database:
The setup of the test table:
create table foo (id integer not null, c1 varchar(2000), c2 varchar(2000));
insert into foo (id, c1, c2)
select lvl, c1, c1 from
(
select level as lvl, dbms_random.string('A', 2000) as c1
from dual
connect by level < 10000
);
That generates 10000 rows with each row filling up some space in order to make sure the table has a realistic size.
Now in SQL*Plus I run the following:
SQL> set autotrace traceonly explain statistics;
SQL> select count(*) from foo;
Execution Plan
----------------------------------------------------------
Plan hash value: 1342139204
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2740 (1)| 00:00:33 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| FOO | 9999 | 2740 (1)| 00:00:33 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
181 recursive calls
0 db block gets
10130 consistent gets
0 physical reads
0 redo size
430 bytes sent via SQL*Net to client
420 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
As you can see a full table scan is done on the table which requires 10130 "IO Operations" (I know that that is not the right term, but for the sake of the demo it should be a good enough explanation for someone never seen this before)
Now I create an index on that column and run the count(*) again:
SQL> create index i1 on foo (id);
Index created.
SQL> select count(*) from foo;
Execution Plan
----------------------------------------------------------
Plan hash value: 129980005
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | INDEX FAST FULL SCAN| I1 | 9999 | 7 (0)| 00:00:01 |
----------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
27 consistent gets
21 physical reads
0 redo size
430 bytes sent via SQL*Net to client
420 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL>
As you can see Oracle did use the index on the (not null!) column and the amount of IO went drastically down (from 10130 to 27 - not something I'd call "grossly ineffecient").
The "physical reads" stem from the fact that the index was just created and was not yet in the cache.
I would expect other DBMS to apply the same optimizations.
In Oracle, COUNT(*) is the most efficient. Realistically, COUNT(rowid), COUNT(1), or COUNT('fuzzy bunny') are likely to be equally efficient. But if there is a difference, COUNT(*) will be more efficient.
i EVER use SELECT COUNT(1) FROM anything;, instead of the asterisk...
some people are of the opinion, that mysql uses the asterisk to invoke the query-optimizer and ignores any optimizing when use of "1" as static scalar...
imho, this is straight-forward, because you don't use any variable and it's clear, that you only count all rows.