What is this query supposed to do? (and why does it fail?) - mysql

I'm tasked with revive this old piece of legacy software.
It used to run on an old server (2012) which has died the ugly way (hard disk failure).
Before this server died, the code worked without problems.
I've rebuild the MySQL database and data from backups.
However, one query is does not work and fails with error: Query preparation failed: Unknown column '_operationId' in 'where clause'. The query in question is:
SELECT
#r AS _operationId
, #r := (
SELECT
operationId
FROM operations
WHERE operationId = _operationId
) AS includesOperationId
FROM (SELECT #r := %i) AS tmp
INNER JOIN operations
WHERE #r > 0 AND #r IS NOT NULL
From what I understand, the query tries to join back onto itself building a tree of some sort??
For some reason, this query must have worked on some previous version of MySQL (5.0??) but with the current version (MySQL 5.7) the query fails.
Is there any 'mysql whisperer' out there who can explain to me:
what the query attempts to do?
why it worked on some previous version but not anymore?
how to change the query to make it work again?
thanks a million in advance.
Update:
The operations table definition and data:
+-------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------------------+------+-----+---------+----------------+
| operationId | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| operation | varchar(40) | NO | UNI | NULL | |
| description | text | YES | | NULL | |
+-------------+---------------------+------+-----+---------+----------------+
+-------------+-----------+-------------+
| operationId | operation | description |
+-------------+-----------+-------------+
| 1 | add | NULL |
| 2 | delete | NULL |
| 3 | edit | NULL |
| 4 | view | NULL |
| 5 | disable | NULL |
| 6 | execute | NULL |
+-------------+-----------+-------------+

The query is attempting to do some sort of tree traversal. I don't know that it would work in any version of MySQL, but my best guess is that the intention is something like this:
SELECT #r AS _operationId,
#r := (SELECT operationId
FROM operations
WHERE operationId = #r
) AS includesOperationId
FROM operations CROSS JOIN
(SELECT #r := %i) params
WHERE #r > 0 AND #r IS NOT NULL;
Having said that, if this happens to work, there is no guarantee that it will work again or in another version of MySQL. This violates two rules of using variables:
A variable assigned in one expression in a SELECT should not be used in another. The order of evaluation of expressions is not defined, so the expressions could be evaluated in any order.
There is not guarantee on when the conditions in the WHERE clause using variables are evaluated and definitely no guarantee about some sort of "sequential" evaluation with respect to the SELECT.
The subquery is also problematic.
The good news is that if operations has no column called _operationId, then the query should fail on all versions of MySQL with an undefined column type of error (although perhaps older versions did something funky).
The bad news is that if you want to walk through a hierarchy in MySQL, you either need to change the data structure or use a stored procedure.

Related

Unexpected result in MyISAM when grouping by bit and selecting distinct values

We have a MyISAM table with a single column bit and two rows, containing 0 and 1. We group by this column, make a count and select it. The result as follows is expected.
select count( bit), bit from tab GROUP BY bit;
| count(bit) | bit |
|------------|-----|
| 1 | 0 |
| 1 | 1 |
But when using the distinct keyword, the output value of the column is always 1. Why?
select count(distinct bit), bit from tab GROUP BY bit;
| count(bit) | bit |
|------------|-----|
| 1 | 1 | # WHYYY
| 1 | 1 |
I've been crawling the documentation and the internet but with no luck.
Here is the setup:
CREATE TABLE `tab` (
`bit` bit(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8; # When using InnoDB everything's fine
INSERT INTO `tab` (`bit`) VALUES
(CONV('1', 2, 10) + 0),
(CONV('0', 2, 10) + 0);
PS: One more thing. I've been doing several experiments. Using group_concat, the column bit becomes independent again.
select count(distinct bit), group_concat(bit) from tab GROUP BY bit;
| count(bit) | bit |
|------------|------------|
| 1 | 1 byte (0) |
| 1 | 1 byte (1) |
Thanks to comments, I am from now on convinced of not using the bit column at all. The more reliable alternative is tinyint(1).
Inspired from the Adminer application bit handling, I recommend using bin function to cast bit on an expected value every time when selecting:
select count(distinct bit), BIN(bit) from tab GROUP BY bit;

Why is this MySQL query poor performance (DEPENDENT_SUBQUERY)

explain select id, nome from bea_clientes where id in (
select group_concat(distinct(bea_clientes_id)) as list
from bea_agenda
where bea_clientes_id>0
and bea_agente_id in(300006,300007,300008,300009,300010,300011,300012,300013,300014,300018,300019,300020,300021,300022)
)
When I try to do the above (without the explain), MySQL simply goes busy, using DEPENDENT SUBQUERY, which makes this slow as hell. The thing is why the optimizer calculates the subquery for each ids in client. I even put the IN argument in a group_concat believing that would be the same to put that result as a plain "string" to avoid scanning.
I thought this wouldn't be a problem for MySQL server which is 5.5+?
Testing in MariaDb also does the same.
Is this a known bug? I know I can rewrite this as a join, but still this is terrible.
Generated by: phpMyAdmin 4.4.14 / MySQL 5.6.26
Comando SQL: explain select id, nome from bea_clientes where id in ( select group_concat(distinct(bea_clientes_id)) as list from bea_agenda where bea_clientes_id>0 and bea_agente_id in(300006,300007,300008,300009,300010,300011,300012,300013,300014,300018,300019,300020,300021,300022) );
Lines: 2
Current selection does not contain a unique column. Grid edit, checkbox, Edit, Copy and Delete features are not available.
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|--------------------|--------------|-------|-------------------------------|---------------|---------|------|-------|------------------------------------|
| 1 | PRIMARY | bea_clientes | ALL | NULL | NULL | NULL | NULL | 30432 | Using where |
| 2 | DEPENDENT SUBQUERY | bea_agenda | range | bea_clientes_id,bea_agente_id | bea_agente_id | 5 | NULL | 2352 | Using index condition; Using where |
Obviously hard to test without the data but something like below.
Subqueries are just not good in mysql (though its my prefered engine).
I could also recommend indexing the relevant columns which will improve performance for both queries.
For clarity can I also advise expanding queries.
select t1.id,t1.nome from (
(select group_concat(distinct(bea_clientes_id)) as list from bea_agenda where bea_clientes_id>0 and bea_agente_id in (300006,300007,300008,300009,300010,300011,300012,300013,300014,300018,300019,300020,300021,300022)
) as t1
join
(select id, nome from bea_clientes) as t2
on t1.list=t2.id
)

How to get MySQL command line tool to show booleans stored as BIT sensibly by default

I got a problem with selecting boolean types stored as BIT with MySQL. I know that I can get bit values shown in a sensible with with custom queries like with SELECT CAST(1=1 AS SIGNED INTEGER) or with SELECT BOOLFIELD + 0 ...
However, is there any way to get our booleans shown in a sensible way with command line client with queries like SELECT * FROM TABLE ?
UPDATE : At the moment I see only space in the results Example:
mysql> SELECT distinct foo, foo + 0 from table
+------+-------+
| foo | foo_0 |
+------+-------+
| | 0 | <-- Only space
| | 1 | <-- Space, one space less
+------+-------+
With some googling, I found some (maybe related) bugs from MySQL bug DB (http://bugs.mysql.com/bug.php?id=28422, http://bugs.mysql.com/bug.php?id=43670) but not answer or fix?
To store booleans, one really ought to use MySQL's BOOLEAN type (which is an alias for TINYINT(1), given that MySQL doesn't have real boolean types): 0 represents false and non-zero represents true.
Whilst it might feel like storing a boolean in a byte is more wasteful than in a BIT(1) column, one must remember that a few saved bits will translate into more bit operations for the CPU on data storage & retrieval; and I'm unsure whether most storage engines pad BIT columns to the next byte boundary anyway.
If you insist on using BIT type columns, you should be aware that they are returned as binary strings. The MySQL command line client (stupidly) attempts to render binary strings as textual (by applying its default character set), which is what causes the behaviour that you observe—there's no way to avoid this (other than to manipulate the field in the select list in order that it as returned as something other than a binary string, as you are already doing).
However, if you also insist on using SELECT * (which is bad practice, albeit somewhat more understandable from the command line client), you might consider defining a view in which the manipulation is performed and then SELECT from that. For example:
CREATE VIEW my_view AS SELECT foo + 0 AS foo, bar FROM my_table;
Then one could do:
SELECT * FROM my_view WHERE foo = 1 AND bar = 'wibble';
A BIT ugly, but maybe some workaround: CASE WHEN ... THEN ... END
Instead of
> select
guid,
consumed,
confirmed
from Account
where customerId = 'xxxx48' and name between xxxx and xxxx;
+--------------------------------------+----------+-----------+
| guid | consumed | confirmed |
+--------------------------------------+----------+-----------+
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | | |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | | |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | | |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | | |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | | |
+--------------------------------------+----------+-----------+
One could do:
> select
guid,
case when consumed then '1' when not consumed then '0' end as been_consumed,
case when confirmed then '1' when not confirmed then '0' end as been_confirmed
from Account
where customerId = 'xxxx48' and name between xxxx and xxxx;
+--------------------------------------+---------------+----------------+
| guid | been_consumed | been_confirmed |
+--------------------------------------+---------------+----------------+
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | 1 | 1 |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | 1 | 0 |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | 1 | 0 |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | 1 | 1 |
| xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | 1 | 0 |
+--------------------------------------+---------------+----------------+

Can I SELECT this in a single stament?

I am a total SQL noob; sorry.
I have a table
mysql> describe activity;
+--------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| user | text | NO | | NULL | |
| time_stamp | int(11) | NO | | NULL | |
| activity | text | NO | | NULL | |
| item | int | NO | | NULL | |
+--------------+---------+------+-----+---------+-------+
Normally activity is a two-step process; 1) "check out" and 2 "use"
An item cnnot be checked out a second time, unless used.
Now I want to find any cases where an item was checked out but not used.
Being dumb, I would use two selects, one for check out &one for use, on the same item, then compare the timestamps.
Is there a SELECT statemnt that will help me selct the items which were checked out but not used?
Tricky with the possibility of multipel checkouts. Or should I just code
loop over activity, form oldest until newset
if find a checkout and there is no newer used time then i have a hit
You could get the last date of each checkout or use and then compare them per item:
SELECT MAX(IF(activity='check out', time_stamp, NULL)) AS last_co,
MAX(IF(activity='use', time_stamp, NULL)) AS last_use
FROM activity
GROUP BY item
HAVING NOT(last_use >= last_co);
The NOT(last_use >= last_co) is written that way because of how NULL compare behaviour works: last_use < last_co will not work if last_use is null.
Without proper indexing, this query will not perform very well though. Plus you might want to bound the query using a WHERE condition.

MySQL InnoDB table returns 20% of rows then halts

I am trying to return a result set from a MySQL database table of email subscriptions.
The table is called subscribe and looks like this:
+-----------------+------------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------------------+-------+
| eid | int(11) unsigned | NO | PRI | NULL | |
| subscribeStatus | varchar(4) | NO | PRI | NULL | |
| transDate | datetime | NO | PRI | 0000-00-00 00:00:00 | |
| senttoevDate | datetime | YES | | NULL | |
+-----------------+------------------+------+-----+---------------------+-------+
The PK is (eid,subscribeStatus,transDate) and there are the following extra indexes:
idxEid on eid,
idxTDate on transDate
The table is an InnoDB table. It contains about 480K rows.
The version of MySQL is 5.1.39 x86_64 and I'm running Windows 7 64bit.
The table has a row inserted each time a user subscribes or unsubscribes from email. I want to know what the latest subscription status is for all users. The query I want to run is:
select
eid, transDate from subscribe s
where
transDate = (select max(transDate) from subscribe si where si.eid = s.eid)
When I run this in MySQL Query Browser (and in TOAD for MySQL) it immediately returns about 98K rows (into the results grid), and then just hangs. I can see from MySQL Administrator GUI that the state is "Sending data". I have left it for up to an hour and it hasn't finished returning the results, or even moved on from the 98K.
I have adjusted the my.ini params for InnoDB to increase the innodb_buffer_pool_size to 3Gb (my machine has 4Gb) but I can see from Task Manager that mysqld is never using more than about 400K Mb.
I have created a MyISAM version of the table to see if that is any better, but it also hangs at around about the same number of rows returned.
Can anyone suggest why the query returns some rows but then "hangs", and also what I can do to get around this and get the query to return as it should?
Many thanks in advance for any help you can offer!
I have no idea why it would hang, but why don't you try to avoid the subquery with something like
select eid, max(transDate) from subscribe group by eid