Is this a MySQL bug or is my query wrong? - mysql

Yesterday I run into some sql weirdness. I had a query that melted the server so, trying to improve it, I made this query:
SELECT idEvent, MAX( fechaHora ) , codAgente, evento FROM eventos_centralita GROUP BY codAgente
And it seems to work for this schema:
CREATE TABLE IF NOT EXISTS `eventos_centralita` (
`idEvent` int(11) NOT NULL AUTO_INCREMENT,
`fechaHora` datetime NOT NULL,
`codAgente` varchar(8) DEFAULT NULL,
`extension` varchar(20) DEFAULT NULL,
`evento` varchar(45) DEFAULT NULL,
PRIMARY KEY (`idEvent`),
KEY `codAgente` (`codAgente`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=105847 ;
I mean, that the hour it's indeed the MAX one for the agent. However, the id of the event and the event itself is wrong...
So, is this a bug or is this expected?

You are mixing an aggregate function and a "normal" column select. This "feature" only works in MySQL and returns a random id.
Normally you should group by a specific column and the use aggregate functions to select all other columns not in that group. Example:
SELECT e1.codAgente, e1.idEvent, e1.fechaHora, e1.evento
FROM eventos_centralita e1
inner join
(
select codAgente, MAX(fechaHora) as fechaHora
from eventos_centralita
group by codAgente
) e2
on e1.codAgente = e2.codAgente and e1.fechaHora = e2.fechaHora

Related

Connecting two tables and having count, distinct, and average of time difference

First, I apologize if my question is not correctly organized.
I am trying to run an SQL Query in Java in order to return all the records of time difference. So to explain more:
I have two tables. Table A has the following structure:
Table `A` (
`interaction_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`job_id` int(11) NOT NULL,
`task_id` varchar(250) NOT NULL,
`task_time` datetime DEFAULT NULL,
`task_assessment` float DEFAULT NULL,
)
Table `B` (
`task_id` varchar(250) NOT NULL,
`task_type` varchar(250) DEFAULT NULL,
`task_weight` float DEFAULT NULL,
`task_due` datetime DEFAULT NULL,
`Job_id` int(11) NOT NULL
)
what I need is to get the count(distinct) from table A -and I do that using the interaction_id
and then get their times -using the task_time for each user and i use "WHERE user_id='" + userId (a java parameter).
After that I want to link Table A with Table B using Job_id
so that I can get the difference date (in hour, so i used SELECT TIMEDIFF(Hour, A(task_time), B(task_due)).
Finally, i need to get Average of the time difference.
I believe its a bit complicated when describing. But, I would appreciate your advanced help!
Thank you very much
This query should gather the results that you are expecting:
select count(*) as countLines,
avg(time_to_sec(timediff(A.task_time, B.task_due)) / 3600)
from A
inner join B on A.job_id = B.job_id
where A.user_id = #userId

SELECT query returns no result without ORDER BY clause

I have this query:
SELECT `Stocks`.`id` AS `Stocks.id` , `Locations`.`id` AS `Locations.id`
FROM `rowiusnew`.`c_stocks` AS `Stocks`
LEFT JOIN `rowiusnew`.`g_locations` AS `Locations` ON ( `Locations`.`ref_id` = `Stocks`.`id` AND `Locations`.`ref_type` = 'stock' )
GROUP BY `Stocks`.`id`
HAVING `Locations.id` IS NOT NULL
This returns 0 results.
When I add
ORDER BY Locations.id
to the exactly same query, I correctly get 3 results.
Noteworthy: When I discard the GROUP BY clause, I get the same 3 results. The grouping is necessary for the complete query with additional joins; this is the simplified one to demonstrate the problem.
My question is: Why do I not get a result with the original query?
Note that there are two conditions in the JOIN ON clause. Changing or removing the braces or changing the order of these conditions does not change the outcome.
Usually, you would suspect that the field id in g_locations is sometimes NULL, thus the ORDER BY clause makes the correct referenced result be displayed "at the top" of the group dataset. This is not the case; the id field is correctly set up as a primary key field and uses auto_increment.
The EXPLAIN statement shows that filesort is used instead of the index in those cases when I actually get a result. The original query looks like this:
The modified, working query looks like this:
Below is the table definitions:
CREATE TABLE IF NOT EXISTS `c_stocks` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_stock_type` int(10) unsigned DEFAULT NULL,
`name` varchar(255) DEFAULT NULL,
`locality` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `StockType_idx` (`id_stock_type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `g_locations` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`ref_type` enum('stock','object','branch') DEFAULT NULL,
`ref_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UniqueLocation` (`ref_type`,`ref_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The ref_id field features a long comment that I omitted in this definition.
After being unable to reproduce the error on SQLFiddle.com and also on my second computer, I realized that there must be a bug involved.
Indeed, my used version 5.6.12 suffers from this bug:
Some LEFT JOIN queries with GROUP BY could return incorrect results. (Bug #68897, Bug #16620047)
See the change log of MySQL 5.6.13: http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-13.html
An upgrade to 5.6.17 solved my problem. I am not getting the results I expect, independent of ORDER clauses and aggregate functions.
Remove
having locations.id is not null
instead use
where locations.id is not null
locations.id is not null is not a problem for the grouping - you don't want them included at all.
Also, you need to do something with the locations.id since it isn't in the group by clause. Do you want "max" locations.id?
so your query now becomes:
SELECT `Stocks`.`id` AS `Stocks.id` , max(`Locations`.`id`) AS `Locations.id`
FROM `rowiusnew`.`c_stocks` AS `Stocks`
LEFT JOIN `rowiusnew`.`g_locations` AS `Locations` ON ( `Locations`.`ref_id` = `Stocks`.`id` AND `Locations`.`ref_type` = 'stock' )
WHERE `Locations.id` IS NOT NULL
GROUP BY `Stocks`.`id`
Make those changes and it should work better for you.
FYI: I think that by putting in the order by clause, you are allowing the engine to guess what you want for the locations.id, otherwise it has no clue. In something other than MYSQL, it wouldn't run at all.

Ordering in MySQL Bogs Down

I've been working on a small Perl program that works with a table of articles, displaying them to the user if they have not been already read. It has been working nicely and it has been quite speedy, overall. However, this afternoon, the performance has degraded from fast enough that I wasn't worried about optimizing the query to a glacial 3-4 seconds per query. To select articles, I present this query:
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
WHERE ciid NOT
IN (
SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)
AND (
cid =117
OR cid =308
OR cid =310
)
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
The list of possible cid's varies and could be quite a bit more. In any case, I noted that about 2-3 seconds of the total time to make the query is devoted to "ORDER BY." If I remove that, it only takes about a half second to give me the query back. If I drop the subquery, the performance goes back to normal... but the subquery didn't seem to be problematic until just this afternoon, after working fine for a week or so.
Any ideas what could be slowing it down so much? What might I do to try to get the performance back up to snuff? The table being queried has 45,000 rows. The subquery's table has fewer than 3,000 rows at present.
Update: Incidentally, if anyone has suggestions on how to do multiple queries or some other technique that would be more efficient to accomplish what I am trying to do, I am all ears. I'm really puzzled how to solve the problem at this point. Can I somehow apply the order by before the join to make it apply to the real table and not the derived table? Would that be more efficient?
Here is the latest version of the query, derived from suggestions from #Gordon, below
SELECT channelitem.ciid, channelitem.cid, name, description, url, creationdate, author
FROM `channelitem`
LEFT JOIN (
SELECT ciid, dateRead
FROM `uninet_channelitem_read`
WHERE uid = '1030'
)alreadyRead ON channelitem.ciid = alreadyRead.ciid
WHERE (
alreadyRead.ciid IS NULL
)
AND `cid`
IN ( 6648, 329, 323, 6654, 6647 )
ORDER BY `channelitem`.`creationdate` DESC
LIMIT 0 , 100
Also, I should mention what my db structure looks like with regards to these two tables -- maybe someone can spot something odd about the structure:
CREATE TABLE IF NOT EXISTS `channelitem` (
`newsversion` int(11) NOT NULL DEFAULT '0',
`cid` int(11) NOT NULL DEFAULT '0',
`ciid` int(11) NOT NULL AUTO_INCREMENT,
`description` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`url` varchar(222) DEFAULT NULL,
`creationdate` datetime DEFAULT NULL,
`urgent` varchar(10) DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`lastchanged` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`author` varchar(255) NOT NULL,
PRIMARY KEY (`ciid`),
KEY `newsversion` (`newsversion`),
KEY `cid` (`cid`),
KEY `creationdate` (`creationdate`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=1638554365 ;
CREATE TABLE IF NOT EXISTS `uninet_channelitem_read` (
`ciid` int(11) NOT NULL,
`uid` int(11) NOT NULL,
`dateRead` datetime NOT NULL,
PRIMARY KEY (`ciid`,`uid`),
KEY `ciid` (`ciid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
It never hurts to try the left outer join version of such a query:
SELECT ci.ciid, ci.cid, ci.name, ci.description, ci.url, ci.creationdate, ci.author
FROM `channelitem` ci left outer join
(SELECT ciid
FROM `uninet_channelitem_read`
WHERE uid = '1030'
) cr
on ci.ciid = cr.ciid
where cr.ciid is null and
ci.cid in (117, 308, 310)
ORDER BY ci.`creationdate` DESC
LIMIT 0 , 100
This query will be faster with an index on uninet_channelitem_read(ciid) and probably on channelitem(cid, ciid, createddate).
The problem could be that you need to create an index on the channelitem table for the column creationdate. Indexes help a database to run queries faster. Here is a link about MySQL Indexing

SQL query that fries the server

I'm having a really weird issue that burns my MySQL server. From my point of view (which is surely wrong), the query is pretty trivial.
I have a table to store PBX events and I try to get the last events for every agent to see his/her situation whenever my application is restarted or whatever.
Whenever I launch, the server goes up to 99% of CPU and lasts about 5 minutes to solve by itself.
It seems that's because the number of records, more than 100,000.
The table is as follows:
CREATE TABLE IF NOT EXISTS `eventos_centralita` (
`idEvent` int(11) NOT NULL AUTO_INCREMENT,
`fechaHora` datetime NOT NULL,
`codAgente` varchar(8) DEFAULT NULL,
`extension` varchar(20) DEFAULT NULL,
`evento` varchar(45) DEFAULT NULL,
PRIMARY KEY (`idEvent`),
KEY `codAgente` (`codAgente`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=105847 ;
And the query is as follows:
SELECT a.* FROM eventos_centralita a
LEFT JOIN eventos_centralita b ON b.codAgente = a.codAgente AND b.fechaHora > a.fechaHora
GROUP BY a.codAgente
I've tried to limit it by date but no luck as the query doesn't give me anything. How could I improve the query to avoid this ?
Please try below:
SELECT a.* FROM eventos_centralita a
INNER JOIN
(
SELECT idEvent, MAX(fechaHora)
FROM eventos_centralita
GROUP BY codAgente
) as b
ON a.idEvent = b.idEvent

Mysql update statement executes too slowly

There are two tables as follows in my problem.
CREATE TABLE `t_user_relation` (
`User_id` INT(32) UNSIGNED NOT NULL ,
`Follow_id` INT(32) UNSIGNED NOT NULL ,
PRIMARY KEY (`User_id`,Follow_id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE `t_user_info`(
`User_id` int(32) unsigned NOT NULL ,
`User_name` varchar(20) NOT NULL ,
`User_avatar` varchar(60) NOT NULL ,
`Msg_count` int(32) unsigned DEFAULT '0' ,
`Fans_count` int(32) unsigned DEFAULT '0' ,
`Follow_count` int(32) unsigned DEFAULT '0' ,
PRIMARY KEY (`User_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
What I am trying to do is to update the Fans_count filed of the t_user_info table. My update statement is as follows:
UPDATE t_user_info set t_user_info.Fans_count=(SELECT COUNT(*) FROM t_user_relation
WHERE t_user_relation.Follow_id=t_user_info.User_id);
But it execute really slow! The table t_user_info consist of 20,445 records and t_user_relation consist of 1,809,915 records.Can anyone help me improve the speed! Thanks for any advices!
I would try this:
UPDATE
t_user_info inner join
(SELECT Follow_id, COUNT(*) as cnt
FROM t_user_relation
GROUP BY Follow_id) s
on t_user_info.User_id=s.Follow_id
SET t_user_info.Fans_count=s.cnt
I'm using a subquery to calculate the count of rows for every Follow_id in table t_user_relation:
SELECT Follow_id, COUNT(*) as cnt
FROM t_user_relation
GROUP BY Follow_id
I am then joining the result of this query with t_user_info, and I am updating Fans_count where the join succeeds, setting it to the count calculated in the subquery.
A query written like this usually runs faster because the resulting rows from the subquery are calculated only once, before the join, while in your solution your subquery is calculated once for every user row.
When dealing with a large number of records on a DB you want to stay away from the wildcard (*) and utilize indexes.