MySQL HAVING & WHERE query different result - mysql

I just a MySQL beginner. This is first time for me asking you guys at STACKOVERFLOW about query using HAVING and WHERE:
SELECT
BOXNUMBER
,COUNT(BOXNUMBER) AS QTY
,CDATETIME
FROM
HSS_SNO
WHERE
year(CDATETIME) IN ('2008','2010','2014')
GROUP BY
BOXNUMBER ;
/* Affected rows: 0 Found rows: 13,928 Warnings: 0 Duration for 1 query: 0.031 sec. (+ 2.782 sec. network) */
SELECT
BOXNUMBER
,COUNT(BOXNUMBER) AS QTY
,CDATETIME
FROM
HSS_SNO
GROUP BY
BOXNUMBER
HAVING
year(CDATETIME) IN ('2008','2010','2014');
/* Affected rows: 0 Found rows: 13,922 Warnings: 0 Duration for 1 query: 0.047 sec. (+ 2.594 sec. network) */
I think these queries will give me same result, but 'found rows' different each other.
Could you tell me why like that ?
Thanks
Tobing
(Sorry for my English)

.......
WHERE
year(CDATETIME) IN ('2008','2010','2014')
GROUP BY
BOXNUMBER ;
The above query gives more rows because you are not applying any condition on Group by clause and the other query
......
GROUP BY
BOXNUMBER
HAVING
year(CDATETIME) IN ('2008','2010','2014');
here you are applying the condition on group by that what type of records you wanted, as a result u got less records when compared.
have a look at this link which will helps in understanding of sql query execution in detail http://social.msdn.microsoft.com/Forums/sqlserver/en-US/70efeffe-76b9-4b7e-b4a1-ba53f5d21916/order-of-execution-of-sql-queries

Related

How to use CASE to select MAX(date) WHEN active=1? (without subquery)

I'm trying to optimize some code, if this is possible it's not only more elegant but it would save me running several other queries to get the same data and speed up my while loop considerably.
How would I CASE select the MAX (date) where it is also 1 from a dataset like this?
0 2020-06-30
0 2020-06-26
1 2020-06-25 <---- I want this guy
0 2020-06-24
0 2020-06-24
0 2020-06-23
0 2020-06-22
0 2020-06-22
0 2020-06-16
0 2020-06-16
0 2020-06-12
1 2020-06-12
0 2020-06-11
0 2020-06-01
0 2020-06-01
I tried something like this but obviously this doens't work.
CASE
WHEN aty.type_count = '1' AND ac.activity_date = MAX(ac.activity_date)
THEN ac.activity_date
ELSE 0
END
AS max_date_active
I can't just sort by both columns as sometimes there are no 1 results. I guess I could make the result set a query, but I am running other SUM(CASE())'s on the same data set, so I'm trying to make it all work together as a single, elegant query.
Any ideas?
EDIT: I updated the name to "without subquery" as once I'm using a subquery I might as well just create a separate query to get the results. I'm curently thinking I just get the entire data set back, and figure out what I want using a PHP loop. Not as elegant but at least it saves several complex joined queries.
A LIMIT query might be the easiest option here:
SELECT *
FROM yourTable
WHERE type_count = '1'
ORDER BY activity_date DESC
LIMIT 1;
If there might be more than one record with a type count of 1, tied for the latest date, then we can use a subquery:
SELECT *
FROM yourTable t1
WHERE
type_count = '1' AND
activity_date = (SELECT MAX(activity_date) FROM yourTable WHERE type_count = '1');
As far as I can tell it's not possible to do exactly what I wanted. Subqueries are possible but if I'm processing the query twice inside itself I'd rather handle them separately.
In the end I just kept the result set shown in my question, and then did a basic loop in PHP to extract the info I wanted.

Slow Query Time in MySQL

The following query is overloading my system. It seems to be a problem with the rand(). I have seen other posts dealing with similar issues, but can't quite get them working in this problem. It is being run on a 10M+ row table. I know that the order by rand() is the issue, but after reading there seems to be an issue of the autoincrement (items.ID) increments by 2 not 1.
SELECT stores.phone, stores.storeID, stores.name, stores.ZIP,
stores.state,stores.city, storeID, GEOCODES.lon, GEOCODES.lat
FROM items
LEFT JOIN stores on stores.storeID = items.store_ID
LEFT JOIN GEOCODES on GEOCODES.address = CONCAT(stores.address1,', ',stores.ZIP)
WHERE stores.phone IS NOT NULL
GROUP BY items.store_ID
ORDER BY RAND( )
LIMIT 200
The other article that I was trying to follow was How can i optimize MySQL's ORDER BY RAND() function?, but can't seem to figure out how to adapt it to this query. Please note that this is done in PHP.
if I were you I would LIMIT first and then ORDER BY RAND() on the limited query.. that way you arent pulling everything out and randomizing it.. I have used this exact method to speed up my queries exponentially
SELECT *
FROM
( SELECT stores.phone, stores.storeID, stores.name, stores.ZIP,
stores.state,stores.city, storeID, GEOCODES.lon, GEOCODES.lat
FROM items
LEFT JOIN stores on stores.storeID = items.store_ID
LEFT JOIN GEOCODES on GEOCODES.address = CONCAT(stores.address1,', ',stores.ZIP)
WHERE stores.phone IS NOT NULL
GROUP BY items.store_ID
LIMIT 200
) t
ORDER BY RAND( )
Some proof:
CREATE table digits as (-- a digit table with 1 million rows)
1000000 row(s) affected Records: 1000000 Duplicates: 0 Warnings: 0
1.869 sec
SELECT * FROM digits ORDER BY RAND() LIMIT 200
200 row(s) returned
0.465 sec / 0.000 sec
SELECT * FROM (SELECT * FROM digits LIMIT 200)t ORDER BY RAND()
200 row(s) returned
0.000 sec / 0.000 sec
Using RAND() in your query has serious performance implications, avoiding it will speed up your query a lot.
Also since you're using php, randomizing the ordering using shuffle() w/ php may be a significantly quicker alternative to mysql.
See: http://php.net/manual/en/function.shuffle.php

Update anomaly. Mysql sql different rows affected count

SELECT * FROM `attempts` WHERE date = '27-04-2014' LIMIT 0 , 30
This particular query gave 386 results(PHPmyAdmin) but on executing the below query
UPDATE `attempts` SET points = points *2 WHERE date = '27-04-2014'
I got 379 rows affected. . Shouldn't I get same numbers? Any other reasons possible? Or am I wrong somewhere?
The query won't affect the rows where points = 0, because doubling the value of points won't have any effect.
For example, try running this query:
UPDATE `attempts` SET points = points + 0 WHERE date = '27-04-2014'
and it will show 0 rows affected.
Also, the count shown by phpMyAdmin is an estimate, if you're using InnoDB. Use COUNT(*) to get the exact count.
SELECT COUNT(*) FROM `attempts` WHERE date = '27-04-2014'
"Affected rows" counts only rows that were changed. If you have records with 0 points, doubling the number of points has no effect and these records will not be included in the count.

MySQL groupby with sum

I have a query with group by and sum. I have close to 1 million records. When i run the query it is taking 2.5s. If i remove the group by clause it is taking 0.89s. Is there any way we can optimize the query using group by and sum together.
SELECT aggEI.ei_uuid AS uuid,aggEI.companydm_id AS companyId,aggEI.rating AS rating,aggEI.ei_name AS name,
compdm.company_name AS companyName,sum(aggEI.count) AS activity
FROM AGG_EXTERNALINDIVIDUAL AS aggEI
JOIN COMPANYDM AS compdm ON aggEI.companydm_id = compdm.companydm_id
WHERE aggEI.ei_uuid is not null
and aggEI.companydm_id IN (8)
and aggEI.datedm_id = 20130506
AND aggEI.topicgroupdm_id IN (1,2,3,4,5,6,7)
AND aggEI.rating >= 0
AND aggEI.rating <= 100
GROUP BY aggEI.ei_uuid,aggEI.companydm_id
LIMIT 0,200000
Explain result is as below:
1 SIMPLE compdm const PRIMARY,companydm_id_UNIQUE,comp_idx PRIMARY 8 const 1 Using temporary; Using filesort
1 SIMPLE aggEI ref PRIMARY,datedm_id_UNIQUE,agg_ei_comdm_fk_idx,agg_ei_datedm_fk_idx,agg_ei_topgrp_fk_idx,uid_comp_ei_dt_idx,uid_comp_dt_idx,comp_idx datedm_id_UNIQUE 4 const 197865 Using where
Also i didn't understand why compdm table is executed first. Can someone explain?
I have index on AGG_EXTERNALINDIVIDUAL table with combination of ei_uuid,companydm_id,datedm_id. The same is shown on aggEI table under possible keys as uid_comp_dt_idx. But aggEI table is taking datedmid_UNIQUE as key. I didn't understand the behavior.
Can someone explain?
Explain has to run the dependent queries before it can run the main one.
You need to check indexing on AGG_EXTERNALINDIVIDUAL.

HOW to understand this EXPLAIN plan..? It is looking confusing to me

I have a table with 130Mill rows in it. When i EXPLAIN the query, it is showing 130Mill in the rowns column.
EXPLAIN SELECT * FROM TABLE1;
rows: 130 Mill
But when it add a where condition with TABLE1.time_dim_id between 1900 and 2000
EXPLAIN SELECT * FROM TABLE1 WHERE time_dim_id between 1900 and 2000 ;
rows: 60Mill
Why am is surprised is because, there are no NULL values for this time_dim_id value and
min(time_dim_id) is 1900 AND
max(time_dim_id) is 2000.
Why is this not showing 130Mill rows again under rows section in the 2nd EXPLAIN plan...?
Try
ANALYZE TABLE TABLE1
and then again you EXPLAIN - this assumes there is an index on TABLE1.time_dim_id