SELECT MAX in GROUP BY but LIMIT results to 1 in MYSQL - mysql

I have the following tables:
Task (id,....)
TaskPlan (id, task_id,.......,end_at)
Note that end_at is a timestamp and that one Task has many TaskPlans. I need to query for the MAX end_at for each Task.
This query works fine, except when you have the same exact timestamp for different TaskPlans. In that case, I would be returned multiple TaskPlans with the MAX end_at for the same Task.
I know this is an unlikely situation, but is there anyway I can limit the number of results for each task_id to 1?
My current code is:
SELECT * FROM Task AS t
INNER JOIN (
SELECT * FROM TaskPlan WHERE end_at in (SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id )
) AS pt
ON pt.task_id = t.id
WHERE status = 'plan';
This works, except in the above situation, how can this be achieved?
Also in the subquery, instad of SELECT MAX(end_at) FROM TaskPlan GROUP BY task_id, is it possible to do something like this so I can use TaskPlan.id for the where in clause?
SELECT id, MAX(end_at) FROM TaskPlan GROUP BY task_id
When I try, it gives the following error:
SQL Error [1055] [42000]: Expression #1 of SELECT list is not in GROUP
BY clause and contains nonaggregated column 'TaskPlan.id' which is not
functionally dependent on columns in GROUP BY clause; this is
incompatible with sql_mode=only_full_group_by
Any explaination and suggestion would be much welcome!
Note on duplicate label: (Now reopened)
I already studied the this question, but it does not provide an answer for my situation where there are multiple max values in the result and it needs to be filtered out to include only one result row per group.

Use the id rather than the timestamp:
SELECT *
FROM Task AS t INNER JOIN
(SELECT tp.*
FROM TaskPlan tp
WHERE tp.id = (SELECT tp2.id FROM TaskPlan tp2 WHERE tp2.task_id = tp.task_id ORDER BY tp2.end_at DESC LIMIT 1)
) tp
ON tp.task_id = t.id
WHERE status = 'plan';
Or use in with tuples:
SELECT *
FROM Task AS t INNER JOIN
(SELECT tp.*
FROM TaskPlan tp
WHERE (tp.task_id, tp.end_at) in (SELECT tp2.task_id, MAX(tp2.end_at)
FROM TaskPlan tp2
GROUP BY tp2.task_id
)
) tp
ON tp.task_id = t.id
WHERE status = 'plan';

If you want to get a list of task ID's with MAX end_at for each, run the query below:
SELECT t.id, MAX(tp.end_at) FROM Task t JOIN TaskPlan tp on t.id = tp.task_id GROUP BY t.id;
EDIT:
Now, I know what exactly you are going to do.
If the TaskPlan table is so big, you can avoid the 'GROUP BY' and run the query below that is very efficient:
SET #first_row := 0;
SET #task_id := 0;
SELECT * FROM Task t JOIN (
SELECT tp.*
, IF(#task_id = tp.`task_id`, #first_row := 0, #first_row := 1) AS temp
, #first_row AS latest_record
, #task_id := tp.`task_id`
FROM TaskPlan tp ORDER BY task_id, end_at DESC) a ON t.task_id = a.task_id AND a.latest_record = 1;

Try this query:
select t.ID , tp1.end_at
from TASK t
left join TASKPLAN tp1 on t.ID = tp1.id
left join TASKPLAN tp2 on t.ID = tp2.id and tp1.end_at < tp2.end_at
where tp2.end_at is null;

Related

Mysql Select unique record based on multiple columns and display only group and sum amount

Hi I am trying to query a table that conatains multiple duplicates on Code,Amount and Status How will I do this if I only one to get a result group according to the client_group name and get the sum of amount under that group
SELECT `client`.`client_group`
, FORMAT(SUM(`Data_result`.`Data_result_amount` ),2) as sum
FROM
`qwer`.`Data_result`
INNER JOIN `qwer`.`Data`
ON (`Data_result`.`Data_result_lead` = `Data`.`Data_id`)
INNER JOIN `qwer`.`Data_status`
ON (`Data_result`.`Data_result_status_id` = `Data_status`.`Data_status_id`)
INNER JOIN `qwer`.`client`
ON (`Data`.`Data_client_id` = `client`.`client_id`)
WHERE `Data_status`.`Data_status_name` IN ('PAID') AND MONTH(`Data_result`.`result_ts`) = MONTH(CURRENT_DATE())
AND YEAR(`Data_result`.`result_ts`) = YEAR(CURRENT_DATE())
GROUP BY `client`.`client_group`
Result of said query:
Table
Try to distinct before run the 'sum' check whether this solve your problem
SELECT `client_group` , FORMAT(SUM(`Data_result_amount` ),2) as sum from (
SELECT DISTINCT `client`.`client_group` , `Data_result`.`Data_result_amount`
FROM
`qwer`.`Data_result`
INNER JOIN `qwer`.`Data`
ON (`Data_result`.`Data_result_lead` = `Data`.`Data_id`)
INNER JOIN `qwer`.`Data_status`
ON (`Data_result`.`Data_result_status_id` = `Data_status`.`Data_status_id`)
INNER JOIN `qwer`.`client`
ON (`Data`.`Data_client_id` = `client`.`client_id`)
WHERE `Data_status`.`Data_status_name` IN ('PAID') AND MONTH(`Data_result`.`result_ts`) = MONTH(CURRENT_DATE())
AND YEAR(`Data_result`.`result_ts`) = YEAR(CURRENT_DATE())
) T
GROUP BY `client_group`
you can check the query here http://sqlfiddle.com/#!9/36a3f8/6

Return MYSQL results based on a sub query count?

Is there a way to adjust the following MYSQL query so that it only shows results where the video count is greater than 0? I have tried the following but it doesn't recognise
video_count
SELECT channels.*,
(SELECT COUNT(*) FROM videos WHERE videos.video_publisher_id = channels.channel_id) as `video_count`
FROM channels
WHERE channel_active = 1
AND video_count > 0
AND channel_thumbnail IS NOT NULL
ORDER BY channel_subscribers DESC
Move your subquery from the SELECT clause to the FROM clause. An inner join guarantees matches.
SELECT c.*, v.video_count
FROM channels c
JOIN
(
SELECT video_publisher_id, COUNT(*) AS video_count
FROM videos
GROUP BY video_publisher_id
) v ON v.video_publisher_id = c.channel_id
WHERE c.channel_active = 1
AND c.channel_thumbnail IS NOT NULL
ORDER BY c.channel_subscribers DESC;
Maybe this will help you.
SELECT channels.*,video_count.count
FROM channels
left join (SELECT channel_id,COUNT(*) count FROM videos) as `video_count` on video_count.video_publisher_id = channels.channel_id
WHERE channel_active = 1
AND video_count.count > 0
AND channel_thumbnail IS NOT NULL
ORDER BY channel_subscribers DESC
For filter by subquery result or aggregation function value you can use HAVING. You have error in WHERE clause because MySQL don't know about this column in WHERE
SELECT channels.*,
(SELECT COUNT(*) FROM videos WHERE videos.video_publisher_id = channels.channel_id) as `video_count`
FROM channels
WHERE channel_active = 1
AND channel_thumbnail IS NOT NULL
HAVING video_count > 0
ORDER BY channel_subscribers DESC
Or:
SELECT channels.*, COUNT(*) AS video_count
FROM channels
INNER JOIN videos
ON videos.video_publisher_id = channels.channel_id
WHERE channel_active = 1
AND channel_thumbnail IS NOT NULL
GROUP BY channels.channel_id
ORDER BY channel_subscribers DESC

SQL: How to get cells by 2 last dates from 3 different tables?

I have 3 tables (stars mach the ids from the table before):
product:
prod_id* prod_name prod_a_id prod_b_id prod_user
keywords:
key_id** key_word key_prod* kay_country
data:
id dat_id** dat_date dat_rank_a dat_traffic_a dat_rank_b dat_traffic_b
I want to run a query (in a function that gets a $key_id) that outputs all these columns but only for the last 2 dates(dat_date) from the 'data' table for the key_id inserted - so that for every key_word - I have the two last dat_dates + all the other variables included in my SQL query:
So... This is what I have so far. and I don't know how to get only the MAX vars. I tried using "max(dat_date)" in different ways that didn't work.
SELECT prod_id, prod_name, prod_a_id, prod_b_id, key_id, key_word, kay_country, dat_date, dat_rank_a, dat_rank_b, dat_traffic_a, dat_traffic_b
FROM keywords
INNER JOIN data
ON keywords.key_id = data.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
Is there a possability to do this with only one query?
EDIT (FOR IgorM):
public function newnew() {
$query = $this->db->query('WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS
RowNo FROM data
)
SELECT *
FROM CTE
INNER JOIN keywords
ON keywords.key_id = CTE.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE RowNo < 3
');
$result = $query->result();
return $result;
}
This is the error on the output:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CTE AS ( SELECT *, ROW_NUMBER() OVER (' at line 1
WITH CTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS RowNo FROM data ) SELECT * FROM CTE INNER JOIN keywords ON keywords.key_id = CTE.dat_id INNER JOIN prods ON keywords.key_prod = prods.prod_id WHERE RowNo < 3
For SQL
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY dat_id ORDER BY dat_date ASC) AS
RowNo FROM data
)
SELECT *
FROM CTE
INNER JOIN keywords
ON keywords.key_id = CTE.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE RowNo < 3
For MySQL (not tested)
SET #row_number:=0;
SET #dat_id = '';
SELECT *,
#row_number:=CASE WHEN #dat_id=dat_id THEN #row_number+1 ELSE 1 END AS row_number,
#dat_id:=dat_id AS dat_id_row_count
FROM data d
INNER JOIN keywords
ON keywords.key_id = d.dat_id
INNER JOIN prods
ON keywords.key_prod = prods.prod_id
WHERE d.row_number < 3
The other approach is self joining. I don't want to take credit for somebody else's job, so please look on the following example:
ROW_NUMBER() in MySQL
Look for the following there:
SELECT a.i, a.j, (
SELECT count(*) from test b where a.j >= b.j AND a.i = b.i
) AS row_number FROM test a
If you only want to do this for one key_id at a time (as alluded to in your responses to other answers) and only want two rows, you can just do:
SELECT p.prod_id,
p.prod_name,
p.prod_a_id,
p.prod_b_id,
k.key_id,
k.key_word,
k.key_country,
d.dat_date,
d.dat_rank_a,
d.dat_rank_b,
d.dat_traffic_a,
d.dat_traffic_b
FROM keywords k
JOIN data d
ON k.key_id = d.dat_id
JOIN prods p
ON k.key_prod = p.prod_id
WHERE k.key_id = :key_id /* Bind in key id */
ORDER BY d.dat_date DESC
LIMIT 2;
Whether you want this depends on your data structure and whether there is more than one key/prod combination per date.
Another option limiting just the data rows would be:
SELECT p.prod_id,
p.prod_name,
p.prod_a_id,
p.prod_b_id,
k.key_id,
k.key_word,
k.key_country,
d.dat_date,
d.dat_rank_a,
d.dat_rank_b,
d.dat_traffic_a,
d.dat_traffic_b
FROM keywords k
JOIN (
SELECT dat_id,
dat_date,
dat_rank_a,
dat_rank_b,
dat_traffic_a,
dat_traffic_b
FROM data
WHERE dat_id = :key_id /* Bind in key id */
ORDER BY dat_date DESC
LIMIT 2
) d
ON k.key_id = d.dat_id
JOIN prods p
ON k.key_prod = p.prod_id;
If you want some kind of grouped results for all the keywords, you'll need to look at the other answers.
I think a window function is the best way to go. without knowing a lot about the structure of the data you can try a subquery of what you are trying to restrict and then joining that to the rest of the data. Then within the where clause restrict the rows you pull back.
select p.prod_id, p.prod_name, p.prod_a_id, p.prod_b_id,
t.key_id, t.key_word, t.kay_country, t.dat_date,
t.dat_rank_a, t.dat_rank_b, t.dat_traffic_a, t.dat_traffic_b
from
(
select
k.key_id, k.key_word, k.kay_country, d.dat_date, d.dat_rank_a,
d.dat_rank_b, d.dat_traffic_a, d.dat_traffic_b,
row_number() over (partition by dat_id order by dat_date desc) as 'RowNum'
from keywords as k
inner join
data as d on k.key_id = d.dat_id
) as t
inner join
prods as p on t.key_prod = p.prod_id
where tmp.RowNum <=2
This is a "groupwise max" problem. Reference. CTE does not exist in MySQL.
I'm not totally clear on how your tables are linked, but here is a stab:
SELECT
*
FROM
( SELECT #prev := '', #n := 0 ) init
JOIN
( SELECT #n := if(k.key_id != #prev, 1, #n + 1) AS n,
#prev := k.key_id,
d.*, k.*, p.*
FROM data d
JOIN keywords k ON k.key_id = d.dat_id
JOIN prods p ON k.key_prod = p.prod_id
ORDER BY
k.key_id ASC,
d.dat_date ASC
) x
WHERE n <= 2
ORDER BY k.key_id, n;
you can use this query:
select prod_id, prod_name, prod_a_id, prod_b_id, key_id, key_word,
kay_country, dat_date, dat_rank_a, dat_rank_b, dat_traffic_a, dat_traffic_b
from keywords where dat_date in (
SELECT MAX(dat_date) FROM keywords temp_1
where temp_1.prod_id = keywords.prod_id
union all
SELECT MAX(dat_date) FROM keywords
WHERE dat_date NOT IN (SELECT MAX(dat_date ) FROM keywords temp_2 where
temp_2.prod_id = keywords.prod_id)
)

Join between sub-queries in SQLAlchemy

In relation to the answer I accepted for this post, SQL Group By and Limit issue, I need to figure out how to create that query using SQLAlchemy. For reference, the query I need to run is:
SELECT t.id, t.creation_time, c.id, c.creation_time
FROM (SELECT id, creation_time
FROM thread
ORDER BY creation_time DESC
LIMIT 5
) t
LEFT OUTER JOIN comment c ON c.thread_id = t.id
WHERE 3 >= (SELECT COUNT(1)
FROM comment c2
WHERE c.thread_id = c2.thread_id
AND c.creation_time <= c2.creation_time
)
I have the first half of the query, but I am struggling with the syntax for the WHERE clause and how to combine it with the JOIN. Any one have any suggestions?
Thanks!
EDIT: First attempt seems to mess up around the .filter() call:
c = aliased(Comment)
c2 = aliased(Comment)
subq = db.session.query(Thread.id).filter_by(topic_id=122098).order_by(Thread.creation_time.desc()).limit(2).offset(2).subquery('t')
subq2 = db.session.query(func.count(1).label("count")).filter(c.id==c2.id).subquery('z')
q = db.session.query(subq.c.id, c.id).outerjoin(c, c.thread_id==subq.c.id).filter(3 >= subq2.c.count)
this generates the following SQL:
SELECT t.id AS t_id, comment_1.id AS comment_1_id
FROM (SELECT count(1) AS count
FROM comment AS comment_1, comment AS comment_2
WHERE comment_1.id = comment_2.id) AS z, (SELECT thread.id AS id
FROM thread
WHERE thread.topic_id = :topic_id ORDER BY thread.creation_time DESC
LIMIT 2 OFFSET 2) AS t LEFT OUTER JOIN comment AS comment_1 ON comment_1.thread_id = t.id
WHERE z.count <= 3
Notice the sub-query ordering is incorrect, and subq2 somehow is selecting from comment twice. Manually fixing that gives the right results, I am just unsure of how to get SQLAlchemy to get it right.
Try this:
c = db.aliased(Comment, name='c')
c2 = db.aliased(Comment, name='c2')
sq = (db.session
.query(Thread.id, Thread.creation_time)
.order_by(Thread.creation_time.desc())
.limit(5)
).subquery(name='t')
sq2 = (
db.session.query(db.func.count(1))
.select_from(c2)
.filter(c.thread_id == c2.thread_id)
.filter(c.creation_time <= c2.creation_time)
.correlate(c)
.as_scalar()
)
q = (db.session
.query(
sq.c.id, sq.c.creation_time,
c.id, c.creation_time,
)
.outerjoin(c, c.thread_id == sq.c.id)
.filter(3 >= sq2)
)

Slow MySQL query with subquery from table

I am trying to bring back a string based on an IF statement but it is extremely slow.
It has something to do with the first subquery but I am unsure of how to rearrange this as to bring back the same results but faster.
Here is my SQL:
SELECT IF
(
(
SELECT COUNT(*)
FROM
(
SELECT DISTINCT enquiryId, type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id
) AS parts
WHERE parts.enquiryId = enquiries.id
) > 1, 'Mixed',
(
SELECT DISTINCT type
FROM parts_enquiries, parts_service_types AS pst
WHERE parts_enquiries.serviceTypeId = pst.id AND enquiryId = enquiries.id
)
) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
How can I make it faster?
I have modified my original query below, but I am getting the error that subquery returns more than one row:
SELECT
(SELECT
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId) AS partTypes
FROM enquiries,
entities
WHERE enquiries.entityId = entities.id
Please have a look if this query yields the same results:
SELECT
enquiryId,
CASE WHEN COUNT(DISTINCT type) > 1 THEN 'Mixed' ELSE `type` END AS type
FROM parts_enquiries
INNER JOIN parts_service_types AS pst ON parts_enquiries.serviceTypeId = pst.id
INNER JOIN enquiries ON parts_enquiries.enquiryId = enquiries.id
INNER JOIN entities ON enquiries.entityId = entities.id
GROUP BY enquiryId
But N.B.'s comment is still valid. To see if and index is used and other information we need to see the EXPLAIN and the table definitions.
This should get you what you want.
I would first pre-query your parts enquiries and parts service types looking for both the count and MINIMUM of the part 'type', grouped by the enquiry ID.
then, run your IF() against that result. If the distinct count is > 0, then 'Mixed'. If only one, since I did the MIN(), it would only have the description of that one value that you desire anyhow.
SELECT
E.ID
IF ( PreQuery.DistTypes > 1, 'Mixed', PreQuery.FirstType ) as PartType
from
Enquiries E
JOIN ( SELECT
PE.EnquiryID,
COUNT( DISTINCT PE.ServiceTypeID ) as DistTypes,
MIN( PST.Type ) as FirstType
from
Parts_Enquiries PE
JOIN Parts_Service_Types PST
ON PE.ServiceTypeID = PST.ID
group by
PE.EnquiryID ) as PreQuery
ON E.ID = PreQuery.EnquiryID