This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 4 years ago.
this one is driving me to drink so I would love some help.
I've got a table with:
act_Address, act_OrderID, act_Date
I'm trying to get the first act_Date for each address we shipped to.
Here's what I've tried but it's been running now for well over an hour so I'm thinking this isn't going to work...
SELECT c.act_Address,
(SELECT o.act_OrderID
FROM tbl_Activity o
WHERE c.act_Address = o.act_Address
ORDER BY o.act_Date
LIMIT 1) AS order_id,
(SELECT d.act_Date
FROM tbl_Activity d
WHERE c.act_Address = d.act_Address
ORDER BY d.act_Date
LIMIT 1) as order_date
FROM tbl_Activity c
I've got to be doing something very wrong, doesn't seem like getting the first date for an address would be that hard, but I'm not that smart.
Your query uses two correlated subqueries to get act_Date and act_OrderID values. Each subquery is executed once for every record of tbl_Activity.
You can use:
SELECT act_Address, MIN(act_Date) AS fist_Date
FROM tbl_Activity
GROUP BY act_Address
to get the first date per address. Then you can use the above query as a derived table and join back to the original table to get the rest of the fields:
SELECT t1.act_Address, t1.act_OrderID, t1.act_date
FROM tbl_Activity AS t1
JOIN (
SELECT act_Address, MIN(act_Date) AS fist_Date
FROM tbl_Activity
GROUP BY act_Address
) AS t2 ON t1.act_Address = t2.act_Address AND t1.act_Date = t2.first_Date
I also propose placing a composite index on (act_Address, act_Date).
You can do this by GROUP BY in a subselect:
SELECT a.act_Address, a.act_OrderID, a.act_Date
FROM (
SELECT a2.act_Address addr, MIN(a2.act_Date) mindate FROM tbl_Activity a2
GROUP BY a2.act_Address
) g, tbl_Activity a
WHERE a.act_Address = g.addr AND a.act_Date = g.mindate;
Related
I am trying to produce a result that shows duplicates in a table. One method I found for getting duplicates and showing them is to run the select statement again through an inner join. However, one of my columns needs to be the result of a function, and the only thing I can think to do is use an alias, however I can't use the alias twice in a SELECT statement.
I am not sure what the best way to run this code for getting the duplicates I need.
My code below
SELECT EXTRACT(YEAR_MONTH FROM date) as 'ndate', a.transponderID
FROM dispondo_prod_disposition.event a
inner JOIN (SELECT EXTRACT(YEAR_MONTH FROM date) as ???,
transponderID, COUNT(*)
FROM dispondo_prod_disposition.event
GROUP BY mdate, transponderID
HAVING count(*) > 1 ) b
ON ndate = ???
AND a.transponderID = b.transponderID
ORDER BY b.transponderID
SELECT b.ndate, transponderID
FROM dispondo_prod_disposition.event a
INNER JOIN ( SELECT EXTRACT(YEAR_MONTH FROM date) as ndate,
transponderID
FROM dispondo_prod_disposition.event
GROUP BY 1, 2
HAVING COUNT(*) > 1 ) b USING (transponderID)
WHERE b.ndate = ??? -- for example, WHERE b.ndate = 202201
ORDER BY transponderID
This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 2 years ago.
I came across this interesting problem. I have a table named email_track to track email status for each category say (invitation, newsletter)
This is how my table data looks,
With these following queries I'm able to get most recent record for each to_email,
with `et2` as (
select `et1`.`category`, `et1`.`to_email`, `et1`.`subject`, `et1`.`status`, ROW_NUMBER() OVER (partition by `to_email` order by `id` desc) as `rn`
from `email_track` `et1`
)
select * from `et2` where `rn` = 1;
select `et1`.`category`, `et1`.`to_email`, `et1`.`subject`, `et1`.`status`, `et2`.`id`
from `email_track` `et1`
left join `email_track` `et2` on (`et1`.`to_email` = `et2`.`to_email` and `et1`.`id` < `et2`.`id`)
where `et2`.`id` is null;
What I'm expecting is for email john#example.com I should get two records one for category invitation and the other for the newsletter. Now, we won't get that result since we partition by to_email
I should get two records one for category invitation and the other for the newsletter. Now, we won't get that result since we partition by to_email.
Adding the category to the partition by clause of the window function should be enough to give your the result that you want:
with et2 as (
select et1.category, et1.to_email, et1.subject, et1.status,
row_number() over(partition by to_email, category order by id desc) as rn
from email_track et1
)
select * from et2 where rn = 1;
This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 2 years ago.
I'm trying to run a distinct on four columns in the query below:
select
full_records.id,
full_records.domain_id,
subdomains.name as subdomain_name,
types.name as type_name,
changelog.content as content,
changelog.changed_on
from full_records
inner join subdomains on full_records.subdomain_id = subdomains.id
inner join types on full_records.type_id = types.id
inner join changelog on full_records.id = changelog.full_record_id
where
full_records.domain_id = 2
order by changelog.changed_on desc
and this returns the following:
I'm not sure how to go about altering the query so that it only returns the records that are unique across these four fields.
full_records.domain_id,
subdomains.name as subdomain_name,
types.name as type_name,
changelog.content as content
So if they were unique across those four fields, the rows 2, 3, 4 and 7 would not be in the results. It's basically to identify the latest change for a domain record. Any help would be really appreciated. Thanks.
One pretty simple method is row_number():
with cte as (
select fr.id, fr.domain_id, sd.name as subdomain_name,
t.name as type_name, cl.content, cl.changed_on
from full_records fr join
subdomains sd
on fr.subdomain_id = sd.id join
types t
on fr.type_id = t.id join
changelog cl
on fr.id = cl.full_record_id
where fr.domain_id = 2
)
select cte.*
from (select cte.*,
row_number() over (partition by domain_id, subdomain_name, type_name, content
order by changed_on desc
) as seqnum
from cte
) cte
where seqnum = 1;
Note that I added table aliases so the query is easier to write and to read.
This question already has answers here:
SQL select only rows with max value on a column [duplicate]
(27 answers)
Closed 4 years ago.
I have the following query
SELECT DISTINCT XCS_TASK.WORKFLOW_ID,
XCS_TASK.COMPLETED_BY,
XCS_WORKFLOW.OBJECT_KEY,
XCS_WORKFLOW.OBJECT_TYPE_ID,
XCS_WORKFLOW.END_DATE_TIME,
XCS_WORKFLOW.START_DATE_TIME
FROM `XCS_TASK`
inner JOIN XCS_WORKFLOW ON
XCS_TASK.WORKFLOW_ID = XCS_WORKFLOW.WORKFLOW_ID
WHERE TASK_TYPE_ID = 124
GROUP BY XCS_WORKFLOW.OBJECT_KEY
ORDER BY XCS_WORKFLOW.START_DATE_TIME DESC
The problem is that I want to get the latest record for that OBJECT_KEY. I know above query is wrong because it groups by and then sorts the result of it. I looked in using the MAX(DATE) function but I couldn't get it to work in this scenario. Any help or pointers would be appreciated.
You could try joining the aggregated result for OBJECT_KEY and max date (eg: start_date_time)
SELECT
XCS_TASK.WORKFLOW_ID,
XCS_TASK.COMPLETED_BY,
XCS_WORKFLOW.OBJECT_KEY,
XCS_WORKFLOW.OBJECT_TYPE_ID,
XCS_WORKFLOW.END_DATE_TIME,
XCS_WORKFLOW.START_DATE_TIME
FROM `XCS_TASK`
INNER JOIN XCS_WORKFLOW ON XCS_TASK.WORKFLOW_ID = XCS_WORKFLOW.WORKFLOW_ID
INNER JOIN (
SELECT
XCS_WORKFLOW.OBJECT_KEY,
MAX( XCS_WORKFLOW.START_DATE_TIME ) max_date
FROM XCS_WORKFLOW
GROUP BY OBJECT_KEY
) t ON t.OBJECT_KEY = XCS_WORKFLOW.OBJECT_KEY
AND XCS_WORKFLOW.OBJECT_KEY = t.max_date
WHERE TASK_TYPE_ID = 124
This question already has answers here:
SQL - HAVING vs. WHERE
(9 answers)
Closed 4 years ago.
How to select rows where sum of a row is over 200?
I tried all kinds of combinations with grouping, setting AS something and using WHERE clause
Current attempt as follow
SELECT something.CustomerName, something.CustomerAge, cars.Prices,
SUM(cars.Price) AS Amount
FROM cars
INNER JOIN something ON something.CustomerNo=Cars.CustomerNo
GROUP BY AMOUNT
WHERE AMOUNT > '200'
I could not find a tutorial on how to do this
According to your current attempt where clause should go before group by clause
SELECT something.CustomerName, something.CustomerAge,
SUM(cars.Price) AS Amount
FROM cars
INNER JOIN something ON something.CustomerNo=Cars.CustomerNo
GROUP BY something.CustomerName, something.CustomerAge
HAVING SUM(cars.Price) > 200;
However, you actually need to apply your filter on Amount but, you can't do that via where clause for that you would need to apply having clause filter rather than where clause
My today advice is to use table alise that could be more readable and easy to use/implement
SELECT s.CustomerName, s.CustomerAge,
SUM(c.Price) AS Amount
FROM cars as c -- use of alise to make it more effective or readable
INNER JOIN something as s ON s.CustomerNo = c.CustomerNo -- and use that alise everywhere entire the query
GROUP BY s.CustomerName, s.CustomerAge
HAVING SUM(c.Price) > 200;
You could use a subquery to do your maths, then select the values you want
SELECT * FROM
(
SELECT (col1 + col2) AS SumOfCols, table.* FROM table
)
WHERE SumOfCols > 200
Or another similar approach is to join an ad-hoc table
SELECT table.* FROM table
INNER JOIN
(
SELECT ID, (col1 + col2) AS SumOfCols FROM table
) AS TableSums ON TableSums.ID = table. ID
WHERE TableSums.SumOfCols > 200