Struggling with an SQL query to select the 5 most recent, unique, entries in a MySQL 5.7.22 table. For example, here's the 'activity' table:
uaid nid created
9222 29722 2018-05-17 03:19:33
9221 31412 2018-05-17 03:19:19
9220 31160 2018-05-16 23:47:34
9219 31160 2018-05-16 23:47:30
9218 31020 2018-05-16 22:35:59
9217 31020 2018-05-16 22:35:54
9216 28942 2018-05-16 22:35:20
...
The desired query should return the 5 most recent, unique entries by the 'nid' attribute, in this order (but only need the nid attribute):
uaid nid created
9222 29722 2018-05-17 03:19:33
9221 31412 2018-05-17 03:19:19
9220 31160 2018-05-16 23:47:34
9218 31020 2018-05-16 22:35:59
9216 28942 2018-05-16 22:35:20
I have tried a variety of combinations of DISTINCT but none work, ie:
select distinct nid from activity order by created desc limit 5
What is the proper query to return the 5 most recent, uniq entries by nid?
Your problem is the simplest form of the top-N-per-group problem. In general, this problem is a real headache to handle in MySQL, which doesn't support analytic functions (at least not in most versions folks are using in production these days). However, since you only want the first record per group, we can do a join to subquery which finds the max created value for each nid group.
SELECT a1.*
FROM activity a1
INNER JOIN
(
SELECT nid, MAX(created) AS max_created
FROM activity
GROUP BY nid
) a2
ON a1.nid = a2.nid AND a1.created = a2.max_created;
Demo
You can use a subquery and join
select * from activity m
inner join (
select nid, min(created) min_date
from activity
group by nid
limit 5
) t on t.nid = m.nin and t.min_date = m.created
Related
My base query:
SELECT project_id
name
stories_produced
on_date
FROM project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
ORDER BY project_id
It can get me these outputs:
Output example:
id name stories_produced on_date
1042 project 1 1001 (wanted) 2017-03-01
1042 project 1 1801 (wanted) 2017-06-10
1568 project 2 355 (wanted) 2017-06-10
1405 project 3 1 (not wanted) 2017-03-10
1405 project 3 1 (not wanted) 2017-06-10
Obs: There is a constraint on (id, on_date) meaning there can always be only one record of a project production on a specific date.
Duplicate records, that have the same id, and exist in both dates and have different production values (wanted)
Single records, that exists on only one of the dates (wanted)
The problem:*
Duplicate records, that have the same id, and exist in both dates and have equal production values (not wanted)
My current query, that need change
select project_id
name
CASE
WHEN max(stories_produced) - min(stories_produced) = 0
THEN max(stories_produced)
ELSE max(stories_produced) - min(stories_produced)
END AS 'stories_produced'
from project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
group by project_id;
output example:
id name stories_produced
1042 project 1 800 (wanted)
1568 project 2 355 (wanted)
1405 project 3 1 (not wanted)
The CASE is currently not taking care of the third constraint (Duplicate records, that have the same id, and exist in both dates and have EQUAL production values (not wanted))
Is there any possible condition that can accommodate this?
One option uses not exists to drop rows that have the same id, and exist in both dates and have equal production values:
select
p.project_id,
p.name,
p.stories_produced,
p.on_date,
from project_prod p
where
on_date in ('2017-03-01', '2017-06-10')
and not exists (
select 1
from project_prod p1
where
p1.on_date in ('2017-03-01', '2017-06-10')
and p1.on_date <> p.date
and p1.id = p.id
and p1.stories_produced = p.stories_produced
)
order by project_id
In MySQL 8.0, you can use window functions:
select
project_id,
name,
stories_produced,
on_date,
from (
select
p.*,
min(stories_produced) over(partition by project_id) min_stories_produced,
max(stories_produced) over(partition by project_id) max_stories_produced,
count(*) over(partition by project_id) max_stories_produced cnt
from project_prod p
where on_date in ('2017-03-01', '2017-06-10')
) t
where not (cnt = 2 and min_stories_produced = max_stories_produced)
oder by project_id
I have two tables in the datbase to store client basic info (name, location, phone number) and another table to store client related transactions (date_sub, profile_sub,isPaid,date_exp,client_id) and i have an html table to view the client basic info and transaction if are available, my problem that i can't get a query to select the client info from table internetClient and from internetclientDetails at the same time, because query is only resulting when client have trans in the detail table. the two table fields are as follow:
internetClient
--------------------------------------------------------
id full_name location phone_number
-------------------------------------------------------
4 Joe Amine beirut 03776132
5 Mariam zoue beirut 03556133
and
internetclientdetails
--------------------------------------------------------------------------
incdid icid date_sub date_exp isPaid sub_price
----------------------------------------------------------------------------
6 4 2018-01-01 2018-01-30 0 2000
7 5 2017-01-01 2017-01-30 0 1000
8 4 2018-03-01 2018-03-30 1 50000
9 5 2018-05-01 2019-05-30 1 90000
// incdid > internetClientDetailsId
// icid> internetClientId
if client have trans in orderdetails, the query should return value like that:
client_id full_name date_sub date_exp isPaid sub_price
-------------------------------------------------------------------------------------
4 Joe Amine 2018-03-01 2018-03-30 1 50000
5 Mariam zoue 2018-05-01 2019-05-30 1 90000
else if the client has no id in internetOrederDetails
--------------------------------------------------------
icid full_name location phone_number
-------------------------------------------------------
4 Joe Amine beirut 03776132
5 Mariam zoue beirut 0355613
Thanks in advance
try with left join. It will display all records from internetClient and related record from internetclientdetails
Select internetClient.id, internetClient.full_name
, internetClient.location, internetClient.phone_number
, internetclientdetails.incdid, internetclientdetails.icid
, internetclientdetails.date_sub, internetclientdetails.date_exp
, internetclientdetails.isPaid, internetclientdetails.sub_price
from internetClient
left join internetclientdetails
on internetClient.id=internetclientdetails.icid group by internetclientdetails.icid order by internetclientdetails.incdid desc
if you want to get records of, only paid clients then you can try the following
Select internetClient.id, internetClient.full_name
, internetClient.location, internetClient.phone_number
, internetclientdetails.icid, internetclientdetails.incdid
, internetclientdetails.date_sub, internetclientdetails.date_exp
, internetclientdetails.isPaid, internetclientdetails.sub_price
from internetClient
left join internetclientdetails
on internetClient.id=internetclientdetails.icid
and internetclientdetails.isPaid=1 group by internetclientdetails.icid
order by internetclientdetails.incdid desc
SUMMARY
We generate a dataset containing just the ICID and max(date_sub) (alias:ICDi) We join this to the InternetClientDetails (ICD) to obtain just the max date record per client. Then left join this to the IC record; ensuring we keep all InternetClient(IC) records; and only show the related max Detail Record.
The below approach should work in most mySQL versions. It does not use an analytic which we could use to get the max date instead of the derived table provided the MySQL version you use supported it.
FINAL ANSWER:
SELECT IC.id
, IC.full_name
, IC.location
, IC.phone_number
, ICD.icid
, ICD.incdid
, ICD.date_sub
, ICD.date_exp
, ICD.isPaid
, ICD.sub_price
FROM internetClient IC
LEFT JOIN (SELECT ICDi.*
FROM internetclientdetails ICDi
INNER JOIN (SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
ON ICDi.ICID = mICD.ICID
AND ICDi.Date_Sub = mICD.MaxDateSub
) ICD
on IC.id=ICD.icid
ORDER BY ICD.incdid desc
BREAKDOWN / EXPLANATION
The below gives us a subset of max(date_Sub) for each ICID in clientDetails. We need to so we can filter out all the records which are not the max date per clientID.
(SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
Using that set we join to the details on the Client_ID's and the max date to eliminate all but the most recent detail for each client. We do this because we need the other detail attributes. This could be done using a join or exists. I prefer the join approach as it seems more explicit to me.
(SELECT ICDi.*
FROM internetclientdetails ICDi
INNER JOIN (SELECT max(date_sub) MaxDateSub, ICID
FROM internetclientdetails
GROUP BY ICID) mICD
ON ICDi.ICID = mICD.ICID
AND ICDi.Date_Sub = mICD.MaxDateSub
) ICD
Finally the full query joins the client to the detail keeping client even if there is no detail using a left join.
COMPONENTS:
You wanted all records from InternetClient (FROM internetClient IC)
You wanted related records from InternetClientDetail (LEFT Join InternetClientDetail ICD) while retaining teh records from InternetClient.
You ONLY wanted the most current record from InternetClientDetail (INNER JOIN InternetClientDetail mICD as a derived table getting ICID and max(date))
Total record count should = total record count in InternetClient which means all relationships must be a 1:1o on the table joins -- one-to-one Optional.
I've got the following tables
menu_supp
menu_id supp_id supp_weight supp_load_follow
1 29 10.00 1
1 31 20.00 2
supps
supp_id user_id supp_name
29 1 Test supp 1
31 1 Test supp 2
supps_prop
supp_id supp_dry_w supp_price supp_date
29 95.00 125.00 2015-10-25
29 94.00 124.00 2015-11-06
29 94.00 128.00 2015-11-12
31 25.00 200.00 2015-06-25
Now I've got this query:
SELECT s.supp_id, s.supp_name, ms.supp_weight, sp.supp_price, sp.supp_dry_weight
FROM menu_supp ms
LEFT JOIN supps s ON ms.supp_id = s.supp_id
LEFT JOIN supps_prop sp ON ms.supp_id = sp.supp_id
WHERE menu_id = 1
GROUP BY s.supp_id
ORDER BY ms.supp_load_follow ASC
Which gives me this result:
supp_id supp_name supp_weight supp_price supp_dry_weight
29 Test supp 1 10.00 125.00 95.00
31 Test supp 2 20.00 200.00 25.00
From supp 29 it gets the oldest value. Where it should take the value based on the current date. How can I achieve that?
If the supp_date is unique for a supp_id then you can use the following to get the value for the latest date:-
SELECT s.supp_id, s.supp_name, ms.supp_weight, sp.supp_price, sp.supp_dry_weight
FROM menu_supp ms
LEFT JOIN supps s
ON ms.supp_id = s.supp_id
LEFT JOIN
(
SELECT supp_id, MAX(supp_date) AS max_supp_date
FROM supps_prop
GROUP BY supp_id
) sub0
ON ms.supp_id = sub0.supp_id
LEFT OUTER JOIN supps_prop sp
ON sub0.supp_id = sp.supp_id
AND sub0.max_supp_date = sp.supp_date
WHERE menu_id = 1
ORDER BY ms.supp_load_follow ASC
This gets the max supp_date for each supp_id and joins that back to the supps_prop table to get the other fields from it.
EDIT - Coping with either the highest date, or the lowest date after today is a bit more complicated.
I would suggest having 2 sub queries. One to get the highest date for each supp_id and one to get the lowest date on or after today for each supp_id. If the 2nd is found then use that, if not use the first. Not tested but:-
SELECT s.supp_id, s.supp_name, ms.supp_weight, COALESCE(sp1.supp_price, sp0.supp_price), COALESCE(sp1.supp_dry_weight, sp0.supp_dry_weight)
FROM menu_supp ms
LEFT JOIN supps s
ON ms.supp_id = s.supp_id
LEFT JOIN
(
SELECT supp_id, MAX(supp_date) AS max_supp_date
FROM supps_prop
GROUP BY supp_id
) sub0
ON ms.supp_id = sub0.supp_id
LEFT OUTER JOIN supps_prop sp0
ON sub0.supp_id = sp0.supp_id
AND sub0.max_supp_date = sp0.supp_date
LEFT JOIN
(
SELECT supp_id, MIN(supp_date) AS max_supp_date
FROM supps_prop
WHERE supp_date >= CURDATE()
GROUP BY supp_id
) sub1
ON ms.supp_id = sub1.supp_id
LEFT OUTER JOIN supps_prop sp1
ON sub1.supp_id = sp1.supp_id
AND sub1.max_supp_date = sp1.supp_date
WHERE menu_id = 1
ORDER BY ms.supp_load_follow ASC
EDIT - An explanation of GROUP BY, etc:-
GROUP BY is used for aggregate functions; these are functions that give a value over a range of rows which share common field values. For example, SUM would be used to add up the values of the fields over multiple rows often for a shared value (ie, maybe the SUM of order values for a customer id). The shared value field is used given in the GROUP BY field.
In normal standard SQL all the returned non aggregate fields returned by the SELECT statement must be mentioned in the GROUP BY statement. This makes logical sense as if they are not mentioned then the values for a group of rows could be different and then there is the problem of which one to choose.
However there are times when this can be a bit too restrictive. For example if you are grouping by a customer id then the customer name is directly related to this customer id. MySQL does allow you to return non aggregate fields in the SELECT statement that are not specified in the GROUP BY clause, but if the values vary over the rows that are grouped together then which value is chosen is not specified; it could be from any of the rows, and indeed there is no reason that it might not change in the future or when using a different storage engine.
Sometimes GROUP BY is abused to return unique rows, in the way that DISTINCT is meant to be used.
In your original query
SELECT s.supp_id, s.supp_name, ms.supp_weight, sp.supp_price,
sp.supp_dry_weight FROM menu_supp ms LEFT JOIN supps s ON ms.supp_id =
s.supp_id LEFT JOIN supps_prop sp ON ms.supp_id = sp.supp_id WHERE
menu_id = 1 GROUP BY s.supp_id ORDER BY ms.supp_load_follow ASC
you are using GROUP BY s.supp_id. While s.supp_name is dependent on this, ms.supp_weight and sp.supp_price are not. There could be numerous values of each of these for any s.supp_id. MySQL has just used the value from one of the grouped rows for these and doesn't really care which row it chose to use.
Here is your query without the group by and using inner joins. It appears to me that no supp_id would be inserted into menu_supp that is not already defined in supps. I suppose it would be possible to have no entry in supps_prop but that looks doubtful also. If I am wrong, simply change it back.
SELECT s.supp_id, s.supp_name, ms.supp_weight, sp.supp_price,
sp.supp_dry_w, sp.supp_date
FROM menu_supp ms
JOIN supps s
ON s.supp_id = ms.supp_id
JOIN supps_prop sp
ON sp.supp_id = ms.supp_id
WHERE menu_id = 1
ORDER BY ms.supp_load_follow;
I've also added the date to make it easier to follow. The results are all four possible rows:
supp_id supp_name supp_weight supp_price supp_dry_w supp_date
------- --------- ----------- ---------- ---------- ---------
29 Test supp 1 10.00 125.00 95.00 2015-10-25
29 Test supp 1 10.00 124.00 94.00 2015-11-06
29 Test supp 1 10.00 128.00 94.00 2015-11-12
31 Test supp 2 20.00 200.00 25.00 2015-06-25
Obviously, you only want to join with the prop information contained in the row with the current or most recent date. That date is the largest value still in the past. Which can be found like this:
SELECT s.supp_id, s.supp_name, ms.supp_weight, sp.supp_price,
sp.supp_dry_w, sp.supp_date
FROM menu_supp ms
JOIN supps s
ON s.supp_id = ms.supp_id
JOIN supps_prop sp
ON sp.supp_id = ms.supp_id
and sp.supp_date =(
select Max( supp_date )
from supps_prop
where supp_id = ms.supp_id
and supp_date <= NOW() )
WHERE menu_id = 1
ORDER BY ms.supp_load_follow;
Don't let the subquery concern you. Since the combination of supp_id and supp_date are the most obvious PK for the prop table, those fields should already be indexed, making this an impressively fast query.
See it in action at sqlfiddle.
I have the following three tables to look after support tickets in a small web application, but I need some help getting the data I need.
Table 1 (ticket):
user_ID site_ID support_ID timestamp priority title
12 25 3 2014-09-26 14:09:25 0 A Test Row
12 26 4 2014-09-27 09:41:18 0 A 2nd Test Row
Table 2 (ticket_reply):
reply_ID support_ID user_ID support_reply reply_timestamp
3 3 12 some really boring text 2014-09-26 14:09:25
4 3 25 some really boring reply 2014-09-26 15:35:18
5 4 12 some really boring text 2014-09-27 09:41:18
Table 3 (ticket_status):
ticket_status_ID support_ID status_ID status_timestamp
3 3 40 2014-09-26 14:09:25
4 3 41 2014-09-26 15:35:18
5 4 40 2014-09-27 09:41:18
The 1st table holds the key ticket information, the 2nd, any replies made to the corresponding ticket, and the third tracks the change in status (statuses are held in another table, but don't need anything from there).
What I need to do is get the number of tickets where the latest status is == 40, and if this is greater than 0, get the latest reply along with the data from the first table.
I've tried multiple ways of doing this, but I am stuck. Don't really want to paste them here as they will likely confuse people, and I doubt they are even close.
This one was rather tricky, however here is a working solution for you.
This query will get the most recent support_reply value for all tickets where the most recent status_ID is 40.
SELECT
ticket_status_ID,
support_ID,
status_ID,
status_timestamp,
reply_ID,
support_reply,
reply_timestamp,
`timestamp` ticket_timestamp,
`priority` ticket_priority,
title
FROM (
SELECT * FROM (
SELECT * FROM (
SELECT
ticket_status.ticket_status_ID,
ticket_status.support_ID,
ticket_status.status_ID,
ticket_status.status_timestamp,
ts1.reply_ID,
ts1.user_ID,
ts1.support_reply,
ts1.reply_timestamp
FROM
ticket_status
INNER JOIN (SELECT * FROM ticket_reply ORDER BY reply_timestamp DESC) ts1 ON ts1.support_ID = ticket_status.support_ID
GROUP BY support_ID, status_ID
ORDER BY status_timestamp DESC
) ts2
GROUP BY ts2.support_ID
) ts3
INNER JOIN (SELECT support_ID as `ticket_support_ID`, site_ID, `timestamp`, priority, title FROM ticket) ts4 ON ts4.ticket_support_ID = ts3.support_ID
WHERE ts3.status_ID = 40
) ts5
From the example given, it looks that all timestamp are equivalent, so a query like this should be enough:
SELECT
ticket.*,
ticket_reply.*
FROM
(SELECT support_ID, MAX(status_timestamp) as max_timestamp
FROM ticket_status
GROUP BY support_ID) m
INNER JOIN ticket
ON m.support_ID=ticket.support_ID
AND m.max_timestamp=ticket.`timestamp`
INNER JOIN ticket_reply
ON m.support_ID=ticket_reply.support_ID
AND m.max_timestamp=ticket_reply.reply_timestamp
INNER JOIN ticket_status
ON m.support_ID=ticket_status.support_ID
AND m.max_timestamp=ticket_status.status_timestamp
WHERE
status_ID=40;
but depending on the logic of your application, it might happen that the last row in a table has a timestamp of 2014-09-27 09:41:18 and the last in another has for example 2014-09-27 09:41:19.
In this case, you should use a query like this one:
SELECT
ticket.*,
ticket_reply.*
FROM
(SELECT support_ID, MAX(status_timestamp) AS max_status_timestamp
FROM ticket_status
GROUP BY support_ID) m_status
INNER JOIN
(SELECT support_ID, MAX(reply_timestamp) AS max_reply_timestamp
FROM ticket_reply
GROUP BY support_ID) m_reply
ON m_status.support_ID=m_reply.support_ID
INNER JOIN
(SELECT support_ID, MAX(`timestamp`) AS max_ticket_timestamp
FROM ticket
GROUP BY support_ID) m_ticket
ON m_status.support_ID=m_ticket.support_ID
INNER JOIN ticket_status
ON ticket_status.support_ID=m_status.support_ID
AND ticket_status.status_timestamp=m_status.max_status_timestamp
INNER JOIN ticket_reply
ON ticket_reply.support_ID=m_reply.support_ID
AND ticket_reply.reply_timestamp=m_reply.max_reply_timestamp
INNER JOIN ticket
ON ticket.support_ID=m_ticket.support_ID
AND ticket.`timestamp`=m_ticket.max_ticket_timestamp
WHERE
ticket_status.status_ID=40;
Please see fiddle here.
You can try this one:
SELECT t.*, tr.support_reply, ts.status_timestamp
FROM ticket_status as ts
left join ticket_reply as tr on(ts.support_ID=tr.support_ID)
left join ticket as t on(t.support_ID=tr.support_ID)
where status_ID=40
order by status_timestamp desc
limit 1;
I have a mysql query like this :
SELECT SUM(bills.Amount) AS AmountExpense, SUM(assets.Amount) as AmountIncome
FROM bills, assets where bills.UserId = 11 and assets.UserId =11
Sample Bills table
id payee description UserId Amount
1 john advance 11 15.0
2 dave request 2 13.0
3 er request 11 12.0
Sample assets table
id payee description UserId Amount
1 john advance 11 40.2
2 dave request 2 13.0
3 ww request 11 14.00
I have a problem with AmountExpense, the record SUM record multiple time. I have successed with Amount Income. Any suggestions?
You have most likely more than one row per user on one or both of those tables. You'll need to join them after performing the aggregation. Also, please don't use old style non ANSI implicit joins:
SELECT AmountExpense, AmountIncome
FROM ( SELECT UserId,
SUM(Amount) AS AmountExpense
FROM bills
GROUP BY UserId) AS b
LEFT JOIN ( SELECT UserId,
SUM(Amount) AmountIncome
FROM assets
GROUP BY UserId) AS a
ON b.UserId = a.UserId
WHERE b.UserId = 11
If you have the possibility that users can be in either table, but not the other, then you want the equivalent of a full outer join. MySQL doesn't support that syntax, but it does support this:
select userid, sum(amountexpense) as amountexpense, sum(amountincome) as amountincome
from (select userid, amount as amountexpense, null as amountincome
from bills
union all
select userid, null, amount as amountincome
from assets
) ba
group by userid;