Get last 3 rows from SQL table without duplicates of a row - mysql

Lets say we have a table that looks like this:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 1 | JSDOAJD8 |2022-07-11 02:52:21|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
| 1 | DA8HWD8HHD |2022-07-11 02:51:49|
------------------------------------------------------
I want to select the last 3 entries into the table, however they must all have separate ID's.
Expected Result:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
------------------------------------------------------
I have already tried:
SELECT DISTINCT id FROM table ORDER BY time DESC LIMIT 3;
And:
SELECT MIN(id) as id FROM table GROUP BY time DESC LIMIT 3;

If you're not on MySQL 8, then I have two suggestions.
Using EXISTS:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
WHERE EXISTS
(SELECT ID
FROM mytable AS m2
GROUP BY ID
HAVING m1.ID=m2.ID
AND m1.time= MAX(time)
)
Using JOIN:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
JOIN
(SELECT ID, MAX(time) AS mxtime
FROM mytable
GROUP BY ID) AS m2
ON m1.ID=m2.ID
AND m1.time=m2.mxtime
I've not test in large data so don't know which will perform better (speed) however this should return the same result:
Here's a fiddle
Of course, this is considering that there will be no duplicate of exact same ID and time value; which seems to be very unlikely but still it's possible.

Using MySql 8 an easy solution is to assign a row number using a window:
select Id, random_string, time
from (
select *, Row_Number() over(partition by id order by time desc) rn
from t
)t
where rn = 1
order by time desc
limit 3;
See Demo

Related

Calculate date difference from previous row of each unique ID in MySQL

I am a SQL beginner and am learning the ropes of querying. I'm trying to find the date difference between purchases by the same customer. I have a dataset that looks like this:
ID | Purchase_Date
==================
1 | 08/10/2017
------------------
1 | 08/11/2017
------------------
1 | 08/17/2017
------------------
2 | 08/09/2017
------------------
3 | 08/08/2017
------------------
3 | 08/10/2017
I want to have a column that shows the difference in days for each unique customer purchase, so that the output will look like this:
ID | Purchase_Date | Difference
===============================
1 | 08/10/2017 | NULL
-------------------------------
1 | 08/11/2017 | 1
-------------------------------
1 | 08/17/2017 | 6
-------------------------------
2 | 08/09/2017 | NULL
-------------------------------
3 | 08/08/2017 | NULL
-------------------------------
3 | 08/10/2017 | 2
What would be the best way to go about this using a MySQL query?
Not so hard, just use a subquery to find previous purchase for each existing purchase for the customer, and self-join to that record.
Select t.id, t.PurchaseDate, p.Purchase_date,
DATEDIFF(t.PurchaseDate, p.Purchase_date) Difference
From myTable t -- t for This purchase record
left join myTable p -- p for Previous purchase record
on p.id = t.Id
and p.purchase_date =
(Select Max(purchase_date)
from mytable
where id = t.id
and purchase_date <
t.purchaseDate)
This is rather tricky in MySQL. Probably the best way to learn if you are a beginning is the correlated subquery method:
select t.*, datediff(purchase_date, prev_purchase_date) as diff
from (select t.*,
(select t2.purchase_date
from t t2
where t2.id = t.id and
t2.purchase_date < t.purchase_date
order by t2.purchase_date desc
limit 1
) as prev_purchase_date
from t
) t;
Performance should be okay if you have an index on (id, purchase_date).
It is possible to solve it not using dependent subquery
SELECT yt.id, create_date, NULLIF(yt.create_date - tm.min_create_date, 0)
FROM your_table yt
JOIN
(
SELECT id, MIN(create_date) min_create_date
FROM your_table
GROUP BY id
) tm ON tm.id = yt.id
sqlfiddle demo

Translate MySQL query to BigQuery query

I am having trouble converting MySQL query to Google Bigquery query. This is my MySQL query
SELECT id
FROM office_details
GROUP BY address
HAVING max(value)
ORDER BY id
This query runs perfectly on phpMyAdmin and with my php script. But when I convert it to bigquery
SELECT id
FROM Office_db.office_details
GROUP BY address
HAVING max(value)
ORDER BY id
It says column id is not in group by nor aggregated.
What I need is the ids of unique address where value is maximum. e.g
+-------------------------+
| id | address | value |
+-------------------------+
| 1 | a | 4 |
| 2 | a | 3 |
| 3 | b | 2 |
| 4 | b | 2 |
+-------------------------+
I need
+----+
| id |
+----+
| 1 |
| 3 |
+----+
#standardSQL
SELECT id FROM (
SELECT
id, address,
ROW_NUMBER() OVER(PARTITION BY address ORDER BY value DESC, id) AS flag
FROM office_details
)
WHERE flag = 1
Try this:
#standardSQL
SELECT ARRAY_AGG(id ORDER BY value DESC, id LIMIT 1)[OFFSET(0)] AS id
FROM office_details
GROUP BY address;
It's less prone to running out of memory than a solution using RANK will be (and may be faster), since it doesn't need to buffer all of the rows while computing ranks within a partition. As a working example:
#standardSQL
WITH office_details AS (
SELECT 1 AS id, 'a' AS address, 4 AS value UNION ALL
SELECT 2, 'a', 3 UNION ALL
SELECT 3, 'b', 2 UNION ALL
SELECT 4, 'b', 2
)
SELECT
address,
ARRAY_AGG(id ORDER BY value DESC, id LIMIT 1)[OFFSET(0)] AS id
FROM office_details
GROUP BY address
ORDER BY address;
This gives the result:
address | id
------------
a | 1
b | 3
A valid query might look as follows:
SELECT MIN(x.id) id
FROM office_details x
JOIN
( SELECT address
, MAX(value) value
FROM officer_details
GROUP
BY address
) y
ON y.address = x.address
AND y.value = x.value
GROUP
BY address
, value

Get second highest values from a table

I have a table like this:
+----+---------+------------+
| id | conn_id | read_date |
+----+---------+------------+
| 1 | 1 | 2010-02-21 |
| 2 | 1 | 2011-02-21 |
| 3 | 2 | 2011-02-21 |
| 4 | 2 | 2013-02-21 |
| 5 | 2 | 2014-02-21 |
+----+---------+------------+
I want the second highest read_date for particular 'conn_id's i.e. I want a group by on conn_id. Please help me figure this out.
Here's a solution for a particular conn_id :
select max (read_date) from my_table
where conn_id=1
and read_date<(
select max (read_date) from my_table
where conn_id=1
)
If you want to get it for all conn_id using group by, do this:
select t.conn_id, (select max(i.read_date) from my_table i
where i.conn_id=t.conn_id and i.read_date<max(t.read_date))
from my_table t group by conn_id;
Following answer should work in MSSQL :
select id,conn_id,read_date from (
select *,ROW_NUMBER() over(Partition by conn_id order by read_date desc) as RN
from my_table
)
where RN =2
There is an intresting article on use of rank functions in MySQL here : ROW_NUMBER() in MySQL
If your table design as ID - date matching (ie a big id always a big date), you can group by id, otherwise do the following:
$sql_max = '(select conn_id, max(read_date) max_date from tab group by 1) as tab_max';
$sql_max2 = "(select tab.conn_id,max(tab.read_date) max_date2 from tab, $sql_max
where tab.conn_id = tab_max.conn_id and tab.read_date < tab_max.max_date
group by 1) as tab_max2";
$sql = "select tab.* from tab, $sql_max2
where tab.conn_id = tab_max2.conn_id and tab.read_date = tab_max2.max_date2";

Query with subquery not returning all results

I am doing the next query:
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
Supposing table is like this:
| id | name | keyt |
+ ------------------------- +
| 1 | Hello | 21 |
| 3 | Katzet | 1 |
| 1 | Welcome | 1 |
| 2 | Two | 21 |
| 2 | Other | 1 |
It should return one of this pairs:
Hello | Welcome (id 1 in common)
Two | Other (id 2 in common)
So, the idea is:
Get one id, which has the keyt value set to 21
Then, get all the rows with this selected id (independently of all the other keyt values)
If I do as you suggested... I would get mixed id values, and all result rows must have the same id.
SELECT x.*
FROM my_table x
JOIN
( SELECT id
FROM my_table
WHERE keyt = 21
ORDER
BY RAND() LIMIT 1
) y
ON y.id = x.id;
The subquery in this query
SELECT id, name, keyt
FROM table
WHERE id = (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND() LIMIT 1)
would return only one record as it has LIMIT 1 added at the end.
Also, in your question, the table contains only 1 record for which
value of keyt = 21, due to which you're getting only one record.
If you want more records, you should remove the LIMIT. In that case you may rephrase your query as:
SELECT id, name, keyt
FROM table
WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())
Hope this is what you expected. As your actual goal is not very clear from the question.
Your table has two 21 in the keyt column so your subquery in the where clause returns 2 values if id that is 1 and 2.So what you need to do is instead of using an equal to operator "=" use IN operator in the where clause.
SELECT id, name, keyt FROM table WHERE id IN (SELECT t2.id FROM table t2 WHERE t2.keyt=21 ORDER BY RAND())

Sort data before using GROUP BY?

I have read that grouping happens before ordering, is there any way that I can order first before grouping without having to wrap my whole query around another query just to do this?
Let's say I have this data:
id | user_id | date_recorded
1 | 1 | 2011-11-07
2 | 1 | 2011-11-05
3 | 1 | 2011-11-06
4 | 2 | 2011-11-03
5 | 2 | 2011-11-06
Normally, I'd have to do this query in order to get what I want:
SELECT
*
FROM (
SELECT * FROM table ORDER BY date_recorded DESC
) t1
GROUP BY t1.user_id
But I'm wondering if there's a better solution.
Your question is somewhat unclear but I have a suspicion what you really want is not any GROUP aggregates at all, but rather ordering by date first, then user ID:
SELECT
id,
user_id,
date_recorded
FROM tbl
ORDER BY date_recorded DESC, user_id ASC
Here would be the result. Note reordering by date_recorded from your original example
id | user_id | date_recorded
1 | 1 | 2011-11-07
3 | 1 | 2011-11-06
2 | 1 | 2011-11-05
5 | 2 | 2011-11-06
4 | 2 | 2011-11-03
Update
To retrieve the full latest record per user_id, a JOIN is needed. The subquery (mx) locates the latest date_recorded per user_id, and that result is joined to the full table to retrieve the remaining columns.
SELECT
mx.user_id,
mx.maxdate,
t.id
FROM (
SELECT
user_id,
MAX(date_recorded) AS maxdate
FROM tbl
GROUP BY user_id
) mx JOIN tbl t ON mx.user_id = t.user_id AND mx.date_recorded = t.date_recorded
Iam just using the technique
"Using order clause before group by inserting it in group_concat clause"
SELECT SUBSTRING_INDEX(group_concat(cast(id as char)
ORDER BY date_recorded desc),',',1),
user_id,
SUBSTRING_INDEX(group_concat(cast(`date_recorded` as char)
ORDER BY `date_recorded` desc),',',1)
FROM data
GROUP BY user_id