I am a SQL beginner and am learning the ropes of querying. I'm trying to find the date difference between purchases by the same customer. I have a dataset that looks like this:
ID | Purchase_Date
==================
1 | 08/10/2017
------------------
1 | 08/11/2017
------------------
1 | 08/17/2017
------------------
2 | 08/09/2017
------------------
3 | 08/08/2017
------------------
3 | 08/10/2017
I want to have a column that shows the difference in days for each unique customer purchase, so that the output will look like this:
ID | Purchase_Date | Difference
===============================
1 | 08/10/2017 | NULL
-------------------------------
1 | 08/11/2017 | 1
-------------------------------
1 | 08/17/2017 | 6
-------------------------------
2 | 08/09/2017 | NULL
-------------------------------
3 | 08/08/2017 | NULL
-------------------------------
3 | 08/10/2017 | 2
What would be the best way to go about this using a MySQL query?
Not so hard, just use a subquery to find previous purchase for each existing purchase for the customer, and self-join to that record.
Select t.id, t.PurchaseDate, p.Purchase_date,
DATEDIFF(t.PurchaseDate, p.Purchase_date) Difference
From myTable t -- t for This purchase record
left join myTable p -- p for Previous purchase record
on p.id = t.Id
and p.purchase_date =
(Select Max(purchase_date)
from mytable
where id = t.id
and purchase_date <
t.purchaseDate)
This is rather tricky in MySQL. Probably the best way to learn if you are a beginning is the correlated subquery method:
select t.*, datediff(purchase_date, prev_purchase_date) as diff
from (select t.*,
(select t2.purchase_date
from t t2
where t2.id = t.id and
t2.purchase_date < t.purchase_date
order by t2.purchase_date desc
limit 1
) as prev_purchase_date
from t
) t;
Performance should be okay if you have an index on (id, purchase_date).
It is possible to solve it not using dependent subquery
SELECT yt.id, create_date, NULLIF(yt.create_date - tm.min_create_date, 0)
FROM your_table yt
JOIN
(
SELECT id, MIN(create_date) min_create_date
FROM your_table
GROUP BY id
) tm ON tm.id = yt.id
sqlfiddle demo
Related
Lets say we have a table that looks like this:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 1 | JSDOAJD8 |2022-07-11 02:52:21|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
| 1 | DA8HWD8HHD |2022-07-11 02:51:49|
------------------------------------------------------
I want to select the last 3 entries into the table, however they must all have separate ID's.
Expected Result:
+---------------+----------------+-------------------+
| ID | random_string | time |
+---------------+----------------+-------------------+
| 2 | K2K3KD9AJ |2022-07-21 20:41:15|
| 1 | SJQJ8JD0W |2022-07-17 23:46:13|
| 3 | KPWJOFPSS |2022-07-11 02:51:57|
------------------------------------------------------
I have already tried:
SELECT DISTINCT id FROM table ORDER BY time DESC LIMIT 3;
And:
SELECT MIN(id) as id FROM table GROUP BY time DESC LIMIT 3;
If you're not on MySQL 8, then I have two suggestions.
Using EXISTS:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
WHERE EXISTS
(SELECT ID
FROM mytable AS m2
GROUP BY ID
HAVING m1.ID=m2.ID
AND m1.time= MAX(time)
)
Using JOIN:
SELECT m1.ID,
m1.random_string,
m1.time
FROM mytable m1
JOIN
(SELECT ID, MAX(time) AS mxtime
FROM mytable
GROUP BY ID) AS m2
ON m1.ID=m2.ID
AND m1.time=m2.mxtime
I've not test in large data so don't know which will perform better (speed) however this should return the same result:
Here's a fiddle
Of course, this is considering that there will be no duplicate of exact same ID and time value; which seems to be very unlikely but still it's possible.
Using MySql 8 an easy solution is to assign a row number using a window:
select Id, random_string, time
from (
select *, Row_Number() over(partition by id order by time desc) rn
from t
)t
where rn = 1
order by time desc
limit 3;
See Demo
I have a MySQL DB and in it there's a table with activity logs of employees.
+-------------------------------------------------+
| log_id | employee_id | date_time | action_type |
+-------------------------------------------------+
| 1 | 1 | 2015/02/03 | action1 |
| 2 | 2 | 2015/02/01 | action1 |
| 3 | 2 | 2017/01/02 | action2 |
| 4 | 3 | 2016/02/12 | action1 |
| 5 | 1 | 2016/10/12 | action2 |
+-------------------------------------------------+
And I would need 2 queries. First, to get for every employee his last action. So from this example table I would need to get row 3,4 and 5 with all columns. And second, get the latest action only for specified employee.
Any ideas how to achieve this? I'm using Spring Data JPA, but raw SQL Query would be also great.
Thank you in advance.
Ready for a fred ed...
SELECT x.*
FROM my_table x
JOIN
( SELECT employee_id
, MAX(date_time) date_time
FROM my_table
GROUP
BY employee_id
) y
ON y.employee_id = x.employee_id
AND y.date_time = x.date_time;
For your first query. Simply
SELECT t1.*
FROM tableName t1
WHERE t1.log_id = (SELECT MAX(t2.log_id)
FROM tableName t2
WHERE t2.employee_id = t1.employee_id)
For the second one
SELECT t1.*
FROM tableName t1
WHERE t1.employee_id=X and t1.log_id = (SELECT MAX(t2.log_id)
FROM tableName t2
WHERE t2.employee_id = t1.employee_id);
You can get the expected output by doing a self join
select a.*
from demo a
left join demo b on a.employee_id = b.employee_id
and a.date_time < b.date_time
where b.employee_id is null
Note it may return multiple rows for single employee if there are rows with same date_time you might need a CASE statement and another attribute to decide which row should be picked to handle this kind of situation
Demo
I have a table of the following structure:
ID | COMPANY_ID | VERSION | TEXT
---------------------------------
1 | 1 | 1 | hello
2 | 1 | 2 | world
3 | 2 | 1 | foo
is there a way to get the most recent version of records only, i.e. I would want to have as a result set the IDs 2 and 3?
I'm sure there are better ways, but I tend to use this kind of query:
SELECT *
FROM
(SELECT * FROM test ORDER BY VERSION DESC) AS my_table
GROUP BY COMPANY_ID
Produces this result set:
ID | COMPANY_ID | VERSION | TEXT
---------------------------------
2 | 1 | 2 | world
3 | 2 | 1 | foo
Try this:
SELECT *
FROM (
SELECT company_id, MAX(version) maxVersion
FROM table
GROUP BY company_id ) as val
JOIN table t ON (val.company_id = t.company_id AND t.version = val.maxversion)
If your IDs are ordered (newer version iff higher id):
SELECT t.*, a.maxversion
FROM (
SELECT MAX(id) maxid, MAX(version) maxversion
FROM table
GROUP BY company_id
) a
INNER JOIN table t
ON a.maxid = t.id
However, if your IDs are not properly ordered, you need to use the following query:
SELECT t.*
FROM (
SELECT company_id, MAX(version) maxversion
FROM table
GROUP BY company_id
) v
INNER JOIN table t
ON v.company_id = t.company_id
AND v.maxversion = t.version
(assuming there's an UNIQUE constraint/index on (company_id, version))
I have read that grouping happens before ordering, is there any way that I can order first before grouping without having to wrap my whole query around another query just to do this?
Let's say I have this data:
id | user_id | date_recorded
1 | 1 | 2011-11-07
2 | 1 | 2011-11-05
3 | 1 | 2011-11-06
4 | 2 | 2011-11-03
5 | 2 | 2011-11-06
Normally, I'd have to do this query in order to get what I want:
SELECT
*
FROM (
SELECT * FROM table ORDER BY date_recorded DESC
) t1
GROUP BY t1.user_id
But I'm wondering if there's a better solution.
Your question is somewhat unclear but I have a suspicion what you really want is not any GROUP aggregates at all, but rather ordering by date first, then user ID:
SELECT
id,
user_id,
date_recorded
FROM tbl
ORDER BY date_recorded DESC, user_id ASC
Here would be the result. Note reordering by date_recorded from your original example
id | user_id | date_recorded
1 | 1 | 2011-11-07
3 | 1 | 2011-11-06
2 | 1 | 2011-11-05
5 | 2 | 2011-11-06
4 | 2 | 2011-11-03
Update
To retrieve the full latest record per user_id, a JOIN is needed. The subquery (mx) locates the latest date_recorded per user_id, and that result is joined to the full table to retrieve the remaining columns.
SELECT
mx.user_id,
mx.maxdate,
t.id
FROM (
SELECT
user_id,
MAX(date_recorded) AS maxdate
FROM tbl
GROUP BY user_id
) mx JOIN tbl t ON mx.user_id = t.user_id AND mx.date_recorded = t.date_recorded
Iam just using the technique
"Using order clause before group by inserting it in group_concat clause"
SELECT SUBSTRING_INDEX(group_concat(cast(id as char)
ORDER BY date_recorded desc),',',1),
user_id,
SUBSTRING_INDEX(group_concat(cast(`date_recorded` as char)
ORDER BY `date_recorded` desc),',',1)
FROM data
GROUP BY user_id
Let's say we have this query
SELECT * FROM table
And this result from it.
id | user_id
------------
1 | 1
------------
2 | 1
------------
3 | 2
------------
4 | 1
How could I get the count of how often a user_id appears as another field (without some major SQL query)
id | user_id | count
--------------------
1 | 1 | 3
--------------------
2 | 1 | 3
--------------------
3 | 2 | 1
--------------------
4 | 1 | 3
We have this value currently in code, but we are implementing sorting to this table and I would like to be able to sort in the SQL query.
BTW if this is not possible without some major trick, we are just going to skip sorting on that field.
You'll just want to add a subquery on the end, I believe:
SELECT
t.id,
t.user_id,
(SELECT COUNT(*) FROM table WHERE user_id = t.user_id) AS `count`
FROM table t;
SELECT o.id, o.user_id, (
SELECT COUNT(id)
FROM table i
WHERE i.user_id = o.user_id
GROUP BY i.user_id
) AS `count`
FROM table o
I suspect this query as not being a performance monster but it should work.