Use most recent creation date in group by statement - mysql

I have an invoicing system and am trying to generate reports on hours spent. I'm saving every instance of a change to the order, so there are multiple entries for almost every item on every invoice. Due to this, I'm filtering out the old changes and am trying to only use the most recent.
Each instance sharing a project_id, phase_id, and the same weekstart are the same item on the invoice. I want to generate a report and only grab the most recent versions of those items.
Example table:
id project_id phase_id weekstart created
---------------------------------------------------------------
1 6 apple 2017-04-20 2017-04-23
2 6 apple 2017-04-20 2017-04-24
3 8 banana 2017-04-20 2017-04-23
4 9 pear 2017-04-20 2017-04-23
5 9 pear 2017-04-20 2017-04-25
I want to be able to run a query to get:
id project_id phase_id weekstart created
---------------------------------------------------------------
2 6 apple 2017-04-20 2017-04-24
3 8 banana 2017-04-20 2017-04-23
5 9 pear 2017-04-20 2017-04-25
Currently I'm using something like:
SELECT * from invoiceitems where employee_id = 10
group by project_id, phase_id, weekstart
But this doesn't account for the creation date.
Ordering the results doesn't have any affect on the group by statement. I've checked for similar posts, but only two I found are looking to order by the highest creation date altogether or aren't grouping the results by multiple columns.

Join to a subquery which finds the latest creation times for each project. Note that we use GROUP BY here, but only in the subquery, to aggregate over projects.
SELECT t1.*
FROM invoiceitems t1
INNER JOIN
(
SELECT project_id, phase_id, MAX(created) AS max_created
FROM invoiceitems
GROUP BY project_id
) t2
ON t1.project_id = t2.project_id AND
t1.phase_id = t2.phase_id
t1.created = t2.max_created

Tested and works perfectly
SELECT MAX(`id`) as `id`, `project_id`, `phase_id`,
`weekstart`, MAX(`created`) as `created`
FROM `invoiceitems`
GROUP BY `project_id`
ORDER BY `project_id` ASC

Related

calculate the maximum and minimum number of days between each person's travels

People travel in pairs. How to find the maximum and minimum number of days between trips every user?
People:
id
user
1
Harry
2
George
3
Thomas
4
Jacob
5
Jack
6
Oliver
Travels:
id
date
user1
user2
1
2005-10-03
2
3
2
2005-10-04
1
4
3
2005-10-05
5
6
4
2005-10-06
1
3
5
2005-10-07
2
4
6
2005-10-08
3
5
7
2005-10-10
1
4
8
2005-10-11
5
2
9
2005-10-15
6
4
I tried to solve this problem in the following way, but I still do not understand how to solve this problem:
select People.id,People.user, count(*)
from People
INNER join
(SELECT MIN(TIMESTAMPDIFF(day, t1.date, t2.date)) as mintime,max(TIMESTAMPDIFF(day, t1.date, t2.date))
from Travels as t1
join Travels as t2 on t1.PERSON_1 = t2.PERSON_1
WHERE t1.date< t2.date
GROUP BY t1.PERSON_1) as t3
group by People.id
There is an idea to use the position function to iterate over each user, and then, as a result, look at the dates and find the minimum and maximum, but I still don't understand how to do this
Best is to do it in steps with subqueries, as below (comments are in the query):
select user, max(dateDiff)
from (
select
user,
-- get the diff between previous and current row date to get diff between trips
datediff(date, lag(date) over (partition by user order by date)) dateDiff
from (
-- full flat list of all users and trip dates
select date, user1 `user`
from testtbl
union all
select date, user2
from testtbl
) a
) a group by user
SQL fiddle
Note that I used windowed function which are not avaiable in MySql 5 and below.

Trying to get latest status for related shipment but the results I receive are incorrect

I am currently working on a project while trying to learn MySQL and I would like to join three tables and get the latest status for each related shipment. Here are the tables I'm working with (with example data):
shipments
id
consignee
tracking_number
shipper
weight
import_no
1
JOHN BROWN
TBA99900000121
AMAZON
1
101
2
HELEN SMITH
TBA99900000190
AMAZON
1
102
3
JACK BLACK
TBA99900000123
AMAZON
1
103
4
JOE BROWM
TBA99900000812
AMAZON
1
104
5
JULIA KERR
TBA99900000904
AMAZON
1
105
statuses
id
name
slug
1
At Warehouse
at_warehouse
2
Ready For Pickup
ready_for_pickup
3
Delivered
delivered
shipment_status (pivot table)
id
shipment_id
status_id
1
1
1
2
2
1
3
3
1
4
4
1
5
5
1
6
1
2
7
2
2
8
3
2
9
4
2
10
5
2
all tables do have created_at and updated_at timestamp columns
Example of the results I'm trying to achieve
slug
shipment_id
status_id
ready_for_pickup
1
2
ready_for_pickup
2
2
ready_for_pickup
3
2
ready_for_pickup
4
2
ready_for_pickup
5
2
Here's the query I wrote to try to achieve what I'm looking for based on examples and research I did during the past couple of days. I find that sometimes there is sometimes a mismatch with the latest status that relates to the shipment
SELECT
statuses.slug AS slug,
MAX(shipments.id) AS shipment_id,
statuses.id AS status_id,
FROM
`shipments`
INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
GROUP BY
`shipment_id`
Because we need to reference other fields from the same record that evaluates from the MAX aggregation, you need to do it in two steps, there are other ways, but I find this syntax simpler:
SELECT
shipments.id AS id,
statuses.slug AS slug,
statuses.id AS status_id,
shipment_status.shipment_id as shipment_id
FROM
`shipments`
INNER JOIN `shipment_status` ON `shipment_status`.`shipment_id` = `shipments`.`id`
INNER JOIN `statuses` ON `shipment_status`.`status_id` = `statuses`.`id`
WHERE
shipment_status.id = (
SELECT MAX(shipment_status.id)
FROM `shipment_status`
WHERE shipment_status.shipment_id = shipments.id
)
try it out!
This query makes the assumption that the id field is an identity column, so the MAX(shipment_status.id) represents only the most recent status for the given shipment_id
You can use window functions:
SELECT s.id, st.slug, st.id
FROM shipments s JOIN
(SELECT ss.*,
ROW_NUMBER() OVER (PARTITION BY shipment_id ORDER BY ss.id DESC) as seqnum
FROM shipment_status ss
) ss
ON ss.shipment_id = s.id JOIN
statuses st
ON ss.status_id` = st.id
WHERE ss.seqnum = 1;
Also note the use of table aliases so the query is easier to write and to read.

MySQL status_id changed over time select just the latest based on the needed status

I know this sounds a little bit strange but I don't really know how to explain this better without an example.
I have the following table
ID contract_id status_id created
1 1 1 2015-10-14
2 1 2 2015-10-15
3 1 1 2016-02-02
4 1 4 2017-03-01
If the query is something like
SELECT * FROM table WHERE status_id = 1 AND created BETWEEN 2015-10-10 AND 2017-03-05
The item with contract_id = 1 should not display because the latest status in that date interval is 4
But if the query is something like this
SELECT * FROM table WHERE status_id = 1 AND created BETWEEN 2015-10-10 AND 2017-02-28 the item with contract_id 1 should show up because the latest status_id = 1
Basically what I need is something like this: Get me the latest item if the status_id = 1 at the end date
I know this is quite simple but I running around in circles right now. I did try abs(datediff(end, start)), select based on if or select in select but I am not getting the result I am looking for.
Thank you very much for your help.
One of the approaches would be to use an INNER JOIN of the latest date a given contract_id had the required status, and test if the date is earlier than the border date:
EDIT: After further reading the comments, most probably this one will do exactly what you need
SELECT t.*
FROM statuses AS t
INNER JOIN (
SELECT max(id) AS lastid, max(created) AS lastdate
FROM statuses
WHERE status_id = 1 AND created < '2016-01-01'
GROUP BY contract_id
) AS latest ON t.ID = latest.lastid
Note that it will work only if the dates are put in chronologically, in other words that for every contract_id: ID' < ID'' ≡ created' < created''
Oracle :
SELECT * FROM table WHERE status_id = 1 AND created BETWEEN 2015-10-10 AND 2017-03-05 AND ROWNUM <=1 ORDER BY created desc
MySQL :
SELECT * FROM table WHERE status_id = 1 AND created BETWEEN 2015-10-10 AND 2017-03-05 ORDER BY created desc LIMIT 1
SQL Server :
SELECT TOP 1 FROM table WHERE status_id = 1 AND created BETWEEN 2015-10-10 AND 2017-03-05 ORDER BY created desc
I understood your question as that you want an item with status_id = 1 and which has most recent created date

MySQL - add column to table and insert "tag" if order is from new customer

I have simple table:
Order_ID Client_ID Date Order_Status
1 1 01/01/2015 3
2 2 05/01/2015 3
3 1 06/01/2015 3
4 2 10/01/2015 3
5 1 12/01/2015 4
6 1 05/02/2015 3
I want to identify orders from new customers which are orders in same month in which that customer made first order with Order_Status = 3
So the output table should look like this:
Order_ID Client_ID Date Order_Status Order_from_new_customer
1 1 01/01/2015 3 yes
2 2 05/01/2015 3 yes
3 1 06/01/2015 3 yes
4 2 10/01/2015 3 yes
5 1 12/01/2015 4 NULL
6 1 05/02/2015 3 no
I wasn't able to successfully figure out the query. Thanks a lot for any help.
Join with a subquery that gets the date of the first order by each customer.
SELECT o.*, IF(MONTH(o.date) = MONTH(f.date) AND YEAR(o.date) = YEAR(f.date),
'yes', 'no') AS order_from_new_customer
FROM orders AS o
JOIN (SELECT Client_ID, MIN(date) AS date
FROM orders
WHERE Order_Status = 3
GROUP BY Client_ID) AS f
ON o.Client_ID = f.Client_ID
Use a CASE statement along with a SELF JOIN like below
select t1.*,
case when t1.Order_Status = 3 and MONTH(t1.`date`) = 1 then 'yes'
when t1.Order_Status = 3 and MONTH(t1.`date`) <> 1 then 'no'
else null end as Order_from_new_customer
from order_table t1 join order_table t2
on t1.Order_ID < t2.Order_ID
and t1.Client_ID = t2.Client_ID;
If your order table gets big, the solutions from Rahul and Barmar will tend to get slow.
I would hope your shop will get many orders and you will run into performance trouble ;-). So I would suggest marking the very first order of a new customer with a tinyint column, and when you have the comfort of a tinyint, you could code it like:
0 : unknown
1 : very first order
2 : order in first month
3 : order in "grown-up" mode.
The very first order you could probably mark easily, everyone loves a bright new customer enough to store this event somehow during first ordering. The other orders you can identify in a background job / cronjob by there "0" for unknown, or you mark your old customers and store the "3" on their orders.
The result-set can be achieved without any table-join or subquery:
select
if(Order_Status<>3,null,if(#first_date:=if(#prev_client_id!=Client_ID,month(date),#first_date)=month(date),"yes","no")) as Order_from_new_customer
,Order_ID,Client_ID,date,Order_Status,#prev_client_id:=client_id
from
t1,
(select #prev_client_id:="",#first_date:="")t
order by Client_ID ,date
One extra column added for computation and order by clause is used.
Verify result at http://sqlfiddle.com/#!9/83c29f/24

Ignore Group if LIMIT is not reached in MySQL

I am working on a rather tricky SQL for my level of knowledge. I have searched and searched for an answer but haven't came across anything. Hopefully someone can shed some light on this.
How can you stop SQL from outputting group of rows if the limit set is not reached?
For example -
Data
Fruits Ordered Date
Orange 4 2015-05-01
Orange 2 2015-05-01
Orange 20 2015-05-01
Apple 30 2015-05-02
Apple 40 2015-05-02
Apple 24 2015-05-02
Apple 19 2015-05-02
Apple 22 2015-05-02
From the data I would like to select and group by Date, but only have a LIMIT of 5.
If there isn't five rows in that group, I want SQL to ignore that group.
So If I did a SUM of all ordered values for each Date Group and SQL ignored the group that didn't consist of 5 values the desired results would look like the following
Desired Result
Fruits SUM(Ordered) Date
Apple 117 2015-05-02
Hope this makes sense, please ask any questions if required!
You can use the having clause to filter out the groups you don't need, keeping only the groups where there are more than 4 dates:
SELECT Fruits, SUM(Ordered), Date
FROM table
GROUP BY Date
HAVING COUNT(Date) > 4
select Fruits,sum(Ordered),Date from Table
group by Fruits, Date
where Fruits in (select Fruits from Table
group by Fruits having count(*) >= 5)
I think you want something like this:
SELECT
Fruits, SUM(Ordered), Date
FROM (
SELECT
*,
CASE WHEN (SELECT COUNT(*) FROM t ti WHERE ti.Fruits = t.Fruits) < 5 THEN Ordered END As gID
FROM t) dt
GROUP BY
Fruits, gID
Actually you need to use your PK column instead of Ordered in the CASE like this:
CASE WHEN (SELECT COUNT(*) FROM t ti WHERE ti.Fruits = t.Fruits) < 5 THEN `PK` END As gID