Selecting latest rows in subgroups - mysql

I have the following table created by a join and some conditionals:
product_id date
11111 2012-06-05
11111 2012-05-01
22222 2011-05-01
22222 2011-07-02
33333 2011-01-01
I am trying to get the rows such that I have a result set with the latest date per product:
GOAL
product_id date
11111 2012-06-05
22222 2011-07-02
33333 2011-01-01
I could extract the data as is and do a manual sort, but I'd rather not. I cannot seem to find a way to do a SELECT MAX() without returning only a single row, and I'd rather not run a query for each product id.
The table is generated by this query:
SELECT item_id, sales_price, item, description, transaction_date
FROM db.invoice_line AS t1 INNER JOIN db.invoices AS t2
ON t1.invoice_id = t2.id_invoices WHERE item IS NOT NULL
AND item_id != '800001E9-1325703142' AND item_id != '800002C3-1326830147'
AND invoice_id IN
(SELECT id_invoices FROM db.invoices
WHERE customer_id = '[variable customer id]'
AND transaction_date >= DATE_SUB(NOW(), INTERVAL 360 DAY));
I use a join to 'add' the date column. After that, I disregard useless items, and select from invoices from a particular customer from a year ago to date.
Thanks for any guidance.
Dane

Looks like a group by would fit the bill:
select product_id
, max(date)
from YourTable
group by
product_id

Related

Cumulative Counts by Date Issue

I have a table that shows , for each date, a list of customer ids - shows customers who were active on any particular day. So each date can include ids that are also present in another date.
bdate customer_id
2012-01-12 111
2012-01-13 222
2012-01-13 333
2012-01-14 111
2012-01-14 333
2012-01-14 666
2012-01-14 777
I am looking to write a query which calculates the total number of unique ids between two dates - the starting date is the row date and the ending date is a particular date in the future.
My query looks like this:
select
bdate,
count(distinct customer_id) as cts
from users
where bdate between bdate and current_date
group by 1
order by 1
But this produces a count of unique users for each date like this:
bdate customer_id
2012-01-12 1
2012-01-13 2
2012-01-14 4
my desired result is ( for a count of users between starting row date and 2012-01-14 )
bdate customer_id
2012-01-12 5 - includes (111,222,333,666,777)
2012-01-13 5 - includes (222,333,111,666,777)
2012-01-14 4 - includes (111,333,666,777)
Like #Strawberry said, you can make a join like this:
select
t1.bdate,
count(distinct t2.customer_id) as cts
from users t1
join users t2 on t2.bdate >= t1.bdate
where t1.bdate between t1.bdate and current_date
group by t1.bdate
order by t1.bdate
join t2 can get you all the users between particular day and current_date, then count t2's customer_id, that's it.
SqlFiddle Demo Here

SQL Duplicate Count if Customer has Spent on Specific Date and Returned?

I've spent a fair amount of time trying to get my head round how to do this, and I can't. I'm making it far to complicated for myself, I understand the code, just not how it all flows together.
If I have table "Customers" with columns for "customer_id", "store_id", "visited", and "date" - I want to identify Customers who visited (visited = yes) a specific store (store_id="NEA") on a set date "2015-05-14" - and then have returned to the same store since then, and count the number of customers who have returned - can anyone help me out?
I know I would need to select customer_id for those who have a store_id of "NEA", a date of "2015-05-14" and a "yes" for visited, but how do I then identify those who returned, and count them - so how many customers visited on that day and then returned?
So for example:
customer_id | store_id | date | visited
123 NEA 2015-05-14 yes
456 NEA 2015-05-14 yes
789 ABC 2015-05-16 no
123 NEA 2015-05-14 yes
654 TDF 2015-05-12 yes
987 PEH 2015-05-14 yes
123 NEA 2015-05-14 no
456 NEA 2015-05-17 yes
987 LEA 2015-05-14 yes
159 NEA 2015-05-16 yes
123 NEA 2015-05-19 yes
or something like this:
SELECT count(*) AS cnt,t.*
FROM yourTable AS t
WHERE
`date` = '2015-05-14'
AND
store_id = 'NEA'
AND
visited = 'YES'
GROUP BY customer_id
HAVING cnt >1;
SELECT DISTINCT customer_id, date
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id, date
HAVING COUNT(*) >= 2
Follow the link below for a running demo:
SQLFiddle
The above query yields a list of duplicate customers and the dates on which they visited the same store twice or more. If you want a count of duplicate customers by date, you can wrap it and subquery:
SELECT t.date, COUNT(*) AS duplicateCount
FROM
(
SELECT DISTINCT customer_id, date
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id, date
HAVING COUNT(*) >= 2
) t
GROUP BY t.date
SQLFiddle
Update:
Based on your feedback, the following query might be what you had in mind:
SELECT DISTINCT customer_id
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id
HAVING SUM(CASE WHEN date = '2015-05-14' THEN 1 ELSE 0 END) >= 1 AND
SUM(CASE WHEN date > '2015-05-14' THEN 1 ELSE 0 END) >= 1

How to consolidate several transaction tables a query with each table representing a status?

I have transaction tables that are related to a main table Order. I would like to consolidate all these transactions into an order history query, such that each transaction and its date is presented as a status of the order at a point in time.
What query would provide the following output?
Order Table
Order ID
1
2
Order Confirmation Table
Order Confirmation ID Date
1 2015-08-01
2 2015-08-01
Order Cancellation Table
Order Cancellation ID Date
1 2015-08-02
Order Completion Table
Order Completion ID Date
2 2015-08-02
Output:
Order ID Date Status
1 2015-08-01 Confirmed
2 2015-08-01 Confirmed
1 2015-08-02 Cancelled
2 2015-08-02 Completed
select o.orderid,
oc.date,
'Cancelled' as status
from order o
join order_cancellation oc
on o.orderid = oc.orderid
union
select o.orderid,
ol.date,
'Completed' as status
from order o
join order_completed ol
on o.orderid = ol.orderid
You can use a union query to achieve this.

Need to sum transaction totals from one table using customer information in another

I have spent the last hour looking for something I can use to implement here, but haven't found exactly what I need.
I have 2 tables: TRANSACTIONS & CUSTOMERS
CUSTOMER
internal_id | name | email
TRANSACTIONS
internal_id | customer_id | transaction_date | total_amount
I would like to cycle through all CUSTOMERS, then sum up the total TRANSACTIONS for each by month and year. I thought it would be as easy as just adding select statements as columns to the initial query, but that isn't working obviously:
NOT WORKING:
select customer.internal_id,
(sum(total_amount) as 'total' from TRANSACTIONS where transactions.customer_id = customer.internal_id and transaction_date >= DATE_SUB(NOW(),INTERVAL 1 month)),
(sum(total_amount) as 'total' from TRANSACTIONS where transactions.customer_id = customer.internal_id and transaction_date >= DATE_SUB(NOW(),INTERVAL 1 year))
from CUSTOMER join TRANSACTIONS on CUSTOMER.internal_id = TRANSACTIONS.customer_id
Basically I would like the output to look like this:
CUSTOMER.name | TRANSACTIONS.total_amount_month | TRANSACTIONS.total_amount_year
ABC Company | $335.00 | $8900.34
Is this possible with a single query? I have it implemented with multiple queries using PHP and would just prefer a single query if possible for performance sake.
Thanks!
SELECT c.name,
SUM(IF(transaction_date >= DATE__SUB(NOW(), INTERVAL 1 MONTH), total_amount, 0) AS total_amount_month,
SUM(total_amount) AS total_amount_year
FROM transactions AS t
JOIN customer AS c ON c.internal_id = t.customer_id
WHERE transaction_date >= DATE__SUB(NOW(), INTERVAL 1 YEAR
GROUP BY t.customer_id

Group BY product using MAX(price) or MAX(date) according to time interval

I've been searching for answers for 2 day and still nothing. Please, help me.
I have a database with products, product's prices and the date when this prices were registered:
product_id | price | date
-------------------------
1 | 8.95 | 2012-12-01
2 | 3.40 | 2012-12-01
1 | 9.05 | 2012-12-19
3 | 2.34 | 2012-12-24
3 | 2.15 | 2012-12-01
1 | 8.80 | 2012-12-19
1 | 8.99 | 2012-12-02
2 | 3.45 | 2012-12-02
Observe that is possible to have different price values for a product on the same day (rows 3 and 6). This is because there are many suppliers for a single product. There is a supplier column on database too, but I found it irrelevant for the solution. You can add it to the solution if I'm wrong.
Basically what I want is to write a query that returns two combined sets of data, as follow:
First set is made by minimum price of products inserted in the last month. As today is jan, 15, query should read rows 3, 4 and 6, apply the minimum price, and return only rows 4 and 6, both with minimum price for that product on the last month.
Second set is made by last products inserted, with no price registered on last month. i.e, for products not shown in the first set, query should search for the last inserted ones.
I hope that is clear. Ask me more if it isn't.
The query result for this database should be:
product_id | price | date
-------------------------
1 | 8.80 | 2012-12-19 <-Min price for product 1 on last month
3 | 2.34 | 2012-12-24 <-Min price for product 3 on last month
2 | 3.45 | 2012-12-02 <-No reg for product 2 on last month, show last reg.
I've tried everything: UNION, (DATE_SUB(CURDATE(), INTERVAL 1 MONTH), MIN(price), MAX(date) etc, etc. Nothing works. I don't know where to search now, please help me.
(SELECT product_id, MIN(price), date
FROM products
WHERE date + INTERVAL 1 MONTH > NOW()
GROUP BY product_id)
UNION
(SELECT product_id, price, MAX(date)
FROM products
WHERE product_id NOT IN (SELECT product_id
FROM products
WHERE date + INTERVAL 1 MONTH > NOW()
GROUP BY product_id)
GROUP BY product_id)
This should work but I'm not sure it's the most optimized way to do it.
something like this will do the trick:
SELECT * FROM (
SELECT DISTINCT b.product_id, IF (c.min IS NULL,(SELECT ROUND(e.price,2) FROM products AS e WHERE e.product_id = b.product_id ORDER BY e.date DESC LIMIT 1 ),c.min) AS min, IF (c.date IS NULL,(SELECT f.date FROM products AS f WHERE f.product_id = b.product_id ORDER BY f.date DESC LIMIT 1 ),c.date) AS date, IF(c.min IS NULL,'<-No reg for product 2 on last month, show last reg.','<-Min price for product 1 on last month') as text FROM products AS b
LEFT JOIN
(SELECT a.product_id, round(MIN(a.price),2) AS min, a.date FROM products AS a WHERE a.date BETWEEN DATE_SUB(CURDATE(), INTERVAL 1 MONTH) AND CURDATE() GROUP BY a.product_id) AS c
ON (b.product_id = c.product_id)
) AS d
ORDER BY d.text, d.product_id
Gives output:
product_id|min|date|text
1|8.80|2012-12-19|<-Min price for product 1 on last month
3|2.34|2012-12-24|<-Min price for product 1 on last month
2|3.45|2012-12-02|<-No reg for product 2 on last month, show last reg.
Break it down into several sub-queries:
Products with prices in the last month, min price
join in date for that price
UNION
Products with no-prices in the last month, max date
join in price on that date
SQL Fiddle
Here
Query
SELECT MINPRICE.product_id, P.date, MINPRICE.price
FROM
(
-- Min price in last 31 days
SELECT product_id, MIN(price) AS price
FROM Prices
WHERE DATEDIFF(CURDATE(), date) < 31
GROUP BY product_id
) MINPRICE
-- Join in to get the date that the price occured on
INNER JOIN Prices P ON
P.product_id = MINPRICE.product_id
AND
P.price = MINPRICE.price
UNION
SELECT MAXDATE.product_id, MAXDATE.date, P.price
FROM
(
-- Product with no price in last 31 days - get most recent date
SELECT product_id, MAX(date) AS date
FROM Prices
WHERE product_id NOT IN
(
SELECT product_id
FROM Prices
WHERE DATEDIFF(CURDATE(), date) < 31
)
) MAXDATE
-- join in price on that date
INNER JOIN Prices P ON
P.product_id = MAXDATE.product_id
AND
P.date = MAXDATE.date
Not that I tested but you can try...
SELECT * FROM
(SELECT * FROM (
SELECT * FROM table ORDER BY date DESC)
as tmp GROUP BY product_id) t1
LEFT JOIN
(SELECT * FROM (
SELECT * FROM table WHERE date => CURDATE() ORDER BY price)
as tmp2 GROUP BY product_id) t2
ON t1.product_id = t2.product_id