Difficult order / group by / count - mysql

I've got 3 simple tables, but my query is difficult
Sellers table :
seller_id | name
1 john
2 paul
5 fred
6 robert
etc ...
Transactions table (only 3 values for the moment) :
trans_id | name
1 BUY
2 SELL
3 EXCHANGE
Operations table :
seller_id | trans_id | datetime
2 1 ....
2 2 ....
6 1 ....
2 3 ....
6 1 ....
This tables records all the sellers' transactions and their moment.
I would like to obtain, in the last day, or in a time interval
seller name, number of buy-transaction order by number of buy-transaction in the interval
seller name, number of buy-or-sell-transaction order by number buy-or-sell-transaction in the interval
I've tried many things, strange things taht mysql does'nt like, but I can't succeed ... thanks !

Here is a solution for your first query, ie "seller name, number of buy-transaction order by number of buy-transaction in the interval" :
SELECT
S.name,
COUNT(1) AS total
FROM operations O
JOIN sellers S on O.seller_id = S.seller_id
JOIN transactions T on O.trans_id = T.trans_id
WHERE O.datetime >= CURDATE() AND T.name = 'BUY'
GROUP BY S.name
ORDER BY total
Obviously, the second query is nearly the same, the where clause just changes a little, see the following :
SELECT
S.name,
COUNT(1) AS total
FROM operations O
JOIN sellers S on O.seller_id = S.seller_id
JOIN transactions T on O.trans_id = T.trans_id
WHERE O.datetime >= CURDATE()-1 AND T.name IN ('BUY','SELL')
GROUP BY S.name
ORDER BY total
To be honest you could even remove the join with transactions table and use O.trans_id in your where clause.
SEE DEMO HERE

Here solution for your first query
Select a.name, a.count from
(select seller.name name, Count(Transaction.Tran_id) count
from seller, Transaction, Operation
where seller.Seller_id = Operation.Seller_Id
and Transaction.Tran_id = Operation.Tran_id
and Transaction.Tran_id=1
and Operation.datetime between (datetime 1 & datetime2)
group by seller.name) a
order by a.count
Second solution is :-
Select a.name, a.count from
(select seller.name name, Count(Transaction.Tran_id) count
from seller, Transaction, Operation
where seller.Seller_id = Operation.Seller_Id
and Transaction.Tran_id = Operation.Tran_id
and Transaction.Tran_id in (1, 2)
and Operation.datetime between (datetime 1 & datetime2)
group by seller.name) a
order by a.count

Related

Finding missing data in a sequence in MySQL

Is there an efficient way to find missing data not just in one sequence, but many sequences?
This is probably unavoidably O(N**2), so efficient here is defined as relatively few queries using MySQL
Let's say I have a table of temporary employees and their starting and ending months.
employees | start_month | end_month
------------------------------------
Jane 2017-05 2017-07
Bob 2017-10 2017-12
And there is a related table of monthly payments to those employees
employee | paid_month
---------------------
Jane 2017-05
Jane 2017-07
Bob 2017-11
Bob 2017-12
Now, it's clear that we're missing a month for Jane (2017-06) and one for Bob too (2017-10).
Is there a way to somehow find the gaps in their payment record, without lots of trips back and forth?
In the case where there's just one sequence to check, some people generate a temporary table of valid values, and then LEFT JOIN to find the gaps. But here we have different sequences for each employee.
One possibility is that we could do an aggregate query to find the COUNT() of paid_months for each employee, and then check it versus the expected delta of months. Unfortunately the data here is a bit dirty so we actually have payment dates that could be before or after that employee start or end date. But we're verifying that the official sequence definitely has payments.
Form a Cartesian product of employees and months, then left join the actual data to that, then the missing data is revealed when there is no matched payment to the Cartesian product.
You need a list of every months. This might come from a "calendar table" you already have, OR, it MIGHT be possible using a subquery if every month is represented in the source data)
e.g.
select
m.paid_month, e.employee
from (select distinct paid_month from payments) m
cross join (select employee from employees) e
left join payments p on m.paid_month = p.paid_month and e.employee = p.employee
where p.employee is null
The subquery m can be substituted by the calendar table or some other technique for generating a series of months. e.g.
select
DATE_FORMAT(m1, '%Y-%m')
from (
select
'2017-01-01'+ INTERVAL m MONTH as m1
from (
select #rownum:=#rownum+1 as m
from (select 1 union select 2 union select 3 union select 4) t1
cross join (select 1 union select 2 union select 3 union select 4) t2
## cross join (select 1 union select 2 union select 3 union select 4) t3
## cross join (select 1 union select 2 union select 3 union select 4) t4
cross join(select #rownum:=-1) t0
) d1
) d2
where m1 < '2018-01-01'
order by m1
The subquery e could contain other logic (e.g. to determine which employees are still currently employed, or that are "temporary employees")
First we need to get all the months between start date and end_date in a temporary table then need do a left outer join with the payments table on paid month filtering all non matching months ( payment employee name is null )
select e.employee, e.yearmonth as missing_paid_month from (
with t as (
select e.employee, to_date(e.start_date, 'YYYY-MM') as start_date, to_date(e.end_date, 'YYYY-MM') as end_date from employees e
)
select distinct t.employee,
to_char(add_months(trunc(start_date,'MM'),level - 1),'YYYY-MM') yearmonth
from t
connect by trunc(end_date,'mm') >= add_months(trunc(start_date,'mm'),level - 1)
order by t.employee, yearmonth
) e
left outer join payments p
on p.paid_month = e.yearmonth
where p.employee is null
output
EMPLOYEE MISSING_PAID_MONTH
Bob 2017-10
Jane 2017-06
SQL Fiddle http://sqlfiddle.com/#!4/2b2857/35

Customer partitioning in sql query

I have a table with following format -
Customer_id Purchase_date
c1 2015-01-11
c2 2015-02-12
c3 2015-11-12
c1 2016-01-01
c2 2016-12-29
c4 2016-11-28
c4 2015-03-15
... ...
The table essentially contains customer_id with their purchase_date. The customer_id is repetitive based on the purchase made on purchase_date. The above is just a sample data and the table contains about 100,000 records.
Is there a way to partition the customer based on pre-defined category data
Category Partitioning
- Category-1: Customer who has not made purchase in last 10 weeks, but made a purchase before that
- Category-2: Customer who as not made a purchase in last 5 weeks, but made purchase before that
- Category-3: Customer who has made one or more purchase in last 4 weeks or it has been 8 weeks since the first purchase
- Category-4: Customer who has made only one purchase in the last 1 week
- Category-5: Customer who has made only one purchase
What I'm looking for is a query that tells customer and their category -
Customer_id Category
C1 Category-1
... ...
The query can adhere to - oracle, postgres, sqlserver
From your question it seems that a customer can fall in multiple categories. So lets find out the customers in each category and then take UNION of the results.
SELECT DISTINCT Customer_Id, 'CATEGORY-1' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) > 10
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-2' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) > 5
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-3' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) < 4 OR
DATEDIFF(ww,MIN(Purchase_date),GETDATE()) =8
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-4' AS Category FROM mytable WHERE
DATEDIFF(ww,Purchase_date,GETDATE())<=1 GROUP BY Customer_Id having
COUNT(*) =1
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-5' AS Category FROM mytable GROUP BY
Customer_Id HAVING COUNT(*) =1
ORDER BY Category
Hope this serves your purpose.
Thanks
you can use something like this
with myTab as (
SELECT Customer_id ,MIN(Purchase_date) AS Min_Purchase_date,MAX(Purchase_date) AS Max_Purchase_date
, SUM(CASE WHEN Purchase_date>= DATEADD(WEEk ,-1,GETDATE()) THEN 1 ELSE 0 END ) AS Count_LastWeek
, COUNT(*) AS Count_All
FROM Purchases_Table
GROUP BY Customer_id
)
SELECT Customer_id
, CASE WHEN Max_Purchase_date < DATEADD(WEEK,-10,GETDATE()) THEN 'Category-1'
WHEN Max_Purchase_date < DATEADD(WEEK,-5,GETDATE()) THEN 'Category-2'
WHEN Max_Purchase_date >= DATEADD(WEEK,-4,GETDATE())
OR DATEDIFF(WEEK, Min_Purchase_date,Max_Purchase_date) >= 8 THEN 'Category-3'
WHEN Count_LastWeek = 1 THEN 'Category-4'
WHEN Count_All = 1 THEN 'Category-5'
ELSE 'No Category'
END
FROM myTab

How do optimise sql query using join between multiple tables

i have two tables having following structure
Table A
itemId categoryId orderDate
==========================================
1 23 2016-11-08
1 23 2016-11-12
1 23 2016-11-16
Table B have the structure
categoryId stock price
==========================================
23 500 600
However mine desired output should be as like
Result C
price stock orderdate qty
600 500 2016-11-08 (first order date) 3 (3 time appearance in first table)
Here is what i have tried so far
select b.price,b.stock from B b, A a
where b.categoryId = (
select a.categoryId
from A
GROUP BY categoryId
HAVING COUNT(categoryId)>1
)
and (a.orderdate = (
select MIN(orderdate)
from A
where categoryId = b.categoryId)
)
i have following result
price stock orderdate
600 500 2016-11-08
i have no idea how do find qty as it is appeared 3 times in first table.
I think you want the records in table a grouped by item id and category id, so include these two in your group by statement. Then the other columns you have to aggregate using MIN, MAX, AVG, SUM, etc. I use MIN which will give you the smallest number in the group for that particular column, although it shouldn't matter in this case whether you use MIN or MAX or AVG - it's all the same. Then COUNT(*) will just count the number of recrods in the group.
Also, joins are generally preferred over listing tables with commas.
SELECT a.itemid, a.categoryid, MIN(b.price), MIN(b.stock), min(a.orderdate), count(*) as qty
FROM a
INNER JOIN b ON a.categoryid = b.categoryid
GROUP BY a.itemid, a.categoryid
You also need to select COUNT(*)
how about use following sql
select min(price), min(stock), min(orderDate), COUNT(categoryId)
from A,B where A.categoryId = B.categoryId
GROUP by A.categoryId
You could create views for your subqueries and give them meaningful names e.g. CategoriesUsedInMultipleOrders, MostRecentOrderByCategory. This would 'optimize' you query by abstracting away complexity and making it easier for the human reader to understand.
This is the Query with the appropriate join method see Result:
SELECT B.price, B.stock, MIN( A.orderDate ) AS orderdate, COUNT( * ) AS qty
FROM TableA A, TableB B
WHERE A.categoryId = B.CategoryId
GROUP BY A.categoryId, B.price, B.stock

add a column in MySQL rank by deal by day

I am just learning MySQL. I need to find out rank of deals by day. Here I am adding the corresponding MYSQL query for my requirement that currently ranks all sales highest to lowest by day. Please help me to add a column that gives the rank to the deal highest to lowest and resetting the next day.
Here is my current working query,..
single day with title, price
SELECT
DATE(order_items.created_at) AS the_day,
order_items.deal_id,
SUM(order_items.item_total) AS daily_total,
SUM(order_items.qty) AS units_sold,
deals.price,
deals.title
FROM
order_items JOIN deals ON order_items.deal_id = deals.id
WHERE
order_items.created_at >= '2016-01-01 00:00:00' AND order_items.created_at < '2016-01-30 00:00:00'
AND
order_items.status=1
AND
order_items.paid=1
GROUP BY
order_items.deal_id
ORDER BY
the_day,
daily_total DESC;
The easiest way to do is that:
Use your existing SQL - I guess you need to update your SQL, make sure any non-aggregated columns in select should be in group by as well
Use PHP to loop (1-5), it works for multiple days
If you are happy to get top 5 for a single day, you can add limit 5 at end of your SQL
If you need top 5 results for each day in multiple days in one SQL, you need to update SQL to be more complicated. And here is a hint to use row id see example:
select increment counter in mysql
OK - Since you updated your question to return top 1 result per day, this is easier:
Step 1: get each day, each deal, report:
SELECT deal_id, date(created_at) ymd, sum(item_total) daily_total, sum(qty) units_sold
FROM order_items
WHERE substr(created_at,1,7) = '2016-01'
AND status = 1
AND paid = 1
GROUP BY 1,2
Step 2: Find the best deal of each day from step 1:
SELECT aa.ymd, max(aa.daily_total) max_total
FROM (
SELECT deal_id, date(created_at) ymd, sum(item_total) daily_total, sum(qty) units_sold
FROM order_items
WHERE substr(created_at,1,7) = '2016-01'
AND status = 1
AND paid = 1
GROUP BY 1,2
) as aa
GROUP BY 1;
Please note that max(item_total) not necessary same row as max(unit_sold), so you need to choose one, and cannot run togather
Step 3: Join step 2 with step 1 and deals to find out the rest of information:
SELECT aa.*, deals.price, deal.title
FROM (
SELECT aa.ymd, max(aa.daily_total) max_total
FROM (
SELECT deal_id, date(created_at) ymd, sum(item_total) daily_total, sum(qty) units_sold
FROM order_items
WHERE substr(created_at,1,7) = '2016-01'
AND status = 1
AND paid = 1
GROUP BY 1,2
) as aa
GROUP BY 1
) as bb
JOIN (
SELECT deal_id, date(created_at) ymd, sum(item_total) daily_total, sum(qty) units_sold
FROM order_items
WHERE substr(created_at,1,7) = '2016-01'
AND status = 1
AND paid = 1
GROUP BY 1,2
) as aa ON bb.ymd = aa.ymd and bb.max_total = aa.daily_total
JOIN deals ON aa.deal_id = deals.id
ORDER BY aa.ymd, aa.max_total

How to get rows with max date when grouping in MySQL?

I have a table with prices and dates on product:
id
product
price
date
I create a new record when price change. And I have a table like this:
id product price date
1 1 10 2014-01-01
2 1 20 2014-02-17
3 1 5 2014-03-28
4 2 25 2014-01-05
5 2 12 2014-02-08
6 2 30 2014-03-12
I want to get last price for all products. But when I group with "product", I can't get a price from a row with maximum date.
I can use MAX(), MIN() or COUNT() function in request, but I need a result based on other value.
I want something like this in final:
product price date
1 5 2014-03-28
2 30 2014-03-12
But I don't know how. May be like this:
SELECT product, {price with max date}, {max date}
FROM table
GROUP BY product
Alternatively, you can have subquery to get the latest get for every product and join the result on the table itself to get the other columns.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT product, MAX(date) mxdate
FROM tableName
GROUP BY product
) b ON a.product = b.product
AND a.date = b.mxdate
I think the easiest way is a substring_index()/group_concat() trick:
SELECT product,
substring_index(group_concat(price order by date desc), ',', 1) as PriceOnMaxDate
max(date)
FROM table
GROUP BY product;
Another way, that might be more efficient than a group by is:
select p.*
from table t
where not exists (select 1
from table t2
where t2.product = t.product and
t2.date > t.date
);
This says: "Get me all rows from the table where the same product does not have a larger date." That is a fancy way of saying "get me the row with the maximum date for each product."
Note that there is a subtle difference: the second form will return all rows that on the maximum date, if there are duplicates.
Also, for performance an index on table(product, date) is recommended.
You can use a subquery that groups by product and return the maximum date for every product, and join this subquery back to the products table:
SELECT
p.product,
p.price,
p.date
FROM
products p INNER JOIN (
SELECT
product,
MAX(date) AS max_date
FROM
products
GROUP BY
product) m
ON p.product = m.product AND p.date = m.max_date
SELECT
product,
price,
date
FROM
(SELECT
product,
price,
date
FROM table_name ORDER BY date DESC) AS t1
GROUP BY product;