MySQL Select Top Row Grouping By Another Row - mysql

I have a sales table and I want to get each members most frequently shopped store in the last 3 months. The following query will get the every member with every store, but I want just one store per member.
SELECT member_id, store_id, COUNT(DISTINCT docket) as docket_count, SUM(dollar_amount) as dollars
FROM sales
WHERE TIMESTAMPDIFF(MONTH, sale_date, CURDATE()) < 3
GROUP BY member_id, store_id
ORDER BY member_id, docket_count DESC, dollars DESC
Or to get the top store for a single member
SELECT store_id, COUNT(DISTINCT docket) as docket_count, SUM(dollar_amount) as dollars
FROM sales
WHERE TIMESTAMPDIFF(MONTH, sale_date, CURDATE()) < 3
AND member_id = 1
GROUP BY store_id
ORDER BY docket_count DESC, dollars DESC

This is tricky. In MySQL, this can be easiest using the group_concat()/substring_index() trick:
SELECT member_id,
SUBSTRING_INDEX(GROUP_CONCAT(store_id ORDER BY docket_count DESC dollars DESC), ',', 1) as Most_Common_Store
FROM (SELECT member_id, store_id, COUNT(DISTINCT docket) as docket_count,
SUM(dollar_amount) as dollars
FROM sales
WHERE sale_date >= CURDATE() - interval 3 month
GROUP BY member_id, store_id
) ms
GROUP BY member_id;

Related

SQL query involving partial group with condition?

Below is a SQL query problem for which I am not able to understand correct approach:
DB tables:
Employee: emp_id, emp_name
Credit: credit_id, emp_id, credit_date, credit_amount
debit: debit_id, emp_id, debit_date, debit_amount
Here, each person can have multiple incomes and expenses.
Query requirement: At the end of each day, each employee will have some asset('credit till now' - 'debit till now'). We need to find top five employees in terms of maximum asset and the date on which they had this maximum asset.
I have tried the below query but seems like I am missing something:
select Credit.emp_id, Credit.date, (Credit.income_amount - Debit.credit_amount) from
(select emp_id, sum(amount) as credit_amount
from credit) Credit
LEFT JOIN LATERAL (
select emp_id, sum(amount) as debit_amount
from debits
where debits.emp_id = Credit.emp_id and Credit.date >= debits.date
group by debits.emp_id
) Debit
ON true
Here I'm breaking the query to make it more readable.
First of all, we need to get the total amount on a day-level for both credit and debit both, so that we can join the credit and debit table on the day level with the same emp_id.
with
credit as(
select emp_id,credit_date date,sum(credit_amount) as amount
from credit
group by 1,2),
debit as(
select emp_id,debit_date,sum(debit_amount) as amount
from expenses
group by 1,2),
Now we need to full outer join the "credit" and "debit" subqueries
payments as (
select distinct
case when c.emp_id is null then d.person_id else c.emp_id end as emp_id ,
case when c.emp_id is null then d.date else c.date end as date,
case when c.emp_id is null then 0 else i.amount end as credit ,
case when d.emp_id is null then 0 else d.amount end as debit
from credit c
full outer join debit d on d.emp_id=c.emp_id and d.date=c.date
),
Now we will take day-wise cumulative sum for credit, debit and total balance as shown below.
total_balance as(
SELECT emp_id, date,
sum(credit) OVER (PARTITION BY emp_id ORDER BY date asc) AS total_credit,
sum(debit) OVER (PARTITION BY emp_id ORDER BY date asc) AS total_debit,
(sum(income) OVER (PARTITION BY person_id ORDER BY date asc) -
sum(expense) OVER (PARTITION BY person_id ORDER BY date asc)) as total_balance
FROM group_payment
ORDER BY person_id, date),
Now we need to use the rank() function to assign rank based on total balance (desc) for an emp_id (ie. rank=1 will be assigned to the largest total balance on a day for a particular emp_id). The query is shown below.
ranks as (select emp_id,date,total_balance,
rank() over (partition by emp_id order by total_balance desc) as rank
from total_balance ),
Now pick the rows having rank=1 (ie. MAX of total_balance on a day for an emp_id and the date on which it was MAX).
Order it by total_balance descending and pick the top 5 rows
emp_order as (select emp_id,date,total_balance
from ranks
where rank=1
order by 3 desc
limit 5)
Now pick the name from the employee table.
select emp_id,name, date, total_balance as balance
from emp_order eo
join Employee e on e.emp_id = eo.emp_id
order by 4 desc
Group by and sum allows you to get the total credit for each person into 1 record. You can do a similar thing in a subquery to subtract the debit.
Select top 5 emp_id, credit_date, (sum(credit_amount) -
(select sum(debit_amount) from debit d
where c.emp_id = d.emp_id and c.credit_date = d.debit_date)
) as total
from Credit c group by emp_id, credit_date order by total

MYSQL DB error Expression #2 of SELECT list is not in GROUP BY clause and contains nonaggregated column

SELECT CUST_ID, AVG(freq), AVG(amount) ,month from
(SELECT CUST_ID, DATE_FORMAT(CDATE, "%Y%m") as month,
COUNT(*) as freq, SUM(BILLS/count(*)) as amount FROM PROCESSED
where CDATE>= DATE(NOW() - INTERVAL 6 MONTH) GROUP BY CUST_ID, month having count(*) >=3
order by cust_id, month) T where CUST_ID != 2750 and CUST_ID != 1 group by CUST_ID
I understand the group by clause does not allow non-agg column, I need the month as a column?
SELECT CUST_ID, AVG(freq), AVG(amount),MONTH from
(SELECT CUST_ID, DATE_FORMAT(CDATE, "%Y%m") as month,
COUNT(*) as freq, SUM(BILLS)/count(*) as amount FROM PROCESSED
where CDATE>= DATE(NOW() - INTERVAL 6 MONTH) GROUP BY CUST_ID, MONTH having count(*) >=3
order by cust_id, month) T where CUST_ID != 2750 and CUST_ID != 1 group by CUST_ID,MONTH
I got the solution - The Issue was group by did not have the month column and so the month column does not know which row to be attached with in the aggregated results

MySQL Finding the MAX of the SUM of each month of the year [MYSQL]

Currently i am able to get the sum for the highest amount for each month of the year. But what i want to do, is to be able to get the SUM of the month that has the highest value in amount for each year.
SELECT year(paymentDate), month(paymentDate) , SUM(amount)
FROM classicmodels.payments
GROUP BY year(paymentDate), month(paymentDate)
ORDER BY SUM(amount) DESC;
This orders the highest SUM(amount) in descending order but i only want to get the highest month for each year. there are only 3 years in my database.
Here's what happening on mysql workbench
One method uses a having clause:
SELECT year(p.paymentDate), month(p.paymentDate), SUM(p.amount)
FROM classicmodels.payments p
GROUP BY year(p.paymentDate), month(p.paymentDate)
HAVING SUM(p.amount) = (SELECT SUM(p2.amount)
FROM classicmodels.payments p2
WHERE year(p2.paymentDate) = year(p.paymentDate)
GROUP BY month(p2.paymentDate)
ORDER BY SUM(p2.amount) DESC
LIMIT 1
)
ORDER BY SUM(amount) DESC;

To find the maximum number of order count that occur in any 1 hour of the day from the database?

I have a food selling website in which there is order table which record the order of every user.It column for user id ,user name,orderid ,timestamp of order.I want to know the maximum number of order that has been made in any one hour span through out the day.Give me any formula for this,or any algorithm or any sql queries for these.
SQL server:
with CTE as
(
select cast(t1.timestamp as date) as o_date, datepart(hh, t1.timestamp) as o_hour, count(*) as orders
from MyTable t1
group by cast(t1.timestamp as date), datepart(hh, t1.timestamp)
)
select o_date, o_hour, orders
from CTE
where orders = (select max(orders) from CTE)
Oracle
with CTE as
(
select to_char(t1.timestamp, 'YYYYMMDD') as o_date, to_char(t1.timestamp, 'HH24') as o_hour, count(*)
from MyTable t1
group by to_char(t1.timestamp, 'YYYYMMDD'), to_char(t1.timestamp, 'HH24')
)
select o_date, o_hour, orders
from CTE
where orders = (select max(orders) from CTE)
You can get count by day and hour like this
For SQL
SELECT TOP 1
COUNT(*)
FROM myTable
GROUP BY DATEPART(day, [column_date]), DATEPART(hour, [column_date])
ORDER BY COUNT(*) DESC;
For MySQL
SELECT
COUNT(*)
FROM myTable
GROUP BY HOUR(column_date), DAY(column_date)
ORDER BY COUNT(*) DESC
LIMIT 1;

Is it possible to get all these stats with one query?

I have a Vote-scoring system, each user can score any product each day (maximum of 10 points a day, but they can go on the same product each day).
The schema for my vote table is like so:
Vote: ID, user_id, product_id, score, date.
What I'd like to do is not only fetch the total score and amount of individual votes, (so I can work out an average) but also get the unique amount of voters (DISTINCT user_id's) in the current time frame (in this example, a month). The current query I have is:
SELECT
SUM(`Vote`.`score`) AS `score`,
COUNT(*) AS `votes`,
CONCAT(YEAR(`Vote`.`date`), '-', MONTH(Vote.date)) AS `month`
FROM
`votes` AS `Vote`
WHERE
`product_id` = 4
GROUP BY
month
ORDER BY `Vote`.`date` DESC
Thanks in advance.
Use COUNT(DISTINCT user_id) in your SELECT list.
You can also have AVG(score) calculated.
SELECT
SUM(score) AS totalScore,
COUNT(*) AS totalVotes,
COUNT(DISTINCT user_id) AS voters,
AVG(score) AS averageScore,
CONCAT(YEAR(`date`), '-', MONTH(`date`)) AS `month`
FROM
votes
WHERE
product_id = 4
GROUP BY
`month`
ORDER BY `date` DESC