Customizing ROLLUP row mySQL - mysql

I want to summarize the sales data and I want to sum its total in the last row, I'm using "GROUP BY" and "WITH ROLLUP" but the results are:
+--------+--------------------+------------+--------+-----------+
| id | name | date | amount | total |
+--------+--------------------+------------+--------+-----------+
| Z00015 | Mebel Harmonis | 2019-05-09 | 2 | 10000000 |
| Z00016 | Mebel Harmonis | 2019-05-09 | 10 | 45000000 |
| Z00017 | Mebel Tunggal Jaya | 2019-05-10 | 3 | 12000000 |
| (null) | Mebel Tunggal Jaya | 2019-05-10 | 29 | 131000000 |
+--------+--------------------+------------+--------+-----------+
the last row that i want:
+--------+--------+--------+----+-----------+
| (null) | (null) | (null) | 29 | 131000000 |
+--------+--------+--------+----+-----------+
This is my query:
SELECT
order2.id_order AS id,
customer.name_customer AS name,
DATE( order2.date_order ) AS date ,
Count( order_detail.id_detail ) AS amount,
SUM( harga ) AS total
FROM
order_detail
INNER JOIN order2 ON order2.id_order = order_detail.id_order
INNER JOIN customer ON order2.id_customer = customer.id_customer
INNER JOIN produk ON produk.id_produk = order_detail.id_produk
INNER JOIN sofa ON sofa.id_sofa = produk.id_sofa
WHERE
date( date_order ) >= '2019-05-01'
AND date( date_order ) <= '2019-05-31'
GROUP BY
order2.id_order WITH ROLLUP;

You need to specify all the columns that can be combined together for the grand total in your GROUP BY clause:
GROUP BY id, name, date WITH ROLLUP
However, this will create intermediate subtotals for each id and id, name. You can filter them out with:
HAVING id IS NOT NULL OR (id IS NULL AND name IS NULL AND date IS NULL)

Related

Joining 2 SQL SELECT result into one query

I wanted to know if there's a way to join two or more result sets into one.
i have the following two queries
First query:
SELECT
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country ,
count(concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count ,
COUNT(DISTINCT db.prod_id.email) AS MAIL
from db.prod_id
left join db.country on db.prod_id.branch_id = db.country.id
where db.prod_id.created_on > '2020-11-17' and (db.country.type = 1 or db.country.type = 2)
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The second query:
select
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country,
count(CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count_BUY
from db.prod_id
left join db.prod_evaluations on db.prod_id.id = db.prod_evaluations.id
left join db.country on db.prod_id.branch_id = db.country.id
left join (Select prod_properties.prod_id, prod_properties.value From prod_properties Where prod_properties.property_id = 5) as db3 on db3.prod_id = db.prod_id.id
where db.prod_id.created_on > '2020-11-17'
and db3.value = 'online-buy' and db.prod_id.status_id <> 25
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The first query give the following result:
+------------+---------+-------+------+
| day | Country | Count | Mail |
+------------+---------+-------+------+
| 17-11-2020 | IT | 200 | 100 |
| 17-11-2020 | US | 250 | 100 |
| 18-11-2020 | IT | 350 | 300 |
| 18-11-2020 | US | 200 | 100 |
+------------+---------+-------+------+
The second query give:
+------------+---------+-----------+
| day | Country | Count_BUY |
+------------+---------+-----------+
| 17-11-2020 | IT | 50 |
| 17-11-2020 | US | 70 |
| 18-11-2020 | IT | 200 |
| 18-11-2020 | US | 50 |
+------------+---------+-----------+
Now i want to merge these two result in one:
+------------+---------+-------+------+-----------+
| day | Country | Count | Mail | Count_BUY |
+------------+---------+-------+------+-----------+
| 17-11-2020 | IT | 200 | 100 | 50 |
| 17-11-2020 | US | 250 | 100 | 70 |
| 18-11-2020 | IT | 350 | 300 | 200 |
| 18-11-2020 | US | 200 | 100 | 50 |
+------------+---------+-------+------+-----------+
How can i perform this query?
I'm using mysql
Thanks
The simple way: You can join queries.
select *
from ( <your first query here> ) first_query
join ( <your second query here> ) second_query using (day_month_year, country)
order by day_month_year, country;
This is an inner join. You can also outer join of course. MySQL doesn't support full outer joins, though. If you want that, you'll have to look up how to emulate a full outer join in MySQL.
The hard way ;-) Merge the queries.
If I am not mistaken, your two queries can be reduced to
select
date(created_on),
branch_id as country,
count(*) as count_products,
count(distinct p.email) as count_emails
from db.prod_id
where created_on >= date '2020-11-17'
and branch_id in (select country from db.country where type in (1, 2))
group by date(created_on), branch_id
order by date(created_on), branch_id;
and
select
date(created_on),
branch_id as country,
count(*) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
and status_id <> 25
and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)
group by date(created_on), branch_id
order by date(created_on), branch_id;
The two combined should be
select
date(created_on),
branch_id as country,
sum(branch_id in (select country from db.country where type in (1, 2)) as count_products,
count(distinct case when branch_id in (select country from db.country where type in (1, 2) then p.email end) as count_emails,
sum(status_id <> 25 and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
group by date(created_on), branch_id
order by date(created_on), branch_id;
You see, the conditions the queries have in common remain in the where clause and the other conditions go inside the aggregation functions.
sum(boolean) is short for sum(case when boolean then 1 else 0 end), i.e. this counts the rows where the condition is met in MySQL.

How can I retrieve all the columns on a timerange aggregation?

I am currently struggling on how to aggregate my daily data in other time aggregations (weeks, months, quarters etc).
Here is how my raw data type looks like:
| date | traffic_type | visits |
|----------|--------------|---------|
| 20180101 | 1 | 1221650 |
| 20180101 | 2 | 411424 |
| 20180101 | 4 | 108407 |
| 20180101 | 5 | 298117 |
| 20180101 | 6 | 26806 |
| 20180101 | 7 | 12033 |
| 20180101 | 8 | 80368 |
| 20180101 | 9 | 69544 |
| 20180101 | 10 | 39919 |
| 20180101 | 11 | 26291 |
| 20180102 | 1 | 1218490 |
| 20180102 | 2 | 410965 |
| 20180102 | 4 | 108037 |
| 20180102 | 5 | 297727 |
| 20180102 | 6 | 26719 |
| 20180102 | 7 | 12019 |
| 20180102 | 8 | 80074 |
First, I would like to check the sum of visits regardless of traffic_type:
SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date
Here is the outcome:
| ymd | visits_per_day |
|:--------:|:--------------:|
| 20180101 | 2294563 |
| 20180102 | 2289145 |
| 20180103 | 2300367 |
| 20180104 | 2310256 |
| 20180105 | 2368098 |
| 20180106 | 2372257 |
| 20180107 | 2373863 |
| 20180108 | 2364236 |
However, if I want to check the specific day which the visits_per_day was the highest for each time aggregation (eg.: Month), I am struggling to retrieve the right output.
Here is what I did:
SELECT
(date div 100) as y_month, MAX(visits_per_day) as max_visit_per_day
FROM
(SELECT date, SUM(visits) as visits_per_day
FROM visits_tbl
GROUP BY date) as t1
GROUP BY
y_month
And here is the output of my query:
| y_month | max_visit_per_day |
|:-------:|:-----------------:|
| 201801 | 2435845 |
| 201802 | 2519000 |
| 201803 | 2528097 |
| 201804 | 2550645 |
However, I cannot know what was the exact day where the visits_per_day was the highest.
Desired output:
| y_month | max_visit_per_day | ymd |
|:-------:|:-----------------:|:--------:|
| 201801 | 2435845 | 20180130 |
| 201802 | 2519000 | 20180220 |
| 201803 | 2528097 | 20180325 |
| 201804 | 2550645 | 20180406 |
ymd would represent the day in which the visits_per_day was the highest.
This logic would be used in a dashboard with the help of programming in order to automatically select the time aggregation.
Can someone please help me?
This is a job for the structured part of structured query language. That is, you will write some subqueries and treat them as tables.
You already know how to find the number of visits per day. Let's add the month for each day to that query (http://sqlfiddle.com/#!9/a8455e/13/0).
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
Next you need to find the largest number of daily visits in each month. (http://sqlfiddle.com/#!9/a8455e/12/0)
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
Then, the trick is retrieving the date on which that maximum occurred in each month. That requires a join. Without common table expressions (which MySQL lacks) you need to repeat the first subquery. (http://sqlfiddle.com/#!9/a8455e/11/0)
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) dayvisits
GROUP BY month
) maxvisits
JOIN (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
) detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
The outline of this rather complex query helps explain it. Instead of that subquery, we'll use an imaginary table called dayvisits.
SELECT detail.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY date DIV 100
) maxvisits
JOIN dayvisits detail ON detail.visits = maxvisits.max_daily_visits
AND detail.month = maxvisits.month
You're seeking an extreme value for each month in the subquery. (This is a fairly standard sort of SQL operation.) To do that you find that value with a MAX() ... GROUP BY query. Then you join that to the subquery itself to find the other values corresponding to the extreme value.
If you did have common table expressions, the query would look like this. YOu might consider adopting the MySQL fork called MariaDB, which has CTEs.
WITH dayvisits AS (
SELECT date DIV 100 as month, date,
SUM(visits) as visits
FROM visits_tbl
GROUP BY date
)
SELECT dayvisits.*
FROM (
SELECT month, MAX(visits) max_daily_visits
FROM dayvisits
GROUP BY month
) maxvisits
JOIN dayvisits ON dayvisits.visits = maxvisits.max_daily_visits
AND dayvisits.month = maxvisits.month
[Query Check on MSSQL] its quick and efficient.
select visit_sum_day_wise.date
, visit_sum_day_wise.Max_Visits
, visit_sum_day_wise.traffic_type
, LAST_VALUE(visit_sum_day_wise.visits) OVER(PARTITION BY
visit_sum_day_wise.date ORDER BY visit_sum_day_wise.date ROWS BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS max_visit_per_day
from (
select visits_tbl.date , visits_tbl.visits , visits_tbl.traffic_type
,max(visits_tbl.visits ) OVER ( PARTITION BY visits_tbl.date ORDER
BY visits_tbl.date ROWS BETWEEN UNBOUNDED PRECEDING AND 0
PRECEDING) Max_visits
from visits_tbl
) as visit_sum_day_wise
where visit_sum_day_wise.visits = (select max(visits_B.visits ) from
visits_tbl visits_B where visits_B.Date = visit_sum_day_wise.date )
enter image description here

MySQL - Return Latest Date and Total Sum from two rows in a column for multiple entries

For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;

SQL Help: How come the total from this query is different that a summation query?

This query does a group by on lead_source_id:
SELECT ch.lead_source_id,
Count(DISTINCT ch.repurchased_date)
FROM customers_history ch
WHERE ch.repurchased_date >= '2014-04-01'
AND ch.repurchased_date < '2014-05-01'
AND ch.lead_source_id IS NOT NULL
GROUP BY ch.lead_source_id;
And this query totals the records in the table:
SELECT Count(DISTINCT( repurchased_date ))
FROM customers_history
INNER JOIN (SELECT DISTINCT( customer_id ) AS xcid
FROM customers_history
WHERE repurchased_date >= '2014-04-01'
AND repurchased_date < '2014-05-01'
AND lead_source_id IS NOT NULL) AS Temp
ON Temp.xcid = customer_id
WHERE repurchased_date >= '2014-04-01'
AND repurchased_date < '2014-05-01'
AND lead_source_id IS NOT NULL;
On our production data, the totals from Query1 come to 7963, but the second query prints 7905. Why the difference and how can we fix our queries?
Here's our table layout:
+--------+-------------+----------------+---------------------+--------+
| id | customer_id | lead_source_id | repurchased_date | Rating |
+--------+-------------+----------------+---------------------+--------+
| 422923 | 420450 | 4 | 2014-04-14 09:16:48 | Warm |
| 422924 | 420450 | 4 | 2014-04-14 09:16:48 | Cold |
| 422956 | 420450 | 4 | 2014-04-14 09:16:49 | Hot |
| 422933 | 420451 | 37 | 2014-04-14 09:18:41 | Hot |
| 422938 | 420452 | 1 | 2014-04-10 20:50:30 | Hot |
| 422984 | 420452 | 1 | 2014-04-12 20:50:30 | Warm |
| 422940 | 420453 | 47 | 2014-04-14 09:20:27 | Hot |
+--------+-------------+----------------+---------------------+--------+
EDIT
To answer some of the possibilities about nulls:
select count(id) from customers_history where customer_id is null: 0
select count(id) from customers_history where lead_source_id is null: 5103
select count(id) from customers_history where repurchased_date is null: 0
The most obvious conclusion is that some lead_source_ids share values of repurchased_date.
Another possibility is that you have NULL values for customer_id and the second filters these out.
The third possibility is that NULL values of lead_source_id are adding additional values in the first query.

Get the balance of my users in the same table

Help please, I have a table like this:
| ID | userId | amount | type |
-------------------------------------
| 1 | 10 | 10 | expense |
| 2 | 10 | 22 | income |
| 3 | 3 | 25 | expense |
| 4 | 3 | 40 | expense |
| 5 | 3 | 63 | income |
I'm looking for a way to use one query and retrive the balance of each user.
The hard part comes when the amounts has to be added on expenses and substracted on incomes.
This would be the result table:
| userId | balance |
--------------------
| 10 | 12 |
| 3 | -2 |
You need to get each totals of income and expense using subquery then later on join them so you can subtract expense from income
SELECT a.UserID,
(b.totalIncome - a.totalExpense) `balance`
FROM
(
SELECT userID, SUM(amount) totalExpense
FROM myTable
WHERE type = 'expense'
GROUP BY userID
) a INNER JOIN
(
SELECT userID, SUM(amount) totalIncome
FROM myTable
WHERE type = 'income'
GROUP BY userID
) b on a.userID = b.userid
SQLFiddle Demo
This is easiest to do with a single group by:
select user_id,
sum(case when type = 'income' then amount else - amount end) as balance
from t
group by user_id
You could have 2 sub-queries, each grouped by id: one sums the incomes, the other the expenses. Then you could join these together, so that each row had an id, the sum of the expenses and the sum of the income(s), from which you can easily compute the balance.