I'm trying to count the number of sales orders has been canceled in a time period. But I run into the problem that it doesn't return results that are zero
My table
+---------------+------------+------------------+
| metrausername | signupdate | cancellationdate |
+---------------+------------+------------------+
| GLO00026 | 2017-06-22 | 2017-03-20 |
| GLO00055 | 2017-06-22 | 2017-04-18 |
| GLO00022 | 2017-06-27 | NULL |
| GLO00044 | 2017-06-24 | NULL |
| GLO00005 | 2017-06-26 | NULL |
+---------------+------------+------------------+
The statment i'm trying to count with
SELECT metrausername, COUNT(*) AS count FROM salesdata2
WHERE cancellationdate IS NOT NULL
AND signupDate >= '2017-6-21' AND signupDate <= '2017-7-20'
GROUP BY metrausername;
Let me know if any additional information would help
If the metrausername is filtered out by the where, it won't appear. Left join to the aggregation to get round this:
select distinct a1.metrausername, coalesce(a2.counted,0) as counted -- coalesce replaces null with a value
from salesdata2 a1
left join
(
SELECT metrausername, COUNT(*) AS counted
FROM salesdata2
WHERE cancellationdate IS NOT NULL
AND signupDate >= '2017-6-21' AND signupDate <= '2017-7-20'
GROUP BY metrausername
) a2
on a1.metrausername = a2.metrausername
I would just do this by moving the filtering clause to the select. Assuming you really do want the date range (as opposed to having users outside the range), then:
SELECT metrausername, COUNT(cancellationdate ) AS count
FROM salesdata2
WHERE signupDate >= '2017-06-21' AND signupDate <= '2017-07-20'
GROUP BY metrausername;
COUNT(<colname>) counts the non-NULL values, so this seems like the simplest approach.
Related
I have a MySql table of users order and it has columns such as:
user_id | timestamp | is_order_Parent | Status |
1 | 10-02-2020 | N | C |
2 | 11-02-2010 | Y | D |
3 | 11-02-2020 | N | C |
1 | 12-02-2010 | N | C |
1 | 15-02-2020 | N | C |
2 | 15-02-2010 | N | C |
I want to count number of new custmer per day defined as: a customer who orders non-parent order and his order status is C AND WHEN COUNTING A USER ONCE IN A DAY WE DONT COUNT HIM FOR OTHER DAYS
An ideal resulted table will be:
Timestamp: Day | Distinct values of User ID
10-02-2020 | 1
11-02-2010 | 1
12-02-2010 | 0 <--- already counted user_id = 1 above, so no need to count it here
15-02-2010 | 1
table name is cscart_orders
If you are running MySQL 8.0, you can do this with window functions an aggregation:
select timestamp, sum(timestamp = timestamp0) new_users
from (
select
t.*,
min(case when is_order_parent = 'N' and status = 'C' then timestamp end) over(partition by user_id) timestamp0
from mytable t
) t
group by timestamp
The window min() computes the timestamp when each user became a "new user". Then, the outer query aggregates by date, and counts how many new users were found on that date.
A nice thing about this approach is that it does not require enumerating the dates separately.
You can use two levels of aggregation:
select first_timestamp, count(*)
from (select t.user_id, min(timestamp) as first_timestamp
from t
where is_order_parent = 'N' and status = 'C'
group by t.user_id
) t
group by first_timestamp;
How to be able to query from this data:
parking_place | number_of_month | from_date | end_date | monthly_unit_price
A | 3 | 2018-01 | 2018-03 | 3000000
Desire to show results:
parking_place | month | monthly_unit_price
A | 2018-01 | 3000000
A | 2018-02 | 3000000
A | 2018-03 | 3000000
please suggest me how to query?
You may join using a calendar table:
SELECT
t.parking_place,
t.month,
t.monthly_unit_price
FROM
(
SELECT '2018-01' AS month UNION ALL
SELECT '2018-02' UNION ALL
SELECT '2018-03'
) months
INNER JOIN yourTable t
ON months.month BETWEEN t.from_date AND t.end_date
ORDER BY
months.month;
Note that it would be better to store actual valid date literals to represent each month. For example, instead of storing the text '2018-01', you could store 2018-01-01 as a date literal.
For example I have created 3 index:
click_date - transaction table, daily_metric table
order_date - transaction table
I want to check does my query use index, I use EXPLAIN function and get this result:
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 668 | Using temporary; Using filesort |
| 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 645 | |
| 2 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 495 | |
| 4 | DERIVED | transaction | ALL | order_date | NULL | NULL | NULL | 291257 | Using where; Using temporary; Using filesort |
| 3 | DERIVED | daily_metric | range | click_date | click_date | 3 | NULL | 812188 | Using where; Using temporary; Using filesort |
| 5 | UNION | <derived7> | ALL | NULL | NULL | NULL | NULL | 495 | |
| 5 | UNION | <derived6> | ALL | NULL | NULL | NULL | NULL | 645 | Using where; Not exists |
| 7 | DERIVED | transaction | ALL | order_date | NULL | NULL | NULL | 291257 | Using where; Using temporary; Using filesort |
| 6 | DERIVED | daily_metric | range | click_date | click_date | 3 | NULL | 812188 | Using where; Using temporary; Using filesort |
| NULL | UNION RESULT | <union2,5> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
In EXPLAIN results I see, that index order_date of transaction table is not used, do I correct understand ?
Index click_date of daily_metric table was used correct ?
Please tell my how to understand from EXPLAIN result does my created index is used in query properly ?
My query:
SELECT
partner_id,
the_date,
SUM(clicks) as clicks,
SUM(total_count) as total_count,
SUM(count) as count,
SUM(total_sum) as total_sum,
SUM(received_sum) as received_sum,
SUM(partner_fee) as partner_fee
FROM (
SELECT
clicks.partner_id,
clicks.click_date as the_date,
clicks,
orders.total_count,
orders.count,
orders.total_sum,
orders.received_sum,
orders.partner_fee
FROM
(SELECT
partner_id, click_date, sum(clicks) as clicks
FROM
daily_metric WHERE DATE(click_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY partner_id , click_date) as clicks
LEFT JOIN
(SELECT
partner_id,
DATE(order_date) as order_dates,
SUM(order_sum) as total_sum,
SUM(customer_paid_sum) as received_sum,
SUM(partner_fee) as partner_fee,
count(*) as total_count,
count(CASE
WHEN status = 1 THEN 1
ELSE NULL
END) as count
FROM
transaction WHERE DATE(order_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY DATE(order_date) , partner_id) as orders ON orders.partner_id = clicks.partner_id AND clicks.click_date = orders.order_dates
UNION ALL SELECT
orders.partner_id,
orders.order_dates as the_date,
clicks,
orders.total_count,
orders.count,
orders.total_sum,
orders.received_sum,
orders.partner_fee
FROM
(SELECT
partner_id, click_date, sum(clicks) as clicks
FROM
daily_metric WHERE DATE(click_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY partner_id , click_date) as clicks
RIGHT JOIN
(SELECT
partner_id,
DATE(order_date) as order_dates,
SUM(order_sum) as total_sum,
SUM(customer_paid_sum) as received_sum,
SUM(partner_fee) as partner_fee,
count(*) as total_count,
count(CASE
WHEN status = 1 THEN 1
ELSE NULL
END) as count
FROM
transaction WHERE DATE(order_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY DATE(order_date) , partner_id) as orders ON orders.partner_id = clicks.partner_id AND clicks.click_date = orders.order_dates
WHERE
clicks.partner_id is NULL
ORDER BY the_date DESC
) as t
GROUP BY the_date ORDER BY the_date DESC LIMIT 50 OFFSET 0
Although I can't explain what the EXPLAIN has dumped, I thought there must be an easier solution to what you have and came up with the following. I would suggest the following indexes to optimize your existing query for the WHERE date range and grouping by partner.
Additionally, when you have a query that uses a FUNCTION on a field, it doesn't take advantage of the index. Such as your DATE(order_date) and DATE(click_date). To allow the index to better be used, qualify the full date/time such as 12:00am (morning) up to 11:59pm. I would typically to this via
x >= someDate #12:00 and x < firstDayAfterRange.
in your example would be (notice less than May 1st which gets up to April 30th at 11:59:59pm)
click_date >= '2013-04-01' AND click_date < '2013-05-01'
Table Index
transaction (order_date, partner_id)
daily_metric (click_date, partner_id)
Now, an adjustment. Since your clicks table may have entries the transactions dont, and vice-versa, I would adjust this query to do a pre-query of all possible date/partners, then left-join to respective aggregate queries such as:
SELECT
AllParnters.Partner_ID,
AllParnters.the_Date,
coalesce( clicks.clicks, 0 ) Clicks,
coalesce( orders.total_count, 0 ) TotalCount,
coalesce( orders.count, 0 ) OrderCount,
coalesce( orders.total_sum, 0 ) OrderSum,
coalesce( orders.received_sum, 0 ) ReceivedSum,
coalesce( orders.partner_fee 0 ) PartnerFee
from
( select distinct
dm.partner_id,
DATE( dm.click_date ) as the_Date
FROM
daily_metric dm
WHERE
dm.click_date >= '2013-04-01' AND dm.click_date < '2013-05-01'
UNION
select
t.partner_id,
DATE(t.order_date) as the_Date
FROM
transaction t
WHERE
t.order_date >= '2013-04-01' AND t.order_date < '2013-05-01' ) AllParnters
LEFT JOIN
( SELECT
dm.partner_id,
DATE( dm.click_date ) sumDate,
sum( dm.clicks) as clicks
FROM
daily_metric dm
WHERE
dm.click_date >= '2013-04-01' AND dm.click_date < '2013-05-01'
GROUP BY
dm.partner_id,
DATE( dm.click_date ) ) as clicks
ON AllPartners.partner_id = clicks.partner_id
AND AllPartners.the_date = clicks.sumDate
LEFT JOIN
( SELECT
t.partner_id,
DATE(t.order_date) as sumDate,
SUM(t.order_sum) as total_sum,
SUM(t.customer_paid_sum) as received_sum,
SUM(t.partner_fee) as partner_fee,
count(*) as total_count,
count(CASE WHEN t.status = 1 THEN 1 ELSE NULL END) as COUNT
FROM
transaction t
WHERE
t.order_date >= '2013-04-01' AND t.order_date < '2013-05-01'
GROUP BY
t.partner_id,
DATE(t.order_date) ) as orders
ON AllPartners.partner_id = orders.partner_id
AND AllPartners.the_date = orders.sumDate
order by
AllPartners.the_date DESC
limit 50 offset 0
This way, the first query will be quick on the index to get all possible combinations from EITHER table. Then the left-join will AT MOST join to one row per set. If found, get the number, if not, I am applying COALESCE() so if null, defaults to zero.
CLARIFICATION.
Like you when building your pre-aggregate queries of "clicks" and "orders", the "AllPartners" is the ALIAS result of the select distinct of partners and dates within the date range you were interested in. The resulting columns of that where were "partner_id" and "the_date" respective to your next queries. So this is the basis of joining to the aggregates of "clicks" and "orders". So, since I have these two columns in the alias "AllParnters", I just grabbed those for the field list since they are LEFT-JOINed to the other aliases and may not exist in either/or the respective others.
All right, so here's a challenge for all you SQL pros:
I have a table with two columns of interest, group and birthdate. Only some rows have a group assigned to them.
I now want to print all rows sorted by birthdate, but I also want all rows with the same group to end up next to each other. The only semi-sensible way of doing this would be to use the groups' average birthdates for all the rows in the group when sorting. The question is, can this be done with pure SQL (MySQL in this instance), or will some scripting logic be required?
To illustrate, with the given table:
id | group | birthdate
---+-------+-----------
1 | 1 | 1989-12-07
2 | NULL | 1990-03-14
3 | 1 | 1987-05-25
4 | NULL | 1985-09-29
5 | NULL | 1988-11-11
and let's say that the "average" of 1987-05-25 and 1989-12-07 is 1988-08-30 (this can be found by averaging the UNIX timestamp equivalents of the dates and then converting back to a date. This average doesn't have to be completely correct!).
The output should then be:
id | group | birthdate | [sort_by_birthdate]
---+-------+------------+--------------------
4 | NULL | 1985-09-29 | 1985-09-29
3 | 1 | 1987-05-25 | 1988-08-30
1 | 1 | 1989-12-07 | 1988-08-30
5 | NULL | 1988-11-11 | 1988-11-11
2 | NULL | 1990-03-14 | 1990-03-14
Any ideas?
Cheers,
Jon
I normally program in T-SQL, so please forgive me if I don't translate the date functions perfectly to MySQL:
SELECT
T.id,
T.group
FROM
Some_Table T
LEFT OUTER JOIN (
SELECT
group,
'1970-01-01' +
INTERVAL AVG(DATEDIFF('1970-01-01', birthdate)) DAY AS avg_birthdate
FROM
Some_Table T2
GROUP BY
group
) SQ ON SQ.group = T.group
ORDER BY
COALESCE(SQ.avg_birthdate, T.birthdate),
T.group
I currently have quite a messy query, which joins data from multiple tables involving two subqueries. I now have a requirement to group this data by DAY(), WEEK(), MONTH(), and QUARTER().
I have three tables: days, qos and employees. An employee is self-explanatory, a day is a summary of an employee's performance on a given day, and qos is a random quality inspection, which can be performed many times a day.
At the moment, I am selecting all employees, and LEFT JOINing day and qos, which works well. However, now, I need to group the data in order to breakdown a team or individual's performance over a date range.
Taking this data:
Employee
id | name
------------------
1 | Bob Smith
Day
id | employee_id | day_date | calls_taken
---------------------------------------------
1 | 1 | 2011-03-01 | 41
2 | 1 | 2011-03-02 | 24
3 | 1 | 2011-04-01 | 35
Qos
id | employee_id | qos_date | score
----------------------------------------
1 | 1 | 2011-03-03 | 85
2 | 1 | 2011-03-03 | 95
3 | 1 | 2011-04-01 | 91
If I were to start by grouping by DAY(), I would need to see the following results:
Day__date | Day__Employee__id | Day__calls | Day__qos_score
------------------------------------------------------------
2011-03-01 | 1 | 41 | NULL
2011-03-02 | 1 | 24 | NULL
2011-03-03 | 1 | NULL | 90
2011-04-01 | 1 | 35 | 91
As you see, Day__calls should be SUM(calls_taken) and Day__qos_score is AVG(score). I've tried using a similar method as above, but as the date isn't known until one of the tables has been joined, its only displaying a record where there's a day saved.
Is there any way of doing this, or am I going about things the wrong way?
Edit: As requested, here's what I've come up with so far. However, it only shows dates where there's a day.
SELECT COALESCE(`day`.day_date, qos.qos_date) AS Day__date,
employee.id AS Day__Employee__id,
`day`.calls_taken AS Day__Day__calls,
qos.score AS Day__Qos__score
FROM faults_employees `employee`
LEFT JOIN (SELECT `day`.employee_id AS employee_id,
SUM(`day`.calls_taken) AS `calls_in`,
FROM faults_days AS `day`
WHERE employee.id = 7
GROUP BY (`day`.day_date)
) AS `day`
ON `day`.employee_id = `employee`.id
LEFT JOIN (SELECT `qos`.employee_id AS employee_id,
AVG(qos.score) AS `score`
FROM faults_qos qos
WHERE employee.id = 7
GROUP BY (qos.qos_date)
) AS `qos`
ON `qos`.employee_id = `employee`.id AND `qos`.qos_date = `day`.day_date
WHERE employee.id = 7
GROUP BY Day__date
ORDER BY `day`.day_date ASC
The solution I'm comming up with looks like:
SELECT
`date`,
`employee_id`,
SUM(`union`.`calls_taken`) AS `calls_taken`,
AVG(`union`.`score`) AS `score`
FROM ( -- select from union table
(SELECT -- first select all calls taken, leaving qos_score null
`day`.`day_date` AS `date`,
`day`.`employee_id`,
`day`.`calls_taken`,
NULL AS `score`
FROM `employee`
LEFT JOIN
`day`
ON `day`.`employee_id` = `employee`.`id`
)
UNION -- union both tables
(
SELECT -- now select qos score, leaving calls taken null
`qos`.`qos_date` AS `date`,
`qos`.`employee_id`,
NULL AS `calls_taken`,
`qos`.`score`
FROM `employee`
LEFT JOIN
`qos`
ON `qos`.`employee_id` = `employee`.`id`
)
) `union`
GROUP BY `union`.`date` -- group union table by date
For the UNION to work, we have to set the qos_score field in the day table and the calls_taken field in the qos table to null. If we don't, both calls_taken and score would be selected into the same column by the UNION statement.
After this, I selected the required fields with the aggregation functions SUM() and AVG() from the union'd table, grouping by the date field in the union table.