If I don't use GROUP_CONCAT() then there is no difficulty to the group and order the rows according to date-month-year
Following code:
SELECT FROM_UNIXTIME(orders.date_time,'%d %m %Y') AS date,
SUM(orders.net_amount) AS total_sales,
COUNT(FROM_UNIXTIME(orders.date_time,'%D %b %Y')) AS total_orders
FROM orders
JOIN users ON orders.user_id = users.id
WHERE FROM_UNIXTIME(orders.date_time,'%d %m %Y') != DATE_FORMAT(users.reg_date_time, '%d %m %Y')
GROUP BY date
ORDER BY Month(1)
O/P:
21 12 2019 1092 1 pinky
04 01 2020 1050 1 harshit
30 12 2019 21 1 robin
05 01 2020 987 2 chetan
31 12 2019 1239 2 rahul
30 11 2019 157.5 1 rahul
01 01 2020 651 1 rahul
15 12 2019 1575 1 isha
03 01 2020 598.5 1 manvi
SEE the names are not concating
But as soon as I add this line:
GROUP_CONCAT(users.firstname SEPARATOR '-')) AS names
like this:
SELECT FROM_UNIXTIME(orders.date_time,'%d %m %Y') AS date,
SUM(orders.net_amount) AS total_sales,
GROUP_CONCAT(users.firstname SEPARATOR '-') AS names,
COUNT(FROM_UNIXTIME(orders.date_time,'%D %b %Y')) AS total_orders
FROM orders
JOIN users ON orders.user_id = users.id
WHERE FROM_UNIXTIME(orders.date_time,'%d %m %Y') != DATE_FORMAT(users.reg_date_time, '%d %m %Y')
GROUP BY date
ORDER BY Month(1)
O/P:
01 01 2020 651 1 rahul
03 01 2020 598.5 1 manvi
04 01 2020 1050 1 harshit
05 01 2020 987 2 chetan-saurabh
15 12 2019 1575 1 isha
21 12 2019 1092 1 pinky
30 11 2019 157.5 1 rahul
30 12 2019 21 1 robin
31 12 2019 1239 2 rahul-manvi
then the order changed by day-order(without proper month and year order) but the grouping is correct.
Am I doing something wrong?
Use ORDER BY MONTH(orders.date_time). The problem is that your date column is not formatted as a valid MySQL date, so it's not extracting the month correctly.
Related
I am trying to find the last entry for the previous years quarter.
All I can access is year i.e 2021 and quarter i.e 1
Here is the data in my database:
id
name
start
end
16
April 2021
2021-04-01
2021-04-30
15
March 2021
2021-03-01
2021-03-31
14
February 2021
2021-02-01
2021-02-28
57
November 2020
2020-11-01
2020-11-30
55
October 2020
2020-10-01
2020-10-31
29
September 2020
2020-09-01
2020-09-30
27
July 2020
2020-07-01
2020-07-31
24
April 2020
2020-04-01
2020-04-30
23
March 2020
2020-03-01
2020-03-31
22
February 2020
2020-02-01
2020-02-29
21
January 2020
2020-01-01
2020-01-31
Using the MySQL quarter function I can get it to print out the quarter as an integer in another column:
SET #given_year = 2021;
SET #given_quarter = 1;
SELECT
id, name, start, end, QUARTER(end) as "Q"
FROM
submissions
id
name
start
end
Q
16
April 2021
2021-04-01
2021-04-30
2
15
March 2021
2021-03-01
2021-03-31
1
14
February 2021
2021-02-01
2021-02-28
1
57
November 2020
2020-11-01
2020-11-30
4
55
October 2020
2020-10-01
2020-10-31
4
29
September 2020
2020-09-01
2020-09-30
3
27
July 2020
2020-07-01
2020-07-31
3
24
April 2020
2020-04-01
2020-04-30
2
23
March 2020
2020-03-01
2020-03-31
1
22
February 2020
2020-02-01
2020-02-29
1
21
January 2020
2020-01-01
2020-01-31
1
I tried using WHERE and LIKE but it is returning 0 rows:
SELECT * FROM (
SELECT
id, name, start, end, QUARTER(end) as "Q"
FROM
submissions as s
) AS vs
WHERE
vs.end
LIKE
#given_year
AND
vs.Q < #given_quarter
I also need to account for the possibility that there may be no rows this year and I need to find the previous year.
For example with these two rows, if I was passed the year 2021 and quarter 1 I would need to return November of the previous year and a different quarter.
id
name
start
end
Q
14
February
2021
2021-02-01
2021-02-28
57
November
2020
2020-11-01
2020-11-30
If I understand correctly, you want all the rows from the quarter in the data before a given quarter. You can filter and use dense_rank():
select s.*
from (select s.*,
dense_rank() over (order by year(start) desc, quarter(start) desc) as seqnum
from submissions s
where year(start) < #given_year or
(year(start) = #given_year and quarter(start) < #given_quarter)
) s
where seqnum = 1;
The above returns all rows from the previous quarter (which is what I thought you wanted). If you want only one row:
select s.*
from submissions s
where year(start) < #given_year or
(year(start) = #given_year and quarter(start) < #given_quarter)
order by start desc
limit 1;
I am working on a table 'booking' -
I want to add a column 'date_of_stay' to this table where date_of_stay will store each date in the period that a booking_id will stay for as per the number of nights given by column 'nights'.
FOR eg-
booking_id booking_date nights date_of_stay
5001 Thu, 03 Nov 2016 7 Thu, 03 Nov 2016
5001 Thu, 03 Nov 2016 7 Fri, 04 Nov 2016
5001 Thu, 03 Nov 2016 7 Sat, 05 Nov 2016
5001 Thu, 03 Nov 2016 7 Sun, 06 Nov 2016
5001 Thu, 03 Nov 2016 7 Mon, 07 Nov 2016
5001 Thu, 03 Nov 2016 7 Tue, 08 Nov 2016
5001 Thu, 03 Nov 2016 7 Wed, 09 Nov 2016
5002 Thu, 03 Nov 2016 2 Thu, 03 Nov 2016
5002 Thu, 03 Nov 2016 2 Fri, 04 Nov 2016
What can be the simplest way of viewing my table like this without altering it?
Use a recursive CTE:
with cte as (
select booking_id, booking_date, nights, 1 as n
from t
union all
select booking_id, booking_date, nights, 1 + n
from cte
where n < nights
)
select *
from cte;
According this article you can use next query:
select booking.*, date_of_stay
from (
select adddate('2015-01-01', t3*1000 + t2*100 + t1*10 + t0) date_of_stay from
(select 0 t0 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t0,
(select 0 t1 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t1,
(select 0 t2 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t2,
(select 0 t3 union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) t3
) v
join booking ON date_of_stay between
booking.booking_date and date_add(booking.booking_date, interval nights-1 day)
order by booking_id, date_of_stay
;
The above query works in MYSQL 5 & 8 as well. It is valid for all dates in interval:
select '2015-01-01' from_date, adddate('2015-01-01',9*1000 + 9*100 + 9*10 + 9) to_date;
+------------+------------+
| from_date | to_date |
+------------+------------+
| 2015-01-01 | 2042-05-18 |
+------------+------------+
Here query can be tested SQLize.online
You can do it with a recursive CTE:
WITH RECURSIVE cte AS (
SELECT booking_id, booking_date, nights, 0 nr,
STR_TO_DATE(booking_date, '%a, %d %b %Y') date_of_stay
FROM booking
UNION ALL
SELECT booking_id, booking_date, nights, nr + 1,
date_of_stay + interval 1 day
FROM cte
WHERE nr < nights - 1
)
SELECT c.booking_id, c.booking_date, c.nights,
DATE_FORMAT(c.date_of_stay, '%a, %d %b %Y') date_of_stay
FROM cte c
ORDER BY c.booking_id, c.date_of_stay
See the demo.
Results:
> booking_id | booking_date | nights | date_of_stay
> ---------: | :--------------- | -----: | :---------------
> 5001 | Thu, 03 Nov 2016 | 7 | Thu, 03 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Fri, 04 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Sat, 05 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Sun, 06 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Mon, 07 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Tue, 08 Nov 2016
> 5001 | Thu, 03 Nov 2016 | 7 | Wed, 09 Nov 2016
> 5002 | Thu, 03 Nov 2016 | 2 | Thu, 03 Nov 2016
> 5002 | Thu, 03 Nov 2016 | 2 | Fri, 04 Nov 2016
Difficult query. Not sure if this is possible.
I am trying to figure out how to format a query that needs to accomplish several things.
(1) I need to parse the DATE field that is a VARCHAR field and not a sql date field to get the month of each date.
(2) I then need to AVG all the PTS fields by NAME and Month. So with my example data below I would have a row that has John and John would have 2 in the JAN column and 3 in the the APR column and the AVG column would be an average of all the months. So the months are an average of all the entries in that month and the AVG column is an average of all the columns in the row.
Table:
Name (VARCAHR) PTS (INT) DATE (VARCHAR)
---------------------------------------------
John 3 Tue Apr 14 17:56:02 2020
Chris 2 Tue Apr 14 19:44:03 2020
John 2 Mon Jan 30 15:23:03 2020
Chris 4 Fri Feb 28 16:15:15 2020
John 3 Tue Apr 14 17:56:02 2020
Table Layout on web page:
Name Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Average
Not impossible, just convoluted. You can use STR_TO_DATE to convert your strings into DATETIME objects, from which you can then use MONTH to get the month number. Note though (as #DRapp commented) that you should be storing DATETIME values in their native form, not as VARCHAR, then you wouldn't have to deal with STR_TO_DATE. Having got the month number, you can then use conditional aggregation to get the results you want:
SELECT name,
COALESCE(AVG(CASE WHEN mth = 1 THEN PTS END), 0) AS Jan,
COALESCE(AVG(CASE WHEN mth = 2 THEN PTS END), 0) AS Feb,
COALESCE(AVG(CASE WHEN mth = 3 THEN PTS END), 0) AS Mar,
COALESCE(AVG(CASE WHEN mth = 4 THEN PTS END), 0) AS Apr,
-- repeat for May to November
COALESCE(AVG(CASE WHEN mth = 12 THEN PTS END), 0) AS `Dec`,
AVG(PTS) AS AVG
FROM (
SELECT name, PTS AS PTS, MONTH(STR_TO_DATE(DATE, '%a %b %e %H:%i:%s %Y')) AS mth
FROM data
) d
GROUP BY name
Output (for your sample data):
name Jan Feb Mar Apr Dec AVG
Chris 0 4 0 2 0 3
John 0 0 0 2.6667 0 2.6667
Demo on SQLFiddle
I have combination of domain and month with their total orders in corresponding month. I would like to impute missing combination with 0 values. What's the least expensive aggregation commands that can be used in Pyspark to achieve this ?
I have following input table:
domain month year total_orders
google.com 01 2017 20
yahoo.com 02 2017 30
google.com 03 2017 30
yahoo.com 03 2017 40
a.com 04 2017 50
a.com 05 2017 50
a.com 06 2017 50
Expected Output:
domain month year total_orders
google.com 01 2017 20
yahoo.com 02 2017 30
google.com 03 2017 30
yahoo.com 03 2017 40
a.com 04 2017 50
a.com 05 2017 50
a.com 06 2017 50
google.com 02 2017 0
google.com 04 2017 0
yahoo.com 04 2017 0
google.com 05 2017 0
yahoo.com 05 2017 0
google.com 06 2017 0
yahoo.com 06 2017 0
Here Expected order of output does not really matter.
The simplest method is to combine all months and years for each domain:
select my.year, my.month, d.domain, coalesce(t.total_orders, 0) as total_orders
from (select distinct month, year from input) my cross join
(select distinct domain from input) d left join
t
on t.month = my.month and t.year = my.year and t.domain = d.domain;
Note: This assumes that each year/month combination occurs at least once, somewhere in the data.
Getting values within a range is a pain because you have split the date into multiple columns. Let me assume the years are all the same, as in your example:
select my.year, my.month, d.domain, coalesce(t.total_orders, 0) as total_orders
from (select distinct month, year from input) my join
(select domain, min(month) as min_month, max(month) as max_month
from input
) d
on my.month >= d.min_month and my.month <= d.max_month left join
t
on t.month = my.month and t.year = my.year and t.domain = d.domain
I have output as below
ID Date
Null 2012-10-01
1 2012-10-02
2 2012-10-03
NULL 2012-10-04
3 2012-10-05
NULL 2012-10-06
4 2012-10-07
NULL 2012-10-08
5 2012-10-10
NULL 2012-10-11
NULL 2012-10-12
6 2012-10-13
NULL 2012-10-16
As it has missing dates with value as NULL. I need to show final output as
2012-10-01 - 2012-10-01 (1 day )
2012-10-04 - 2012-10-04(1 day )
2012-10-06 - 2012-10-06(1 day )
2012-10-08 - 2012-10-08(1 day )
2012-10-11 - 2012-10-12(2 day )
2012-10-14 - 2012-10-14(1 day )
You can generate the date ranges using the following query:
select
min(date) as start,
max(date) as end,
datediff(max(date), min(date)) + 1 as numDays
from
(select #curRow := #curRow + 1 AS row_number, id, date
from Table1 join (SELECT #curRow := 0) r where ID is null) T
group by
datediff(date, '2012-10-01 00:00:00') - row_number;
The logic is based on a clever trick for grouping consecutive ranges. First, we filter and number the rows in the subquery. Then, the rows that are grouped together are found by comparing the number of days after 2012-10-01 to the row number. If any rows share this value, then they must be consecutive, otherwise there would be a "jump" between two rows and the expression datediff(date, '2012-10-01 00:00:00') - row_number would no longer match.
Sample output (DEMO):
START END NUMDAYS
October, 01 2012 00:00:00+0000 October, 01 2012 00:00:00+0000 1
October, 04 2012 00:00:00+0000 October, 04 2012 00:00:00+0000 1
October, 06 2012 00:00:00+0000 October, 06 2012 00:00:00+0000 1
October, 08 2012 00:00:00+0000 October, 08 2012 00:00:00+0000 1
October, 11 2012 00:00:00+0000 October, 12 2012 00:00:00+0000 2
October, 16 2012 00:00:00+0000 October, 16 2012 00:00:00+0000 1
From there I think it should be pretty trivial for you to get the exact output you are looking for.