how to select every month even they do not have value [duplicate] - mysql

This question already has answers here:
What is the most straightforward way to pad empty dates in sql results (on either mysql or perl end)?
(9 answers)
Closed 5 years ago.
I want to select company id, date and number from table, but this query does not show the month some companies with 0 number.
Here is the query:
SELECT c.name, date_format(e.created, '%y_%m') AS date, count(*)
FROM company c
JOIN edited e
on c.id=e.company_id AND e.created >='2016-12-13 00:00:00' AND e.created <='2017-05-20 00:00:00'
GROUP BY c.id, date
Some of results like this 4 16_12 2
4 17_01 4
4 17_04 2
4 17_05 2
without 17_03 (march). how can it show 17_03 with 0 ?

If your join not match values according to its in ON clause, not will return tuples and consequently you won't count rows. If you have a fixed set of companies and dates, you can set manually your values with UNION statement:
(SELECT 'company_name1', 'date1', COUNT(*)
FROM company c
JOIN edited e
on c.id=e.company_id AND e.created >='2016-12-13 00:00:00' AND e.created <='2017-05-20 00:00:00')
UNION
(SELECT 'company_name2', 'date2', COUNT(*)
FROM company c
JOIN edited e
on c.id=e.company_id AND e.created >='2016-12-13 00:00:00' AND e.created <='2017-05-20 00:00:00')
#and make this for companies that not will return zero at count
UNION
(SELECT c.name, date_format(e.created, '%y_%m') AS date, count(*)
FROM company c
JOIN edited e
on c.id=e.company_id AND e.created >='2016-12-13 00:00:00' AND e.created <='2017-05-20 00:00:00'
GROUP BY c.id, date)

SELECT c.id
, ym.y
, ym.m
, count(e.created) cnt
FROM ( SELECT #startDate := date_add(#startDate, interval 1 month) date
, year(#startDate) y
, month(#startDate) m
FROM HugeTable
JOIN ( SELECT #startDate := '2016-11-01'
, #endDate := '2017-05-01'
) months
WHERE #startDate < #endDate
) ym
LEFT JOIN edited e
ON year(e.created) = ym.y
AND month(e.created) = ym.m
LEFT JOIN company c
ON c.id = e.company_id
GROUP BY ym.y
, ym.m
, c.id
HugeTable can be any table with enough number of records as number of months to be displayed.

Related

How to set default value from mysql join interval yearmonth

I have problem with my query. I have two tables and I want join them to get the results based on primary key on first table, but I missing 1 data from first table.
this my fiddle
as you can see, I missing "xx3" from month 1
I have tried to change left and right join but, the results stil same.
So as you can see I have to set coalesce(sum(b.sd_qty),0) as total, if no qty, set 0 as default.
You should cross join the table to the distinct dates also:
SELECT a.item_code,
COALESCE(SUM(b.sd_qty), 0) total,
DATE_FORMAT(d.sd_date, '%m-%Y') month_year
FROM item a
CROSS JOIN (
SELECT DISTINCT sd_date
FROM sales_details
WHERE sd_date >= '2020-04-01' - INTERVAL 3 MONTH AND sd_date < '2020-05-01'
) d
LEFT JOIN sales_details b
ON a.item_code = b.item_code AND b.sd_date = d.sd_date
GROUP BY month_year, a.item_code
ORDER BY month_year, a.item_code;
Or, for MySql 8.0+, with a recursive CTE that returns the starting dates of all the months that you want the results, which can be cross joined to the table:
WITH RECURSIVE dates AS (
SELECT '2020-04-01' - INTERVAL 3 MONTH AS sd_date
UNION ALL
SELECT sd_date + INTERVAL 1 MONTH
FROM dates
WHERE sd_date + INTERVAL 1 MONTH < '2020-05-01'
)
SELECT a.item_code,
COALESCE(SUM(b.sd_qty), 0) total,
DATE_FORMAT(d.sd_date, '%m-%Y') month_year
FROM item a CROSS JOIN dates d
LEFT JOIN sales_details b
ON a.item_code = b.item_code AND DATE_FORMAT(b.sd_date, '%m-%Y') = DATE_FORMAT(d.sd_date, '%m-%Y')
GROUP BY month_year, a.item_code
ORDER BY month_year, a.item_code;
See the demo.

How can I create this query?

I have this MySQL Query here:
SELECT
COUNT(*) ReleasePerMonth,
d.name as DevGroup_REGION
FROM
release_summary r
inner join
gti_server_info g
on r.gti_server_id = g.gti_server_id
inner join
dev_group d
on d.dev_group_id = g.dev_group_id
WHERE
r.testingFinishedOn_timestamp >= '2020-05-01 00:00:00'
AND r.testingFinishedOn_timestamp <= '2020-05-31 00:00:00'
AND r.test_type != 14
GROUP BY
d.name ;
Now, I want this to run for every month. That is,
r.testingFinishedOn_timestamp >= '2019-01-01 00:00:00'
AND r.testingFinishedOn_timestamp <= '2019-01-31 00:00:00'
and
r.testingFinishedOn_timestamp >= '2019-05-02 00:00:00'
AND r.testingFinishedOn_timestamp <= '2019-02-31 00:00:00'
Till end of year. Currently, I am doing this manually. Is there any way I can do it in an automated manner?
Clarification:
I'd want 12 seperate tables for each of the 12 months.
Try the following:
SELECT
YEAR(r.testingFinishedOn_timestamp) year,
MONTH(r.testingFinishedOn_timestamp) month,
COUNT(*) ReleasePerMonth,
d.name as DevGroup_REGION
FROM
release_summary r
inner join
gti_server_info g
on r.gti_server_id = g.gti_server_id
inner join
dev_group d
on d.dev_group_id = g.dev_group_id
WHERE
r.test_type != 14
GROUP BY
YEAR(r.testingFinishedOn_timestamp),
MONTH(r.testingFinishedOn_timestamp),
d.name
ORDER BY year, month;
A possible solution is to query data for the entire year and group by MONTH(testingFinishedOn_timestamp).
I added the query below but it's not tested:
SELECT
MONTH(r.testingFinishedOn_timestamp) ReleaseMonth,
COUNT(*) ReleasePerMonth,
d.name as DevGroup_REGION
FROM
release_summary r
inner join
gti_server_info g
on r.gti_server_id = g.gti_server_id
inner join
dev_group d
on d.dev_group_id = g.dev_group_id
WHERE
r.testingFinishedOn_timestamp >= '2020-01-01 00:00:00'
AND r.testingFinishedOn_timestamp < '2021-01-01 00:00:00'
AND r.test_type != 14
GROUP BY
d.name, MONTH(r.testingFinishedOn_timestamp);
ORDER BY
MONTH(r.testingFinishedOn_timestamp), d.name
Based on the documentation available, MONTH() function returns the number of the month, for instance for January returns 1.
If you want to have the name of the month you case use MONTHNAME() function instead of Month().
You can try the below query - using last_day()
SELECT year(r.testingFinishedOn_timestamp),month(r.testingFinishedOn_timestamp),
COUNT(*) ReleasePerMonth
FROM
release_summary r
inner join
gti_server_info g
on r.gti_server_id = g.gti_server_id
inner join
dev_group d
on d.dev_group_id = g.dev_group_id
WHERE
r.testingFinishedOn_timestamp >= date_add(date_add(LAST_DAY(now()),interval 1 DAY),interval -12 MONTH)
AND r.testingFinishedOn_timestamp <= LAST_DAY(now())
AND r.test_type != 14
GROUP BY
year(r.testingFinishedOn_timestamp),month(r.testingFinishedOn_timestamp) ;

How to calculate percent?

Could you help me to calculate percent of users, which made payments?
I've got two tables:
activity
user_id login_time
201 01.01.2017
202 01.01.2017
255 04.01.2017
255 05.01.2017
256 05.01.2017
260 15.03.2017
2
payments
user_id payment_date
200 01.01.2017
202 01.01.2017
255 05.01.2017
I try to use this query, but it calculates wrong percent:
SELECT activity.login_time, (select COUNT(distinct payments.user_id)
from payments where payments.payment_time between '2017-01-01' and
'2017-01-05') / COUNT(distinct activity.user_id) * 100
AS percent
FROM payments INNER JOIN activity ON
activity.user_id = payments.user_id and activity.login_time between
'2017-01-01' and '2017-01-05'
GROUP BY activity.login_time;
I need a result
01.01.2017 100 %
02.01.2017 0%
03.01.2017 0%
04.01.2017 0%
05.01.2017 - 50%
If you want the ratio of users who have made payments to those with activity, just summarize each table individually:
select p.cnt / a.cnt
from (select count(distinct user_id) as cnt from activity a) a cross join
(select count(distinct user_id) as cnt from payment) p;
EDIT:
You need a table with all dates in the range. That is the biggest problem.
Then I would recommend:
SELECT d.dte,
( ( SELECT COUNT(DISTINCT p.user_id)
FROM payments p
WHERE p.payment_date >= d.dte and p.payment_date < d.dte + INTERVAL 1 DAY
) /
NULLIF( (SELECT COUNT(DISTINCT a.user_id)
FROM activity a
WHERE a.login_time >= d.dte and p.login_time < d.dte + INTERVAL 1 DAY
), 0
) as ratio
FROM (SELECT date('2017-01-01') dte UNION ALL
SELECT date('2017-01-02') dte UNION ALL
SELECT date('2017-01-03') dte UNION ALL
SELECT date('2017-01-04') dte UNION ALL
SELECT date('2017-01-05') dte
) d;
Notes:
This returns NULL on days where there is no activity. That makes more sense to me than 0.
This uses logic on the dates that works for both dates and date/time values.
The logic for dates can make use of an index, which can be important for this type of query.
I don't recommend using LEFT JOINs. That will multiply the data which can make the query expensive.
First you need a table with all days in the range. Since the range is small you can build an ad hoc derived table using UNION ALL. Then left join the payments and activities. Group by the day and calculate the percentage using the count()s.
SELECT x.day,
concat(CASE count(DISTINCT a.user_id)
WHEN 0 THEN
1
ELSE
count(DISTINCT p.user_id)
/
count(DISTINCT a.user_id)
END
*
100,
'%')
FROM (SELECT cast('2017-01-01' AS date) day
UNION ALL
SELECT cast('2017-01-02' AS date) day
UNION ALL
SELECT cast('2017-01-03' AS date) day
UNION ALL
SELECT cast('2017-01-04' AS date) day
UNION ALL
SELECT cast('2017-01-05' AS date) day) x
LEFT JOIN payments p
ON p.payment_date = x.day
LEFT JOIN activity a
ON a.login_time = x.day
GROUP BY x.day;

SQL aggregation select using SUM and COUNT on different tables

I have a table emails
id date sent_to
1 2013-01-01 345
2 2013-01-05 990
3 2013-02-05 1000
table2 is responses
email_id email response
1 xyz#email.com xxxx
1 xyzw#email.com yyyy
.
.
.
I want a result with the following format:
Month total_number_of_subscribers_sent total_responded
2013-01 1335 2
.
.
this is my query:
SELECT
DATE_FORMAT(e.date, '%Y-%m')AS `Month`,
count(*) AS total_responded,
SUM(e.sent_to) AS total_sent
FROM
responses r
LEFT JOIN emails e ON e.id = r.email_id
WHERE
e.date > '2012-12-31' AND e.date < '2013-10-01'
GROUP BY
DATE_FORMAT(e.date, '%Y %m')
it works ok with total_responded, but the total_sent goes crazy in millions, obviously because the resultant join table has the redundant values.
So basically can I do a SUM and COUNT in the same query on separate tables ?
If you want to count duplicates in each table, then the query is a little complicated.
You need to aggregate the sends and responses separately, before joining them together. The join is on the date, which necessarily comes from the "sent" information:
select r.`Month`, coalesce(total_sent, 0) as total_sent, coalesce(total_emails, 0) as total_emails,
coalesce(total_responses, 0) as total_responses,
coalesce(total_email_responses, 0) as total_email_responses
from (select DATE_FORMAT(e.date, '%Y-%m') as `Month`,
count(*) as total_sent, count(distinct email) as total_emails
from emails e
where e.date > '2012-12-31' AND e.date < '2013-10-01'
group by DATE_FORMAT(r.date, '%Y-%m')
) e left outer join
(select DATE_FORMAT(e.date, '%Y-%m') as `Month`,
count(*) as total_responses, count(distinct r.email) as total_email_responses
from emails e join
responses r
on e.email = r.email
where e.date > '2012-12-31' AND e.date < '2013-10-01'
) r
on e.`Month` = r.`Month`;
The apparent fact that your responses have no link to the "sent" information -- not even the date -- suggests a real problem with your operations and data.

MySQL Select, join, comparing dates fails

I have these two tables:
DateRanges
some_id start_date end_date
---------------------------------
1 2012-12-01 2012-12-15
1 2013-01-01 2013-01-15
3 2013-01-03 2013-01-10
Items
id name
----------------
1 Some name
2 Other name
3 So on...
What I try to achieve is to get, for each element in Items table, the biggest start_date (ignoring the smaller dates/date ranges for that Item) and check if the current date is in that range, like in the next table (let's say today's 02 January 2013):
id name TodayIsInTheRange
---------------------------------------------
1 Some name true
2 Other name false
3 So on... false
I have tried to obtain the 3rd table with the next query:
SELECT A.*, (B.`start_date` <= CURRENT_DATE AND CURRENT_DATE <= B.`end_date`) AS `TodayIsInTheRange`
FROM `Items` as A
LEFT JOIN `DateRanges` as B ON
A.id = B.some_id
SORT BY B.`end_date` DESC
But with this query my items repeat themselves because I have two records in DateRanges for the same item.
I use SQL Server, but I think something like this should be pretty close:
SELECT
I.Id,
I.Name,
(DR.start_date <= CURRENT_DATE AND CURRENT_DATE <= DR.end_date) AS `TodayIsInTheRange`
FROM `Items` AS I
LEFT JOIN
(SELECT Some_Id, MAX(Start_Date) as MaxStartDate
FROM `DateRanges`
GROUP BY Some_ID) AS HDR ON I.Id = HDR.Some_Id
LEFT JOIN `DateRanges` AS DR ON HDR.Some_Id = DR.Some_Id AND HDR.MaxStartDate = DR.Start_Date
select * from items join date_ranges dr0 on items.id = dr0.some_id
where start_date =
(select max(start_date) from date_ranges dr1 where dr0.some_id = dr1.some_id);
SELECT
a.*,
( b.start_date <= CURRENT_DATE
AND CURRENT_DATE <= b.end_date ) AS TodayIsInTheRange
FROM
Items AS a
LEFT JOIN
( SELECT some_id, MAX(start_date) AS start_date
FROM DateRanges
GROUP BY some_id
) AS m
JOIN
DateRanges AS b
ON b.some_id = m.some_id
ON a.id = m.some_id
ORDER BY b.end_date DESC ;
try to use GROUP BY and MAX function. The first provide you only one row for each item.id, the second tell you if there is at least one date in your range
SELECT A.*, MAX(B.`start_date` <= CURRENT_DATE AND CURRENT_DATE <= B.`end_date`) AS `TodayIsInTheRange`
FROM `Items` as A
LEFT JOIN `DateRanges` as B ON
A.id = B.some_id
GROUP BY A.id
ORDER BY B.`end_date` DESC