I have two tables in MySQL
table1(Date(full_date), app_id, type(free, paid))
table2(Date_fk, Year, month, day, quater)
Query for Single Count is :
select Year, count(*)
from Table1, Table2
where Table1.Date = Table2.Date and Table1.Type='Free'
GROUP BY YEAR
---------------------
| year | free_count |
---------------------
| 2019 | 10 |
---------------------
I want output as
---------------------------------
| year | free_count | Paid_count |
----------------------------------
| 2019 | 10 | 12 |
----------------------------------
Here's one option using conditional aggregation:
select year,
count(case when t1.type='free' then 1 end) as freecount,
count(case when t1.type='paid' then 1 end) as paidcount
from table1 t1
join table2 t2 on t1.date = t2.date
group by year
Also please take a look at the join syntax. In general, I'd highly recommend not using commas in your from clause.
Try this out:
SELECT
d.year,
SUM(CASE WHEN a.Type = 'Free' THEN 1 ELSE 0 END) AS free_count,
SUM(CASE WHEN a.Type = 'Paid' THEN 1 ELSE 0 END) AS paid_count
FROM Table2 d -- Dates table
LEFT JOIN Table1 a -- Apps table
ON d.Date_fk = a.Date
GROUP BY d.year;
The LEFT JOIN guarantees that you'll still get results for those years without any apps.
Related
I got two tables with identical structure. From those tables I need to get rows with highest value on rate column where fix_id is the same.
Table1
fix_id | rate | proc | unique_id
2 | 72 | 50 | 23_tab1
3 | 98 | 70 | 24_tab1
4 | 78 | 80 | 25_tab1
table2
fix_id | rate | proc | unique_id
2 | 75 | 999 | 23_tab2
3 | 80 | 179 | 24_tab2
4 | 82 | 898 | 25_tab2
Expected result
fix_id | rate | proc | unique_id
2 | 75 | 999 | 23_tab2
3 | 98 | 70 | 24_tab1
4 | 82 | 898 | 25_tab2
I've tried this...
Select fix_id,proc,unique_id,MAX(rate) rate from
(Select fix_id,proc,unique_id,MAX(rate) rate from table1 group by fix_id
UNION ALL SELECT fix_id,proc,unique_id,MAX(rate) rate from table2 group by fix_id ) group by fix_id
I get the highest values from rate column but the values from other columns are incorrect.
It can be done using CASE statement.
Try this query
select
(case
when T1.rate > T2.rate then T1.fix_id else T2.fix_id
end) as fix_id,
(case
when T1.rate > T2.rate then T1.rate else T2.rate
end) as rate,
(case
when T1.rate > T2.rate then T1.proc else T2.proc
end) as proc,
(case
when T1.rate > T2.rate then T1.unique_id else T2.unique_id
end) as unique_id
from table1 as T1, table2 as T2 where T1.id = T2.id
You can use row_number():
select t.*
from (select fix_id, proc, unique_id, rate,
row_number() over (partition by fix_id order by rate desc) as seqnum
from ((select fix_id, proc, unique_id, rate from table1
) union all
(select fix_id, proc, unique_id, rate from table2
)
) t
) t
where seqnum = 1;
As fix_id is unique in both tables, the answer with CASE statements (https://stackoverflow.com/a/65609931/53341) is likely the fastest (so, I've upvoted that)...
Join once
Compare rates, on each row
Pick which table to read from, on each row
For large numbers of columns, however, it's unwieldy to type all the CASE statements. So, here is a shorter version, though it probably takes twice as long to run...
SELECT t1.*
FROM table1 AS t1 INNER JOIN table2 AS t2 ON t1.fix_id = t2.fix_id
WHERE t1.rate >= t2.rate
UNION ALL
SELECT t2.*
FROM table1 AS t1 INNER JOIN table2 AS t2 ON t1.fix_id = t2.fix_id
WHERE t1.rate < t2.rate
I'm trying to create a query with conditional logic where I only calculate revenue for the most recent records by each month using a datetime column (start_date), but only if there are multiple records in that month from the same account_id.
Here's a basic example of the schema after I join two tables (full schema in sqlfiddle link).
| account_id | plan_id | start_date | plan_interval | price |
|------------|---------|----------------------|---------------|-------|
| 1 | 1 | 2018-01-03T14:52:13Z | month | 39 |
| 1 | 3 | 2018-02-07T11:10:17Z | year | 999 |
| 1 | 2 | 2018-02-07T11:11:17Z | month | 99 |
In the above example, I would only like to include rows 1 and 3 in my output, as it's the one record from account_id 1 in January and the most recent of two records for account_id 1 in February.
SELECT
MONTH(start_date) AS month,
SUM(CASE WHEN plan_interval = 'month'
THEN price * .01
ELSE (price * .01)/12 END) AS mrr
FROM subscriptions
JOIN plans
ON plans.id = subscriptions.plan_id
WHERE Year(start_date) = 2018 AND
CASE WHEN (account_id = account_id
AND MONTH(start_date) = MONTH(start_date))
THEN (SELECT MAX(start_date) FROM subscriptions)
ELSE (SELECT start_date FROM subscriptions)
END
GROUP BY month
ORDER BY month ASC;
The case statement in the subquery above does not seem to work in doing this. It returns the data without filtering out records when the first condition is met.
Here is an example: sqlfiddle
This query returns the rows that you are asking for in the question:
SELECT s.*, p.plan_interval, p.price,
(CASE WHEN p.plan_interval = 'month'
THEN p.price * 0.01
ELSE (p.price * 0.01)/12
END) AS mrr
FROM subscriptions s JOIN
plans p
ON p.id = s.plan_id
WHERE YEAR(s.start_date) = 2018 AND
s.start_date = (SELECT MAX(s2.start_date)
FROM subscriptions s2
WHERE s2.account_id = s.account_id AND
EXTRACT(YEAR_MONTH FROM s2.start_date) = EXTRACT(YEAR_MONTH FROM s.start_date)
)
ORDER BY s.start_date ASC;
This uses a subquery to get the most recent record for a subscription for each month.
You can then aggregate this however you wish.
Notes about the query:
Table aliases make the query easier to write and to read.
The subquery uses the handy YEAR_MONTH option of EXTRACT(), so it handles both years and months.
For numeric constants between -1 and 1, I always prepend with a 0, so 0.12 rather than .12. If find that this makes the decimal point more obvious.
First work out the last entry by account and month (sub query a) join to subscriptions to get the plan_id and then get the plan
SELECT S.ACCOUNT_id,s.plan_id,s.start_date,p.Price,p.plan_interval,
case when p.plan_interval = 'month' then p.price * .01 /12 else p.price * .01 end as rev
from subscriptions s
join (select s.account_id,month(s.start_date), max(s.start_date) start_date
from subscriptions s
group by account_id,month(start_date)) a on a.account_id = s.account_id and a.start_date = s.start_date
join plans p on p.id = s.plan_id;
+------------+---------+---------------------+----------+---------------+--------------+
| ACCOUNT_id | plan_id | start_date | Price | plan_interval | rev |
+------------+---------+---------------------+----------+---------------+--------------+
| 1 | 1 | 2018-01-03 14:52:13 | 3900.00 | month | 3.25000000 |
| 1 | 2 | 2018-02-07 11:11:17 | 9900.00 | month | 8.25000000 |
| 2 | 3 | 2018-01-03 17:40:05 | 99900.00 | year | 999.00000000 |
+------------+---------+---------------------+----------+---------------+--------------+
In your case, the WHERE statement does not work because the CASE statement will always return a boolean.
CASE WHEN (account_id = account_id
AND MONTH(start_date) = MONTH(start_date))
THEN (SELECT MAX(start_date) FROM subscriptions)
ELSE (SELECT start_date FROM subscriptions)
END
Another approach to what you are building would involve using a subquery to order the columns the way you want within the groups.
SELECT
account_id,
month,
CASE WHEN plan_interval = 'month'
THEN price * .01
ELSE (price * .01)/12
END AS mrr
FROM (
SELECT *, MONTH(start_date) AS month
FROM subscriptions
INNER JOIN plans ON plans.id = subscriptions.plan_id
ORDER BY account_id, start_date DESC
) sq
GROUP BY account_id, month
This works because selecting columns in a GROUP BY will automatically take the first row that is returned by the subquery for a given group of columns.
I'd like to get the Date & ID which corresponds to the lowest and Largest Time, respectively the extreme rows in the table below with ID 5 & 4.
Please note the following:
Dates are stored as values in ms
The ID reflects the Order By Date ASC
Below I have split the Time to make it clear
* indicates the two rows to return.
Values should be returns as columns, i.e: SELECT minID, minDate, maxID, maxDate FROM myTable
| ID | Date | TimeOnly |
|----|---------------------|-----------|
| 5 | 14/11/2019 10:01:29 | 10:01:29* |
| 10 | 15/11/2019 10:01:29 | 10:01:29 |
| 6 | 14/11/2019 10:03:41 | 10:03:41 |
| 7 | 14/11/2019 10:07:09 | 10:07:09 |
| 11 | 15/11/2019 12:01:43 | 12:01:43 |
| 8 | 14/11/2019 14:37:16 | 14:37:16 |
| 1 | 12/11/2019 15:04:50 | 15:04:50 |
| 9 | 14/11/2019 15:04:50 | 15:04:50 |
| 2 | 13/11/2019 18:10:41 | 18:10:41 |
| 3 | 13/11/2019 18:10:56 | 18:10:56 |
| 4 | 13/11/2019 18:11:03 | 18:11:03* |
In earlier versions of MySQL, you can use couple of inline queries. This is a straight-forward option that could be quite efficient here:
select
(select ID from mytable order by TimeOnlylimit 1) minID,
(select Date from mytable order by TimeOnly limit 1) minDate,
(select ID from mytable order by TimeOnly desc limit 1) maxID,
(select Date from mytable order by TimeOnly desc limit 1) maxDate
One option for MySQL 8+, using ROW_NUMBER with pivoting logic:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY TimeOnly) rn_min,
ROW_NUMBER() OVER (ORDER BY Date TimeOnly) rn_max
FROM yourTable
)
SELECT
MAX(CASE WHEN rn_min = 1 THEN ID END) AS minID,
MAX(CASE WHEN rn_min = 1 THEN Date END) AS minDate
MAX(CASE WHEN rn_max = 1 THEN ID END) AS maxID,
MAX(CASE WHEN rn_max = 1 THEN Date END) AS maxDate
FROM cte;
Here is an option for MySQL 5.7 or earlier:
SELECT
MAX(CASE WHEN pos = 1 THEN ID END) AS minID,
MAX(CASE WHEN pos = 1 THEN Date END) AS minDate
MAX(CASE WHEN pos = 2 THEN ID END) AS maxID,
MAX(CASE WHEN pos = 2 THEN Date END) AS maxDate
FROM
(
SELECT ID, Date, 1 AS pos FROM yourTable
WHERE TimeOnly = (SELECT MIN(TimeOnly) FROM yourTable)
UNION ALL
SELECT ID, Date, 2 FROM yourTable
WHERE TimeOnly = (SELECT MAX(TimeOnly) FROM yourTable)
) t;
This second 5.7 option uses similar pivoting logic, but instead of ROW_NUMBER is uses subqueries to identify the min and max records. These records are brought together using a union, along with an identifier to keep track of which record be min/max.
You could simply do this:
SELECT minval.ID, minval.Date, maxval.ID, maxval.Date
FROM (
SELECT ID, Date
FROM t
ORDER BY CAST(Date AS TIME)
LIMIT 1
) AS minval
CROSS JOIN (
SELECT ID, Date
FROM t
ORDER BY CAST(Date AS TIME) DESC
LIMIT 1
) AS maxval
If you want two rows then change CROSS JOIN query to a UNION ALL query.
Demo on db<>fiddle
I want to return all rows that were public in May (2019-05), so if a row was turned to draft (and not back to public) at any point before the end of May, I don't want it. For example:
id | post_id | status | date
-------------------------
1 | 1 | draft | 2019-03-25
2 | 1 | public | 2019-04-02
3 | 1 | draft | 2019-05-25
4 | 2 | draft | 2019-03-10
5 | 2 | public | 2019-04-01
6 | 2 | draft | 2019-06-01
The desired result for the above would return post_id 2 because its last status change prior to the end of May was to public.
post_id 1 was put back in draft before the end of May, so it would not be included.
I'm not sure how to use the correct join or sub-queries to do this as efficiently as possible.
You seem to want the status as of 2019-05-31. A correlated subquery seems like the simplest solution:
select t.*
from t
where t.date = (select max(t2.date)
from t t2
where t2.post_id = t.post_id and
t2.date <= '2019-05-31'
);
To get the ones that are public, just add a WHERE condition:
select t.*
from t
where t.date = (select max(t2.date)
from t t2
where t2.post_id = t.post_id and
t2.date <= '2019-05-31'
) and
t.status = 'public';
For performance, you want an index on (post_id, date).
You can also phrase this using a JOIN:
select t.*
from t join
(select t2.post_id, max(t2.date) as max_date
from t t2
where t2.date <= '2019-05-31'
group by t2.post_id
) t2
on t2.max_date = t.date
where t.status = 'public';
I would expect the correlated subquery to have better performance with the right indexes. However, sometimes MySQL surprises me.
we need to determine whether
the status of each post_id is public prior to the month May (the subquery with max(date)),
any post_id exists with status not equals public within the month May,
and then exclude the post_id satisfying the matter 2.
So, you can use :
select distinct t1.post_id
from tab t1
where t1.post_id not in
(
select distinct t1.post_id
from tab t1
join
(
select post_id, max(date) as date
from tab
where '2019-05-01'> date
group by post_id ) t2
on t1.post_id = t2.post_id
where t1.status != 'public'
and t1.date < '2019-06-01'
and t1.date > '2019-04-30'
);
+---------+
| POST_ID |
+---------+
| 2 |
+---------+
Demo
When I run a single query using the following formula to have the first column give back the month/year, the second give back the number of people signing per month, and the third give back the running total of signers, it works great:
SET #runtot1:=0;
SELECT
1rt.MONTH,
1rt.1signed,
(#runtot1 := #runtot1 + 1rt.1signed) AS 1rt
FROM
(SELECT
DATE_FORMAT(STR_TO_DATE(s.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
IFNULL(COUNT(DISTINCT CASE WHEN s.surveyid = 791796 THEN s.id ELSE NULL END),0) AS 1signed
FROM table1 s
JOIN table2 m ON s.id = m.id AND m.current = "Yes"
WHERE STR_TO_DATE(s.datecontacted,'%m/%d/%Y') > '2015-03-01'
GROUP BY MONTH
ORDER BY MONTH) AS 1rt
With the query above, I get the following results table, which would be exactly what I want if I only needed to count one thing:
MONTH 1signed 1rt
2015-03 0 0
2015-04 1 1
2015-05 0 1
2015-08 1 2
2015-10 1 3
2015-11 1 4
2016-01 0 4
2016-02 0 4
But I can't figure out how to do that with multiple subqueries since I need this to happen for multiple columns at the same time. For example, I was attempting things like this (which doesn't work):
SET #runtot1:=0;
SET #runtot2:=0;
select
DATE_FORMAT(STR_TO_DATE(s1.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
t1.1signed,
(#runtot1 := #runtot1 + t1.1signed) AS 1rt,
t2.2signed,
(#runtot2 := #runtot2 + t2.2signed) AS 2rt
from
(select
DATE_FORMAT(STR_TO_DATE(s.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
IFNULL(COUNT(DISTINCT CASE WHEN s.surveyid = 791796 THEN s.id ELSE NULL END),0) AS 1signed
from table1 s
left join table2 m ON m.id = s.id
where m.current = "Yes"
GROUP BY MONTH
ORDER BY MONTH) as T1,
(select
DATE_FORMAT(STR_TO_DATE(s.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
IFNULL(COUNT(DISTINCT CASE WHEN s.surveyid = 846346 THEN s.id ELSE NULL END),0) AS 2signed
from table1 s
left join table2 m ON m.id = s.id
where m.current = "Yes"
GROUP BY MONTH
ORDER BY MONTH) as T2,
table1 s1
LEFT JOIN table2 m1 ON m1.id = s1.id AND m1.current = "Yes"
WHERE STR_TO_DATE(s1.datecontacted,'%m/%d/%Y') > '2015-03-01'
GROUP BY DATE_FORMAT(STR_TO_DATE(s1.datecontacted,'%m/%d/%Y'),'%Y-%m')
ORDER BY DATE_FORMAT(STR_TO_DATE(s1.datecontacted,'%m/%d/%Y'),'%Y-%m')
That blew up my results badly -- I also tried LEFT JOINs to get those two next each other, but that didn't work either.
Here's a SQL Fiddle with a few values with the query at the top that works, but not the query needed to look like the idea below.
If the multiple subquery version of the code worked, below would be the ideal end-result:
MONTH 1signed 1rt 2signed 2rt
2015-03 0 0 1 1
2015-04 1 1 0 1
2015-05 0 1 1 2
2015-08 1 2 0 2
2015-10 1 3 0 2
2015-11 1 4 0 2
2016-01 0 4 0 2
2016-02 0 4 1 3
Just trying to figure out a way to get counts by month and rolling totals since March 2015 for two different survey questions using the same query. Any help would be greatly appreciated!
Your attempt was actually pretty close. I just got rid of S1 and joined the two subqueries together on their MONTH columns:
SET #runtot1:=0;
SET #runtot2:=0;
select
T1.MONTH,
t1.1signed,
(#runtot1 := #runtot1 + t1.1signed) AS 1rt,
t2.2signed,
(#runtot2 := #runtot2 + t2.2signed) AS 2rt
from
(select
DATE_FORMAT(STR_TO_DATE(s.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
IFNULL(COUNT(DISTINCT CASE WHEN s.surveyid = 791796 THEN s.id ELSE NULL END),0) AS 1signed
from table1 s
left join table2 m ON m.id = s.id
where m.current = "Yes" and STR_TO_DATE(s.datecontacted,'%m/%d/%Y') > '2015-03-01'
GROUP BY MONTH
ORDER BY MONTH) as T1,
(select
DATE_FORMAT(STR_TO_DATE(s.datecontacted,'%m/%d/%Y'),'%Y-%m') AS MONTH,
IFNULL(COUNT(DISTINCT CASE WHEN s.surveyid = 846346 THEN s.id ELSE NULL END),0) AS 2signed
from table1 s
left join table2 m ON m.id = s.id
where m.current = "Yes" and STR_TO_DATE(s.datecontacted,'%m/%d/%Y') > '2015-03-01'
GROUP BY MONTH
ORDER BY MONTH) as T2
WHERE
T1.MONTH=T2.MONTH
GROUP BY T1.MONTH
ORDER BY T1.MONTH
I haven't tested Strawberry's solution, which looks more elegant. But I thought you'd like to know that your approach (solving the running totals individually, then joining the results together) would have worked too.
It seems that you're after something like this...
The data set:
DROP TABLE IF EXISTS table1;
CREATE TABLE table1
( id INT NOT NULL
, date_contacted DATE NOT NULL
, survey_id INT NOT NULL
, PRIMARY KEY(id,survey_id)
);
DROP TABLE IF EXISTS table2;
CREATE TABLE table2
(id INT NOT NULL PRIMARY KEY
,is_current TINYINT NOT NULL DEFAULT 0
);
INSERT INTO table1 VALUES
(1,"2015-03-05",846346),
(2,"2015-04-15",791796),
(2,"2015-05-04",846346),
(3,"2015-06-07",791796),
(3,"2015-06-08",846346),
(4,"2015-08-02",791796),
(5,"2015-10-15",791796),
(6,"2015-11-25",791796),
(6,"2016-01-02", 11235),
(6,"2016-02-06",846346);
INSERT INTO table2 (id,is_current) VALUES
(1,1),
(2,1),
(3,0),
(4,1),
(5,1),
(6,1);
The query:
SELECT x.*
, #a:=#a+a rt_a
, #b:=#b+b rt_b
FROM
( SELECT DATE_FORMAT(date_contacted,'%Y-%m') month
, SUM(survey_id = 791796) a
, SUM(survey_id = 846346) b
FROM table1 x
JOIN table2 y
ON y.id = x.id
WHERE y.is_current = 1
GROUP
BY month
) x
JOIN (SELECT #a:=0,#b:=0) vars
ORDER
BY month;
+---------+------+------+------+------+
| month | a | b | rt_a | rt_b |
+---------+------+------+------+------+
| 2015-03 | 0 | 1 | 0 | 1 |
| 2015-04 | 1 | 0 | 1 | 1 |
| 2015-05 | 0 | 1 | 1 | 2 |
| 2015-08 | 1 | 0 | 2 | 2 |
| 2015-10 | 1 | 0 | 3 | 2 |
| 2015-11 | 1 | 0 | 4 | 2 |
| 2016-01 | 0 | 0 | 4 | 2 |
| 2016-02 | 0 | 1 | 4 | 3 |
+---------+------+------+------+------+