Compare date in two year - mysql

I have a table of population that I want to compare population in two year.
My table structure:
id (auto increment), type (man,woman,child), population (1 to 10000), date
I want run two under query and show into one table result:
query1:
SELECT type,count(population) as count_of_year1
FROM population
where date between '2013-01-01' and '2013-01-24'
GROUP BY type
query2:
SELECT type, count(population) as count_of_year2
FROM population
where date between '2014-01-01' and '2014-01-24'
GROUP BY type
I need this result :
| Type | population in year2013| population in year 2014
How to do this?

Use case expressions to do conditional counting:
SELECT type,
count(case when date between '2013-01-01' and '2013-01-24' then population end) as count_of_year1,
count(case when date between '2014-01-01' and '2014-01-24' then population end) as count_of_year2
FROM population
GROUP BY type
Add this where clause to speed things up if needed:
where date between '2013-01-01' and '2013-01-24'
or date between '2014-01-01' and '2014-01-24'

As population can have a value from 1 to 10000, I assume you want SUM() here not COUNT().
I'd have a separate table for types:
population_type - id, title
population - id, type_id (references type.id), population, date
Then I prefer using JOINs here:
SELECT pt.title type,
COALESCE(y1.total_population,0) population_2013,
COALESCE(y2.total_population,0) population_2014
FROM population_type pt
LEFT JOIN (
SELECT type_id,
SUM(population) total_population,
FROM population
WHERE date >= '2013-01-01'
AND date < '2013-01-24' + INTERVAL 1 DAY
GROUP BY type
) y1
ON y1.type_id = pt.id
LEFT JOIN (
SELECT type_id,
SUM(population) total_population,
FROM population
WHERE date >= '2014-01-01'
AND date < '2014-01-24' + INTERVAL 1 DAY
GROUP BY type
) y2
ON y2.type_id = pt.id
This way you are only summing through what you need each time and the query is more modular.

Related

Avg function not returning proper value

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1
Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)
Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

SUM subquery with condition depends on parent query columns returns NULL

everyone!
I'm trying to calc sum of price of deals by each day. What i do:
SET #symbols_set = "A,B,C,D";
DROP TABLE IF EXISTS temp_deals;
CREATE TABLE temp_deals AS SELECT Deal, TimeMsc, Price, VolumeExt, Symbol FROM deals WHERE TimeMsc >= "2019-04-01" AND TimeMsc <= "2019-06-30" AND FIND_IN_SET(Symbol, #symbols_set) > 0;
SELECT
DATE_FORMAT(TimeMsc, "%d/%m/%Y") AS Date,
Symbol,
(SELECT SUM(Price) FROM temp_deals dap WHERE dap.TimeMsc BETWEEN Date AND Date + INTERVAL 1 DAY AND dap.Symbol = Symbol) AS AvgPrice
FROM temp_deals
ORDER BY Date;
DROP TABLE IF EXISTS temp_deals;
But in result i've got NULL in AvgPrice column. I can't understand what i'm doing wrong.
It's look like i can't pass parent query's column to subquery, am i right?
Qualify your column names. But mostly, don't use a string for comparing dates:
SELECT DATE_FORMAT(d.TimeMsc, '%d/%m/%Y') AS Date,
d.Symbol,
(SELECT SUM(dap.Price)
FROM temp_deals dap
WHERE dap.TimeMsc >= d.TimeMsc AND
dap.TimeMsc < d.TimeMsc + INTERVAL 2 DAY AND -- not sure if you want 1 day or 2 day
dap.Symbol = d.Symbol
) AS AvgPrice
FROM temp_deals d
ORDER BY d.TimeMsc;

Getting total average between dates

I have a table named sales with the following format.
sale_id user_id sale_date sale_cost
j847bv-6ggd bd48ta36-cn5x 2017-01-10 15:43:12 30
vf87x2-15gr bd48ta36-cn5x 2017-01-05 13:41:16 60
3gfd7f-2cdd 8g4f5ccf-1fet 2017-01-15 14:10:12 100
4bgfd5-12vn 8g4f5ccf-1fet 2017-01-20 19:47:14 20
b58e32-bf87 8g4f5ccf-1fet 2017-01-20 17:35:13 15
bg87db-127g gr4gg1f4-3gbb 2017-01-20 12:26:15 80
How could I get the average amount that a user (user_d) spends within the first X amount of days since their first purchase? I don't want an average for every user, but a total average for all.
I know that I can get the average using select avg(sale_cost) but I'm unsure how to find out the average for a date period.
You can find average of total for each user within 10 days date range from intial sales date like this:
select avg(sale_cost)
from (
select sum(t.sale_cost) sale_cost
from your_table t
join (
select user_id, min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date
from your_table
group by user_id
) t2 on t.user_id = t2.user_id
and t.sale_date between t2.start_date and t2.end_date
group by t.user_id
) t;
It finds the first sale_date and date 10 days after this for each user. Then joins it with the table to get total for each user within that range and then finally average of the above calculated totals.
Demo
If you want to find the average between overall first sale_date (not individual) and 10 days from it, use:
select avg(sale_cost)
from (
select sum(t.sale_cost) sale_cost
from your_table t
join (
select min(sale_date) start_date, date_add(min(sale_date), interval 10 day) end_date
from your_table
) t2 on t.sale_date between t2.start_date and t2.end_date
group by t.user_id
) t;
Demo
The between operator comes in handy whenever it comes to checking ranges
SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;
In this case value1 and value2 will be replaced by your dates using:
'2011-01-01 00:00:00' AND '2011-01-31 23:59:59'
or
sale_date AND DATE_ADD(OrderDate,INTERVAL 10 DAY)
The first way is faster and also the between values are inclusive.

MySQL Join three different results and count

Got three results extracted from three different tables.
Each table is a product: loans, credits and discounts.
loans and credits got the following data: clientid, type, productid, date & expiration (days to go).
discounts got: clientid, date and expiration.
The results are the number of times (count) for every client which product expires in 10 days (or less) and is registered among two dates.
Example (just for loans):
SELECT clientid, COUNT(*)
FROM loans
WHERE ((type LIKE 'TITULAR') AND(date BETWEEN 'ccyy-mm-dd' AND 'ccyy-mm-dd') AND (expires <= 10))
GROUP BY clientid
ORDER BY clientid;
Obviously, not all the clients got loans, credits or discounts at the same time, but I need to get a result that sums the number of times any client has any of the products expiring in 10 days or less among the limit dates.
So, in example, if client #200 got 3 loans, 2 credits and just one discount; all of them between date1 and date2, with expiration equal or less 10; the result should be 6.
So far I've tried:
SELECT loansr.clienteid, (loansr.count + creditsr.count + discountsr.count)
FROM
(SELECT clienteid, COUNT(*) AS "count"
FROM loans
WHERE (type LIKE 'TITULAR')
AND (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires <= 10)
GROUP BY clienteid) loansr,
(SELECT clienteid, COUNT(*) AS "count"
FROM credits
WHERE (type LIKE 'TITULAR')
AND (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires <= 10)
GROUP BY clienteid) creditsr,
(SELECT clienteid, COUNT(*) AS "count"
FROM discounts
WHERE (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires <= 10)
GROUP BY clienteid) discountsr
WHERE
(loansr.clienteid = creditsr.clienteid = discountsr.clienteid)
ORDER BY loansr.clienteid;
Edit 18:25
I've think that if I use UNION ALL to mix the three results and then group by clienteid I will get what I'm looking for, won't I?
SELECT clienteid AS "CLIENTE", SUM(COUNT) AS "NUM_VECES_INCI_10_ACT_U3M" FROM
((SELECT clienteid, COUNT(*) AS "COUNT"
FROM loans
WHERE (titularidad_tipo LIKE 'TITULAR')
AND (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires >= 11)
GROUP BY clienteid)
UNION ALL
(SELECT clienteid, COUNT(*) AS "COUNT"
FROM credits
WHERE (titularidad_tipo LIKE 'TITULAR')
AND (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires >= 11)
GROUP BY clienteid)
UNION ALL
(SELECT clienteid, COUNT(*) AS "COUNT"
FROM discounts
WHERE (date BETWEEN '2009-08-01' AND '2009-10-30')
AND (expires >= 11)
GROUP BY clienteid)) orig
GROUP BY clienteid
ORDER BY clienteid;
I'd post it in the comment if I could :)
If you use UNION ALL, you should get the desired results. Although make sure to have proper indexes (I suggest titularidad_tipo, date, expires) for tables loansr, credits, and (date, expires) for discounts table. If you have proper indexation, your results will come quickly.

Calculating a Moving Average MySQL?

Good Day,
I am using the following code to calculate the 9 Day Moving average.
SELECT SUM(close)
FROM tbl
WHERE date <= '2002-07-05'
AND name_id = 2
ORDER BY date DESC
LIMIT 9
But it does not work because it first calculates all of the returned fields before the limit is called. In other words it will calculate all the closes before or equal to that date, and not just the last 9.
So I need to calculate the SUM from the returned select, rather than calculate it straight.
IE. Select the SUM from the SELECT...
Now how would I go about doing this and is it very costly or is there a better way?
If you want the moving average for each date, then try this:
SELECT date, SUM(close),
(select avg(close) from tbl t2 where t2.name_id = t.name_id and datediff(t2.date, t.date) <= 9
) as mvgAvg
FROM tbl t
WHERE date <= '2002-07-05' and
name_id = 2
GROUP BY date
ORDER BY date DESC
It uses a correlated subquery to calculate the average of 9 values.
Starting from MySQL 8, you should use window functions for this. Using the window RANGE clause, you can create a logical window over an interval, which is very powerful. Something like this:
SELECT
date,
close,
AVG (close) OVER (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
FROM tbl
WHERE date <= DATE '2002-07-05'
AND name_id = 2
ORDER BY date DESC
For example:
WITH t (date, `close`) AS (
SELECT DATE '2020-01-01', 50 UNION ALL
SELECT DATE '2020-01-03', 54 UNION ALL
SELECT DATE '2020-01-05', 51 UNION ALL
SELECT DATE '2020-01-12', 49 UNION ALL
SELECT DATE '2020-01-13', 59 UNION ALL
SELECT DATE '2020-01-15', 30 UNION ALL
SELECT DATE '2020-01-17', 35 UNION ALL
SELECT DATE '2020-01-18', 39 UNION ALL
SELECT DATE '2020-01-19', 47 UNION ALL
SELECT DATE '2020-01-26', 50
)
SELECT
date,
`close`,
COUNT(*) OVER w AS c,
SUM(`close`) OVER w AS s,
AVG(`close`) OVER w AS a
FROM t
WINDOW w AS (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
ORDER BY date DESC
Leading to:
date |close|c|s |a |
----------|-----|-|---|-------|
2020-01-26| 50|1| 50|50.0000|
2020-01-19| 47|2| 97|48.5000|
2020-01-18| 39|3|136|45.3333|
2020-01-17| 35|4|171|42.7500|
2020-01-15| 30|4|151|37.7500|
2020-01-13| 59|5|210|42.0000|
2020-01-12| 49|6|259|43.1667|
2020-01-05| 51|3|159|53.0000|
2020-01-03| 54|3|154|51.3333|
2020-01-01| 50|3|155|51.6667|
Use something like
SELECT
sum(close) as sum,
avg(close) as average
FROM (
SELECT
(close)
FROM
tbl
WHERE
date <= '2002-07-05'
AND name_id = 2
ORDER BY
date DESC
LIMIT 9 ) temp
The inner query returns all filtered rows in desc order, and then you avg, sum up those rows returned.
The reason why the query given by you doesn't work is due to the fact that the sum is calculated first and the LIMIT clause is applied after the sum has already been calculated, giving you the sum of all the rows present
an other technique is to do a table:
CREATE TABLE `tinyint_asc` (
`value` tinyint(3) unsigned NOT NULL default '0',
PRIMARY KEY (value)
) ;
​
INSERT INTO `tinyint_asc` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250),(251),(252),(253),(254),(255);
After you can used it like that:
select
date_add(tbl.date, interval tinyint_asc.value day) as mydate,
count(*),
sum(myvalue)
from tbl inner
join tinyint_asc.value <= 30 -- for a 30 day moving average
where date( date_add(o.created_at, interval tinyint_asc.value day ) ) between '2016-01-01' and current_date()
group by mydate
This query is fast:
select date, name_id,
case #i when name_id then #i:=name_id else (#i:=name_id)
and (#n:=0)
and (#a0:=0) and (#a1:=0) and (#a2:=0) and (#a3:=0) and (#a4:=0) and (#a5:=0) and (#a6:=0) and (#a7:=0) and (#a8:=0)
end as a,
case #n when 9 then #n:=9 else #n:=#n+1 end as n,
#a0:=#a1,#a1:=#a2,#a2:=#a3,#a3:=#a4,#a4:=#a5,#a5:=#a6,#a6:=#a7,#a7:=#a8,#a8:=close,
(#a0+#a1+#a2+#a3+#a4+#a5+#a6+#a7+#a8)/#n as av
from tbl,
(select #i:=0, #n:=0,
#a0:=0, #a1:=0, #a2:=0, #a3:=0, #a4:=0, #a5:=0, #a6:=0, #a7:=0, #a8:=0) a
where name_id=2
order by name_id, date
If you need an average over 50 or 100 values, it's tedious to write, but
worth the effort. The speed is close to the ordered select.