MySQL - Full outer join on same table using COUNT - mysql

I am trying to generate a table in the following format.
Proday | 2014-04-01 | 2014-03-01
--------------------------------
1 | 12 | 17
2 | 6 | 0
7 | 0 | 24
13 | 3 | 7
Prodays (duration between two timestamps) is a calculated value and the data for months is a COUNT. I can output the data for a single month, but am having troubles joining queries to additional months. The index (prodays) may not match for each month. e.g.. 2014-04-01 may not have any data for Prodays 7, whereas 2014-03-01 may not have Proday 2. Should indicate with 0 or null.
I suspect FULL OUTER JOIN is what should do the trick. But have read that's not possible in Mysql?
This is the query to get data for a single month:
SELECT round((protime - createtime) / 86400) AS prodays, COUNT(id) AS '2014-04-01'
FROM `tbl_users` as t1
WHERE status = 1 AND DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-04-01'
AND DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-04-01')
GROUP BY prodays
ORDER BY `prodays` ASC
How can I join/union an additional query to create a column for 2014-03-01?

You want to use conditional aggregation -- that is, move the filtering logic from the where clause to the select clause:
SELECT round((protime - createtime) / 86400) AS prodays,
sum(DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-04-01' AND
DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-04-01')
) as `2014-04-01`,
sum(DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') >= '2014-03-01' AND
DATE_FORMAT(FROM_UNIXTIME(createtime),'%Y-%m-%d') <= LAST_DAY('2014-03-01')
) as `2014-03-01`
FROM `tbl_users` as t1
WHERE status = 1
GROUP BY prodays
ORDER BY `prodays` ASC;

Related

Mysql: Subtraction between rows and sum with other table

I have two tables, both with a Time column as timestamp type which is filled by default when the row is created: Table1 is updated approximately every 10 seconds:
Time | Val_1a | Val_2a | Val_3a
2021-11-06 13:59:53 | 15 | 10 | 35
2021-11-06 14:00:02 | 12 | 15 | 34
.................
2021-11-06 14:05:25 | 11 | 13 | 35
2021-11-06 14:05:35 | 11 | 17 | 36
Table2 is updated every hour after mathematical operations on table1:
Time | Var_1b | Var_2b | Var_3b
2021-11-06 11:00:00 | 2 | 15 | 30
2021-11-06 12:00:00 | 8 | 12 | 32
2021-11-06 13:00:00 | 12 | 11 | 35
What I would like to get but I'm not able to do in any way, is:
Check that the last table1.Val_2a value is greater than the first table1.Val_2a value written at the beginning of the current hour (with the tables above, check if 17 > 15). If this condition is not met, the entire query must return 0 otherwise:
2a) If the last row in table2 refers to the previous day, then the query result is simply the difference of the two table1.Val_2a values (17 - 15 = 2)
2b) Otherwise their difference is calculated as at point 2a (17-15 = 2) and it is added to the table2.Var_1b value (2 + 12 = 14)
I hope I was able to explain it in a clearly way, and that it all is possible with a single query. Thanks everyone for the support
Sorry, if I add an Answer but I couldn't add the image into the comment.
This is the qwery I used to test the CASE clause
SELECT t1.dtm, t1.Val_2a2, t1.Val_2a1,
CASE WHEN Val_2a2 > Val_2a1
THEN Val_2a2-Val_2a1 ELSE 0 END AS ValF FROM (SELECT DATE_FORMAT(time, '%Y-%m-%d %H:00:00') dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time),',',1) Val_2a1,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time DESC),',',1) Val_2a2 FROM table1 GROUP BY dtm) t1
and this is the unexpected result
Qwery result
It is possible in a single query but different people will have different method of doing it. Whatever the method is, I personally think that the most important part is to keep the logic intact. The details you've provided in your question got me assuming that this might be a kind of query you're looking for:
SELECT t1.dtm, t1.Val_2a2, t1.Val_2a1, t2.Val_1b2,
CASE WHEN Val_2a2 > Val_2a1
THEN Val_2a2-Val_2a1+Val_1b2 ELSE 0 END AS ValF
FROM
(SELECT DATE_FORMAT(time, '%Y-%m-%d %H:00:00') dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time),',',1) Val_2a1 ,
SUBSTRING_INDEX(GROUP_CONCAT(Val_2a ORDER BY time DESC),',',1) Val_2a2
FROM table1
GROUP BY dtm) t1
LEFT JOIN
(SELECT DATE(time) dtm,
SUBSTRING_INDEX(GROUP_CONCAT(Val_1b ORDER BY time DESC),',',1) Val_1b2
FROM table2
GROUP BY dtm) t2
ON DATE(t1.dtm)=t2.dtm;
Demo fiddle
hoping it can help someone else, after some more test this is the final qwery I got, considering I just need a value on the fly without needing of storing it.
Of course every consideration by the experts is more than appreciate.
Thanks to all
SELECT
CASE WHEN
(ABS(t1.Val_2a2) - ABS(t1.Val_2a1)) BETWEEN 0 AND 30
THEN t1.Val_2a2-t1.Val_2a1+t2.Val_1b2
ELSE t2.Val_1b2
END AS My_result
FROM
(SELECT DATE_FORMAT(Time, '%Y-%m-%d %H:00:00') dtm,
(SELECT Val_2a FROM table1 WHERE Time >= DATE_FORMAT(NOW(),"%Y-%m-%d %H:00:00") ORDER BY Time LIMIT 1) Val_2a1,
(SELECT Val_2a FROM table1 WHERE Time >= DATE_FORMAT(NOW(),"%Y-%m-%d %H:00:00") ORDER BY Time DESC LIMIT 1) Val_2a2
FROM table1
GROUP BY dtm
ORDER BY Time DESC LIMIT 1) t1
LEFT JOIN
(SELECT (Time) dtm,
(Val_1b) Val_1b2
FROM table2
GROUP BY dtm ORDER BY dtm DESC LIMIT 1) t2
ON DATE(t1.dtm)= DATE(t2.dtm)

MySQL: select entries with a certain count within a certain period

I have a MySQL table with a datetime row. How can I find all groups with at least 5 entries within 10 minutes?
My only idea is to write a program (in whatever language) and loop over the timestamps, check always 5 (..) successive entries, calculate the time span between the last and the first and check whether it is below the limit.
Can this be done using a single SQL query too?
(The scenario is is simplified and the numbers are just examples.)
As requested, here comes an example:
id | timestamp | other_column
---|---------------------|-------------
3 | 2017-01-01 11:00:00 | thank
2 | 2017-01-01 11:01:00 | you
1 | 2017-01-01 11:02:00 | for
* 6 | 2017-01-01 11:20:00 | your
* 5 | 2017-01-01 11:21:00 | efforts
* 4 | 2017-01-01 11:22:00 | to
* 7 | 2017-01-01 11:23:00 | help
* 8 | 2017-01-01 11:24:00 | me
9 | 2017-01-01 11:40:00 | :
10 | 2017-01-01 11:41:00 | )
If the count limit is 5 and the timespan limit is 10 minutes, I'd like to get the entries marked with "*". The "id" column is the primary key of the table, but the order is not always the order of the timestamps. The "other_column" is used for a where clause. The table has about 1 million entries.
Try to break this down logically. Sorry for the psuedo code bits, I'm a little short on time.
select t1.id, t1.timestamp, t2.timestamp
from yourtable t1
inner join yourtable t2 on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
(plus 20 minutes won't work as is, use appropriate add function)
So this will give you a relatively giant list of all ID's joined to any other id's within a 20 minute time interval (including one row for itself). (add, I'm only picking out the first row of the group at this point, easier just to grab the 'header row' here by this timestamp plus 20 minutes and worry about the rest in the next step) If we group by the ID and time, we get a count of how many rows were within 20 minutes:
select id, t1.timestamp, count(1)
from yourtable t1
inner join yourtable t2 on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
group by id, t1.timestamp
having count(1) > 4
This will now give you a list of all the ID's and it's timestamp that has itself and 4 other records or more within 20 minutes away from that timestamp. Now it depends on how you want to group from here, if you want each of the 5 lines, we can call the query above a subquery and join it back to the main table to get the rows you want returned.
select t3.*
from
(select id, t1.timestamp, count(1)
from yourtable t1
inner join yourtable t2
on t2.timestamp >= t1.timestamp and t2.timestamp < (t1.timestamp + 20 minutes)
group by id, t1.timestamp
having count(1) > 4) a
inner join yourtable t3 on t3.timestamp >= a.timestamp and t3.timestamp < (a.timestamp + 20 minutes)
And that should give you ID 4-8 and it's info returned (order as you see fit).
My apologies that I don't have the time to test, but the logic should work.

mysql: get daily average price of product table with version

my products table :
ProductId(inc.key) | Price | VersionCreatedDate | MainProductId
1 | 15 | 1-11-2016 | 1
2 | 20 | 1-11-2016 | 2
3 | 30 | 1-11-2016 | 3
4 | 10 | 2-11-2016 | 1 -> mainProductId 1 changed price(-5$)
5 | 20 | 3-11-2016 | 3 -> mainProductId 3 changed price(-10$)
6 | 30 | 4-11-2016 | 3 -> mainProductId 3 changed price(+10$)
I want to display the output as like this
Date | AvgPrice
1-11-2016 | 21.67 ((15+20+30)/3)
2-11-2016 | 20 ((10+20+30)/3)
3-11-2016 | 16.67 ((10+20+20)/3)
4-11-2016 | 20 ((10+20+30)/3)
How do I get the output with sql code?
Assuming you have a calendar table with all dates you need. And you have a main_products table with MainProductId as primary/unique key. The following query should return average prices for every day in october 2016.
select sub.date, avg(sub.Price) as Price
from (
select
c.date,
m.MainProductId,
(
select p.Price
from products
where p.MainProductId = m.MainProductId
and p.VersionCreatedDate < c.date + interval 1 day
order by p.VersionCreatedDate desc
limit 1
) as Price
from callendar c
cross join main_products m
where c.date between '2016-10-01' and '2016-10-31'
) sub
group by sub.date
order by sub.date
The subquery (derived table aliased as sub) returns a combination of all dates in the range and all "main products" from the main_products table. The recent price each "main product" for a specific date is calculated in the subselect (correlated subquery in the SELECT clause) using ORDER BY and LIMIT 1. This allows us to group the subquery result by date and calculate the average price per date.
It is even possible to eliminate the derived table and hope that mysql can use an index to GROUP BY date instead of working on a temp table:
select c.date, avg((
select p.Price
from products
where p.MainProductId = m.MainProductId
and p.VersionCreatedDate < c.date + interval 1 day
order by p.VersionCreatedDate desc
limit 1
)) as Price
from callendar c
cross join main_products m
where c.date between '2016-10-01' and '2016-10-31'
group by c.date
order by c.date
I have no clue if that query can be executed effiently (especially if mysql can). You should however have at least the following indexes: callendar(date), products(MainProductId, VersionCreatedDate)

MySQL function IFNULL not working with GROUP BY

I've got a standard table with the list of users and I've got a column lastactivity with UNIX Timestamp (which shows when they have logged in) and column timestamp with UNIX Timestamp that shows when they have registered.
I've build a SQL query that shows how many users were active within 24 hours (86400 seconds) from now and grouped results by weeks so the counter counts how many users have registered each week:
SELECT
IFNULL(COUNT(*),0) as `counter`,
(WEEK(`timestamp`)) as `week`
FROM
`clients`
WHERE
(CAST(UNIX_TIMESTAMP() as signed) - CAST(`lastactivity` as signed)) <= 86400
GROUP BY
WEEK(`timestamp`);
The issue is that function IFNULL(COUNT(*),0) is not working as I intended. This SQL query won't display the week if there is NULL / 0 on the counter even with IFNULL() MySQL function. That is probably because of how GROUP BY works. So for example I will get this kind of result:
counter | week
2 | 11
1 | 13
9 | 14
6 | 17
But I would like to show each week like this:
counter | week
2 | 11
0 | 12
1 | 13
9 | 14
0 | 15
0 | 16
6 | 17
Anyone have idea how can I fix this issue?
Gordon is trying to help me by getting LEFT JOIN query but I still got the same results, maybe I am doing something wrong here:
SELECT
COUNT(a.id) as `counter`,
(WEEK(b.timestamp)) as `week`
FROM
`users` a
LEFT JOIN
`users` b
ON
a.id = b.id
WHERE
(CAST(UNIX_TIMESTAMP() as signed) - CAST(a.lastactivity as signed)) <= 86400
GROUP BY
WEEK(b.timestamp);
The problem is that you don't understand how the query works. IFNULL() (or the standard version COALESCE() converts a column value that is NULL to some other value. However, COUNT() never returns NULL. So, leave it out:
SELECT COUNT(*) as `counter`, WEEK(`timestamp`) as `week`
FROM `clients`
WHERE (CAST(UNIX_TIMESTAMP() as signed) - CAST(`lastactivity` as signed)) <= 86400
GROUP BY WEEK(`timestamp`);
Your problem is missing rows, not NULL values. You would have to solve this with a LEFT JOIN.
EDIT:
You need a left join to include all the weeks:
SELECT COUNT(c.timestamp) as `counter`, wk as `week`
FROM (SELECT 11 as wk UNION ALL
SELECT 12 UNION ALL
SELECT 13 UNION ALL
SELECT 14 UNION ALL
SELECT 15 UNION ALL
SELECT 16 UNION ALL
SELECT 17
) w LEFT JOIN
`clients` c
ON WEEK(c.`timestamp`) = w.wk
WHERE (CAST(UNIX_TIMESTAMP() as signed) - CAST(`lastactivity` as signed)) <= 86400
GROUP BY WEEK(`timestamp`);

Given a table with time periods, query for a list of sum per day

Let's say I have a table that says how many items of something are valid between two dates.
Additionally, there may be multiple such periods.
For example, given a table:
itemtype | count | start | end
A | 10 | 2014-01-01 | 2014-01-10
A | 10 | 2014-01-05 | 2014-01-08
This means that there are 10 items of type A valid 2014-01-01 - 2014-01-10 and additionally, there are 10 valid 2014-01-05 - 2014-01-08.
So for example, the sum of valid items at 2014-01-06 are 20.
How can I query the table to get the sum per day? I would like a result such as
2014-01-01 10
2014-01-02 10
2014-01-03 10
2014-01-04 10
2014-01-05 20
2014-01-06 20
2014-01-07 20
2014-01-08 20
2014-01-09 10
2014-01-10 10
Can this be done with SQL? Either Oracle or MySQL would be fine
The basic syntax you are looking for is as follows:
For my example below I've defined a new table called DateTimePeriods which has a column for StartDate and EndDate both of which are DATE columns.
SELECT
SUM(NumericColumnName)
, DateTimePeriods.StartDate
, DateTimePeriods.EndDate
FROM
TableName
INNER JOIN DateTimePeriods ON TableName.dateColumnName BETWEEN DateTimePeriods.StartDate and DateTimePeriods.EndDate
GROUP BY
DateTimePeriods.StartDate
, DateTimePeriods.EndDate
Obviously the above code won't work on your database but should give you a reasonable place to start. You should look into GROUP BY and Aggregate Functions. I'm also not certain of how universal BETWEEN is for each database type, but you could do it using other comparisons such as <= and >=.
There are several ways to go about this. First, you need a list of dense dates to query. Using a row generator statement can provide that:
select date '2014-01-01' + level -1 d
from dual
connect by level <= 15;
Then for each date, select the sum of inventory:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual),
periods as (select date '2014-01-01' + level -1 d from dual connect by level <= 15)
select
periods.d,
(select sum(item_count) from sample_data where periods.d between start_date and end_date) available
from periods
where periods.d = date '2014-01-06';
You would need to dynamically set the number of date rows to generate.
If you only needed a single row, then a query like this would work:
with
sample_data as
(select 'A' itemtype, 10 item_count, date '2014-01-01' start_date, date '2014-01-10' end_date from dual union all
select 'A', 10, date '2014-01-05', date '2014-01-08' from dual)
select sum(item_count)
from sample_data
where date '2014-01-06' between start_date and end_date;