Selecting from 2 tables with possibly corresponding dates - mysql

I am looking for the correct query for my mysql db that has 2 seperate tables for lengths and weights.
I want to have the result returned as 1 query with 3 columns: datetime, length and weight.
The query should also allow to specify the user.
Eg.:
Table heights:
id user_id created_on height
1 2 2019-01-01 00:00:01 180
2 2 2019-01-02 00:00:01 181
3 3 2019-01-03 00:00:01 182
4 3 2019-01-04 00:00:01 183
5 2 2019-01-07 00:00:01 184
Table weights:
id user_id created_on weight
1 2 2019-01-01 00:00:01 80
2 2 2019-01-04 00:00:01 81
3 3 2019-01-05 00:00:01 82
4 3 2019-01-06 00:00:01 83
5 2 2019-01-07 00:00:01 84
I am looking to get the following result with a single query:
user_id created_on weight height
2 2019-01-01 00:00:01 80 180
2 2019-01-02 00:00:01 null 181
2 2019-01-04 00:00:01 81 null
2 2019-01-07 00:00:01 84 184
I have tried working with JOIN statements but fail to get the required result.
This join statement
SELECT w.* , h.* FROM weight w
JOIN height h
ON w.created_on=h.created_on
AND w.user_id=h.user_id AND user_id=2
will return only those results that have both a height and weight item for user_id and created_on
A full outer join would do the trick, however this is not supported by mysql.
The following query seems to be returning the required result, however it is very slow:
SELECT r.* FROM
(SELECT w.user_id as w_user, w.created_on as weightdate, w.value as weight, h.created_on as heightdate ,h.user_id as h_user, h.value as height FROM weight w
LEFT JOIN height h ON w.user_id = h.user_id
AND w.created_on=h.created_on
UNION
SELECT w.user_id as w_user, w.created_on as weightdate, w.value as weight, h.created_on as heightdate ,h.user_id as h_user, h.value as height FROM weight w
RIGHT JOIN height h ON w.user_id = h.user_id
AND w.created_on=h.created_on ) r
WHERE h_user=2 OR w_user =2
The query takes more than 3 seconds if the 2 tables have around 3000 entries.
Is there a way to speed this up, possibly using a different approach?
For extra bonus points: is it possible to allow for a small time discrepancy between both created_on datetimes? (eg. 10 minutes or within the same hour). Eg. if column weight has an entry for 2019-01-01 00:00:00 and table height has an entry for height at 2019-01-01 00:04:00 they appear in the same row.

Instead of using a calendar table to select dates of interest, you can use a UNION to select all the distinct dates from the heights and weights tables. To deal with matching times within an hour of each other, you can compare the times using TIMESTAMPDIFF and truncate the created_on time to the hour. Since this might create duplicate entries, we add the DISTINCT qualifier to the query:
SELECT DISTINCT COALESCE(h.user_id, w.user_id) AS user_id,
DATE_FORMAT(COALESCE(h.created_on, w.created_on), '%y-%m-%d %H:00:00') AS created_on,
w.weight,
h.height
FROM (SELECT created_on FROM heights
UNION
SELECT created_on FROM weights) d
LEFT JOIN heights h ON ABS(TIMESTAMPDIFF(HOUR, h.created_on, d.created_on)) = 0 AND h.user_id = 2
LEFT JOIN weights w ON ABS(TIMESTAMPDIFF(HOUR, w.created_on, d.created_on)) = 0 AND w.user_id = 2
WHERE h.user_id IS NOT NULL OR w.user_id IS NOT NULL
ORDER BY created_on
Output (from my demo, where I've modified your times to allow for matching within the hour):
user_id created_on weight height
2 19-01-01 01:00:00 80 180
2 19-01-02 00:00:00 181
2 19-01-04 04:00:00 81
2 19-01-07 06:00:00 84 184
Demo on dbfiddle

This is probably best handled using a calendar table, containing all dates of interest for the query. We can start the query with the calendar table, then left join to the heights and weights tables:
SELECT
COALESCE(h.user_id, w.user_id) AS user_id,
d.dt AS created_on,
w.weight,
h.height
FROM
(
SELECT '2019-01-01 00:00:01' AS dt UNION ALL
SELECT '2019-01-02 00:00:01' UNION ALL
SELECT '2019-01-03 00:00:01' UNION ALL
SELECT '2019-01-04 00:00:01' UNION ALL
SELECT '2019-01-05 00:00:01' UNION ALL
SELECT '2019-01-06 00:00:01' UNION ALL
SELECT '2019-01-07 00:00:01'
) d
LEFT JOIN heights h
ON d.dt = h.created_on AND h.user_id = 2
LEFT JOIN weights w
ON d.dt = w.created_on AND w.user_id = 2
WHERE
h.user_id IS NOT NULL OR w.user_id IS NOT NULL
ORDER BY
d.dt;
Demo

Related

The most efficient query to select min elements from table and date_format displayed [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 3 years ago.
I've one table with 1 million of rows and I want to get the rows with min date and formatted by day.
My table is:
Id Created_at Value
1 2019-04-08 10:35:32 254
1 2019-04-08 10:31:23 241
1 2019-04-08 11:47:32 258
2 2019-04-08 10:32:42 276
2 2019-04-08 10:34:23 280
2 2019-04-08 11:34:23 290
And I would like to get (the min created_at values for each hour and format by hour):
Id Created_at Value
1 2019-04-08 10:00:00 241
1 2019-04-08 11:00:00 258
2 2019-04-08 10:00:00 276
2 2019-04-08 11:00:00 290
I have mysql 5.7 so I can't build windowed queries. I'm researching the most efficient way to select this elements.
I would do something like:
select
t.id, m.h, t.value
from my_table t
join (
select
id,
from_unixtime(floor(unix_timestamp(created_at) / 3600) * 3600) as h,
min(created_at) as min_created_at
from my_table
group by id, from_unixtime(floor(unix_timestamp(created_at) / 3600) * 3600)
) m on m.id = t.id and m.min_created_at = t.created_at
In mysql 5.7
You can use a join on subquery for min result
select m.id, date(m.created_at) , m.value
INNER JOIN (
select min(created_at) min_date
from my_tbale
group by date(created_at),hour(created_at)
) t on t.min_date = m.created_at
be sure you have a composite index on my_table columns (created_at, id, value)

SQL-Case When Issue

I am trying to sort the transaction dates into an aging policy. When LastDate has been in the location for greater than Aging Days limit policy it should show up as OverAge if not Within referring to the current date.
Here is the current table:
+---------+------+----------+-------------+
|LastDate | Part | Location | Aging Days |
+---------+------+----------+-------------+
12/1/2016 123 VVV 90
8/10/2017 444 RRR 10
8/01/2017 144 PR 21
7/15/2017 12 RRR 10
Here is the query:
select
q.lastdate,
r.part, r.location,
a.agingpolicy as 'Aging Days'
from opsintranexcel r (nolock)
left InventoryAging a (nolock) on r.location=a.location
left join (select part,MAX(trandate) as lastdate from opsintran group by
part) q on r.part=q.part
Here is the extra column I want added in:
+---------+------+----------+------------+---------+
|LastDate | Part | Location | Aging Days | Age |
+---------+------+----------+------------+---------+
12/1/2016 123 VVV 90 Overage
8/10/2017 444 RRR 10 Within
8/01/2017 144 PR 21 Within
7/15/2017 12 RRR 10 Overage
I appreciate your help.
I think below code will be work for you
SELECT
q.lastdate,
r.part,
r.location,
a.agingpolicy as 'Aging Days'
'Age' =
CASE
WHEN DATEDIFF( day, q.LastDate, GETDATE() ) > a.agingpolicy THEN 'Overage'
ELSE THEN 'Within'
END
FROM opsintranexcel r (nolock)
LEFT JOIN InventoryAging a (nolock) on r.location=a.location
LEFT JOIN (
SELECT part,MAX(trandate) as lastdate
FROM opsintran
WHERE trantype='II' and PerPost>='201601'
GROUP BY part) q ON r.part=q.part
you can check the difference of the current date and the lastdate value if over or within the aging days
CASE WHEN DATEDIFF(NOW(), q.lastdate) > a.agingpolicy
THEN 'Overage'
ELSE 'Within'
END AS age
You should modify your query as:
select
q.lastdate,
r.part, r.location,
a.agingpolicy as 'Aging Days',
if(DATEDIFF(NOW(), q.lastdate)) > a.agingpolicy, 'Overage','Within') as 'Age'
from opsintranexcel r (nolock)
left InventoryAging a (nolock) on r.location=a.location
left join (select part,MAX(trandate) as lastdate from opsintran where
trantype='II' and PerPost>='201601' group by part) q on r.part=q.part

MySQL get count of periods where date in row

I have an MySQL table, similar to this example:
c_id date value
66 2015-07-01 1
66 2015-07-02 777
66 2015-08-01 33
66 2015-08-20 200
66 2015-08-21 11
66 2015-09-14 202
66 2015-09-15 204
66 2015-09-16 23
66 2015-09-17 0
66 2015-09-18 231
What I need to get is count of periods where dates are in row. I don't have fixed start or end date, there can be any.
For example: 2015-07-01 - 2015-07-02 is one priod, 2015-08-01 is second period, 2015-08-20 - 2015-08-21 is third period and 2015-09-14 - 2015-09-18 as fourth period. So in this example there is four periods.
SELECT
SUM(value) as value_sum,
... as period_count
FROM my_table
WHERE cid = 66
Cant figure this out all day long.. Thx.
I don't have enough reputation to comment to the above answer.
If all you need is the NUMBER of splits, then you can simply reword your question: "How many entries have a date D, such that the date D - 1 DAY does not have an entry?"
In which case, this is all you need:
SELECT
COUNT(*) as PeriodCount
FROM
`periods`
WHERE
DATE_ADD(`date`, INTERVAL - 1 DAY) NOT IN (SELECT `date` from `periods`);
In your PHP, just select the "PeriodCount" column from the first row.
You had me working on some crazy stored procedure approach until that clarification :P
I should get deservedly flamed for this, but anyway, consider the following...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(date DATE NOT NULL PRIMARY KEY
,value INT NOT NULL
);
INSERT INTO my_table VALUES
('2015-07-01',1),
('2015-07-02',777),
('2015-08-01',33),
('2015-08-20',200),
('2015-08-21',11),
('2015-09-14',202),
('2015-09-15',204),
('2015-09-16',23),
('2015-09-17',0),
('2015-09-18',231);
SELECT x.*
, SUM(y.value) total
FROM
( SELECT a.date start
, MIN(c.date) end
FROM my_table a
LEFT
JOIN my_table b
ON b.date = a.date - INTERVAL 1 DAY
LEFT
JOIN my_table c
ON c.date >= a.date
LEFT
JOIN my_table d
ON d.date = c.date + INTERVAL 1 DAY
WHERE b.date IS NULL
AND c.date IS NOT NULL
AND d.date IS NULL
GROUP
BY a.date
) x
JOIN my_table y
ON y.date BETWEEN x.start AND x.end
GROUP
BY x.start;
+------------+------------+-------+
| start | end | total |
+------------+------------+-------+
| 2015-07-01 | 2015-07-02 | 778 |
| 2015-08-01 | 2015-08-01 | 33 |
| 2015-08-20 | 2015-08-21 | 211 |
| 2015-09-14 | 2015-09-18 | 660 |
+------------+------------+-------+
4 rows in set (0.00 sec) -- <-- This is the number of periods
there is a simpler way of doing this, see here SQLfiddle:
SELECT min(date) start,max(date) end,sum(value) total FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl, (SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
GROUP BY diff
This select groups over the same difference between sequential number and date value.
Edit
As Strawberry remarked quite rightly, there was a flaw in my apporach, when a period spans a month change or indeed a change into the next year. The unix_timestamp() function can cure this though: It returns the seconds since 1970-1-1, so by dividing this number by 24*60*60 you get the days since that particular date. The rest is simple ...
If you only need the count, as your last comment stated, you can do it even simpler:
SELECT count(distinct diff) period_count FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl,(SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
Tnx. #cars10 solution worked in MySQL, but could not manage to get period count to echo in PHP. It returned 0. Got it working tnx to #jarkinstall. So my final select looks something like this:
SELECT
sum(coalesce(count_tmp,coalesce(count_reserved,0))) as sum
,(SELECT COUNT(*) FROM my_table WHERE cid='.$cid.' AND DATE_ADD(date, INTERVAL - 1 DAY) NOT IN (SELECT date from my_table WHERE cid='.$cid.' AND coalesce(count_tmp,coalesce(count_reserved,0))>0)) as periods
,count(*) as count
,(min(date)) as min_date
,(max(date)) as max_date
FROM my_table WHERE cid=66
AND coalesce(count_tmp,coalesce(count_reserved,0))>0
ORDER BY date;

Summing data for last 7 day look back window

I want a query that can give result with sum of last 7 day look back.
I want output date and sum of last 7 day look back impressions for each date
e.g. I have a table tblFactImps with below data:
dateFact impressions id
2015-07-01 4022 30
2015-07-02 4021 33
2015-07-03 4011 34
2015-07-04 4029 35
2015-07-05 1023 39
2015-07-06 3023 92
2015-07-07 8027 66
2015-07-08 2024 89
I need output with 2 columns:
dateFact impressions_last_7
query I got:
select dateFact, sum(if(datediff(curdate(), dateFact)<=7, impressions,0)) impressions_last_7 from tblFactImps group by dateFact;
Thanks!
If your fact table is not too big, then a correlated subquery is a simple way to do what you want:
select i.dateFact,
(select sum(i2.impressions)
from tblFactImps i2
where i2.dateFact >= i.dateFact - interval 6 day
) as impressions_last_7
from tblFactImps i;
You can achieve this by LEFT OUTER JOINing the table with itself on a date range, and summing the impressions grouped by date, as follows:
SELECT
t1.dateFact,
SUM(t2.impressions) AS impressions_last_7
FROM
tblFactImps t1
LEFT OUTER JOIN
tblFactImps t2
ON
t2.dateFact BETWEEN
DATE_SUB(t1.dateFact, INTERVAL 6 DAY)
AND t1.dateFact
GROUP BY
t1.dateFact;
This should give you a sliding 7-day sum for each date in your table.
Assuming your dateFact column is indexed, this query should also be relatively fast.

Difference over sums in SQL

I have a table like the following:
PARENTREF TRANSTYPE(BIT(1)) DUEDATE(DateTime) TOTAL
2038 0 2015-01-01 1000
2038 1 2015-03-05 500
2039 0 2015-01-01 1000
2040 0 2015-01-01 1000
2041 0 2015-01-01 1000
2040 1 2015-04-07 200
I want a SELECT query that returns SUM(TOTAL) when TRANSTYPE=1 subtracted from SUM(TOTAL) when TRANSTYPE=0 for each distinct PARENTREF. I also would like to get in a separate column the DUEDATE for the PARENTREF when TRANSTYPE=0. There may be only one PARENTREF with TRANSTYPE=0 so that won't be a problem. In other words, I should get the following table:
PARENTREF DUEDATE(DateTime) TOTAL
2038 2015-01-01 500
2039 2015-01-01 1000
2040 2015-01-01 800
2041 2015-01-01 1000
(1-transtype*2) is 1 when transtype=0 and is -1 when transtype=1, so query subtract values of total where transtype=1 from value of total where transtype=0. max ignore null values, so it select only not null value where transtype=0.
select
parentref,
sum((1-transtype*2)*total) as total,
max(if(transtype=0,duedate,null)) as duedate
from tablename
group by parentref
Try this....
select t.PARENTREF,t.DueDate,(t.Total-isnull(m.Total,0)) as total
from tabl t LEFT outer join tabl m on t.PARENTREF=m.PARENTREF and t.TRANSTYPE <> m.TRANSTYPE
where (t.Transtype=0 ) and (isnull(m.Transtype,1)=1 )
Please Check out this fiddle http://sqlfiddle.com/#!3/d4988/1
or use thiss...
select t.PARENTREF,t.DueDate,(sum(t.Total)-sum(isnull(m.Total,0))) as total
from tabl t LEFT outer join tabl m on t.PARENTREF=m.PARENTREF and t.TRANSTYPE <> m.TRANSTYPE
where (t.Transtype=0 ) and (isnull(m.Transtype,1)=1 )
group by t.PARENTREF,t.DueDate
Check this fiddle http://sqlfiddle.com/#!3/d4988/2