I have the below table that is just a snapshot and all I want to do is to calculate the number of open items per date.
I used to do it in excel with simple formula =COUNTIFS($A$2:$A$30000,"<="&E2,$B$2:$B$30000,">="&E2) where column A was the Open_Date dates and column B the Close_Date dates. I want to use SQL to get the same results.
This is my excel snapshot. Formula above.
In mysql I have replicated it with T1 table:
CREATE TABLE T1
(
ID int (10),
Open_Date date,
Close_Date date);
insert into T1 values (1, '2018-12-17', '2018-12-18');
insert into T1 values (2, '2018-12-18', '2018-12-18');
insert into T1 values (3, '2018-12-18', '2018-12-18');
insert into T1 values (4, '2018-12-19', '2018-12-20');
insert into T1 values (5, '2018-12-19', '2018-12-21');
insert into T1 values (6, '2018-12-20', '2018-12-22');
insert into T1 values (7, '2018-12-20', '2018-12-22');
insert into T1 values (8, '2018-12-21', '2018-12-25');
insert into T1 values (9, '2018-12-22', '2018-12-26');
insert into T1 values (10, '2018-12-23', '2018-12-27');
First step was to create the table with dates in case there any gap in Date_open. So my code at the moment is
SELECT
d.dt, Temp_T1.*
FROM
(
SELECT '2018-12-17' AS dt UNION ALL
SELECT '2018-12-18' UNION ALL
SELECT '2018-12-19' UNION ALL
SELECT '2018-12-20' UNION ALL
SELECT '2018-12-21' UNION ALL
SELECT '2018-12-22' UNION ALL
SELECT '2018-12-23' UNION ALL
SELECT '2018-12-24'
) d
LEFT JOIN
(SELECT * FROM T1) AS Temp_T1
ON Temp_T1.Open_Date = d.dt
I am lost how to calculate the same values as I do in excel?
You want to use GROUP BY to make one row for each date in your d derived table.
Then join d to the t1 table where the d.dt is between the open and close dates.
SELECT
d.dt, COUNT(*) AS open_items
FROM
(
SELECT '2018-12-17' AS dt UNION ALL
SELECT '2018-12-18' UNION ALL
SELECT '2018-12-19' UNION ALL
SELECT '2018-12-20' UNION ALL
SELECT '2018-12-21' UNION ALL
SELECT '2018-12-22' UNION ALL
SELECT '2018-12-23' UNION ALL
SELECT '2018-12-24'
) d
LEFT JOIN T1 ON d.dt BETWEEN t1.Open_Date and t1.Close_Date
GROUP BY d.dt;
Output:
+------------+------------+
| dt | open_items |
+------------+------------+
| 2018-12-17 | 1 |
| 2018-12-18 | 3 |
| 2018-12-19 | 2 |
| 2018-12-20 | 4 |
| 2018-12-21 | 4 |
| 2018-12-22 | 4 |
| 2018-12-23 | 3 |
| 2018-12-24 | 3 |
+------------+------------+
Related
I'm trying to calculate the days since the last different order so for example let's say I have the following table:
cust_id|Product_id|Order_date|
1 |a |10/02/2020|
2 |b |10/01/2020|
3 |c |09/07/2020|
4 |d |09/02/2020|
1 |a |08/29/2020|
1 |f |08/02/2020|
2 |g |07/01/2020|
3 |t |06/06/2020|
4 |j |05/08/2020|
1 |w |04/20/2020|
I want to find the difference between the most recent date and the previous date that has a product ID that doesn't match the most recent product ID.
So the output should be something like this:
cust_id|latest_Product_id|time_since_last_diff_order_days|
1 |a |30 |
2 |b |92 |
3 |c |91 |
4 |d |123 |
Here's the query that I tried to use but got an error (error code 1064)
SELECT a.cust_id, a.Product_ID as latest_Product_id, DATEDIFF(MAX(a.Order_date),MAX(b.Order_date)) as time_since_last_diff_order_days
FROM database_customers.cust_orders a
INNER JOIN
database_customers.cust_orders b
on
a.cust_id = b.cust_id
WHERE a.product_id =! b.prodcut_id;
Thank you for any help!
It isn't pretty,, but will do the job
CREATE TABLE tab1
(`cust_id` int, `Product_id` varchar(1), `Order_date` datetime)
;
INSERT INTO tab1
(`cust_id`, `Product_id`, `Order_date`)
VALUES
(1, 'a', '2020-10-02 02:00:00'),
(2, 'b', '2020-10-01 02:00:00'),
(3, 'c', '2020-09-07 02:00:00'),
(4, 'd', '2020-09-02 02:00:00'),
(1, 'a', '2020-08-29 02:00:00'),
(1, 'f', '2020-08-02 02:00:00'),
(2, 'g', '2020-07-01 02:00:00'),
(3, 't', '2020-06-06 02:00:00'),
(4, 'j', '2020-05-08 02:00:00'),
(1, 'w', '2020-04-20 02:00:00')
;
WITH CTE AS (SELECT `cust_id`, `Product_id`,`Order_date`,ROW_NUMBER() OVER(PARTITION BY `cust_id` ORDER BY `Order_date` DESC) rn
FROM tab1)
SELECT t1.`cust_id`, t1.`Product_id`, t2.time_since_last_diff_order_days
FROM
(SELECT
`cust_id`, `Product_id`
FROM
CTE
WHERE rn = 1 ) t1
JOIN
( SELECT `cust_id`,DATEDIFF(MAX(`Order_date`), MIN(`Order_date`)) time_since_last_diff_order_days
FROM CTE WHERE rn in (1,2) GROUP BY `cust_id`) t2 ON t1.cust_id = t2.cust_id
cust_id | Product_id | time_since_last_diff_order_days
------: | :--------- | ------------------------------:
1 | a | 34
2 | b | 92
3 | c | 93
4 | d | 117
db<>fiddle here
I want to find the difference between the most recent date and the previous date that has a product ID that doesn't match the most recent product ID.
You can use first_value() to get the last product and then aggregate:
select cust_id, last_product_id, max(order_date),
datediff(max(order_date), max(case when product_id <> last_product_id then order_date end)) as diff_from_last_product
from (select co.*,
first_value(product_id) over (partition by cust_id order by order_date) as last_product_id
from cust_orders co
) co
group by cust_id, last_product_id;
Lets say I have 2 tables like so:
MyTable1:
Name ID Timestamp TestNum Grade
Alex 1101 2020-10-01 12:00:00 1 85
Alex 1101 2020-10-02 13:00:00 2 90
Alex 1101 2020-10-03 8:00:00 3 95
Alex 1101 2020-10-04 10:00:00 4 90
MyTable2:
ID Avg StDev
1101 90 4.08
I am trying to get the row of the first (Timestamp) instance where the grade was X standard deviations away.
ExpectedResults:
Name ID Timestamp TestNum StDevsAway
Alex 1101 2020-10-01 12:00:00 1 -1.23
Alex 1101 2020-10-02 13:00:00 2 0
Alex 1101 2020-10-03 8:00:00 3 1.23
The 4th row should not be returned as its Standard Deviations Away was already found at a previous Timestamp.
I'm still fairly new to MySQL, but this is where I'm at so far:
select a.Name
, a.ID
, a.Timestamp
, a.TestNum
, round( ( a.Grade - b.Avg ) / b.StDev, 2 ) as StDevsAway
from MyTable1 as a
join MyTable2 as b
on a.ID = b.ID
group
by round( ( a.Grade - b.Avg ) / b.StDev, 2 );
I think the question is just about finding "first" row for each id/grade tupe. So (asssuming MySQL 8.0):
select t1.*
from (
select t1.*, row_number() over(partition by id, grade order by timestamp) rn
from mytable1 t1
) t1
where rn = 1
Then, you can bring the second table with a join if you like:
select t1.*, round(t1.grade - t2.avg) / t2.stdev, 2) stdevsaway
from (
select t1.*, row_number() over(partition by id, grade order by timestamp) rn
from mytable1 t1
) t1
inner join mytable2 t2 on t2.id = t1.id
where rn = 1
In earlier versions, you can filter with a subquery:
select t1.*, round(t1.grade - t2.avg) / t2.stdev, 2) stdevsaway
from mytable1 t1
inner join mytable2 t2 on t2.id = t1.id
where t1.timestamp = (
select min(t11.timestamp) from mytable1 t11 where t11.id = t1.id and t11.grade = t1.grade
)
In previous Versin and of Course in mysql8 as well you can do this.
this will exclude every testnum, that gas standard deviation,except the first one, for that user
Schema (MySQL v5.5)
CREATE TABLE MyTable1 (
`Name` VARCHAR(4),
`ID` INTEGER,
`Timestamp` DATETIME,
`TestNum` VARCHAR(7),
`Grade` INTEGER
);
INSERT INTO MyTable1
(`Name`, `ID`, `Timestamp`, `TestNum`, `Grade`)
VALUES
('Alex', '1101', '2020-10-01 12:00:00', '1', '85'),
('Alex', '1101', '2020-10-02 13:00:00', '2', '90'),
('Alex', '1101', '2020-10-03 08:00:00', '3','95'),
('Alex', '1101', '2020-10-04 10:00:00', '4', '90');
CREATE TABLE MyTable2 (
`ID` INTEGER,
`Avg` INTEGER,
`StDev` FLOAT
);
INSERT INTO MyTable2
(`ID`, `Avg`, `StDev`)
VALUES
('1101', '90', '4.08');
Query #1
select
a.Name
, a.ID
, a.Timestamp
, a.TestNum
, round( ( a.Grade - b.Avg ) / b.StDev, 2 ) as StDevsAway
from MyTable1 as a join MyTable2 as b on a.ID = b.ID
WHERE
TestNum NOT IN (SELECT TestNum
FROM MyTable1 c
WHERE c.`ID` = a.`ID`
AND c.`Grade` = b.Avg
AND c.`TestNum`<> (SELECT MIN(TestNum)
FROM MyTable1 d
WHERE d.`ID` = a.`ID`
AND d.`Grade` = b.Avg)
);
| Name | ID | Timestamp | TestNum | StDevsAway |
| ---- | ---- | ------------------- | ------- | ---------- |
| Alex | 1101 | 2020-10-01 12:00:00 | 1 | -1.23 |
| Alex | 1101 | 2020-10-02 13:00:00 | 2 | 0 |
| Alex | 1101 | 2020-10-03 08:00:00 | 3 | 1.23 |
View on DB Fiddle
There are 2 MariaDB (Ver 15.1 Distrib 5.5.64-MariaDB, for Linux (x86_64)) tables:
CREATE TABLE Table1
(`phone` int, `calldate` datetime)
;
INSERT INTO Table1
(`phone`, `calldate`)
VALUES
(123, '2020-01-01 10:00:00'),
(123, '2020-01-01 11:00:00'),
(123, '2020-01-01 12:00:00')
;
CREATE TABLE Table2
(`phone` int, `calldate` datetime)
;
INSERT INTO Table2
(`phone`, `calldate`)
VALUES
( 123, '2020-01-01 09:01:00'),
( 123, '2020-01-01 09:02:00'),
( 123, '2020-01-01 10:15:00'),
( 123, '2020-01-01 10:20:00'),
( 123, '2020-01-01 10:23:00'),
( 123, '2020-01-01 11:05:00'),
( 123, '2020-01-01 11:12:00'),
( 123, '2020-01-01 11:25:00')
;
How to get result as :
The calldate of first record from table1 (2020-01-01 10:00:00) is more later than calldate of two records from table2.
Similarly for second one - the count is 5 (from 09:01:00 to 10:23:00)
But two records from table2 with calldate 09:01:00 and 09:02:00 are already "overlapped" by first record from table1, so result should be 3 instead of 5.
|------+----------------------+-------+
| phone | calldate | count |
|-------+---------------------+-------+
| 123 | 2020-01-01 09:02:00 | 2 |
| 123 | 2020-01-01 10:23:00 | 3 |
| 123 | 2020-01-01 11:25:00 | 3 |
|------+---------------------+|------+|
Also, the calldate in result set should be most last calldate from "overlapped" subset.
You can do this using window functions:
select t1.phone, t1.calldate, count(t2.phone)
from (select t1.*,
lead(calldate) over (partition by phone order by calldate) as next_calldate
from table1 t1
) t1 left join
table2 t2
on t2.phone = t1.phone and
t2.calldate >= t1.calldate and
(t2.calldate < t1.next_calldate or t1.next_calldate is null)
group by t1.phone, t1.calldate;
EDIT:
You can follow the same idea with a correlated subquery:
select t1.phone, t1.calldate, count(t2.phone)
from (select t1.*,
(select min(tt1.calldate)
from table1 tt1
where tt1.calldate > t1.calldate
) as next_calldate
from table1 t1
) t1 left join
table2 t2
on t2.phone = t1.phone and
t2.calldate >= t1.calldate and
(t2.calldate < t1.next_calldate or t1.next_calldate is null)
group by t1.phone, t1.calldate;
This will be even less efficient than the window functions version.
Join the tables and use NOT EXISTS in the ON clause like this:
select t1.phone, t1.calldate, count(t2.calldate) count
from Table1 t1 left join Table2 t2
on t2.phone = t1.phone and t2.calldate < t1.calldate
and not exists (
select 1 from Table1
where calldate < t1.calldate and t2.calldate < calldate
)
group by t1.phone, t1.calldate
See the demo.
Results:
| phone | calldate | count |
| ----- | ------------------- | ----- |
| 123 | 2020-01-01 10:00:00 | 2 |
| 123 | 2020-01-01 11:00:00 | 3 |
| 123 | 2020-01-01 12:00:00 | 3 |
Following query...
SELECT event_id, user_id FROM EventUser WHERE user_id IN (1, 2)
...gives me the following result:
+----------+---------+
| event_id | user_id |
+----------+---------+
| 3 | 1 |
| 2 | 1 |
| 1 | 1 |
| 5 | 1 |
| 4 | 1 |
| 6 | 1 |
| 4 | 2 |
| 2 | 2 |
| 1 | 2 |
| 5 | 2 |
+----------+---------+
Now, I want to modify the above query so that I only get for example two rows for each user_id, eg:
+----------+---------+
| event_id | user_id |
+----------+---------+
| 3 | 1 |
| 2 | 1 |
| 4 | 2 |
| 5 | 2 |
+----------+---------+
I am thinking about something like this, which of course does not work:
SELECT event_id, user_id FROM EventUser WHERE user_id IN (1, 2) LIMIT 2 by user_id
Ideally, this should work with offsets as well because I want to use it for paginations.
For performance reasons it is essential to use the WHERE user_id IN (1, 2) part of the query.
One method -- assuming you have at least two rows for each user -- would be:
(select min(event_id) as event_id, user_id
from t
where user in (1, 2)
group by user_id
) union all
(select max(event_id) as event_id, user_id
from t
where user in (1, 2)
group by user_id
);
Admittedly, this is not a "general" solution, but it might be the simplest solution for what you want.
If you want the two biggest or smallest, then an alternative also works:
select t.*
from t
where t.user_id in (1, 2) and
t.event_id >= (select t2.event_id
from t t2
where t2.user_id = t.user_id
order by t2.event_id desc
limit 1, 1
);
Here is a dynamic example for such problems, Please note that this example is working in SQL Server, could not try on mysql for now. Please let me know how it works.
CREATE TABLE mytable
(
number INT,
score INT
)
INSERT INTO mytable VALUES ( 1, 100)
INSERT INTO mytable VALUES ( 2, 100)
INSERT INTO mytable VALUES ( 2, 120)
INSERT INTO mytable VALUES ( 2, 110)
INSERT INTO mytable VALUES ( 3, 120)
INSERT INTO mytable VALUES ( 3, 150)
SELECT *
FROM mytable m
WHERE
(
SELECT COUNT(*)
FROM mytable m2
WHERE m2.number = m.number AND
m2.score >= m.score
) <= 2
How about this?
SELECT event_id, user_id
FROM (
SELECT event_id, user_id, row_number() OVER (PARTITION BY user_id) AS row_num
FROM EventUser WHERE user_id in (1,2)) WHERE row_num <= n;
And n can be whatever
Later but help uses a derived table and the cross join.
For the example in this post the query will be this:
SELECT
#row_number:=CASE
WHEN #user_no = user_id
THEN
#row_number + 1
ELSE
1
END AS num,
#user_no:=user_id userid, event_id
FROM
EventUser,
(SELECT #user_no:=0,#row_number:=0) as t
group by user_id,event_id
having num < 3;
More information in this link.
I have three tables to join, one of them with one-to several values.
SQLFIDDLE
CREATE TABLE Table1 (`id` int, `name` varchar(3));
INSERT INTO Table1 (`id`, `name`)
VALUES (1, 'A'), (2, 'B'), (3, 'C');
CREATE TABLE Table2 (`id` int, `status` int, `date` varchar(9));
INSERT INTO Table2 (`id`, `status`, `date`)
VALUES (1, 1, '''.11..'''), (1, 2, '''.12..'''), (1, 3, '''.13..'''),
(2, 3, '''.23..'''), (3, 1, '''.31..'''), (3, 3, '''.33..''')
;
CREATE TABLE Table3 (`id` int, `value` int);
INSERT INTO Table3 (`id`, `value`)
VALUES (1, 34), (2, 22), (3, 17);
Query 1:
select * from table1
| id | name |
|----|------|
| 1 | A |
| 2 | B |
| 3 | C |
Query 2:
select * from table2;
| id | status | date |
|----|--------|---------|
| 1 | 1 | '.11..' |
| 1 | 2 | '.12..' |
| 1 | 3 | '.13..' |
| 2 | 3 | '.23..' |
| 3 | 1 | '.31..' |
| 3 | 3 | '.33..' |
Query 3:
select * from table3
| id | value |
|----|-------|
| 1 | 34 |
| 2 | 22 |
| 3 | 17 |
I need query that returns for each id:
TABLE1.name, TABLE2.status, TABLE2.date, TABLE3.value
with this condition:
If TABLE2.status =1 exists then return ONLY that line of TABLE2
Else if TABLE2.status =1 does not exists then look for status =2 and return ONLY that line of TABLE2
If no one of those values are present in TABLE2 then skip that id from results
EDIT: TABLE2 has an UNIQUE key for id,status so there can be only one id=1 status=1
Thanks for your help!
Something like this maybe:
select table1.id, table1.name,
coalesce(table2_status1.status, table2_status2.status) as status,
coalesce(table2_status1.date, table2_status2.date) as date,
table3.value
from table1
left join table2 table2_status1 on table2_status1.id = table1.id and table2_status1.status = 1
left join table2 table2_status2 on table2_status2.id = table1.id and table2_status2.status = 2
join table3 on table3.id = table1.id
where (table2_status1.id is not null or table2_status2.id is not null);
Not performant, using subselects, works ( but rlanvins https://stackoverflow.com/a/48235077/7505395 is nicer) :
A,B,C instead of First, Second, ...
select
TABLE1.name,
case
when exists( select 1 from table2 where id = table1.id and status = 1)
then 1
when exists( select 1 from table2 where id = table1.id and status = 2)
then 2
end as T2status,
case
when exists( select 1 from table2 where id = table1.id and status = 1)
then ( select date from table2 where id = table1.id and status = 1)
when exists( select 1 from table2 where id = table1.id and status = 2)
then ( select date from table2 where id = table1.id and status = 2)
end as T2date,
TABLE3.value
from table1
join table3 on table1.id = table3.id
where
exists( select 1 from table2 where id = table1.id and status = 1)
or exists( select 1 from table2 where id = table1.id and status = 2)
Output
Name T2status T2date value
A 1 '.11..' 34
C 1 '.31..' 17
DDL
CREATE TABLE Table1 (`id` int, `name` varchar(3));
INSERT INTO Table1 (`id`, `name`)
VALUES (1, 'A'), (2, 'B'), (3, 'C');
CREATE TABLE Table2 (`id` int, `status` int, `date` varchar(9));
INSERT INTO Table2 (`id`, `status`, `date`)
VALUES (1, 1, '''.11..'''), (1, 2, '''.12..'''), (1, 3, '''.13..'''),
(2, 3, '''.23..'''), (3, 1, '''.31..'''), (3, 3, '''.33..''')
;
CREATE TABLE Table3 (`id` int, `value` int);
INSERT INTO Table3 (`id`, `value`)
VALUES (1, 34), (2, 22), (3, 17);