Selecting Non-existent Data With MySQL - mysql

I'm trying to select data between two date range. However not all data are being inserted daily. Below is sample of the table:
mysql> SELECT * FROM attendance;
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-04 | 70 |
| 2012-07-05 | 78 |
+------------+-------+
3 rows in set (0.00 sec)
The scenario is I want to get total of attendance from 2012-07-02 till 2012-07-04. Based on the data above I will get
mysql> SELECT * FROM attendance WHERE date BETWEEN '2012-07-02' AND '2012-07-04';
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-04 | 70 |
+------------+-------+
2 rows in set (0.00 sec)
However my objective is to have 2012-07-03 included in the result.
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-03 | 0 |
| 2012-07-04 | 70 |
+------------+-------+
Is this possible to be done through MySQL? I did look into temporary table. But still unable to achieve the objective.

You can enumerate dates as derived pseudo-table (with UNION) and then join it with your data
SELECT dates.date, COALESCE(attendance.total,0) AS total FROM (
SELECT '2012-07-02' AS date
UNION ALL SELECT '2012-07-03'
UNION ALL SELECT '2012-07-04'
) AS dates
LEFT JOIN attendance USING(date)
Edit: added COALESCE to return 0 instead of NULL on missing records.

This is a common problem with a simple solution.
Create a regular table, say REF_DATE, and store in it all dates for like 3 years or whatever time span you would need.
Then use this table on the left of a LEFT OUTER JOIN
SELECT REF.date,IFNULL(A.total,0) as total FROM REF_DATE REF
LEFT OUTER JOIN attendance
ON REF.date=A.date
A WHERE REF.date BETWEEN '2012-07-02' AND '2012-07-04';
DATE is a keyword in MySQL, I have used it here for readability. Use a different column name.

MySQL cannot generate data that isn't there. If you want non-existent dates, you'll need to have a temporary table that contains the full date range you join against. Another alternative is to maintain a server-side variable and do some date math for each row, which is ugly
select #dateval := '2012-07-02';
SELECT #dateval := #dateval + INTERVAL 1 DAY from ...

Related

SQL Query with all data from lest column and fill blank with previous row value

After searching a lot on this forum and the web, i have an issue that i cannot solve without your help.
The requirement look simple but not the code :-(
Basically i need to make a report on cumulative sales by product by week.
I have a table with the calendar (including all the weeks) and a view which gives me all the cumulative values by product and sorted by week. What i need the query to do is to give me all the weeks for each products and then add in a column the cumulative values from the view. if this value does not exist, then it should give me the last know record.
Can you help?
Thanks,
The principal is establish all the weeks that a product could have had sales , sum grouping by week, add the missing weeks and use the sum over window function to get a cumulative sum
DROP TABLE IF EXISTS T;
CREATE TABLE T
(PROD INT, DT DATE, AMOUNT INT);
INSERT INTO T VALUES
(1,'2022-01-01', 10),(1,'2022-01-01', 10),(1,'2022-01-20', 10),
(2,'2022-01-10', 10);
WITH CTE AS
(SELECT MIN(YEARWEEK(DT)) MINYW, MAX(YEARWEEK(DT)) MAXYW FROM T),
CTE1 AS
(SELECT DISTINCT YEARWEEK(DTE) YW ,PROD
FROM DATES
JOIN CTE ON YEARWEEK(DTE) BETWEEN MINYW AND MAXYW
CROSS JOIN (SELECT DISTINCT PROD FROM T) C
)
SELECT CTE1.YW,CTE1.PROD
,SUMAMT,
SUM(SUMAMT) OVER(PARTITION BY CTE1.PROD ORDER BY CTE1.YW) CUMSUM
FROM CTE1
LEFT JOIN
(SELECT YEARWEEK(DT) YW,PROD ,SUM(AMOUNT) SUMAMT
FROM T
GROUP BY YEARWEEK(DT),PROD
) S ON S.PROD = CTE1.PROD AND S.YW = CTE1.YW
ORDER BY CTE1.PROD,CTE1.YW
;
+--------+------+--------+--------+
| YW | PROD | SUMAMT | CUMSUM |
+--------+------+--------+--------+
| 202152 | 1 | 20 | 20 |
| 202201 | 1 | NULL | 20 |
| 202202 | 1 | NULL | 20 |
| 202203 | 1 | 10 | 30 |
| 202152 | 2 | NULL | NULL |
| 202201 | 2 | NULL | NULL |
| 202202 | 2 | 10 | 10 |
| 202203 | 2 | NULL | 10 |
+--------+------+--------+--------+
8 rows in set (0.021 sec)
Your calendar date may be slightly different to mine but you should get the general idea.

Write query on sql

I have one monitoring table with client ID columns ID, last login date to application Time. I wrote a query to display the table at what time the clients had access to the system in the form: Time - Number of entries at this time - Client IDs.
Request:
select Time, count (*) as Quantity, group_concat (ClientID) from monitoring group by Time;
How do I write the following query? Display the table in the same form, only now it is necessary for each time when at least 1 client had access, display the id of all clients who did not have access at that time.
UPD.
+---------------------+-------------+----------------+
| Time | Quantity | ClientID |
+---------------------+-------------+----------------+
| 2018-06-14 15:51:03 | 3 | 311,240,528 |
| 2018-06-14 15:51:20 | 3 | 314,312,519 |
| 2019-01-14 06:00:07 | 1 | 359 |
| 2019-08-21 14:30:04 | 1 | 269 |
+---------------------+-------------+----------------+
These are the IDs of clients who currently had access. And you need to display the IDs of all clients who did not have access at that particular time
That is, in this case:
+---------------------+-------------+-----------------------------+
| Time | Quantity | ClientID |
+---------------------+-------------+-----------------------------+
| 2018-06-14 15:51:03 | 5 | 269,359,314,312,519 |
| 2018-06-14 15:51:20 | 5 | 311,240,528,359,269 |
| 2019-01-14 06:00:07 | 7 | 311,240,528,314,312,519,269 |
| 2019-08-21 14:30:04 | 7 | 311,240,528,314,312,519,359 |
+---------------------+-------------+-----------------------------+
It is advisable not to take into account the day and time, but only the year and month. But as soon as it comes out. Thanks.
You can generate all possible combinations of clients and time with a cross join of two select distinct subqueries, and then filter out those that exist in the table with not exists. The final step is aggregation:
select t.time, count(*) as quantity, group_concat(c.clientid) as clientids
from (select distinct time from monitoring) t
cross join (select distinct clientid from monitoring) c
where not exists (
select 1
from monitoring m
where m.time = t.time and m.clientid = c.clientid
)
group by t.time
It is unclear to me what you mean by the last sentence in the question. The above query would generate the results that you showed for your sample data.

How to get the average time between multiple dates

What I'm trying to do is bucket my customers based on their transaction frequency. I have the date recorded for every time they transact but I can't work out to get the average delta between each date. What I effectively want is a table showing me:
| User | Average Frequency
| 1 | 15
| 2 | 15
| 3 | 35
...
The data I currently have is formatted like this:
| User | Transaction Date
| 1 | 2018-01-01
| 1 | 2018-01-15
| 1 | 2018-02-01
| 2 | 2018-06-01
| 2 | 2018-06-18
| 2 | 2018-07-01
| 3 | 2019-01-01
| 3 | 2019-02-05
...
So basically, each customer will have multiple transactions and I want to understand how to get the delta between each date and then average of the deltas.
I know the datediff function and how it works but I can't work out how to split them transactions up. I also know that the offset function is available in tools like Looker but I don't know the syntax behind it.
Thanks
In MySQL 8+ you can use LAG to get a delayed Transaction Date and then use DATEDIFF to get the difference between two consecutive dates. You can then take the average of those values:
SELECT User, AVG(delta) AS `Average Frequency`
FROM (SELECT User,
DATEDIFF(`Transaction Date`, LAG(`Transaction Date`) OVER (PARTITION BY User ORDER BY `Transaction Date`)) AS delta
FROM transactions) t
GROUP BY User
Output:
User Average Frequency
1 15.5
2 15
3 35
Demo on dbfiddle.com
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(user INT NOT NULL
,transaction_date DATE
,PRIMARY KEY(user,transaction_date)
);
INSERT INTO my_table VALUES
(1,'2018-01-01'),
(1,'2018-01-15'),
(1,'2018-02-01'),
(2,'2018-06-01'),
(2,'2018-06-18'),
(2,'2018-07-01'),
(3,'2019-01-01'),
(3,'2019-02-05');
SELECT user
, AVG(delta) avg_delta
FROM
( SELECT x.*
, DATEDIFF(x.transaction_date,MAX(y.transaction_date)) delta
FROM my_table x
JOIN my_table y
ON y.user = x.user
AND y.transaction_date < x.transaction_date
GROUP
BY x.user
, x.transaction_date
) a
GROUP
BY user;
+------+-----------+
| user | avg_delta |
+------+-----------+
| 1 | 15.5000 |
| 2 | 15.0000 |
| 3 | 35.0000 |
+------+-----------+
I don't know what to say other than use a GROUP BY.
SELECT User, AVG(DATEDIFF(...))
FROM ...
GROUP BY User

MySQL - Selecting a weeks worth of data from week beginning date

Trying to do a select but can't seem to master the art of this particular one.
This is what I have tried:
select user_id,date,monday_am_task from users,week,timesheet_submission where user_id='1' and date='2015-04-06';
However it says it is too ambiguous. This is basically what I want to do. If the user_id=1 and the date is in between 2015-04-06 then show the data. By between and the date I mean this, I have setup a week table, this includes the week_number, week_id and date. Date is referred to as the week commencing date. So with my select statement I am trying to select the date that will pull all the data for that week, if that makes sense?
Week Table:
mysql> select * from week;
+---------+------+------------+
| week_id | week | date |
+---------+------+------------+
| 1 | 1 | 2014-12-29 |
| 2 | 2 | 2015-01-05 |
| 3 | 3 | 2015-01-12 |
| 4 | 4 | 2015-01-19 |
| 5 | 5 | 2015-01-26 |
| 6 | 6 | 2015-02-02 |
| 7 | 7 | 2015-02-09 | etc...
Users:
mysql> select user_id, username, level from users;
+---------+----------+-------+
| user_id | username | level |
+---------+----------+-------+
| 1 | tom | 1 |
| 2 | owain | 2 |
+---------+----------+-------+
2 rows in set (0.00 sec)
mysql> select user_id, date, timesheet_id, monday_am_task from timesheet_submission;
+---------+---------------------+--------------+----------------+
| user_id | date | timesheet_id | monday_am_task |
+---------+---------------------+--------------+----------------+
| 1 | 2015-04-10 12:44:54 | 34 | 5 |
+---------+---------------------+--------------+----------------+
1 row in set (0.00 sec)
This error happens when a column name exists in one or more of the selected tables. In this case, it appears to be user_id because it is in both users and timesheet_submission.
In addition, while it is all personal preference, many people prefer to use the JOIN syntax instead of listing multiple tables in the FROM clause. The second option creates a Cartesian product, which is what you are experiencing, and unless you set the proper conditions to relate a table, you can get very odd results like the ones you have now.
In short, the join syntax may look like this:
SELECT [columns]
FROM table1 t1
JOIN table2 t2 ON t2.relatedColumn = t1.relatedColumn;
For your tables, you can join users to timesheet_submission, and timesheet_submission to week, although that one is tricky because there is no direct link. To break my answer down a bit, I would start by getting all timesheet submissions for user_id 1 with a join like this:
SELECT u.user_id, ts.date, ts.monday_am_task
FROM users u
JOIN timesheet_submission ts ON ts.user_id = u.user_id AND u.user_id = 1;
Don't forget to use the table alias in select, or you'll get the ambiguity error again. As far as the date, if I were writing this query I would just include the condition that the timesheet date is between your given date, and six days later [since between is inclusive, if you go seven days later you will get the next date as well]. Try this:
SELECT u.user_id, ts.date, ts.monday_am_task
FROM users u
JOIN timesheet_submission ts ON ts.user_id = u.user_id AND u.user_id = 1
WHERE ts.date BETWEEN '2015-04-06' AND DATE_ADD('2015-04-06', INTERVAL 6 DAY);
Here is more on JOINS and on the DATE_ADD function.

SQL (MySQL) to generate continuous dates/month for reporting [duplicate]

I'm trying to select data between two date range. However not all data are being inserted daily. Below is sample of the table:
mysql> SELECT * FROM attendance;
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-04 | 70 |
| 2012-07-05 | 78 |
+------------+-------+
3 rows in set (0.00 sec)
The scenario is I want to get total of attendance from 2012-07-02 till 2012-07-04. Based on the data above I will get
mysql> SELECT * FROM attendance WHERE date BETWEEN '2012-07-02' AND '2012-07-04';
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-04 | 70 |
+------------+-------+
2 rows in set (0.00 sec)
However my objective is to have 2012-07-03 included in the result.
+------------+-------+
| date | total |
+------------+-------+
| 2012-07-02 | 100 |
| 2012-07-03 | 0 |
| 2012-07-04 | 70 |
+------------+-------+
Is this possible to be done through MySQL? I did look into temporary table. But still unable to achieve the objective.
You can enumerate dates as derived pseudo-table (with UNION) and then join it with your data
SELECT dates.date, COALESCE(attendance.total,0) AS total FROM (
SELECT '2012-07-02' AS date
UNION ALL SELECT '2012-07-03'
UNION ALL SELECT '2012-07-04'
) AS dates
LEFT JOIN attendance USING(date)
Edit: added COALESCE to return 0 instead of NULL on missing records.
This is a common problem with a simple solution.
Create a regular table, say REF_DATE, and store in it all dates for like 3 years or whatever time span you would need.
Then use this table on the left of a LEFT OUTER JOIN
SELECT REF.date,IFNULL(A.total,0) as total FROM REF_DATE REF
LEFT OUTER JOIN attendance
ON REF.date=A.date
A WHERE REF.date BETWEEN '2012-07-02' AND '2012-07-04';
DATE is a keyword in MySQL, I have used it here for readability. Use a different column name.
MySQL cannot generate data that isn't there. If you want non-existent dates, you'll need to have a temporary table that contains the full date range you join against. Another alternative is to maintain a server-side variable and do some date math for each row, which is ugly
select #dateval := '2012-07-02';
SELECT #dateval := #dateval + INTERVAL 1 DAY from ...