Get value difference between two dates in MySQL - mysql

I have a following table structure:
page_id view_count date
1 30 2018-08-30
1 33 2018-08-31
1 1 2018-09-01
1 5 2018-09-02
...
View count is reset on 1st of every month, and it's current value is stored on a daily basis, so on 31st of August it was increased by 3 (because 33-30).
What I need to do is to retrieve the view count (difference) between two dates through SQL query. To retrieve view count between two dates in same month would be simple, by just subtracting bigger date with the lower date, but retrieving between two dates that are in different months is what's not sure to me how to achieve. If I wanted to retrieve data between 2018-08-13 and 2018-09-13 I would have to get difference between 2018-08-31 and 2018-08-13, and add it to the value of 2018-09-13.
Also, I would like to do it for all page_id at once, between the same dates if possible within a single query.

assuming that the counter is unique per page and that the page_id counter is inserted daily into the table, I think that such a solution would work
The dates are based on the example,
and should be replaced by the relevant parameters
SELECT
v1.view_count + eom.view_count - v2.view_count
FROM
view_counts v1
INNER JOIN view_counts v2 ON v2.page_id = v1.page_id AND v2.`date` = '2018-08-13'
INNER JOIN view_counts eom ON v2.page_id = v.page_id AND eom.`date` = LAST_DAY(DATE_ADD(v.`date`, INTERVAL -1 MONTH))
WHERE
`date` = '2018-09-13'

Related

SQl query to calculate number of active users at the end of everyday

I have three columns User_ID, New_Status and DATETIME.
New_Status contains 0(inactive) and 1(active) for users.
Every user starts from active status - ie. 1.
Subsequently table stores their status and datetime at which they got activated/inactivated.
How to calculate number of active users at the end of each date, including dates when no records were generated into the table.
Sample data:
| ID | New_Status | DATETIME |
+----+------------+---------------------+
| 1 | 1 | 2019-01-01 21:00:00 |
| 1 | 0 | 2019-02-05 17:00:00 |
| 1 | 1 | 2019-03-06 18:00:00 |
| 2 | 1 | 2019-01-02 01:00:00 |
| 2 | 0 | 2019-02-03 13:00:00 |
Format the date time value to a date only string and group by it
SELECT DATE_FORMAT(DATETIME, '%Y-%m-%d') as day, COUNT(*) as active
FROM test
WHERE New_Status = 1
GROUP BY day
ORDER BY day
In MySQL 8 you can use the row_number() window function to get the last status of a user per day. Then filter for the one that indicate the user was active GROUP BY the day and count them.
SELECT date(x.datetime),
count(*)
FROM (SELECT date(t.datetime) datetime,
t.new_status,
row_number() OVER (PARTITION BY date(t.datetime)
ORDER BY t.datetime DESC) rn
FROM elbat t) x
WHERE x.rn = 1
AND x.new_status = 1
GROUP BY x.datetime;
If not all days are in the table you need to create a (possibly derived) table with all days and cross join it.
Find out the last activity status of users whose activity was changed for each day
select User_ID, New_Status, DATE_FORMAT(DATETIME, '%Y-%m-%d')
from activity_table
where not exists
(
select 1
from activity_table at
where at.User_ID = activity_table.User_ID and
DATE_FORMAT(at.DATETIME, '%Y-%m-%d') = DATE_FORMAT(activity_table.DATETIME, '%Y-%m-%d') and
at.DATETIME > activity_table.DATETIME
)
order by DATE_FORMAT(activity_table.DATETIME, '%Y-%m-%d');
This is not the solution yet, but a very very useful information before solution. Note that here not all dates are covered yet and the values are individual records, more precisely their last values on each day, ordered by the date.
Let's get aggregate numbers
Using the query above as a subselect and aliasing it into a table, you can group by DATETIME and do a select sum(new_Status) as activity, count(*) total, DATETIME so you will know that activity - (total - activity) is the difference in comparison to the previous day.
Knowing the delta for each day present in the result
At the previous section we have seen how the delta can be calculated. If the whole query in the previous section is aliased, then you can self join it using a left join, with pairs of (previous date, current date), still having the gaps of dates, but not worrying about that just yet. In the case of the first date, its activity is the delta. For subsequent records, adding the previous day's delta to their delta yields the result you need. To achieve this you can use a recursive query, supported by MySQL 8, or, alternatively, you can just have a subquery which sums the delta of previous days (with special attention to the first date, as described earlier) will and adding the current date's delta yields the result we need.
Fill the gaps
The previous section would already perfectly work (assuming the lack of integrity problems), assuming that there were activity changes for each day, but we will not continue with the assumption. Here we know that the figures are correct for each date where a figure is present and we will need to just add the missing dates into the result. If the results are properly ordered, as they should be, then one can use a cursor and loop the results. At each record after the first one, we can determine the dates that are missing. There might be 0 such dates between two consequent dates or more. What we do know about the gaps is that their values are exactly the same as the previous record, that do has data. If there were no activity changes on a given date, then the number of active users is exactly the same as in the previous day. Using some structure, like a table you can generate the results you have with the knowledge described here.
Solving possible integrity problems
There are several possibilities for such problems:
First, a data item might exist prior to the introduction of this table's records were started to be spawned.
Second, bugs or any other causes might have made a pause in creating records for this activity table.
Third, the addition of user is or was not necessarily generating an activity change, since its popping into existence renders its previous state of activity undefined and subject to human standards, which might change over time.
Fourth, the removal of user is or was not necessarily generating an activity change, since its popping out of existence renders is current state of activity undefined and subject to human standards, which might change over time.
Fifth, there is an infinity of other issues which might cause data integrity issues.
To cope with these you will need to comprehensively analyze whatever you can from the source-code and the history of the project, including database records, logs and humanly available information to detect such anomalies, the time they were effective and figure out what their solution is if they exist.
EDIT
In the meantime I was thinking about the possibility of a user, who was active at the start of the day being deactivated and then activated again by the end of the day. Similarly, an inactive user during a day might be activated and then finally deactivated by the end of the day. For users that have more than an activation at the start of the day, we need to compare their activity status at the start and the end of the day to find out what the difference was.
SELECT
DATE(DATETIME),
COUNT(*)
FROM your_table
WHERE New_Status = 1
GROUP BY User_ID,
DATE(DATETIME)
For MySQL
WITH RECURSIVE
cte AS (
SELECT MIN(DATE(DT)) dt
FROM src
UNION ALL
SELECT dt + INTERVAL 1 DAY
FROM cte
WHERE dt < ( SELECT MAX(DATE(DT)) dt
FROM src )
),
cte2 AS
(
SELECT users.id,
cte.dt,
SUM( CASE src.New_Status WHEN 1 THEN 1
WHEN 0 THEN -1
ELSE 0
END ) OVER ( PARTITION BY users.id
ORDER BY cte.dt ) status
FROM cte
CROSS JOIN ( SELECT DISTINCT id
FROM src ) users
LEFT JOIN src ON src.id = users.id
AND DATE(src.dt) = cte.dt
)
SELECT dt, SUM(status)
FROM cte2
GROUP BY dt;
fiddle
Do not forget to adjust max recursion depth.
Here is what I believe is a good solution for this problem of yours:
SELECT SUM(New_Status) "Number of active users"
, DATE_FORMAT(DATEC, '%Y-%m-%d') "Date"
FROM TEST T1
WHERE DATE_FORMAT(DATEC,'%H:%i:%s') =
(SELECT MAX(DATE_FORMAT(T2.DATEC,'%H:%i:%s'))
FROM TEST T2
WHERE T2.ID = T1.ID
AND DATE_FORMAT(T1.DATEC, '%Y-%m-%d') = DATE_FORMAT(T2.DATEC, '%Y-%m-%d')
GROUP BY ID
, DATE_FORMAT(DATEC, '%Y-%m-%d'))
GROUP BY DATE_FORMAT(DATEC, '%Y-%m-%d');
Here is the DEMO

SQL Query - Find out how many times a row changes from 0 to another value

I am using MySQL 8 and need to create a stored procedure
I have a single table that has a DATE field and a value field which can be 0 or any other number. This value field represents the daily amount of rain for that day.
The table stores data between today and 10 years.
I need to find out how many periods of rain there will be in the next 10 years.
So, for example, if my table contains the following data:
Date - Value
2018-06-09 - 0
2018-06-10 - 50
2018-06-11 - 0
2018-06-12 - 15
2018-06-13 - 17
2018-06-14 - 0
2018-06-15 - 0
2018-06-16 - 12
2018-06-17 - 123
2018-06-18 - 17
Then the SP should return 3, because there were 3 periods of rain.
Any help in getting me closer to the answer will be appreciated!
You don't need to have a stored procedure for this.
A solution with MySQL's 8.0 LEAD function this supports dates with gaps.
The complete table needs to be scanned but i don't think that a huge problem with ~3560 records.
Query
SELECT
SUM(filter_match = 1) AS number
FROM (
SELECT
((t.value = 0) AND (LEAD(t.value) OVER (ORDER BY t.date ASC) != 0)) AS filter_match
FROM
t
) t
see demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/2
By the way, would you mind expanding your answer to understand how
LEAD and SUM work together?
LEAD(t.value) OVER (ORDER BY t.date ASC) simply means get the next value from the next record ordered by date.
this demo shows it nicely https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/6
SUM(filter_match = 1) is a conditional sum. in this case the alias filter_match needs to be true.
see what filter_match is demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/8
In MySQL aggregate functions can have a SQL expression something like 1 = 1 (which is always true or 1) or 1 = 0 (which is always false or 0).
The conditional sum only sums up when the condition is true.
see demo https://www.db-fiddle.com/f/sev4NqgLsFPgtNgwzruwy/7
Use MySQL join:
SELECT COUNT(*) Number_of_Periods
FROM yourTable A JOIN yourTable B
ON DATE(A.`DATE`)=DATE(B.`DATE` - INTERVAL 1 DAY)
AND A.`VALUE`=0 AND B.`VALUE`>0;
See Demo on DB Fiddle.

How can I get the month that is not yet updated in SQL by inserting another row on every update?

I have a table that contains records of different transaction that is needed to be updated monthly. Once the record for a specific month has been successfully updated, it will insert a new record to that table to indicate that it is already updated. Let's take this example.
**date_of_transaction** **type**
2015-04-21 1 //A deposit record
2015-04-24 2 //A withdrawal record
2015-04-29 1
2015-04-30 2
2015-04-30 3 //3, means an update record
2015-05-14 1
2015-05-22 1
2015-05-27 2
2015-05-30 2
2015-06-09 1
2015-06-12 2
2015-06-17 2
2015-06-19 2
Let's suppose that the day today is July 23, 2015. I can only get the data one month lower than the current month, so only the data that I can get are june and downwards records.
As you can see, there is an update performed in the month of April because of the '3' in the type attribute, but in the month of May and June, there are no updates occurred, how can I get the month that is not yet updated?
This will return you months, which has no type=3 rows
SELECT MONTH([trans-date]) FROM [table] GROUP BY MONTH([trans-date]) HAVING MAX([trans-type])<3
Note: this will not work if 3 is not max value in the column
My approach would be to find all the months first, then find the months whose records were updated. Then select only those months from all months whose records werent updated (A set minus operation).
Mysql query would be something like this
select extract(MONTH,data_of_transaction) from your_table_name where month not in (select extract(MONTH,data_of_transaction) from table where type=3);
You can try this;
select *
from tbl
where date_of_transaction < 'July 23, 2015'
and
date_format(date_of_transaction, '%M-%Y') in (
select
date_format(date_of_transaction, '%M-%Y')
from tbl
group by date_format(date_of_transaction, '%M-%Y')
having max(type) != 3
)
date_format(date_of_transaction, '%M-%Y') will take month-year in consideration and filter the data having type = 3.

Get stats for each day in a month without ignoring days with no data

I want to get stats for each day in a given month. However, if a day has no rows in the table, it doesn't show up in the results. How can I include days with no data, and show all days until the current date?
This is the query I have now:
SELECT DATE_FORMAT(FROM_UNIXTIME(timestamp), '%d'), COUNT(*)
FROM data
WHERE EXTRACT(MONTH FROM FROM_UNIXTIME(timestamp)) = 6
GROUP BY EXTRACT(DAY FROM FROM_UNIXTIME(timestamp))
So if I have
Row 1 | 01-06
Row 2 | 02-06
Row 3 | 03-06
Row 4 | 05-06
Row 5 | 05-06
(i changed timestamp values to a day/month date just to explain)
It should output
01 | 1
02 | 1
03 | 1
04 | 0
05 | 2
06 | 0
...Instead of ignoring day 4 and today (day 6).
You will need a calendar table to do something in the form
SELECT `date`, count(*)
FROM Input_Calendar c
LEFT JOIN Data d on c.date=d.date
GROUP BY `date`
I keep a full copy of a calendar table in my database and used a WHILE loop to fill it but you can populate one on the fly for use based on the different solutions out there like http://crazycoders.net/2012/03/using-a-calendar-table-in-mysql/
In MySQL, you can use MySQL variables (act like in-line programming values). You set and can manipulate as needed.
select
dayofmonth( DynamicCalendar.CalendarDay ) as `Day`,
count(*) as Entries
from
( select
#startDate := date_add( #startDate, interval 1 day ) CalendarDay
from
( select #startDate := '2013-05-31' ) sqlvars,
AnyTableThatHasAsManyDaysYouExpectToReport
limit
6 ) DynamicCalendar
LEFT JOIN Input_Calendar c
on DynamicCalendar.CalendarDay = date( from_unixtime( c.date ))
group by
DynamicCalendar.CalendarDay
In the above sample, the inner query can join against as the name implies "Any Table" in your database that has at least X number of records you are trying to generate for... in this case, you are dealing with only the current month of June and only need 6 records worth... But if you wanted to do an entire year, just make sure the "Any Table" has 365 records(or more).
The inner query will start by setting the "#startDate" to the day BEFORE June 1st (May 31). Then, by just having the other table, will result in every record joined to this variable (creates a simulated for/next loop) via a limit of 6 records (days you are generating the report for). So now, as the records are being queried, the Start Date keeps adding 1 day... first record results in June 1st, next record June 2nd, etc.
So now, you have a simulated calendar with 6 records dated from June 1 to June 6. Take that and join to your "data" table and you are already qualifying your dates via the join and get only those dates of activity. I'm joining on the DATE() of the from unix time since you care about anything that happend on June 1, and June 1 # 12:00:00AM is different than June 1 # 8:45am, so matching on the date only portion, they should remain in proper grouping.
You could expand this answer by changing the inner '2013-05-31' to some MySQL Date function to get the last day of the prior month, and the limit based on whatever day in the current month you are doing so these are not hard-coded.
Create a Time dimension. This is a standard OLAP reporting trick. You don't need a cube in order to do OLAP tricks, though. Simply find a script on the internet to generate a Calendar table and join to that table.
Also, I think your query is missing a WHERE clause.
Other useful tricks include creating a "Tally" table that is a list of numbers from 1 to N where N is usually the max of the bigint on your database management system.
No code provided here, as I am not a MySQL guru.
Pseudo-code is:
Select * from Data left join TimeDimension on data.date = timedimension.date

How to select the field's increment from mysql

I have a table recording the accumulative total visit numbers of some web pages every day. I want to fetch the real visit numbers in a specific day for all these pages. the table is like
- record_id page_id date addup_number
- 1 1 2012-9-20 2110
- 2 2 2012-9-20 1160
- ... ... ... ...
- n 1 2012-9-21 2543
- n+1 2 2012-9-21 1784
the result I'd like to fetch is like:
- page_id date increment_num(the real visit numbers on this date)
- 1 2012-9-21 X
- 2 2012-9-21 X
- ... ... ...
- N 2012-9-21 X
but I don't want to do this in php, cause it's time consuming. Can I get what I want with SQL directives or with some mysql functions?
Ok. You need to join the table on itself by joining on the date column and adding a day to one side of the join.
Assuming:
date column is a legitimate DATE Type and not a string
Every day is accounted for each page (no gaps)
addup_number is an INT of some type (BIGINT, INT, SMALLINT, etc...)
table_name is substituted for your actual table name which you don't indicate
Only one record per day for each page... i.e. no pages have multiple counts on the same day
You can do this:
SELECT t2.page_id, t2.date, t2.addup_number - t1.addup_number AS increment_num
FROM table_name t1
JOIN table_name t2 ON t1.date + INTERVAL 1 DAY = t2.date
WHERE t1.page_id = t2.page_id
One thing to note is if this is a huge table and date is an indexed column, you'll suffer on the join by having to transform it by adding a day in the ON clause, but you'll get your data.
UPDATED:
SELECT today.page_id, today.date, (today.addup_number - yesterday.addup_number) as increment
FROM myvisits_table today, myvisits_table yesterday
WHERE today.page_id = yesterday.page_id
AND today.date='2012-9-21'
AND yesterday.date='2012-9-20'
GROUP BY today.page_id, today.date, yesterday.page_id, yesterday.date
ORDER BY page_id
Something like this:
SELECT date, SUM(addup_number)
FROM your_table
GROUP BY date