I'm working on a project where I search for a wifi signal (from cellphones etc). It detects every mac address that is around in the wifi sensors radius. This data is then sent from a server to a database, which uses a reporting tool to show statistics. This can be used in stores to study customer behavior.
This is what the data looks like:
The thing is, I want to know the time between entries, The problem is if a person stays for 10 minutes in the store it wil display alot of addresses, what I want to calculate is the difference between the visits. So what I need to do is count the time between the current day and the next day that they came.
I would then like to display this in a table like the one below.
current time | next time they came | Time between visist
-------------------------------------------------------------------
*current time* | *other day* | *Time between visits*
I have no idea how I should do this, abstract thinking is always welcome
P.s If there is any missing info, please comment. I'm new to the forums.
First of all you have to translate that time field into its readable date part
select date(from_unixtime(`time`)) from yourTable;
This value can be the joining criteria of a self join
select t1.address,
from_unixtime(t1.`time`),
min(from_unixtime(t2.`time`))
from yourTable t1
left join
yourTable t2
on t1.address = t2.address and
date(from_unixtime(t1.`time`)) < date(from_unixtime(t2.`time`))
group by t1.address, from_unixtime(t1.`time`)
This would get you, for each address and each visit time, the earliest visit time on a different day.
From there you could return the time difference
select tt.address,
tt.visit,
tt.next_visit,
timediff(coalesce(tt.next_visit, tt.visit), tt.visit) as `interval`
from (
select t1.address,
from_unixtime(t1.`time`) as visit,
min(from_unixtime(t2.`time`)) as next_visit
from yourTable t1
left join
yourTable t2
on t1.address = t2.address and
date(from_unixtime(t1.`time`)) < date(from_unixtime(t2.`time`))
group by t1.address, from_unixtime(t1.`time`)
) tt
The coalesce is to avoid having a null in the timediff function, which would happen for each address's last visit.
Related
I am making a covid log db for easy contact tracing.
these are my tables
log_tbl (fk_UserID, fk_EstID, log_date, log_time)
est_tbl (EstID, EstName)
user_tbl (User_ID, Name, Address, MobileNumber)
I wanted to write a statement that shows when and where an individual (User_ID)
enters an Establishment (EstID),
SELECT l.*
FROM log_tbl l
WHERE (l.EstID, l.log_date) IN (SELECT l2.EstID, l2.log_date
FROM log_tbl l2
WHERE l2.User_ID = 'LIN78JFF5WG'
);
[Result of Query]1
this currently works,
but it still has to be filterd by +-2 hours based on the time the when User_ID was logged on log_tbl, so that it would narrow down result when first query would spit out 1000 logs. Because these Results will be Contacted, and to reduce Costs, it needs to be narrowed down to less than 50%.
So, table below should not include first 2 and last one because it doesn't fit with 1, the date, and 2 the time, in relation to the searched userLIN78JFF5WG
[Unfiltered Result]2
FROM log_tbl
WHERE User_ID = 'LIN78JFF5WG'
AND (BETWEEN subtime(log_tbl.log_time, '02:00:00') AND addtime(log_tbl.log_time, '02:00:00'
I know this is wrong, but I don't have any idea how to join the two queries
and result should include
EstID, Name, Address, MobileNumber, log_date, log_time sorted by Date
Imagine it like this,
There are 3 baskets full of tomatoes,
2 of the baskets have rotten tomatoes inside.
Do you throw away the whole basket full of tomatoes?
No.. you select the rotten tomato, and others close to it, and throw them away.
I need that for the DB, instead of Getting Result for the Whole Day,
I only need the People who are in close contact with The Target user.
is it possible to do this on mysql? I have to use mysql because of reasons..
Here I include the data sample fiddle:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=050b2103d3adf5828524f49066c12e74
MySQL supports window functions with the range window frame specification. I would suggest:
select l.*
from (select l.*,
sum(case when fk_UserID = 'LIN78JFF5WG' then 1 else 0 end) over
(partition by log_date
order by log_time
range between interval 2 hour preceding and interval 2 hour following
) as cnt_user
from log_tbl l
) l
where cnt_user > 0;
Here is a db<>fiddle.
You can then annotate the results would other columns from other tables to get your final result.
This should be much faster than alternative methods.
Note, however, that you have a flaw in this logic, because it is not checking four hours between 0:00-2:00 a.m. and 22:00-0:00. You can store the date/time in a single column to make it easier to get a more accurate list.
I am not fully understand your requirements.
but I write sample sql so that we can make it clear
select *,(select UNIX_TIMESTAMP(CONCAT(log_date," ",log_time)) as ts from log_tbl where fk_UserID='LIN78JFF5WG') as target_time
from
log_tbl as l
-- simple join query.to get intend information
left join user_tbl as u on (u.User_id=l.fk_UserID)
left join est_tbl as e on (l.fk_EstID=e.EstID)
-- mysql datediff only return day as unit.so we convert to timestamp to do the diff
where UNIX_TIMESTAMP(CONCAT(l.log_date," ",l.log_time)) - target_time between 60*60*2 and 60*60*2
-- solution two
-- but I suggest you divide it into two sql like this.
select UNIX_TIMESTAMP(CONCAT(log_date," ",log_time)) as ts from log_tbl where fk_UserID='LIN78JFF5WG';
-- we get the user log timestamp.and use it in next query
select *
from
log_tbl as l
-- simple join query.to get intend information
left join user_tbl as u on (u.User_id=l.fk_UserID)
left join est_tbl as e on (l.fk_EstID=e.EstID)
-- mysql datediff only return day as unit.so we convert to timestamp to do the diff
where UNIX_TIMESTAMP(CONCAT(l.log_date," ",l.log_time)) - [target_time(passed by code)] between 60*60*2 and 60*60*2
I have three columns User_ID, New_Status and DATETIME.
New_Status contains 0(inactive) and 1(active) for users.
Every user starts from active status - ie. 1.
Subsequently table stores their status and datetime at which they got activated/inactivated.
How to calculate number of active users at the end of each date, including dates when no records were generated into the table.
Sample data:
| ID | New_Status | DATETIME |
+----+------------+---------------------+
| 1 | 1 | 2019-01-01 21:00:00 |
| 1 | 0 | 2019-02-05 17:00:00 |
| 1 | 1 | 2019-03-06 18:00:00 |
| 2 | 1 | 2019-01-02 01:00:00 |
| 2 | 0 | 2019-02-03 13:00:00 |
Format the date time value to a date only string and group by it
SELECT DATE_FORMAT(DATETIME, '%Y-%m-%d') as day, COUNT(*) as active
FROM test
WHERE New_Status = 1
GROUP BY day
ORDER BY day
In MySQL 8 you can use the row_number() window function to get the last status of a user per day. Then filter for the one that indicate the user was active GROUP BY the day and count them.
SELECT date(x.datetime),
count(*)
FROM (SELECT date(t.datetime) datetime,
t.new_status,
row_number() OVER (PARTITION BY date(t.datetime)
ORDER BY t.datetime DESC) rn
FROM elbat t) x
WHERE x.rn = 1
AND x.new_status = 1
GROUP BY x.datetime;
If not all days are in the table you need to create a (possibly derived) table with all days and cross join it.
Find out the last activity status of users whose activity was changed for each day
select User_ID, New_Status, DATE_FORMAT(DATETIME, '%Y-%m-%d')
from activity_table
where not exists
(
select 1
from activity_table at
where at.User_ID = activity_table.User_ID and
DATE_FORMAT(at.DATETIME, '%Y-%m-%d') = DATE_FORMAT(activity_table.DATETIME, '%Y-%m-%d') and
at.DATETIME > activity_table.DATETIME
)
order by DATE_FORMAT(activity_table.DATETIME, '%Y-%m-%d');
This is not the solution yet, but a very very useful information before solution. Note that here not all dates are covered yet and the values are individual records, more precisely their last values on each day, ordered by the date.
Let's get aggregate numbers
Using the query above as a subselect and aliasing it into a table, you can group by DATETIME and do a select sum(new_Status) as activity, count(*) total, DATETIME so you will know that activity - (total - activity) is the difference in comparison to the previous day.
Knowing the delta for each day present in the result
At the previous section we have seen how the delta can be calculated. If the whole query in the previous section is aliased, then you can self join it using a left join, with pairs of (previous date, current date), still having the gaps of dates, but not worrying about that just yet. In the case of the first date, its activity is the delta. For subsequent records, adding the previous day's delta to their delta yields the result you need. To achieve this you can use a recursive query, supported by MySQL 8, or, alternatively, you can just have a subquery which sums the delta of previous days (with special attention to the first date, as described earlier) will and adding the current date's delta yields the result we need.
Fill the gaps
The previous section would already perfectly work (assuming the lack of integrity problems), assuming that there were activity changes for each day, but we will not continue with the assumption. Here we know that the figures are correct for each date where a figure is present and we will need to just add the missing dates into the result. If the results are properly ordered, as they should be, then one can use a cursor and loop the results. At each record after the first one, we can determine the dates that are missing. There might be 0 such dates between two consequent dates or more. What we do know about the gaps is that their values are exactly the same as the previous record, that do has data. If there were no activity changes on a given date, then the number of active users is exactly the same as in the previous day. Using some structure, like a table you can generate the results you have with the knowledge described here.
Solving possible integrity problems
There are several possibilities for such problems:
First, a data item might exist prior to the introduction of this table's records were started to be spawned.
Second, bugs or any other causes might have made a pause in creating records for this activity table.
Third, the addition of user is or was not necessarily generating an activity change, since its popping into existence renders its previous state of activity undefined and subject to human standards, which might change over time.
Fourth, the removal of user is or was not necessarily generating an activity change, since its popping out of existence renders is current state of activity undefined and subject to human standards, which might change over time.
Fifth, there is an infinity of other issues which might cause data integrity issues.
To cope with these you will need to comprehensively analyze whatever you can from the source-code and the history of the project, including database records, logs and humanly available information to detect such anomalies, the time they were effective and figure out what their solution is if they exist.
EDIT
In the meantime I was thinking about the possibility of a user, who was active at the start of the day being deactivated and then activated again by the end of the day. Similarly, an inactive user during a day might be activated and then finally deactivated by the end of the day. For users that have more than an activation at the start of the day, we need to compare their activity status at the start and the end of the day to find out what the difference was.
SELECT
DATE(DATETIME),
COUNT(*)
FROM your_table
WHERE New_Status = 1
GROUP BY User_ID,
DATE(DATETIME)
For MySQL
WITH RECURSIVE
cte AS (
SELECT MIN(DATE(DT)) dt
FROM src
UNION ALL
SELECT dt + INTERVAL 1 DAY
FROM cte
WHERE dt < ( SELECT MAX(DATE(DT)) dt
FROM src )
),
cte2 AS
(
SELECT users.id,
cte.dt,
SUM( CASE src.New_Status WHEN 1 THEN 1
WHEN 0 THEN -1
ELSE 0
END ) OVER ( PARTITION BY users.id
ORDER BY cte.dt ) status
FROM cte
CROSS JOIN ( SELECT DISTINCT id
FROM src ) users
LEFT JOIN src ON src.id = users.id
AND DATE(src.dt) = cte.dt
)
SELECT dt, SUM(status)
FROM cte2
GROUP BY dt;
fiddle
Do not forget to adjust max recursion depth.
Here is what I believe is a good solution for this problem of yours:
SELECT SUM(New_Status) "Number of active users"
, DATE_FORMAT(DATEC, '%Y-%m-%d') "Date"
FROM TEST T1
WHERE DATE_FORMAT(DATEC,'%H:%i:%s') =
(SELECT MAX(DATE_FORMAT(T2.DATEC,'%H:%i:%s'))
FROM TEST T2
WHERE T2.ID = T1.ID
AND DATE_FORMAT(T1.DATEC, '%Y-%m-%d') = DATE_FORMAT(T2.DATEC, '%Y-%m-%d')
GROUP BY ID
, DATE_FORMAT(DATEC, '%Y-%m-%d'))
GROUP BY DATE_FORMAT(DATEC, '%Y-%m-%d');
Here is the DEMO
This site has answered many a SQL questions for me. Finally signed up to ask one of my own and get active here.
Anyways, I'm working in a table that has Date_Effective and Date_Lapse. A Client can have multiple rows so what I'm trying to get to is the number of days between a Date_Lapse and the next Date_Effective for the same client. The date values in this table are int's that I'll convert to dates later.
The below code doesn't work. It doesn't like the second value I'm joining on. Why can't I get it to give me min date_effective that's greater than each date_effective? If I run the below I just get no results because it's seeing it as there are no effective dates greater than the min effective date.
SELECT ClientID, c1.Date_Lapse, c2.Date_Effective
FROM Fact_Episodes c1
LEFT JOIN (
SELECT ClientID, min(Date_Effective) as Date_Effective
FROM Fact_Episodes
GROUP BY ClientID
) c2
ON c1.ClientID = c2.ClientID
AND c2.Date_Effective > c1.Date_Effective
If you wanted to stick with the left join, this would work.
select c1.ClientID, c1.Date_Lapse, min(c2.Date_Effective) as Date_Effective
from Fact_Episodes c1
left join Fact_Episodes c2
on c1.ClientID = c2.ClientID
and c2.Date_Effective > c1.Date_Effective
group by c1.ClientId, c1.Date_Lapse
I have a table that stores the statuses an applications goes through. Some applications go through the same status multiple times. Each time it goes through a status, the time of the status change is recorded.
How can I pull a list of applications based on the first time applications goes through a particular status within a specified date range. Below is what I have tried thus far:
SELECT d1.STATUS,
d1.APPL_ID
FROM APP_STATUS d1
LEFT JOIN APP_STATUS d2 ON d1.APPL_ID = d2.APPL_ID AND d1.STATUS = 'AT_CUSTOMER' AND d2.STATUS = 'AT_CUSTOMER'
WHERE DATE(d1.STATUS_CREATE_DT) >= '2014-10-26'
AND DATE(d1.STATUS_CREATE_DT) <= '2014-11-25'
AND d2.STATUS IS NULL
GROUP BY d1.APPL_ID;
To get the first time a status goes through, try this query:
select a.appl_id, min(status_create_dt) as first_dt
from ap_status
where d.STATUS_CREATE_DT >= '2014-10-26' and
d.STATUS_CREATE_DT < date('2014-11-25') + interval 1 day and
d2.STATUS IS NULL
group by a.appl_id;
I think this does what you need. If you want more columns, then you can join this back to ap_status.
Note that I changed the date logic a bit. The date functions are only on the constant side of the dates. This allows the query to take advantage of an index on STATUS_CREATE_DT, if appropriate.
I'm trying to figure out a way to perform a query which will obtain all data greater than six months old, without any data that is newer. I will see if I can appropriately summarize:
select u.USER_FirstName, u.USER_LastName,
u.USER_LastSession, c.Login_Name
FROM USER u
JOIN Customer c
ON u.USER_Customer_Identity=c.Customer_Identity
Where u.USER_LastSession < getdate()-180
Order by USER_LastSession
This is what I've found on SO so far, but the issue lies in that the USER.USER_LastSession records values for each log in (so some Customer.Login_Name values are unnecessary to return). I only want the ones which are greater than six months, with no result returned if they are also recorded at time less than six months. Example data:
USER_LastSession Login_Name
2012-08-29 21:33:30.000 TEST/TEST
2012-12-25 13:12:23.346 EXAMPLE/EXAMPLE
2013-10-30 17:13:45.000 TEST/TEST
I would not want to return TEST/TEST, since there is data in the past six months. I would, however, like to return EXAMPLE/EXAMPLE, since it only has data that is older than six months. I imagine there is probably something that I have overlooked - please forgive me if there is already an answer up for this (I was only able to find a "get older than six months" reply). Any and all help is greatly appreciated!
SELECT ...
FROM User u
JOIN Customer c ON u.USER_Customer_Identity=c.Customer_Identity
WHERE u.USER_Customer_Identity NOT IN
(SELECT USER_Customer_Identity
FROM User
WHERE USER_LastSession >= getdate() - 180)
ORDER BY USER_LastSession
with cte as (
select Login_Name, max(USER_LastSession) LastSession
FROM USER u
JOIN Customer c
ON u.USER_Customer_Identity = c.Customer_Identity
group by Login_Name
)
select *
from cte
where LastSession < getdate()-180