I'd like to insert visits like this:
id - user - visit - data
1 - 1 - 2 - date
2 - 1 - 3 - date
3 - 1 - 2 - date after 5 minutes from the first (id 1) - only insert if it has pasted more than 5 minutes from the last similar record.
user 1 visited user 2 and 3.
The problem is, I'd like to insert visits without repeating them in the first 5 minutes. After this I'd like to insert.
I tried:
INSERT INTO visits (user, visit, data)
SELECT '1', '2', NOW() WHERE NOT EXISTS (SELECT 1 FROM visits WHERE user = '1' AND visit = '2' AND data >= DATE_SUB(NOW(), INTERVAL 5 MINUTE))
but it is not working. any ideas?
You can express the logic in the insert:
INSERT INTO visits (user, visit, data)
SELECT u.user, u.visit, u.data
FROM (SELECT 1 as user, 2 as visit, NOW() as data) u
WHERE NOT EXISTS (SELECT 1
FROM visits v
WHERE v.user = u.user AND v.visit = u.visit AND
u.data >= DATE_SUB(NOW(), INTERVAL 5 MINUTE)
);
This solves the problem for the particular INSERT. And that might be good enough. However, you are relying on the application to ensure data integrity. A trigger would ensure that no inserts or updates take place that violate your rule.
Related
I need to extract data from a MySQL table, but am not allowed to include a record if there's a previous record less than a year old.
Given the following records, only the records 1, 3 and 5 should be included (because record 2 was created 1 month after record 1, and record 4 was created 1 month after record 3):
1 2019-12-21
2 2020-01-21
3 2021-12-21
4 2022-01-21
5 2023-12-21
I came up with the following non-functional solution:
SELECT
*
FROM
table t
WHERE
(created_at > DATE_ADD(
(SELECT
created_at
FROM
table t2
WHERE
t2.created_at < t.created_at
ORDER BY
t2.created_at
DESC LIMIT 1), INTERVAL 1 YEAR)
But this only returns the first and the last record, but not the third:
1 2019-12-21
5 2023-12-21
I know why: the third record gets excluded because record 2 is less than a year old. But record 2 shouldn't be taken into account, because it won't make the list itself.
How can I solve this?
Using lag, assuming your MySql supports it, you can calculate the difference in months using period_diff
with d as (
select * ,
period_diff(extract(year_month FROM date),
extract(year_month from lag(date,1,date) over (order by date))
) as m
from t
)
select id, date
from d
where m=0 or m>12
Demo Fiddle
What I am trying to accomplish in my query is:
Users can store their mobile number in my database, and if the same mobile number is inserted, it will check if 5 minutes had been passed, if it has it will store another record of the same number, if it hasn't then it will not store the record. The column in my database are: Name, contact and date(timestamp)
What I had played around with and written:
INSERT INTO users (name, contact) SELECT * FROM (SELECT 'test', '123') AS 1 WHERE NOT EXISTS ( SELECT time FROM users WHERE time <= now() + INTERVAL 5 MINUTE ) LIMIT 1
This will always fail to save the record into the database, since the time is always <= 5 minutes as I'm not matching it to any specific rows. So my question here is: How can I go about matching the 5 minutes interval to a specific mobile number? IE: I want to check if the mobile number '123' had already been in my database for past 5 minutes, if it has then you can use '123' again in the mobile number field.
Consider the following... I used 1 minute because it's quicker to test
drop table if exists my_table;
create table my_table
( id serial primary key
, number varchar(12) not null
, dt datetime not null
);
insert into my_table
select null
, '1111'
, now()
from (select 1)x
left
join my_table y
on y.number = '1111'
and y.dt >= now() - interval 1 minute
where y.id is null
limit 1;
I want to write in MySQL a window function which gives a 30 day roll, counting unique id's. To be more precise, my database has many entries per day as a timestamp, for many different id's. I want to count each day how many different id's connect, and also to get each day the total number of id's that have been online in the last 30 days.
Consider the following table:
CREATE TABLE `my_database` (
`timestamp` BIGINT(20) UNSIGNED NOT NULL,
`id` VARCHAR(32) NOT NULL);
INSERT INTO my_database (timestamp,id) VALUES (CURDATE(),1);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 1 DAY),2);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 2 DAY),1);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 2 DAY),3);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 29 DAY),4);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 300 DAY),2);
INSERT INTO my_database (timestamp,id) VALUES (DATE_SUB(CURDATE(), INTERVAL 1000 DAY),5);
Which looks like:
timestamp id
20190730 1
20190729 2
20190728 1
20190728 3
20190701 4
20181003 2
20161102 5
The result I want to get is the following:
date count_day count_30day
2019-07-30 1 4
2019-07-29 1 4
2019-07-28 2 3
2019-07-01 1 1
2018-10-03 1 1
2016-11-02 1 1
I don't know how to get the count_30day column. So far I have written the following:
SELECT DATE(a.`timestamp`) AS 'date',
COUNT(DISTINCT a.id) AS 'count_day',
COUNT(DISTINCT a.id) OVER (ORDER BY DATE(a.`timestamp`) ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) AS 'count_30day'
FROM my_database AS a
GROUP
BY DATE(a.`timestamp`)
ORDER
BY DATE(a.`timestamp`) DESC
However that does not work for the count_30day column. I have been looking at other questions and the documentation and the syntax for the window functions seems to be correct as far as I have seen, but clearly is not as this does not work. How should I write the window function properly? Is there a better way to do this other than COUNT(DISTINCT)? Thanks!!
ROWS PRECEDING is related to number of rows, doesn't have anything to do with days
You need a subquery:
SQL DEMO
SELECT DATE(a.`timestamp`) AS 'date',
COUNT(DISTINCT a.id) AS 'count_day',
MAX( (SELECT COUNT(DISTINCT ID)
FROM my_database db2
WHERE db2.timestamp between DATE_SUB(a.timestamp, INTERVAL 30 DAY)
and a.timestamp
)
) as count30
FROM my_database AS a
GROUP
BY DATE(a.`timestamp`)
ORDER
BY DATE(a.`timestamp`) DESC
I'm tracking a users visit to a website by recording them in a database. A visit has a cooldown period of 6 hours. For this reason, I want to add a row to the table visits only if a user last visited the current website over 6 hours ago. If the last visit was less than 6 hours ago, do nothing.
I've looked around for answers to this and found plenty of quite similar issues, but none of those worked for me.
This is last query I tried:
INSERT INTO visits (user_id, web_id)
SELECT (66, 2) FROM websites WHERE NOT EXISTS (
SELECT 1 FROM visits WHERE web_id = 2 and user_id = 66 and added_on >= NOW() - INTERVAL 6 HOUR
)
I'm getting a syntax error near WHERE NOT EXISTS.
You might want to enforce this rule with a trigger rather than in the application. But I think the problem are the parentheses in the SELECT clause:
INSERT INTO visits (user_id, web_id)
SELECT 66, 2
FROM websites
WHERE NOT EXISTS (SELECT 1
FROM visits
WHERE web_id = 2 and user_id = 66 and
added_on >= NOW() - INTERVAL 6 HOUR
);
Hmmm . . . Your query is strange. What is websites? Your query is going to insert one row for every row in that table. It seems unlikely that you want this behavior. Perhaps you just want this:
INSERT INTO visits (user_id, web_id)
SELECT w.user_id, w.web_id
FROM (SELECT 66 as user_id, 2 as web_id) w
WHERE NOT EXISTS (SELECT 1
FROM visits v
WHERE v.web_id = w.web_id and
v.user_id = w.user_id and
v.added_on >= NOW() - INTERVAL 6 HOUR
);
you might try this ...
INSERT visits (user_id, web_id)
SELECT distinct user_id, web_id
FROM websites t join websites b
on b.user_id = t.user_id
and b.web_id = t.web_id
and b.added_on = (Select Max(added_on) From websites
Where id = t.user_id
and web_id = t.web_id
and added_on < t.added_on)
WHERE t.added_on >= NOW() - INTERVAL 6 HOUR
and user_id = 66
and web_id = 2
try the Select part by itself first, to see if it returns the correct result.
by the way, if you only have the userId and the webId in the visits table, without the datetime of the visit, how do you interpret the data when there are multiple rows for the same user/website combination, which are the result of visits more than six hours apart?
We have a statistical database for our Facebook application. One of our outputs is Unique Facebook Users based on time range. If our customers select daily usage, we show them a graph of unique Facebook users per hour.
My problem is with the unique values. First, here is the relevant columns from the table:
timestamp---facebookID---actionID---producerID
My current query is:
SELECT HOUR(timestamp) as Hour, COUNT(DISTINCT facebookID) as Events
FROM `e4s_analytic_data`
WHERE actionID = 'ax' AND producerID = '2' AND timestamp BETWEEN'12-06-11 0:00:00' and '12-06-11 23:59:59'
GROUP BY HOUR(timestamp)
This gives unique visitors (Based on facebookID) per hour. But if id = 123 visited in 14 and then visited again in 17, He will be counted twice - firstly in 14 and then in 17.
To solve this I've tried to add an inner query that will give all the ids that are already in the table from earlier hours.
I thought to bring all facebookIDs already listed in the table from 0 (The start of the day) until the current hour (taken from each row from the outside SELECT) and remove them from the outside SELECT. So that every COUNT will only include new Facebook IDs. Here is what I've tried:
SELECT HOUR(timestamp) as Hour, COUNT(DISTINCT facebookID) as Events
FROM `e4s_analytic_data`
WHERE actionID = 'ax' AND producerID = '2' AND timestamp between '12-06-11 0:00:00' and '12-06-11 23:59:59'
AND facebookID NOT IN
( SELECT facebookID FROM `e4s_analytic_data`
WHERE actionID = 'ax' AND producerID = '2' AND
HOUR(timestamp) >= 0 AND HOUR(timestamp) < Hour
)
GROUP BY HOUR(timestamp)
But it gives me this error:
Unknown column 'Hour' in 'where clause'
How can I solve this ?
Thanks.
EDIT: Sample data:
timestamp--------------facebookID--------producerID-------actionID
2012-06-13 12:38:55 ******6513406 2 ax
2012-06-13 08:49:55 ******6513406 2 ax
The query returns 1 unique visitor at 8 and 1 unique visitor at 12. I want to return only 1 unique at 8, because at 12 it is the same visitor from 8.