Calculate visits from pageviews in MySQL - mysql

I am recording each page that is viewed by logged in users in a MySQL table. I would like to calculate how may visits the site has had in within time period (eg day, week, month, between 2 dates etc.) in a similar way to Google Analytics.
Google Analytics defines a visit as user activity separated by at least 30 minutes of inactivity. I have the user ID, URL and date/time of each pageview so I need a query that can calculate a visit defined in this way.
I can easily count the pageviews between 2 dates but how can dynamically work out if a pageview from a user is within 30 minutes of another pageview and only count it once?
Here is a small sample of the data:
http://sqlfiddle.com/#!2/56695/2
Many thanks.

First, note that doing this kind of analysis in SQL is not the best idea indeed. It just has a very high computational complexity. There are many ways of eliminating the complexity from here.
Since we're talking about the analytics data, or something more akin to access logs of a typical web-server, we could as well just add a cookie value to it, and have a simple piece of front-end code that makes this cookie and gives it a random id, unless the cookie already exists. And sets the expiry of the cookie to whatever you want your session to be, which is 30 minutes by default. Note that you can change your session length in GA. Now your task is as simple as counting unique ids grouped by user. The complexity of N. The favourite complexity of most DBMSes.
Now if you really want to be solving the gaps-and-islands problem, you can just look at classical solutions of the problem, as well as some examples here on SO: SQL Server - Counting Sessions - Gaps and islands
Finally, the 'proper' way of tracking the session id would be generating a random string on every hit and setting it to a certain custom dimension, while having it as a session-level dimension for GA UA. Here's a more detailed explanation.
GA4 is gracious enough to surface the session id more properly, and here is how.

First, I would also index the uri column and make each column "not nullable":
CREATE TABLE IF NOT EXISTS `uri_history` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user` int(10) unsigned NOT NULL, /* cannot be NULL */
`timestamp` int(10) unsigned NOT NULL, /* cannot be NULL */
`uri` varchar(255) NOT NULL, /* cannot be NULL */
PRIMARY KEY (`id`),
KEY `user` (`user`),
KEY `timestamp` (`timestamp`),
KEY `uri`
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
However, I am a bit bewildered by your timestamp column having an int(10) definition and values such as 1389223839. I would expect an integer timestamp to be a value created with the UNIX_TIMESTAMP function call, but 1389223839 would then represent a value of '2014-01-08 18:30:39' for the 'America/New_York' time zone. I would have expected a sample timestamp to be more "contemporary." But I will have to assume that this column is a Unix timestamp value.
Let's say I was interested in gathering statistics for the month of June of this year:
SELECT * FROM uri_history
WHERE DATE(FROM_UNIXTIME(`timestamp`)) between '2022-06-01' and '2022-06-30'
ORDER BY `uri`, `user`, `timestamp`
From this point on I would process the returned rows in sequence recognizing breaks on the uri and user columns. For any returned uri and user combination, it should be very simple to compare the successive timestamp values and see if they differ by at least 30 minutes (i.e. 1800 seconds). In Python this would look like:
current_uri = None
current_user = None
current_timestamp = None
counter = None
# Process each returned row:
for row in returned_rows:
uri = row['uri']
user = row['user']
timestamp = row['timestamp']
if uri != current_uri:
# We have a new `uri` column:
if current_uri:
# Statistics for previous uri:
print(f'Visits for uri {current_uri} = {counter}')
current_uri = uri
current_user = user
counter = 1
elif user != current_user:
# We have a new user for the current uri:
current_user = user
counter += 1
elif timestamp - current_timestamp >= 1800:
# New visit is at least 30 minutes after the user's
# previous visit for this uri:
counter += 1
current_timestamp = timestamp
# Output final statistics, if any:
if current_uri:
print(f'Visits for uri {current_uri} = {counter}

Do i'm correct that you want count how many user visit the site within 30 minute for login user but only count as one per user event user visit more page in that period of time? If that so you could filter it then group by period of time visit within 30 minute.
First convert integer timestimp into date by using FROM_UNIXTIME, get minute visit, group minute has past, get period of start and end
SELECT DATE_FORMAT(FROM_UNIXTIME(timestamp), '%e %b %Y %H:%i:%s') visit_time,
FROM_UNIXTIME(timestamp) create_at,
MINUTE(FROM_UNIXTIME(timestamp)) create_minute,
MINUTE(FROM_UNIXTIME(timestamp))%30 create_minute_has_past_group,
date_format(FROM_UNIXTIME(timestamp) - interval minute(FROM_UNIXTIME(timestamp))%30 minute, '%H:%i') as period_start,
date_format(FROM_UNIXTIME(timestamp) + interval 30-minute(FROM_UNIXTIME(timestamp))%30 minute, '%H:%i') as period_end
FROM uri_history
After that group by period of start and COUNT DISTINCT user
SELECT date_format(FROM_UNIXTIME(timestamp) - interval minute(FROM_UNIXTIME(timestamp))%30 minute, '%H:%i') as period_start,
date_format(FROM_UNIXTIME(timestamp) + interval 30-minute(FROM_UNIXTIME(timestamp))%30 minute, '%H:%i') as period_end,
COUNT(DISTINCT(user)) count
FROM uri_history
GROUP BY period_start
ORDER BY period_start ASC;
I got these from these answer

Related

What is the best way to handle millions of rows inside the Visits table?

According to this question, The answer is correct and made the queries better but does not solve the whole problem.
CREATE TABLE `USERS` (
`ID` char(255) COLLATE utf8_unicode_ci NOT NULL,
`NAME` char(255) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
There are only 5 rows inside the USERS table.
ID
NAME
C9XzpOxWtuh893z1GFB2sD4BIko2
...
I2I7CZParyMatRKnf8NiByujQ0F3
...
EJ12BBKcjAr2I0h0TxKvP7uuHtEg
...
VgqUQRn3W6FWAutAnHRg2K3RTvVL
...
M7jwwsuUE156P5J9IAclIkeS4p3L
...
CREATE TABLE `VISITS` (
`USER_ID` char(255) COLLATE utf8_unicode_ci NOT NULL,
`VISITED_IN` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY `USER_ID` (`USER_ID`,`VISITED_IN`),
CONSTRAINT `VISITS_ibfk_1` FOREIGN KEY (`USER_ID`) REFERENCES `USERS` (`ID`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The indexes inside the VISITS table:
Keyname
Type
Unique
Packed
Column
Cardinality
Collation
Null
Comment
USER_ID
BTREE
No
No
USER_ID VISITED_IN
3245 5283396
A A
No No
There are 5,740,266 rows inside the VISITS table:
C9XzpOxWtuh893z1GFB2sD4BIko2 = 4,359,264 profile visits
I2I7CZParyMatRKnf8NiByujQ0F3 = 1,237,286 profile visits
EJ12BBKcjAr2I0h0TxKvP7uuHtEg = 143,716 profile visits
VgqUQRn3W6FWAutAnHRg2K3RTvVL = 0 profile visits
M7jwwsuUE156P5J9IAclIkeS4p3L = 0 profile visits
The time is taken for queries: (Seconds will change according to the number of rows)
SELECT COUNT(*) FROM VISITS WHERE USER_ID = C9XzpOxWtuh893z1GFB2sD4BIko2
Before applying Rick James' answer, The query took between 90 to 105 seconds
After applying Rick James' answer, The query took between 55 to 65 seconds
SELECT COUNT(*) FROM VISITS WHERE USER_ID = I2I7CZParyMatRKnf8NiByujQ0F3
Before applying Rick James' answer, The query took between 90 to 105 seconds
After applying Rick James' answer, The query took between 20 to 30 seconds
SELECT COUNT(*) FROM VISITS WHERE USER_ID = EJ12BBKcjAr2I0h0TxKvP7uuHtEg
Before applying Rick James' answer, The query took between 90 to 105 seconds
After applying Rick James' answer, The query took between 4 to 8 seconds
SELECT COUNT(*) FROM VISITS WHERE USER_ID = VgqUQRn3W6FWAutAnHRg2K3RTvVL
Before applying Rick James' answer, The query took between 90 to 105 seconds
After applying Rick James' answer, The query took between 1 to 3 seconds
SELECT COUNT(*) FROM VISITS WHERE USER_ID = M7jwwsuUE156P5J9IAclIkeS4p3L
Before applying Rick James' answer, The query took between 90 to 105 seconds
After applying Rick James' answer, The query took between 1 to 3 seconds
As you can see before applying the index, It was taken between 90 to 105 seconds to count the visits of a specific user even if the user has a few rows (visits).
After applying the index things became better but the problem is:
If I visit the C9XzpOxWtuh893z1GFB2sD4BIko2 profile, It will take
between 55 to 65 seconds to get profile visits.
If I visit the I2I7CZParyMatRKnf8NiByujQ0F3 profile, It will take
between 20 to 30 seconds to get profile visits.
Etc...
The user who has a few rows (visits) will be lucky because his profile will load faster.
I can ignore everything above and create a column inside the USERS table to count the user visits and increase it when catching a new visit without creating millions of rows but that will not be working with me because I allow the user to filter the visits like this:
Last 60 minutes
Last 24 hours
Last 7 days
Last 30 days
Last 6 months
Last 12 months
All-time
What should I do?
The problem is that you are evaluating, and continually re-evaluating, very large row counts that are actually part of history and can never change. You cannot count these rows every time, because that takes too long. You want to provide counts for:
Last 60 minutes
Last 24 hours
Last 7 days
Last 30 days
Last six months
All-time
You need four tables:
Table 1: A small, fast table holding the records of visits today and yesterday
Table 2: An even smaller, very fast table holding counts for the periods 'Day before yesterday ("D-2") to "D-7", field 'D2toD7', the period 'D8toD30', 'D31toD183' and 'D184andEarlier'
Table 3: A table holding the visit counts for each user on each day
Table 4: The very large and slow table you already have, with each visit logged against a timestamp
You can then get the 'Last 60 minutes' and 'Last 24 hours' counts by doing a direct query on Table 1, which will be very fast.
‘Last 7 days’ is the count of all records in Table 1 (for your user) plus the D2toD7 value (for your user) in Table 2.
‘Last 30 days’ is the count of all records in Table 1 (for your user) plus D2toD7, plus D8toD30.
‘Last six months’ is Table 1 plus D2toD7, plus D8toD30, plus D31toD183.
‘All-time’ is Table 1 plus D2toDy, plus D8toD30, plus D31toD183, plus D184andEarlier.
I’d be running php scripts to retrieve these values – there’s no need to try and do it all in one complex query. A few, even several, very quick hits on the database, collect up the numbers, return the result. The script will run in very much less than one second.
So, how do you keep the counts in Table 2 updated? This is where you need Table 3, which holds counts of visits by each user on each day. Create Table 3 and populate it with COUNT values for the data in your enormous table of all visits, GROUP BY User and Date, so you have the number of visits by each user on each day. You only need to create and populate Table 3 once.
You now need a CRON job/script, or similar, to run once a day. This script will delete rows recording visits made the day before yesterday from Table 1. This script needs to:
Identify the counts of visits for each user the day before yesterday
Insert those counts in Table 3 with the ‘day before yesterday’ date.
Add the count values to the ‘D2toD7’ values for each user in Table 2.
Delete the 'day before yesterday' rows from Table 1.
Look up the value for (what just became) D8 for each user in Table 3. Decrement this value from the ‘D2 to D7’ value for each user.
For each of the ‘D8toD30’, ’D31toD183’ etc. fields, increment for the day that is now part of the time period, decrement as per the day that drops out of the time period. Using the values stored in Table 3.
Remember to keep a sense of proportion; a period of 183 days approximates to six months well enough for any real-world visit counting purpose.
Overview: you cannot count millions of rows quickly. Use the fact that these are historical figures that will never change. Because you have Table 1 for the up-to-the-minute counts, you only need to update the historic period counts once a day. Multiple (even dozens of) very, very fast queries will get you accurate results very quickly.
This not be the answer, but a suggestion.
If they do not require real-time data,
Can't we run a scheduler and insert these into a summary table every x minutes. then we can access that summary table for your count.
Note: We can add a sync time column to your table if you need a time-wise login count. (Then your summery table also getting increased dynamically)
Table column ex:
PK_Column, user ID, Numb of visit, sync_time
We can use asynchronous (reactive) implementation for your front end. That mean, Data will load after some time, but the user never will experience that delay in his work.
create a summary table and every day at 12.00 AM run a job and put the user wise and date wise last visited summery into that table.
user_visit_Summary Table:
PK_Column, User ID, Number_of_Visites, VISIT_Date
Note: Create indexes for User ID and the Date fields
When you're retrieving the data, you're going to access it by a DB function
Select count(*) + (Select Number_of_Visites from VISITS
where user_id = xxx were VISIT_Date <= ['DATE 12:00 AM' -1] PK_Column desc limit 1) as old_visits
where USER_ID = xxx and VISITED_IN > 'DATE 12:00 AM';
For any query of a day or longer, use a Summary table.
That is, build and maintain a Summary table with 3 columns user_id, date, count; PRIMARY KEY(user_id, date) For "all time" and "last month", the query will be
SELECT CUM(count) FROM summary WHERE user_id='...';
SELECT CUM(count) FROM summary
WHERE user_id='...'
AND date >= CURDATE() - INTERVAL 1 MONTH
At midnight each night, roll the your current table up into one row per user in the summary table, then clear the table. This table will continue to be used for shorter timespans.
This achieves speed for every user for every time range.
But, there is a "bug". I am forcing "day"/"week"/etc to be midnight to midnight, and not allowing you to really says "the past 24 hours".
I suggest the following compromise for that "bug":
For long timespans, use the summary table, plus count today's hits from the other table.
For allowing "24 hours" to reach into yesterday, change the other table to reach back to yesterday morning. That is, purge only after 24 hours, not 1 calendar day.
To fetch all counters at once, do all the work in subqueries. There are two approaches, probably equally fast, but the result is either in rows or columns:
-- rows:
SELECT 'hour', COUNT(*) FROM recent ...
UNION ALL
SELECT '24 hr', COUNT(*) FROM recent ...
UNION ALL
SELECT 'month', SUM(count) FROM summary ...
UNION ALL
SELECT 'all', SUM(count) FROM summary ...
;
-- columns:
SELECT
( SELECT COUNT(*) FROM recent ... ) AS 'hour'.
( SELECT COUNT(*) FROM recent ... ) AS '24 hr',
( SELECT SUM(count) FROM summary ... ) AS 'last month'
( SELECT SUM(count) FROM summary ... ) AS 'all time'
;
The "..." is
WHERE user_id = '...'
AND datetime >= ... -- except for "all time"
There is an advantage in rolling the several queries into a single query (either way) -- This avoids multiple round trips to the server and multiple invocations of the Optimizer.
forpas provided another approach https://stackoverflow.com/a/72424133/1766831 but it needs to be adjusted to reach into two different tables.

How to get a rolling data set by week with sql

I had a sql query I would run that would get a rolling sum (or moving window) data set. I would run this query for every 7 days, increase the interval number by 7 (28 in example below) until I reached the start of the data. It would give me the data split by week so I can loop through it on the view to create a weekly graph.
SELECT *
FROM `table`
WHERE `row_date` >= DATE_SUB(NOW(), INTERVAL 28 DAY)
AND `row_date` <= DATE_SUB(NOW(), INTERVAL 28 DAY)
This is of course very slow once you have several weeks worth of data. I wanted to replace it with a single query. I came up with this.
SELECT *
CONCAT(YEAR(row_date), '/', WEEK(row_date)) as week_date
FROM `table`
GROUP BY week_date
ORDER BY row_date DESC
It appeared mostly accurate, except I noticed the current week and the last week of 2015 was much lower than usual. That's because this query gets a week starting on Sunday (or Monday?) meaning that it resets weekly.
Here's a data set of employees that you can use to demonstrate the behavior.
CREATE TABLE employees (
id INT NOT NULL,
first_name VARCHAR(14) NOT NULL,
last_name VARCHAR(16) NOT NULL,
row_date DATE NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO `employees` VALUES
(1,'Bezalel','Simmel','2016-12-25'),
(2,'Bezalel','Simmel','2016-12-31'),
(3,'Bezalel','Simmel','2017-01-01'),
(4,'Bezalel','Simmel','2017-01-05')
This data will return the last 3 rows on the same data point on the old query (last 7 days) assuming you run it today 2017-01-06, but only the last 2 rows on the same data point on the new query (Sunday to Saturday).
For more information on what I mean by rolling or moving window, see this English stack exchange link.
https://english.stackexchange.com/questions/362791/word-for-graph-that-counts-backwards-vs-graph-that-counts-forwards
How can I write a query in MySQL that will bring me rolling data, where the last data point is the last 7 days of data, the previous point is the previous 7 days, and so on?
I've had to interpret your question a lot so this answer might be unsuitable. It sounds like you are trying to get a graph showing data historically grouped into 7-day periods. Your current attempt does this by grouping on calendar week instead of by 7-day period leading to inconsistent size of periods.
So using a modification of your dataset on sql fiddle ( http://sqlfiddle.com/#!9/90f1f2 ) I have come up with this
SELECT
-- Figure out how many periods of 7 days ago this record applies to
FLOOR( DATEDIFF( CURRENT_DATE , row_date ) / 7 ) AS weeks_ago,
-- Count the number of ids in this group
COUNT( DISTINCT id ) AS number_in_week,
-- Because this is grouped, make sure to have some consistency on what we select instead of leaving it to chance
MIN( row_date ) AS min_date_in_week_in_dataset
FROM `sample_data`
-- Groups by weeks ago because that's what you are interested in
GROUP BY weeks_ago
ORDER BY
min_date_in_week_in_dataset DESC;

How to count entries in a mysql table grouped by time

I've found lots of not quite the answers to this question, but nothing I can base my rather limited sql skills on...
I've got a gas meter, which gives a pulse every cm3 of gas used - the time the pulses happen is obtained by a pi and stored in a mysql db. I'm trying to graph the db. In order to graph the data, I want to sum how many pulses are received every n time period. Where n may be 5 mins for a graph covering a day or n may be up to 24hours for a graph covering a year.
The data are in a table which has two columns, a primary key/auto inc called "pulse_ref" and "pulse_time" which stores a unix timestamp of the time a pulse was received.
Can anyone suggest a sql query to count how many pulses occurred grouped up into, say, 5minutely intervals?
Create table:
CREATE TABLE `gas_pulse` (
`pulse_ref` int(11) NOT NULL AUTO_INCREMENT,
`pulse_time` int(11) DEFAULT NULL,
PRIMARY KEY (`pulse_ref`));
Populate some data:
INSERT INTO `gas_pulse` VALUES (1,1477978978),(2,1477978984),(3,1477978990),(4,1477978993),(5,1477979016),(6,1477979063),(7,1477979111),(8,1477979147),(9,1477979173),(10,1477979195),(11,1477979214),(12,1477979232),(13,1477979249),(14,1477979267),(15,1477979285),(16,1477979302),(17,1477979320),(18,1477979337),(19,1477979355),(20,1477979372),(21,1477979390),(22,1477979408),(23,1477979425),(24,1477979443),(25,1477979461),(26,1477979479),(27,1477979497),(28,1477979515),(29,1477979533),(30,1477979551),(31,1477979568),(32,1477979586),(33,1477980142),(34,1477980166),(35,1477981433),(36,1477981474),(37,1477981526),(38,1477981569),(39,1477981602),(40,1477981641),(41,1477981682),(42,1477981725),(43,1477981770),(44,1477981816),(45,1477981865),(46,1477981915),(47,1477981966),(48,1477982017),(49,1477982070),(50,1477982124),(51,1477982178),(52,1477982233),(53,1477988261),(54,1477988907),(55,1478001784),(56,1478001807),(57,1478002385),(58,1478002408),(59,1478002458),(60,1478002703),(61,1478002734),(62,1478002784),(63,1478002831),(64,1478002863),(65,1478002888),(66,1478002909),(67,1478002928),(68,1478002946),(69,1478002964),(70,1478002982),(71,1478003000),(72,1478003018),(73,1478003036),(74,1478003054),(75,1478003072),(76,1478003090),(77,1478003108),(78,1478003126),(79,1478003145),(80,1478003163),(81,1478003181),(82,1478003199),(83,1478003217),(84,1478003235),(85,1478003254),(86,1478003272),(87,1478003290),(88,1478003309),(89,1478003327),(90,1478003346),(91,1478003366),(92,1478003383),(93,1478003401),(94,1478003420),(95,1478003438),(96,1478003457),(97,1478003476),(98,1478003495),(99,1478003514),(100,1478003533),(101,1478003552),(102,1478003572),(103,1478003592),(104,1478003611),(105,1478003632),(106,1478003652),(107,1478003672),(108,1478003693),(109,1478003714),(110,1478003735),(111,1478003756),(112,1478003778),(113,1478003799),(114,1478003821),(115,1478003844),(116,1478003866),(117,1478003889),(118,1478003912),(119,1478003936),(120,1478003960),(121,1478003984),(122,1478004008),(123,1478004033),(124,1478004058),(125,1478004084),(126,1478004109),(127,1478004135),(128,1478004161),(129,1478004187),(130,1478004214),(131,1478004241),(132,1478004269),(133,1478004296),(134,1478004324),(135,1478004353),(136,1478004381),(137,1478004410),(138,1478004439),(139,1478004469),(140,1478004498),(141,1478004528),(142,1478004558),(143,1478004589),(144,1478004619),(145,1478004651),(146,1478004682),(147,1478004714),(148,1478004746),(149,1478004778),(150,1478004811),(151,1478004844),(152,1478004877),(153,1478004911),(154,1478004945),(155,1478004979),(156,1478005014),(157,1478005049),(158,1478005084),(159,1478005120),(160,1478005156),(161,1478005193),(162,1478005231),(163,1478005268),(164,1478005306),(165,1478005344),(166,1478005383),(167,1478005422),(168,1478005461),(169,1478005501),(170,1478005541),(171,1478005582),(172,1478005622),(173,1478005663),(174,1478005704),(175,1478005746),(176,1478005788),(177,1478005831),(178,1478005873),(179,1478005917),(180,1478005960),(181,1478006004),(182,1478006049),(183,1478006094),(184,1478006139),(185,1478006186),(186,1478006231),(187,1478006277),(188,1478010694),(189,1478010747),(190,1478010799),(191,1478010835),(192,1478010862),(193,1478010884),(194,1478010904),(195,1478010924),(196,1478010942),(197,1478010961),(198,1478010980),(199,1478010999),(200,1478011018),(201,1478011037),(202,1478011056),(203,1478011075),(204,1478011094),(205,1478011113),(206,1478011132),(207,1478011151),(208,1478011170),(209,1478011189),(210,1478011208),(211,1478011227),(212,1478011246),(213,1478011265),(214,1478011285),(215,1478011304),(216,1478011324),(217,1478011344),(218,1478011363),(219,1478011383),(220,1478011403),(221,1478011423),(222,1478011443),(223,1478011464),(224,1478011485),(225,1478011506),(226,1478011528),(227,1478011549),(228,1478011571),(229,1478011593),(230,1478011616),(231,1478011638),(232,1478011662),(233,1478011685),(234,1478011708),(235,1478011732),(236,1478011757),(237,1478011782),(238,1478011807),(239,1478011832),(240,1478011858),(241,1478011885),(242,1478011912),(243,1478011939),(244,1478011967),(245,1478011996),(246,1478012025),(247,1478012054),(248,1478012086),(249,1478012115),(250,1478012146),(251,1478012178),(252,1478012210),(253,1478012244),(254,1478012277),(255,1478012312),(256,1478012347),(257,1478012382),(258,1478012419),(259,1478012456),(260,1478012494),(261,1478012531),(262,1478012570),(263,1478012609),(264,1478012649),(265,1478012689),(266,1478012730),(267,1478012771),(268,1478012813),(269,1478012855),(270,1478012898),(271,1478012941),(272,1478012984),(273,1478013028),(274,1478013072),(275,1478013117),(276,1478013163),(277,1478013209),(278,1478013255),(279,1478013302),(280,1478013350),(281,1478013399),(282,1478013449),(283,1478013500),(284,1478013551),(285,1478013604),(286,1478013658),(287,1478013714),(288,1478013771),(289,1478013830),(290,1478013891),(291,1478013954),(292,1478014019),(293,1478014086),(294,1478014156),(295,1478014228),(296,1478014301),(297,1478014373),(298,1478014446),(299,1478014518),(300,1478014591),(301,1478014664),(302,1478014736),(303,1478014809),(304,1478014882),(305,1478015377),(306,1478015422),(307,1478015480),(308,1478015543),(309,1478015608),(310,1478015676),(311,1478015740),(312,1478015803),(313,1478015864),(314,1478015921),(315,1478015977),(316,1478016030),(317,1478016081),(318,1478016129),(319,1478016176);
I assume you need to get the pulse count in n-minute (in your case 5 minutes) intervals. For achieving this, please try the following query
SELECT
COUNT(*) AS gas_pulse_count,
FROM_UNIXTIME(pulse_time - MOD(pulse_time, 5 * 60)) from_time,
FROM_UNIXTIME((pulse_time - MOD(pulse_time, 5 * 60)) + 5 * 60) to_time
FROM
gas_pulse
GROUP BY from_time

SQL average number of requests per user over time period

I have a table (Requests) with the following fields:
id, requestType, userEmail, date
I want to find the average number of requests per user over a given period (i.e. over the last month). Any suggestions?
Thanks!
Greg
Something like SUM feature will work. Might be a little slow.
SELECT SUM(requestType) FROM Requests WHERE `userEmail` = `userEmail` and `date` BETWEEN `first-date YYYY-MM-DD` AND `second-date YYYY-MM-DD`;
SQL SUM
I would also recommend, if you have a lot of request, to have one row per user per day and just update the request total for that user.
Edit: If you want the last 30 days something like this query should work. It worked on my test table.
SELECT SUM(requestType) FROM Requests WHERE `userEmail` = `userEmail` and `date`BETWEEN curdate() - INTERVAL 30 DAY AND curdate();

Selecting records timestamped in a certain time range of each other

How would you go about selecting records timestamped within a certain amount of time of each other?
Application and sought solution:
I have a table with records of clicks, I am wanting to go through and find the clicks from the same IP that occurred within a certain time period.
e.g.: SELECT ALL ip_address WHERE 5 or more of the same ip_address, occurred/are grouped within/timestamped, within 10 minutes of each other
You can select record like that
$recorddate = date("Y-m-d");
SELECT * FROM table WHERE date > UNIX_TIMESTAMP('$recorddate');
UNIX_TIMESTAMP function converts date to timestamp. And you can easily use it in your queries.
If you want to grab the record in 10 minutes interval you can do something like that
$starttime = "2012-08-30 19:00:00";
$endtime = "2012-08-30 19:10:00";
SELECT * FROM table WHERE date >= UNIX_TIMESTAMP('$starttime') AND date <= UNIX_TIMESTAMP('$endtime') ;
Decided not to try for a single query on the raw data.
After discussion with a friend, and then reading about options mentioning the memory engine, and PHP memcatche; I decided to go with a regular table to record click counts that use a time to live timestamp. After that timestamp is passed, a new ttl is assigned and the count is re-set.
One thing is for my application I can't be exactly sure how long the parameter configuration settings will be - if they are larger and the memory gets cleared, then things start over.
It isn't a perfect solution if it is run on user link click, but it should be pretty good about catching click fraud storms, and do the job.
Some managing PHP/MySQL code ("Drupalized queries"):
$timeLimit = $clickQualityConfigs['edit-submitted-within-x-num-of-same-ip-clicks']." ".$clickQualityConfigs['edit-submitted-time-period-same-ip-ban']; // => 1 days // e.g.
$filterEndTime = strtotime("+".$timeLimit);
$timeLimitUpdate_results = db_query('UPDATE {ip_address_count}
SET ttl_time_stamp = :filterendtime, click_count = :clickcountfirst WHERE ttl_time_stamp < :timenow', array(':filterendtime' => $filterEndTime, ':clickcountfirst' => '0', ':timenow' => time()));
$clickCountUpdate_results = db_query('INSERT INTO {ip_address_count} (ip_address,ttl_time_stamp,click_count)
VALUES (:ipaddress,:timestamp,:clickcountfirst)
ON DUPLICATE KEY UPDATE click_count = click_count + 1', array(':ipaddress' => $ip_address,':timestamp' => $filterEndTime,':clickcountfirst' => '1'));
DB info:
CREATE TABLE `ip_address_count` (
`ip_address` varchar(24) NOT NULL DEFAULT '',
`ttl_time_stamp` int(11) DEFAULT NULL,
`click_count` int(11) DEFAULT NULL,
PRIMARY KEY (`ip_address`)
)