Once an hour results from MySQL - mysql

Consider a table such as this:
CREATE TABLE records (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
eventOccured DATETIME NOT NULL,
eventType INT NOT NULL,
eventDescription VARCHAR(32)
)
I would like to retrieve the eventOccured and eventType fields from the entire table, but if two or more events occurred within an hour of each other then I would want only the first of them. This would be be simple if everything from N:00 to N:59 were considered the same hour, but in this case an event at 12:15 of eventType "5" would be considered to have occurred less than an hour after an 11:45 eventType "5", and so should not be returned. Can this be done in MySQL without a stored procedure? I could write such a procedure, but I worry that it will be rather resource-intensive and I would love to learn if MySQL has such ability out of the box.

What you are saying is that you want all events, where there is not another event in the next 60 minutes. This query does this in what should be a mysql friendly way:
select *
from records r
where not exists (select 1
from records r2
where r2.eventOccured > r.eventOccured and
timestampdiff(minute, r.eventOccured, r2.eventOccured) < 60
)

Related

Select one piece of data from every day at a specific hour MySQL

My database has data imputed every 1 minute and is stored in the format 2020-04-05 16:20:04 under a column called timestamp.
I need a MySQL query to select data from every day at a specific hour (the second does not matter), for for example I want to get the data from 16:00 of every day from the past 30 days.
It currently, just grabs the data from the past 30 days and then the PHP application sorts it, however, this is causing very slow loading time, hence wanting to only select the wanted data from the database.
Example of data
Please try the following sql:
select
d.timestamp, hour(d.timestamp)
from
demo1 d
where
DATEDIFF(NOW(), d.timestamp) < 30 and hour(d.timestamp) = 16;
The create sql is as following:
CREATE TABLE `demo1` (
`id` int(11) not null auto_increment primary key,
`serverid` int(11) not null,
`timestamp` datetime not null,
KEY `idx_timestamp` (`timestamp`)
) engine = InnoDB;
insert into `demo1` (serverid, timestamp)
VALUES (1, "2020-07-05 16:20:04"),
(2, "2020-07-06 17:20:04"),
(3, "2020-07-07 16:40:04"),
(4, "2020-07-08 08:20:04"),
(5, "2020-07-05 15:20:04"),
(5, "2020-07-05 16:59:04"),
(5, "2020-06-04 16:59:04");
Zhiyong's response will work, but wont perform well. You need to figure out a way to get the query to use indexes.
You can add a simple index on timestamp and run the query this way:
SELECT
d.timestamp, d.*
FROM demo1 d
WHERE 1
AND d.timestamp > CURDATE() - INTERVAL 30 DAY
AND hour(d.timestamp) = 16;
In MySQL 5.7 and up, you can created a generated column (also called calculated column) top store the hour of the timestamp in a separate column. You can then index this column, perhaps as a composite index of hour + timestamp, so that the query above will perform really quickly.
ALTER TABLE demo1
ADD COLUMN hour1 tinyint GENERATED ALWAYS AS (HOUR(timestamp)) STORED,
ADD KEY (hour1, timestamp);
The result query would be:
SELECT
d.timestamp, d.*
FROM demo1 d
WHERE 1
AND d.timestamp > CURDATE() - INTERVAL 30 DAY
AND hour1 = 16;
More info on that here:
https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html
https://dev.mysql.com/doc/refman/5.7/en/generated-column-index-optimizations.html

How to count entries in a mysql table grouped by time

I've found lots of not quite the answers to this question, but nothing I can base my rather limited sql skills on...
I've got a gas meter, which gives a pulse every cm3 of gas used - the time the pulses happen is obtained by a pi and stored in a mysql db. I'm trying to graph the db. In order to graph the data, I want to sum how many pulses are received every n time period. Where n may be 5 mins for a graph covering a day or n may be up to 24hours for a graph covering a year.
The data are in a table which has two columns, a primary key/auto inc called "pulse_ref" and "pulse_time" which stores a unix timestamp of the time a pulse was received.
Can anyone suggest a sql query to count how many pulses occurred grouped up into, say, 5minutely intervals?
Create table:
CREATE TABLE `gas_pulse` (
`pulse_ref` int(11) NOT NULL AUTO_INCREMENT,
`pulse_time` int(11) DEFAULT NULL,
PRIMARY KEY (`pulse_ref`));
Populate some data:
INSERT INTO `gas_pulse` VALUES (1,1477978978),(2,1477978984),(3,1477978990),(4,1477978993),(5,1477979016),(6,1477979063),(7,1477979111),(8,1477979147),(9,1477979173),(10,1477979195),(11,1477979214),(12,1477979232),(13,1477979249),(14,1477979267),(15,1477979285),(16,1477979302),(17,1477979320),(18,1477979337),(19,1477979355),(20,1477979372),(21,1477979390),(22,1477979408),(23,1477979425),(24,1477979443),(25,1477979461),(26,1477979479),(27,1477979497),(28,1477979515),(29,1477979533),(30,1477979551),(31,1477979568),(32,1477979586),(33,1477980142),(34,1477980166),(35,1477981433),(36,1477981474),(37,1477981526),(38,1477981569),(39,1477981602),(40,1477981641),(41,1477981682),(42,1477981725),(43,1477981770),(44,1477981816),(45,1477981865),(46,1477981915),(47,1477981966),(48,1477982017),(49,1477982070),(50,1477982124),(51,1477982178),(52,1477982233),(53,1477988261),(54,1477988907),(55,1478001784),(56,1478001807),(57,1478002385),(58,1478002408),(59,1478002458),(60,1478002703),(61,1478002734),(62,1478002784),(63,1478002831),(64,1478002863),(65,1478002888),(66,1478002909),(67,1478002928),(68,1478002946),(69,1478002964),(70,1478002982),(71,1478003000),(72,1478003018),(73,1478003036),(74,1478003054),(75,1478003072),(76,1478003090),(77,1478003108),(78,1478003126),(79,1478003145),(80,1478003163),(81,1478003181),(82,1478003199),(83,1478003217),(84,1478003235),(85,1478003254),(86,1478003272),(87,1478003290),(88,1478003309),(89,1478003327),(90,1478003346),(91,1478003366),(92,1478003383),(93,1478003401),(94,1478003420),(95,1478003438),(96,1478003457),(97,1478003476),(98,1478003495),(99,1478003514),(100,1478003533),(101,1478003552),(102,1478003572),(103,1478003592),(104,1478003611),(105,1478003632),(106,1478003652),(107,1478003672),(108,1478003693),(109,1478003714),(110,1478003735),(111,1478003756),(112,1478003778),(113,1478003799),(114,1478003821),(115,1478003844),(116,1478003866),(117,1478003889),(118,1478003912),(119,1478003936),(120,1478003960),(121,1478003984),(122,1478004008),(123,1478004033),(124,1478004058),(125,1478004084),(126,1478004109),(127,1478004135),(128,1478004161),(129,1478004187),(130,1478004214),(131,1478004241),(132,1478004269),(133,1478004296),(134,1478004324),(135,1478004353),(136,1478004381),(137,1478004410),(138,1478004439),(139,1478004469),(140,1478004498),(141,1478004528),(142,1478004558),(143,1478004589),(144,1478004619),(145,1478004651),(146,1478004682),(147,1478004714),(148,1478004746),(149,1478004778),(150,1478004811),(151,1478004844),(152,1478004877),(153,1478004911),(154,1478004945),(155,1478004979),(156,1478005014),(157,1478005049),(158,1478005084),(159,1478005120),(160,1478005156),(161,1478005193),(162,1478005231),(163,1478005268),(164,1478005306),(165,1478005344),(166,1478005383),(167,1478005422),(168,1478005461),(169,1478005501),(170,1478005541),(171,1478005582),(172,1478005622),(173,1478005663),(174,1478005704),(175,1478005746),(176,1478005788),(177,1478005831),(178,1478005873),(179,1478005917),(180,1478005960),(181,1478006004),(182,1478006049),(183,1478006094),(184,1478006139),(185,1478006186),(186,1478006231),(187,1478006277),(188,1478010694),(189,1478010747),(190,1478010799),(191,1478010835),(192,1478010862),(193,1478010884),(194,1478010904),(195,1478010924),(196,1478010942),(197,1478010961),(198,1478010980),(199,1478010999),(200,1478011018),(201,1478011037),(202,1478011056),(203,1478011075),(204,1478011094),(205,1478011113),(206,1478011132),(207,1478011151),(208,1478011170),(209,1478011189),(210,1478011208),(211,1478011227),(212,1478011246),(213,1478011265),(214,1478011285),(215,1478011304),(216,1478011324),(217,1478011344),(218,1478011363),(219,1478011383),(220,1478011403),(221,1478011423),(222,1478011443),(223,1478011464),(224,1478011485),(225,1478011506),(226,1478011528),(227,1478011549),(228,1478011571),(229,1478011593),(230,1478011616),(231,1478011638),(232,1478011662),(233,1478011685),(234,1478011708),(235,1478011732),(236,1478011757),(237,1478011782),(238,1478011807),(239,1478011832),(240,1478011858),(241,1478011885),(242,1478011912),(243,1478011939),(244,1478011967),(245,1478011996),(246,1478012025),(247,1478012054),(248,1478012086),(249,1478012115),(250,1478012146),(251,1478012178),(252,1478012210),(253,1478012244),(254,1478012277),(255,1478012312),(256,1478012347),(257,1478012382),(258,1478012419),(259,1478012456),(260,1478012494),(261,1478012531),(262,1478012570),(263,1478012609),(264,1478012649),(265,1478012689),(266,1478012730),(267,1478012771),(268,1478012813),(269,1478012855),(270,1478012898),(271,1478012941),(272,1478012984),(273,1478013028),(274,1478013072),(275,1478013117),(276,1478013163),(277,1478013209),(278,1478013255),(279,1478013302),(280,1478013350),(281,1478013399),(282,1478013449),(283,1478013500),(284,1478013551),(285,1478013604),(286,1478013658),(287,1478013714),(288,1478013771),(289,1478013830),(290,1478013891),(291,1478013954),(292,1478014019),(293,1478014086),(294,1478014156),(295,1478014228),(296,1478014301),(297,1478014373),(298,1478014446),(299,1478014518),(300,1478014591),(301,1478014664),(302,1478014736),(303,1478014809),(304,1478014882),(305,1478015377),(306,1478015422),(307,1478015480),(308,1478015543),(309,1478015608),(310,1478015676),(311,1478015740),(312,1478015803),(313,1478015864),(314,1478015921),(315,1478015977),(316,1478016030),(317,1478016081),(318,1478016129),(319,1478016176);
I assume you need to get the pulse count in n-minute (in your case 5 minutes) intervals. For achieving this, please try the following query
SELECT
COUNT(*) AS gas_pulse_count,
FROM_UNIXTIME(pulse_time - MOD(pulse_time, 5 * 60)) from_time,
FROM_UNIXTIME((pulse_time - MOD(pulse_time, 5 * 60)) + 5 * 60) to_time
FROM
gas_pulse
GROUP BY from_time

Retrieve one out of every n records

I have a table containing thousands of records representing the temperature of a room in a certain moment. Up to now I have been rendering a client side graph of the temperature with JQuery. However, as the amount of records increases, I think it makes no sense to provide so much data to the view, if it is not going to be able to represent them all in a single graph.
I would like to know if there exists a single MySQL query that returns one out of every n records in the table. If so, I think I could get a representative sample of the temperatures measured during a certain lapse of time.
Any ideas? Thanks in advance.
Edit: add table structure.
CREATE TABLE IF NOT EXISTS `temperature` (
`nid` int(10) unsigned NOT NULL COMMENT 'Node identifier',
`temperature` float unsigned NOT NULL COMMENT 'Temperature in Celsius degrees',
`timestamp` int(10) unsigned NOT NULL COMMENT 'Unix timestamp of the temperature record',
PRIMARY KEY (`nid`,`timestamp`)
)
You could do this, where the subquery is your query, and you add a row number to it:
SET #rows=0;
SELECT * from(
SELECT #rows:=#rows+1 AS rowNumber,nid,temperature,`timestamp`
FROM temperature
) yourQuery
WHERE MOD(rowNumber, 5)=0
The mod would choose every 5th row: The 5 here is your n. so 5th row, then 10th, 15th etc.
Not really sure what your asking but you have multiple options
You can limit your results to n (n representing the amount of temperatures you want to display)
just a simple query with the limit in the end:
select * from tablename limit 1000
You could use a time/date restraint so you display only the results of the last n days.
Here is an example that uses date functions. The following query selects all rows with a date_col value from within the last 30 days:
mysql> SELECT something FROM tbl_name
-> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;
You could select an average temperature of a certain period, the shorter the period the more results you'll get. You can group by date, yearweek, month etc. to "create the periods"

MySQL select records using MAX(datefield) minus three days

Clearly, I am missing the forest for the trees...I am missing something obvious here!
Scenario:
I've a typical table asset_locator with multiple fields:
id, int(11) PRIMARY
logref, int(11)
unitno, int(11)
tunits, int(11)
operator, varchar(24)
lineid, varchar(24)
uniqueid, varchar(64)
timestamp, timestamp
My current challenge is to SELECT records from this table based on a date range. More specifically, a date range using the MAX(timestamp) field.
So...when selecting I need to start with the latest timestamp value and go back 3 days.
EX: I select all records WHERE the lineid = 'xyz' and going back 3 days from the latest timestamp. Below is an actual example (of the dozens) I've been trying to run.
MySQL returns a single row with all NULL values for the following:
SELECT id, logref, unitno, tunits, operator, lineid,
uniqueid, timestamp, MAX( timestamp ) AS maxdate
FROM asset_locator
WHERE 'maxdate' < DATE_ADD('maxdate',INTERVAL -3 DAY)
ORDER BY uniqueid DESC
There MUST be something obvious I am missing. If anyone has any ideas, please share.
Many thanks!
MAX() is an aggregated function, which means your SELECT will always return one row containing the maximum value. Unless you use GROUP BY, but it looks that's not what you need.
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_max
If you need all the entries between MAX(timestamp) and 3 days before, then you need to do a subselect to obtain the max date, and after that use it in the search condition. Like this:
SELECT id, logref, unitno, tunits, operator, lineid, uniqueid, timestamp
FROM asset_locator
WHERE timestamp >= DATE_ADD( (SELECT MAX(timestamp) FROM asset_locator), INTERVAL -3 DAY)
It will still run efficiently as long as you have an index defined on timestamp column.
Note: In your example
WHERE 'maxdate' < DATE_ADD('maxdate',INTERVAL -3 DAY)
Here you were are actually using the string "maxdate" because of the quotes causing the condition to return false. That's why you were seeing NULL for all fields.
Edit: Oops, forgot the "FROM asset_locator" in query. It got lost at some point when writing the answer :)

MYSQL Finding accounts that have not had log entries updated in n days

I have an SQL query that I need help with...
Basically I have two tables I need to work with. One contains customer accounts and the other contains a log of customer service reps interactions with customers. I want this query give me the id of any account that has not had a log entry (interaction) in the last 14 days. I also want to filter out a few rep accounts that are irrelevant (using the assignedto field as you will see). Also, the date format in the log table is funky non-standard and I cannot change it, as software I have not written also utilizes this database.
The two tables are cm.dbs (customer accounts) and cm.log (interaction log).
This is the query I came up with but it takes FOREVER to run. The subquery works perfectly and takes a fraction of a second, but when the main query runs with the subquery it is just impossibly slow. I'm guessing this is because the subquery is being run for every row in the main query (and it doesn't need to be) but I am kind of clueless as to how to fix this, as I am not an expert in SQL, I know enough to create basic to intermediate queries and this is not something I have done before.
Here is the query I created so far:
SELECT id FROM cm.dbs WHERE id NOT IN (SELECT filenumber FROM cm.log
WHERE STR_TO_DATE(logdate, '%m/%d/%Y')
BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY)
AND NOW()
GROUP BY filenumber)
AND assignedto != 'OLD_ACCTS'
AND assignedto != 'HOUSE_ACCOUNTS'
AND assignedto != 'PAID_ACCOUNTS';
The subquery finds all the accounts that have entries in the log table within the last two weeks. It does this job perfectly. The trick is then to get the main query to find all the accounts that do not have entries.
Note also, that the filenumber field in cm.log corresponds to id in the cm.dbs table.
I may have approached this in a completely silly way and I am not above admitting that. Any input on making this work correctly and efficiently is appreciated. I'd also love the fixes/changes anyone recommends explained. I am not simply wanting a query built for me, I want to learn what I did wrong and how to do it better so next time I can figure this out for myself. I rarely ever ask questions like this, I usually figure things out on my own but this has me stumped.
EDIT: Here is a partial schema for the relevant fields in the tables:
cm.dbs:
id int(10) UN PK AI
title varchar(45)
firstname varchar(200)
middlename varchar(200)
lastname varchar(200)
fullname varchar(200)
address varchar(200)
address2 varchar(200)
city varchar(200)
state varchar(200)
zip varchar(50)
assignedto varchar(200)
...
cm.log:
id int(10) UN PK AI
filenumber varchar(200)
agentname varchar(200)
logtime varchar(200)
logdateandtime varchar(200)
logdate varchar(200)
logmessage mediumtext
Your query looks correct to me except the change below ( since you have multiple assignedto values to be checked for, use a IN operator instead making them in separate OR exclusively.)
SELECT id FROM cm.dbs WHERE id NOT IN (SELECT filenumber FROM cm.log
WHERE STR_TO_DATE(logdate, '%m/%d/%Y')
BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY)
AND NOW()
GROUP BY filenumber)
AND assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS');
This is the best I can do without a database schema, but should hopefully be pretty close to what you were looking for (or at least point you in the right direction):
SELECT DISTINCT dbs.id
FROM cm.dbs, cm.log
WHERE dbs.id = log.filenumber
AND STR_TO_DATE(log.logdate, '%m/%d/%Y') NOT BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY) AND NOW()
AND dbs.assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS');
If you get a chance run EXPLAIN on your query and add the output to your question, so we can profile it better (and include the database schema).
I think you are attacking this in the wrong way. Lets break down what you're looking for.
First thing is the filenumber and max logdate:
SELECT filenumber, MAX(logdate)
FROM cm.log
GROUP BY filenumber
So now we just need to join it to the other table:
SELECT filenumber, MAX(logdate), assignedto
FROM cm.log as log
INNER JOIN cm.dbs as dbs ON log.filenumber = dbs.id
GROUP BY filenumber
Now we want to apply some conditions on what we just selected (older than 2 weeks, not in those 3 groups) :
SELECT * FROM (
SELECT log.filenumber, MAX(logdate) as logdate, assignedto
FROM cm.log as log
INNER JOIN cm.dbs as dbs ON log.filenumber = dbs.id
GROUP BY filenumber) t
WHERE logdate < DATE_SUB(NOW(), INTERVAL 14 DAY)
AND assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS')