Get latest row from each row in a set (5M+ rows)

Get latest row from each row in a set (5M+ rows) - mysql

I have 2 tables - sensors and readings. There is a one to many relation from sensors to readings.
I need to query for all rows from sensors and then get the newest (i.e MAX timestamp) data from readings for each row. I've tried with:
SELECT sensors.*, readings.value, readings.timestamp
FROM sensors
LEFT JOIN readings ON readings.sensor_id = sensors.id
GROUP BY readings.sensor_id
The problem is, I have 6 million rows of data and the query is taking nearly two minutes to execute. Is there a more effecient way I can get hold of the last reading/value for each sensor?

This is how I'd go about the problem:
it involves a trigger that populates latest_readings table
it involves another table that I named latest_readings.
The table
I made sensor_id unique because I assumed you have one reading per sensor. This can be categorized by types by adding an additional column.
Reason for unique index: we'll be using MySQL's INSERT INTO ... ON DUPLICATE KEY UPDATE to have all the hard work done for us. If there's a reading for a particular sensor, it gets updated - otherwise, it gets inserted (in one query).
You can also make sensor_id a foreign key. I skipped that part.
CREATE TABLE latest_readings (
id int unsigned not null auto_increment,
sensor_id int unsigned not null,
reading_id int unsigned not null,
primary key(id),
unique (sensor_id)
) ENGINE = InnoDB;
The trigger
Trigger type is after insert. I will assume that the table is named readings and that it contains sensor_id column. Adjust accordingly.
DELIMITER $$
CREATE
TRIGGER `readings_after_insert` AFTER INSERT ON `readings`
FOR EACH ROW BEGIN
INSERT INTO readings
(sensor_id, reading_id)
VALUES
(NEW.sensor_id, NEW.id)
ON DUPLICATE KEY UPDATE reading_id = NEW.id
;
END;
$$
DELIMITER ;
How to query for latest sensor reading
Once more, I assumed what column names were, so adjust accordingly.
SELECT
r.reading_value
FROM readings r
INNER JOIN latest_readings latest
ON latest.sensor_id = r.sensor_id
WHERE r.sensor_id = 12345;
Disclaimer: this is just an example and it probably contains bugs, which means it's not a copy paste solution. If something doesn't work, and it's easy to fix - please do it :)

Related

mysql trigger to keep track of how often a column was updated

I have a column "processed_at" on table. This can get reset from multiple places in the code in order to indicate to a job that this row needs to be processed. I would like to find out how processed_at is set to null.
What is the easiest way to do this? Ideally I would know how often this happens by row id, but it would also be ok to just know a number for all rows combined over a certain period.
Can this be done like this:
A trigger that reacts to the update and then stores id and reset timestamp to a separate table?
Would this have a noticeable effect on the performance of the original query?

Something like this:
create table mytable_resets (
id serial primary key,
mytable_id bigint unsigned not null,
reset_at datetime not null
);
delimiter ;;
create trigger t after update on mytable
for each row begin
if NEW.processed_at is null then
insert into mytable_resets values (default, NEW.id, NOW());
end if;
end;;
delimiter ;
Yes, it will impact the performance of the original query.
The cost of database writes is roughly proportional to the number of indexes it updates. If your query execute a trigger to insert into another table, it adds another index update. In this case, the primary key index of the mytable_resets table.
But it shouldn't be significantly greater overhead than if your mytable table had one more index.

How to update a row and insert one if it doesn't exist, without wrongly raising auto_increment [duplicate]

I have table structure like this
when I insert row to the table I'm using this query:
INSERT INTO table_blah ( material_item, ... hidden ) VALUES ( data, ... data ) ON DUPLICATE KEY UPDATE id = id, material_item = data, ... hidden = data;
when I first insert data without triggering the ON DUPLICATE KEY the id increments fine:
but when the ON DUPLICATE KEY triggers and i INSERT A NEW ROW the id looks odd to me:
How can I keep the auto increment, increment properly even when it triggers ON DUPLICATE KEY?

This behavior is documented (paragraph in parentheses):
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that
would cause a duplicate value in a UNIQUE index or PRIMARY KEY, MySQL
performs an UPDATE of the old row. For example, if column a is
declared as UNIQUE and contains the value 1, the following two
statements have similar effect:
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
(The effects are not identical for
an InnoDB table where a is an auto-increment column. With an
auto-increment column, an INSERT statement increases the
auto-increment value but UPDATE does not.)
Here is a simple explanation. MySQL attempts to do the insert first. This is when the id gets auto incremented. Once incremented, it stays. Then the duplicate is detected and the update happens. But the value gets missed.
You should not depend on auto_increment having no gaps. If that is a requirement, the overhead on the updates and inserts is much larger. Essentially, you need to put a lock on the entire table, and renumber everything that needs to be renumbered, typically using a trigger. A better solution is to calculate incremental values on output.

This question is a fairly old one, but I answer it maybe it helps someone, to solve the auto-incrementing problem use the following code before insert/on duplicate update part and execute them all together:
SET #NEW_AI = (SELECT MAX(`the_id`)+1 FROM `table_blah`);
SET #ALTER_SQL = CONCAT('ALTER TABLE `table_blah` AUTO_INCREMENT =', #NEW_AI);
PREPARE NEWSQL FROM #ALTER_SQL;
EXECUTE NEWSQL;
together and in one statement it should be something like below:
SET #NEW_AI = (SELECT MAX(`the_id`)+1 FROM `table_blah`);
SET #ALTER_SQL = CONCAT('ALTER TABLE `table_blah` AUTO_INCREMENT =', #NEW_AI);
PREPARE NEWSQL FROM #ALTER_SQL;
EXECUTE NEWSQL;
INSERT INTO `table_blah` (`the_col`) VALUES("the_value")
ON DUPLICATE KEY UPDATE `the_col` = "the_value";

I had the same frustration of gaps in the auto increment but I found a way to avoid it.
In terms of previouslly discussed "overheads". When I first wrote my DB query code, it did so many separate queries that it took 5 hours. Once I put on
"ON DUPLICATE KEY UPDATE"
it got it down to about 50 seconds. Amazing! Anyway the way I solved it was by using 2 queries. Which doulbles the time it takes to 2 minutes, which is still fine.
First I did an sql query for writing all the data (updates and inserts), but I included "IGNORE" in the first query, so this just bypasses the updates and only inserts the new stuff. So assuming your auto_increment previously has no gaps then it will still have no gaps because its only new records. I believe it is updates that cause the gaps. So for inserts:
"INSERT IGNORE INTO mytablename(stuff,stuff2) VALUES "
Next I did the "ON DUPLICATE KEY UPDATE" variation of that sql query. It will keep the ID's in tact because all the records being updated have ID's already. The only thing it breaks is the auto_increment value, which gets incremented when a new record is added (or updated). So the solution is to patch this auto_increment value back to what it was before, once you have applied the updates.
To patch the auto increment value use this sql in your php:
"ALTER TABLE mytablename AUTO_INCREMENT = " . ($TableCount + 1);
This works because when you do the updates you are not increasing the amount of records. Therefore we can use the tablecount to know what the next ID should be. You set $TableCount to the table count, then we add 1 and that's the next auto increment number.
This is cheap and dirty but it seems to work. Could be bad using this while something else is writing to the db though.

Change database engine from InnoDB to MyIsam will resolve your issue.

I often deal with this by creating a temporary table, recording in the temporary table whether the record is new or not, doing an UPDATE only on the rows that are not new, and doing an INSERT with the new rows. Here's a complete example:
## THE SETUP
# This is the table we're trying to insert into
DROP TABLE IF EXISTS items;
CREATE TABLE items (
id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) UNIQUE,
price INT
);
# Put a few rows into the table
INSERT INTO items (name, price) VALUES
("Bike", 200),
("Basketball", 10),
("Fishing rod", 25)
;
## THE INSERT/UPDATE
# Create a temporary table to help with the update
DROP TEMPORARY TABLE IF EXISTS itemUpdates;
CREATE TEMPORARY TABLE itemUpdates (
name VARCHAR(100) UNIQUE,
price INT,
isNew BOOLEAN DEFAULT(true)
);
# Change the price of the Bike and Basketball and add a new Tent item
INSERT INTO itemUpdates (name, price) VALUES
("Bike", 150),
("Basketball", 8),
("Tent", 100)
;
# For items that already exist, set isNew false
UPDATE itemUpdates
JOIN items
ON items.name = itemUpdates.name
SET isNew = false;
# UPDATE the already-existing items
UPDATE items
JOIN itemUpdates
ON items.name = itemUpdates.name
SET items.price = itemUpdates.price
WHERE itemUpdates.isNew = false;
# INSERT the new items
INSERT INTO items (name, price)
SELECT name, price
FROM itemUpdates
WHERE itemUpdates.isNew = true;
# Check the results
SELECT * FROM items;
# Results:
# ID | Name | Price
# 1 | Bike | 150
# 2 | Basketball | 8
# 3 | Fishing rod | 25
# 4 | Tent | 100
The INSERT IGNORE INTO approach is simpler, but it ignores any error, which isn't what I want. And I agree that this is strange behavior on the part of MySQL but it's what we've got to work with.

I just thought I'd add, as i was trying to find an answer to my problem.
I could not stop the duplicate warning and found it was because I had it set it to TINYINT which only allows 127 entries, changing to SMALL/MED/BIGINT allows for many more

I don't think this is a problem with MySQL 5.6. See this example.

ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id)

Adding less of a direct answer and more of a fix to the end results.
If you don't use your autoincrement as an identification field within your application (and you really shouldn't be. A UUID or something of that nature is better practice), and of course, if you don't have multi-billions of lines, you can reset your autoincrement field fairly easily.
SET SQL_SAFE_UPDATES = 0;
SET #num := 0;
UPDATE my_table SET id = #num := (#num+1);
ALTER TABLE my_table AUTO_INCREMENT =1;
I kinda hate that this is a thing when doing an INSERT UPDATE in MySQL.
This is not my code. I got it some somewhere on SO but it was so long ago...
Additional note, this is not really an answer to this issue. Its more to help fix an out-of-control autoincrement field.

INSERT INTO table_blah ( material_item, ... hidden ) VALUES ( data, ... data ) ON DUPLICATE KEY UPDATE material_item = data, ... hidden = data
Yes remove the ID=ID as it will automaticly add where PRIMARY KEY = PRIMARY KEY...

Make sure the same ID is not affected to multiple tables

I want to restrict the number of the ID between multiple tables. Let's say I have 4 tables with products. Each table will have an ID, but I want to restrict the number of ID so I can't duplicate it that number already exists on one of these tables. Let's say I have the ID 43 in one table, I want to "deny" it on another table so I can't use it anymore because it's already created on that one.
Thanks!

Have a 5th table for doling out ids. There, the id can be AUTO_INCREMENT (if that is suitable).
get new number from 5th table
INSERT into the appropriate of the 4 tables.
The two steps do not need to be in the same 'transaction'.
A sample sequence generator:
-- Setup:
CREATE TABLE `seq` (
`id` INT UNSIGNED NOT NULL,
`code` char(1) CHARACTER SET ascii NOT NULL,
PRIMARY KEY (`code`)
) ENGINE=InnoDB;
INSERT INTO seq VALUES (0, 'a'); -- The only row for the table
-- Increment and get next value:
UPDATE seq
SET id = LAST_INSERT_ID(id + 1)
WHERE code = 'a';
SELECT LAST_INSERT_ID();
Note: the UPDATE and SELECT can (should) be done outside any transaction; be in autocommit=ON mode. The SELECT is specific to the connection, so there is no chance of mixing up numbers with another connection.

When you INSERT new row, you should, LOCK all four tables (for data insert), start transaction, read max value of ID of each table and choose maximum, add 1 to that maximum and INSERT new row, then finish transaction and unlock tables.
This is basic solution, you can try to optimize it by saving in some place last maximum ID. The table locking ensure that ID will be unique when many concurrent threads (e.g. from php) will insert rows.
May be you will able to do it using triggers (so no on application side but on DB side).

sql server trigger for insert new record

i have a table called 'tblDive' with columns:
create table tblDive (
DiveNumber int
InstructorNumber int
ClubNumber int
InstructorSigniture date
)
and another table:
create table tblWorksAt (
InstructorNumber int
ClubNumber int
StartWorkingDate date
EndWorkingDate date
)
the table 'tblWorksAt' has this record:
InstructorNumber | ClubNumber | StartWorkingDate | EndWorkingDate
1 2 1.1.2000 1.1.2005
i want to create a trigger that checks when inserting a new dive, if the instructor really worked at this club in the same time of signing on the dive.
so for example if i insert a new dive:
insert into tblDive (DiveNumber InstructorNumber ClubNumber InstructorSigniture)
values 111, 1, 2, 1.1.2009
i won't be able to insert this record because instructor number 1 stopped working at club number 2 in 1.1.2005

An alternative to using a trigger is to use a check constraint and a user defined function.
A function that checks is the instructor is employed at the right club at the right time:
CREATE FUNCTION CheckEmployment(#InstructorNumber int, #ClubNumber int, #checkdate date)
RETURNS int
AS
BEGIN
DECLARE #retval int
SELECT #retval = COUNT(*)
FROM tblWorksAt
WHERE InstructorNumber = #InstructorNumber
AND ClubNumber = #ClubNumber
AND (EndWorkingDate IS NULL OR EndWorkingDate > #checkdate)
RETURN #retval
END;
GO
And a check constraint using it:
ALTER TABLE tblDive
ADD CONSTRAINT chkEmployed
CHECK (dbo.CheckEmployment(InstructorNumber, ClubNumber, InstructorSigniture) != 0);
This might not be the most efficient way, but it should get the job done. The logic in the function might need improvement too, I might have missed something.
Sample SQL Fiddle showing it in action.

What I'll do is give you some hints and not-so-obvious information about triggers that may help you write the trigger, but you need to write it.
The inserted table
In SQL Server triggers, there are 2 pseudo-tables that you can reference: inserted and deleted tables. The names are somewhat deceiving, particularly if you are doing an UPDATE. The thing to remember is that under the hood, an UPDATE is a delete plus insert.
So essentially, inserted is the new (or updated) rows and deleted is the deleted rows and/or previous rows from an UPDATE before the changes were applied.
For a straight INSERT statement, the deleted table ought to be empty.
Therefore, you want to look for rows in inserted that meet a certain set of criteria. There are two logical ways to approach this:
Check if all rows meet this condition
Look for any rows that do not meet the condition.
Join
If you join inserted to tblWorksAt, you now have all the data you need. Something like this to join the tables and find rows that pass your business rules:
select 1
from inserted i
inner join tblWorksAt wa on wa.InstructorNumber = i.InstructorNumber and i.ClubNumber = wa.ClubNumber
where i.InstructorSignature between wa.StartWorkingDate and wa.EndWorkingDate
What to do with the query
Like I said before:
Check if all rows meet this condition
Look for any rows that do not meet the condition.
To check if all rows pass that criteria, you could:
Check if the count of rows matching that query equals the number of total rows in inserted, without a join.
Loop over each row in inserted which I will tell you right now, is almost never a good idea in a trigger.
To check if at least one row fails to pass the criteria, you could:
Change the between to not between and wrap this query around if (exists(...)).
Change the inner join to left outer join, move the where clause to the join, and then add a new WHERE clause that says tblWorksAt.InstructorNumber is null, then wrap this query around if (exists(...)).
Throwing an error
Now you know how to find rows that pass or fail. Now you just need to throw an error to prevent the statement from completing and prevent the data from persisting. I will leave that as an exercise to you. It should be easy to research.

How to swap values of two rows in MySQL without violating unique constraint?

I have a "tasks" table with a priority column, which has a unique constraint.
I'm trying to swap the priority value of two rows, but I keep violating the constraint. I saw this statement somewhere in a similar situation, but it wasn't with MySQL.
UPDATE tasks
SET priority =
CASE
WHEN priority=2 THEN 3
WHEN priority=3 THEN 2
END
WHERE priority IN (2,3);
This will lead to the error:
Error Code: 1062. Duplicate entry '3' for key 'priority_UNIQUE'
Is it possible to accomplish this in MySQL without using bogus values and multiple queries?
EDIT:
Here's the table structure:
CREATE TABLE `tasks` (
`id` int(11) NOT NULL,
`name` varchar(200) DEFAULT NULL,
`priority` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `priority_UNIQUE` (`priority`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Is it possible to accomplish this in MySQL without using bogus values and multiple queries?
No. (none that I can think of).
The problem is how MySQL processes updates. MySQL (in difference with other DBMS that implement UPDATE properly), processes updates in a broken manner. It enforces checking of UNIQUE (and other) constraints after every single row update and not - as it should be doing - after the whole UPDATE statement completes. That's why you don't have this issue with (most) other DBMS.
For some updates (like increasing all or some ids, id=id+1), this can be solved by using - another non-standard feature - an ORDER BY in the update.
For swapping the values from two rows, that trick can't help. You'll have to use NULL or a bogus value (that doesn't exist but is allowed in your column) and 2 or 3 statements.
You could also temporarily remove the unique constraint but I don't think that's a good idea really.
So, if the unique column is a signed integer and there are no negative values, you can use 2 statements wrapped up in a transaction:
START TRANSACTION ;
UPDATE tasks
SET priority =
CASE
WHEN priority = 2 THEN -3
WHEN priority = 3 THEN -2
END
WHERE priority IN (2,3) ;
UPDATE tasks
SET priority = - priority
WHERE priority IN (-2,-3) ;
COMMIT ;

I bumped into the same issue. Had tried every possible single-statement query using CASE WHEN and TRANSACTION - no luck whatsoever. I came up with three alternative solutions. You need to decide which one makes more sense for your situation.
In my case, I'm processing a reorganized collection (array) of small objects returned from the front-end, new order is unpredictable (this is not a swap-two-items deal), and, on top of everything, change of order (usually made in English version) must propagate to 15 other languages.
1st method: Completely DELETE existing records and repopulate entire collection using the new data. Obviously this can work only if you're receiving from the front-end everything that you need to restore what you just deleted.
2st method: This solution is similar to using bogus values. In my situation, my reordered collection also includes original item position before it moved. Also, I had to preserve original index value in some way while UPDATEs are running. The trick was to manipulate bit-15 of the index column which is UNSIGNED SMALLINT in my case. If you have (signed) INT/SMALLINT data type you can just invert the value of the index instead of bitwise operations.
First UPDATE must run only once per call. This query raises 15th bit of the current index fields (I have unsigned smallint). Previous 14 bits still reflect original index value which is never going to come close to 32K range.
UPDATE *table* SET `index`=(`index` | 32768) WHERE *condition*;
Then iterate your collection extracting original and new index values, and UPDATE each record individually.
foreach( ... ) {
UPDATE *table* SET `index`=$newIndex WHERE *same_condition* AND `index`=($originalIndex | 32768);
}
This last UPDATE must also run only once per call. This query clears 15th bit of the index fields effectively restoring original index value for records where it hasn't changed, if any.
UPDATE *table* SET `index`=(`index` & 32767) WHERE *same_condition* AND `index` > 32767;
Third method would be to move relevant records into temporary table that doesn't have a primary key, UPDATE all indexes, then move all records back to first table.

Bogus value option:
Okay, so my query is similar and I've found a way to update in "one" query. My id column is PRIMARY and position is part of a UNIQUE group. This is my original query that doesn't work for swapping:
INSERT INTO `table` (`id`, `position`)
VALUES (1, 2), (2, 1)
ON DUPLICATE KEY UPDATE `position` = VALUES(`position`);
.. but position is an unsigned integer and it's never 0, so I changed the query to the following:
INSERT INTO `table` (`id`, `position`)
VALUES (2, 0), (1, 2), (2, 1)
ON DUPLICATE KEY UPDATE `position` = VALUES(`position`);
.. and now it works! Apparently, MYSQL processes the values groups in order.
Perhaps this would work for you (not tested and I know almost nothing about MYSQL):
UPDATE tasks
SET priority =
CASE
WHEN priority=3 THEN 0
WHEN priority=2 THEN 3
WHEN priority=0 THEN 2
END
WHERE priority IN (2,3,0);
Good luck.

Had a similar problem.
I wanted to swap 2 id's that were unique AND was a FK from an other table.
The fastest solution for me to swap two unique entries was:
Create a ghost entry in my FK table.
Go back to my table where I want to switch the id's.
Turned of the FK Check SET FOREIGN_KEY_CHECKS=0;
Set my first(A) id to the ghost(X) fk (free's A)
Set my second (B) id to A (free's B)
Set A to B (free's X)
Delete ghost record and turn checks back on. SET FOREIGN_KEY_CHECKS=1;

Not sure if this would violate the constraints, but I have been trying to do something similar and eventually came up with this query by combining a few of the answers I found:
UPDATE tasks as T1,tasks as T2 SET T1.priority=T2.priority,T2.priority=T1.priority WHERE (T1.task_id,T2.task_id)=($T1_id, $T2_id)
The column I was swapping did not use a unique, so I am unsure if this will help...

you can achieve swapping your values with your above mentioned update statement, with a slight change in your key indexes.
CREATE TABLE `tasks` ( `id` int(11) NOT NULL, `name` varchar(200) DEFAULT NULL, `priority` varchar(45) DEFAULT NULL, PRIMARY KEY (`id`,`priority`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This will have a primary key index as a combination of id and priority. you cna then swap values.
UPDATE tasks
SET priority =
CASE
WHEN priority=2 THEN 3
WHEN priority=3 THEN 2
END
WHERE priority IN (2,3);
I dont see any need of user variables or temp variables here.
Hope this solves your issue :)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008