I have a VIEW (view1) that returns random values from another table (Table2) based on values inside Table1.
A trigger is configured to UPDATE a third table (Table3), when values inside Table1 change. The purpose of Table3 is to hold the random values so they don’t get updated all the time.
The problem I have is that each select statement inside the trigger causes the whole row to get new random values for each column being updated. If there are multiple updates than it’s multiple times the whole row is getting new random values. I have adequate hardware, but it’s still too slow. Is there a way to reduce it, so maybe it selects once for each row, regardless of how many columns are receiving the updates? Maybe holds the values somewhere temporarily and updates from that?
Here is sample fiddle. In my real data I have significantly more columns.
Fiddle Example:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=4fd8bf89135d8babe8c19fc15a565d50
Currently I don’t have indexes on any columns. I’ve read mixed reviews re indexes and updates.
Lastly, while browsing Stack I found a few links to this: https://jan.kneschke.de/projects/mysql/order-by-rand/, but I’m not sure there is a way I can apply it.
Related
I’m having problems with a MySQL TRIGGER.
I have an employees table with some basic information. Then I have a VIEW that we call our finance_system. The finance_system VIEW holds a lot of information from many sources. A few columns in the finance_system are related to an internal sales draw and return random values from other tables each time the finance_system VIEW is queried. Lastly, I have a table called EP1. Its job is to hold some of the data from the finance_system, particularly some of the columns with random values. This keeps them from UPDATING /Changing with every query. The EP1 table gets updated via a TRIGGER that fires AFTER AN UPDATE to certain columns within the employees table. The trigger works and the EP1 table gets updated, but it’s updating the entire table rather than the rows associated to the employee_id that was updated.
I read Stacks policy prior to posting and understand a reproducible example is necessary so I created a Fiddle, but I’ve obfuscated the data and only included a few columns. Hopefully it’s enough as I’ve exhausted Google trying to figure out why it’s not working.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=942dbabba180df86b7ac9317bbcdd64b
Edit:
Simpler version: When Table1.status is updated, the trigger should update Table3.rand1 with the values from Table2.rand1 where the ID’s match. Each ID has two rows in Table2 and Table3. My current problem is that when the trigger fires it updates all the rows in Table3 and I want it to only update the rows associated to the ID that was update in Table1. A join won’t work in this scenario, as Table2 is actually a VIEW using rand() in my real data.
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=329e2dd3bf2fff39afc5388721c53ddc
Aimee
Check the trigger's UPDATE query: it is missing a WHERE clause, hence all rows of Table3 get updated instead of the ones matching the desired id.
This is not working for the row level locking
I just want to know if I can select the row like
select * from table where folder like %344443%**
then update the row with
update table set folder = '{"bin":"44456","venv":4366}' where id = 'i-instanceid'
You can't.
The problem is that the UPDATE must scan the entire table to find the row(s) you need to change. In doing so, it locks the entire table.
Don't bury things that you want to search on inside JSON strings. Have them as indexed columns on their own. This should let you lock a single row, and run the Update much faster.
Or look into indexing parts JSON columns. Such is still evolving. What version of MySQL are you using?
Furthermore, why select the id first, then do the update? Can't you simply do the update? If you are actually doing something else in the "transaction", say so. You may need the SELECT. At that point, it would need to be SELECT ... FOR UPDATE.
I have created a table ("texts" table) for storing ocr text from scanned documents. The table now has 100,000 + records. It stores a separate record for each page in the document. I set up the table originally so it stored the documents' title and its location against each record, which was obviously bad design as the info was duplicated for many records. I have subsequently created a separate table which now only stores one record for each document ("documents" table). The original table still contains a record for each page in the document, but the only columns now are the ocr text and the id of the document record in the documents table.
The documents table has a column "total_pages". I am trying to update this value using the following query:
UPDATE documents SET total_pages=(SELECT Count(*) from texts where texts.docs_id=documents.id)
This just seems to take forever to execute and I have had to crash out of it on a couple of occasions. There are over 8000 records in the documents table.
I have tested the query by limiting it to just one document
UPDATE documents SET total_pages=(SELECT Count(*) from texts where texts.docs_id=documents.id and documents.id=1)
This works eventually with just one record, but it takes a very long time to execute. I am guessing that my full query needs a bit of optimization! Any help greatly appreciated.
This is your query:
UPDATE documents
SET total_pages = (SELECT Count(*)
from texts
where texts.docs_id = documents.id)
For performance, you want an index on texts(docs_id). That will probably fix your performance problem. In fact, it might make it unnecessary to store this value in the master table.
If you do decide to store the count, be sure that you keep the value up-to-date. That would typically require a trigger to handle inserts and dates (and perhaps updates, if doc_id changes).
I would like to store random numbers in one MySql table, randomly retrieve one and insert it into another table column each time a new record is created. I want to delete the retrieved number from the random number table as it is used.
The random numbers are 3 digit, there are 900 of them.
I have read several posts here that describe the problems using unique random numbers and triggering their insertion. I want to use this method as it seems to be reliable while generating few problems.
Can anyone here give me an example of a sql query that will accomplish the above? (If sql query is not the recommended way to do this please feel free to recommend a better method.)
Thank you for any help you can give.
I put together the two suggestions here and tried this trigger and query:
CREATE TRIGGER rand_num before
INSERT ON uau3h_users FOR EACH ROW
insert into uau3h_users (member_number)
select random_number from uau3h_rand900
where random_number not in (select member_number from uau3h_users)
order by random_number
limit 1
But it seems that there is already a trigger attached to that table so the new one cause a conflict, things stopped working until I removed it. Any ideas about how accomplish the same using another method?
You are only dealing with 900 records, so performance is not a major issue.
If you are doing a single insert into a table, you can do something like the following:
insert into t(rand)
select rand
from rand900
where rand not in (select rand from t)
order by rand()
limit 1
In other words, you don't have to continually delete from one table and move to the other. You can just choose to insert values that don't already exist. If performance is a concern, then indexes will help in this case.
More than likely you need to take a look into Triggers. You can do some stuff for instance after inserting a record in a table. Refer this link to more details.
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
We're trying to figure out what the relative costs are between a couple of approaches.
We have a web page where people choose to add/keep/remove rows from a table, by marking them with checkboxes. (People can add new entries to the page as well as see existing ones.)
When posted to the web server the page loops over the entries and calls a stored procedure, passing in the state of the checkbox as one of the parameters.
The stored procedure currently calls a delete statement for each entry, followed by an insert if the checkbox is marked. This has the virtue of simplicity.
We're thinking instead of putting some if exists logic in there, to test whether the row is already in the table.
If so and the checkbox is marked, we'd leave it alone. Otherwise we'd insert it. Conversely, if the row isn't in the table and the checkbox is unmarked, we'd skip the delete and insert statements. This minimizes the number of deletes and such but at a cost of more logic.
In terms of load on the database, is one approach generally preferred to the other?
Is there a cost to calling delete statements that don't, in fact, affect any rows, as would be the case when adding new records? Is this worse than an if exists check?
The table is indexed on all relevant columns. I assume for posting 600,000 entries there would be a big advantage to checking beforehand, but the page in question will have 100 entries at most.
The biggest problem you're going to have with performance here is that you are calling a stored procedure for every entry - it really doesn't matter if inside that stored procedure you use DELETE/INSERT or check first, you're still going to have the overhead of 600K procedure calls, some potentially large portion of 600K logged transactions, etc.
I strongly recommend you look at table-valued parameters. Your C# or whatever can pass a set of 600K entries to a single stored procedure, once, and then you can perform two set-based operations (pseudo-code):
UPDATE src SET val = t.val
FROM dbo.tvp INNER JOIN dbo.source AS src
ON t.key = src.key;
INSERT src SELECT x FROM dbo.tvp AS t
WHERE NOT EXISTS (SELECT 1 FROM src WHERE key = t.key);