I'm using a 3rd party ETL application (Pentaho/ Kettle/ Spoon) --- so unfortunately I'm not sure of the exact SQL query, but I can try different manual queries.
I'm just wondering why ... MySQL seems to allow multiple processes at once do an insert, but if found, update ... queries.
MS SQL does not ... it "locks" the rows when one query is doing an insert/ update ... and throws an error if another query tries to insert/ update over the same data.
I guess this makes sense ... but I'm just a bit annoyed that MySQL allows this, and MS SQL does not.
Is there any way to get around this?
I just want as fast a way as possible to insert/ update a list of 1000 records into a data table. In the past I just divided this numbers into 20 processes updating 50 records doing insert/ updates ... this worked in parallel because none of the 1000 records are duplicate ... they are only some duplicates of them already in table ... so they can be inserted/ updated in any order, so long as it happens.
Any thoughts? Thanks
MySQL use the ISAM storage engine by default which does not support transactions. SQL Server is a RDBMS and supports transactions as you've observed though you can tweak the isolation levels to do risky things like read uncommitted (very rarely a good idea).
If you want your MySQL database to have transaction support, you need to explicitly create your table with the option ENGINE=INNODB. Older versions also support ENGINE=BDB which is the Berkeley Database engine. See MySQL docs for more details on InnoDB
http://dev.mysql.com/doc/refman/5.7/en/innodb-storage-engine.html
Related
In our application, we are using SELECT FOR UPDATE statement to ensure locking for our entities from other threads. One of our original architects who implemented this logic put a comment in our wiki that MySQL has a limit of 200 for select for update statements. I could not find anything like this anywhere on the internet. Does anyone know if this is true and if so is there any way we can increase the limit?
The primary reason for SELECT FOR UPDATE is used is for Concurrency Prevention in the case when two users are currently trying to access the same data in the same time. If the users, however, try to update the data there will be a serious problem in the database.
In some Database Systems this problem can affect database integrity in a serious way. To help prevent concurrency problem, some Database Management Systems like SQL Server and MySQL use locking in most cases to prevent serious data integrity problems from occuring.
These locks delay the execution of the committed transaction if it conflicts the transaction that is already running.
In SQL Server or MySQL SELECT FOR UPDATE queries are used when the transaction is committed or rolled back.
In MySQL, however, the transaction records are allocated to individual MySQL servers for a minimum total number of transactions in the cluster.
MySQL uses high level datbase algorithm that makes up this formula:
TotalNoOfConcurrentTransactions = (maximum number of tables accessed in any single transaction + 1) * number of SQL nodes.
Each data node can handle TotalNoOfConcurrentTransactions / number of data nodes. Each and every Network Database (NDB) Cluster has 4 data nodes.
The result of the above formula is expressed as MaxNoOfConcurrentTransactions / 4.
In MySQL Documentation, they provided an example using 10 SQL nodes using a cluster in 10 tables in 11 transaction that resulted in 275 as MaxNoOfConcurrentTransactions.
LIMIT in SELECT FOR UPDATE is possibly used for number of rows affected during update.
I am not sure probably your architects made use of the figure above according to MySQL Documentation.
Please check the link below for more information.
https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster-ndbd-definition.html#ndbparam-ndbd-maxnoofconcurrentoperations
I have a quick question that I can't seem to find online, not sure I'm using the right wording or not.
Do MySql database automatically synchronize queries or coming in at around the same time? For example, if I send a query to insert something to a database at the same time another connection sends a query to select something from a database, does MySQL automatically lock the database while the insert is happening, and then unlock when it's done allowing the select query to access it?
Thanks
Do MySql databases automatically synchronize queries coming in at around the same time?
Yes.
Think of it this way: there's no such thing as simultaneous queries. MySQL always carries out one of them first, then the second one. (This isn't exactly true; the server is far more complex than that. But it robustly provides the illusion of sequential queries to us users.)
If, from one connection you issue a single INSERT query or a single UPDATE query, and from another connection you issue a SELECT, your SELECT will get consistent results. Those results will reflect the state of data either before or after the change, depending on which query went first.
You can even do stuff like this (read-modify-write operations) and maintain consistency.
UPDATE table
SET update_count = update_count + 1,
update_time = NOW()
WHERE id = something
If you must do several INSERT or UPDATE operations as if they were one, you'll need to use the InnoDB engine, and you'll need to use transactions. The transaction will block SELECT operations while it is in progress. Teaching you to use transactions is beyond the scope of a Stack Overflow answer.
The key to understanding how a modern database engine like InnoDB works is Multi-Version Concurrency Control or MVCC. This is how simultaneous operations can run in parallel and then get reconciled into a consistent "view" of the database when fully committed.
If you've ever used Git you know how you can have several updates to the same base happening in parallel but so long as they can all cleanly merge together there's no conflict. The database works like that as well, where you can begin a transaction, apply a bunch of operations, and commit it. Should those apply without conflict the commit is successful. If there's trouble the transaction is rolled back as if it never happened.
This ability to juggle multiple operations simultaneously is what makes a transaction-capable database engine really powerful. It's an important component necessary to meet the ACID standard.
MyISAM, the original engine from MySQL 3.0, doesn't have any of these features and locks the whole database on any INSERT operation to avoid conflict. It works like you thought it did.
When creating a database in MySQL you have your choice of engine, but using InnoDB should be your default. There's really no reason at all to use MyISAM as any of the interesting features of that engine (e.g. full-text indexes) have been ported over to InnoDB.
I am currently trying to figure out why the site I am working on (Laravel 4.2 framework) is really slow at times, and I think it has to do with my database setup. I am not a pro at all so I would assume that where the problem is
My sessions table has roughly 2.2 million records in it, when I run show processlist;, all the queries that take the longest relate to that table.
Here is a picture for example:
Table structure
Surerly I am doing something wrong or it's not index properly? I'm not sure, not fantastic with databases.
We don't see the complete SQL being executed, so we can't recommend appropriate indexes. But if the only predicate on the DELETE statements is on the last_activity column i.e.
DELETE FROM `sessions` WHERE last_activity <= 'somevalue' ;
Then performance of the DELETE statement will likely be improved by adding an index with a leading column of somevalue, e.g.
CREATE INDEX sessions_IX1 ON sessions (last_activity);
Also, if this table is using MyISAM storage engine, then DML statements cannot execute concurrently; DML statements will block while waiting to obtain exclusive lock on the table. The InnoDB storage engine uses row level locking, so some DML operations can be concurrent. (InnoDB doesn't eliminate lock contention, but locks will be on rows and index blocks, rather than on the entire table.)
Also consider using a different storage mechanism (other than MySQL database) for storing and retrieving info for web server "sessions".
Also, is it necessary (is there some requirement) to persist 2.2 million "sessions" rows? Are we sure that all of those rows are actually needed? If some of that data is historical, and isn't specifically needed to support the current web server sessions, we might consider moving the historical data to another table.
Our server database is in mysql 5.1
we have 754 tables in our db.We create a table for each project. Hence the large no of tables.
From past one week i have noticed a very long delay in inserts and updates to any table.If i create a new table and insert into it,It takes one min to insert around 300 recs.
Where as our test database in the same server has 597 tables Same insertion is very fast in test db.
Default engine is MYISAM. But we have few tables in INNODB .
There were a few triggers running. After i deleted triggers it has become some what faster. But it is not fast enough.
USE DESCRIBE to know your query execution plans.
Look more at http://dev.mysql.com/doc/refman/5.1/en/explain.html for its usage.
As #swapnesh mentions, the DESCRIBE command is very usefull for performance debugging.
You can also check your installation for issues using:
https://raw.github.com/rackerhacker/MySQLTuner-perl/master/mysqltuner.pl
You use it like this:
wget https://raw.github.com/rackerhacker/MySQLTuner-perl/master/mysqltuner.pl
chmod +x mysqltuner.pl
./mysqltuner.pl
Of course, here I am assuming that you run some kind of a Unix based system.
You can use OPTIMIZE. According to Manual it does the following:
Reorganizes the physical storage of table data and associated index
data, to reduce storage space and improve I/O efficiency when
accessing the table. The exact changes made to each table depend on
the storage engine used by that table
The syntax is:
OPTIMIZE TABLE tablename
Inserts are typically faster when made in bulk rather than one by one. Try inserting 10, 30, or 100 records per statement.
If you use jdbc you may be able to achieve the same effect with batching, without changing the SQL.
I have three large MySQL tables. They are approaching 2 million records. Two of the tables are InnoDB and are currently around 500 MB in size. The other table is MyISAM and is about 2.5 GB.
We run an import script from FileMaker to insert and update records in these tables but lately it has become very slow - only inserting a few hundred records per hour.
What can I do to increase performance to make inserts and updates happen faster?
For INSERT it could have to do with the indexes you have defined on the tables (they have to be updated after each INSERT). Could you post more information about them? And are there triggers set on the tables?
For UPDATE it is a different story, it could be that not the record update is slow but finding the record is slow. Could you try to change the UPDATE into a SELECT and see if it is still slow? If yes, then you should investigate your indexes.
For the Innodb table, if it's an acceptable risk, I'd consider changing the innodb_flush_log_at_trx_commit level. Some more details in this blog post, along with some more Innodb tuning pointers.
For both engines, batching INSERTs together can speed things up to a point. See doc.
What version of MySQL are you running? There have been many improvements with the new InnoDB "Plugin" engine and concurrency of operations on servers with multiple processors.
Is the query slow when executed on MySQL from the command line?
If you're using the Execute SQL Script step from FileMaker, that connects and disconnects after every call, causing major slowdowns when executing large numbers of queries. We've had clients switch to our JDBC plugin (self-promotion disclaimer here) to avoid this, resulting in major speedups.
It turns out the reason for the slowness was from the FileMaker side of things. Exporting the FileMaker records to a CSV and running INSERT/UPDATE commands resulted in very fast execution.