Optimize Mysql update query performance with - mysql

I have a table in which i need to add a column, The table has millions of record. For existing record i have to update the column value(which will be different for each record). Running individual update query will take lots of time. Is there a way where this can be achieved with minimum amount of locking time for the table.

in your case there can be two ways,
if your additional column is a derived column (values can be derived from existing columns) then simple update query will be enough
if its not derived one then write all the values in file (probably scripts) and import that file as new column in your table. (or export existing table and modify that file with new column values and then import)
in above two ways, import is faster with less amount of locking.
hope that helps :)

When the hash value cannot be generated in the database, the only thing you can do is run individual updates. You may get some performance improvements by batching several updates into one transaction.

Related

Maintain data integrity and consistency when performing sql batch insert/update with unique columns

I have an excel file that contains contents from the database when downloaded. Each row is identified using an identifier called id_number. Users can add new rows on the file with a new unique id_number. When it is uploaded, for each excel row,
When the id_number exist on the database, an update is performed on the database row.
When the id_number does not exist on the database, an insert is performed on the database row.
Other than the excel file, data can be added or updated individually using a file called report.php. Users use this page if they only want to add one data for an employee, for example.
Ideally, I would like to do an insert ... on duplicate key update for maximum performance. I might also put all of them in a transaction. However, I believe this overall process have some flaws:
Before any add/updates, validation checks have to be done on all excel rows against their corresponding database rows. The reason is because there are many unique columns in the table. That's why I'll have to do some select statements to insure that the data is valid before performing any add/update. Is this efficient on tables with 500 rows and 69 columns? I could probably just get all the data and store all of them in a php array and do the validation check on the array, but what happens if someone adds a new row (with an id_number of 5) through report.php? Then suppose the excel file I uploaded also contains a row with an id_number 5? That could probably destroy my validations because I can not be sure my data is up to date without performing a lot of select statements.
Suppose the system is in the middle of a transaction adding/updating the data retrieved from the excel file, then someone from report.php adds a row because all the validations have been satisfied (E.G. no duplicate id_numbers). Suppose at this point in time the next row to be added from the excel file and the row that will be added by the user on report.php have the same id_number. What happens then? I don't have much knowledge on transactions, I think they at least prevents two queries changing a row at the same time? Is that correct?
I don't really mind these kinds of situations that much. But some files have many rows and it might take a long time to process all of them.
One way I've thought of fixing this is: while the excel file upload is processing, I'll have to prevent users using report.php to modify the rows currently held by the excel file. Is this fine?
What could be the best way to fix these problems? I am using mysql.
If you really need to allow the user to generate their own unique ID then the you could lock the table in question while you're doing you validation and inserting.
If you acquire a write lock, then you can be certain the table isn't changed while you do your work of validation and inserting.
`mysql> LOCK TABLES tbl_name WRITE`
don't forget to
`mysql> UNLOCK TABLES;`
The downside with locking is obvious, the table is locked. If it is high traffic, then all your traffic is waiting, and that could lead all kinds of pain, (mysql running out of connections, would be one common one)
That said, I would suggest a different path altogether, let mysql be the only one who generates a unique id. That is make sure the database table have an auto_increment unique id (primary key) and then have new records in the spreadsheet entered without without the unique id given. Then mysql will ensure that the new records get a unique id, and you don't have to worry about locking and can validate and insert without fear of a collision.
In regards to the question as to performance with a 500 records 69 column table, I can only say that if the php server and the mysql server are reasonably sized and the columns aren't large data types then this amount of data should be readily handled in a fractions of a second. That said performance can be sabotaged by one bad line of code so if your code is slow to perform, I would take that as a separate optimisation problem.

Inserting 1 million records in mysql

I have two tables and in both tables I get 1 million records .And I am using cron job every night for inserting records .In first table I am truncating the table first and then inserting the records and in second table I am updating and inserting record according to primary key. I am using mysql as my database.My problem is I need to do this task each day but I am unable to insert all data .So what can be the possible solution for this problem
Important is to set off all kind of actions and checks MySQL wants to perform when posting data, like autocommit, indexing, etc.
https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-bulk-data-loading.html
Because if you do not do this, MySQL does a lot of work after every record added, and it adds up, when the process is proceeding, resulting in a very slow processing and importing in the end, and may not complete in one day.
If you must use MySql : For the first table, disable the indexes, do the inserts, than enable indexes. This will works faster.
Alternatively MongoDb will be faster, and Redis is very fast.

Will MySQL table need re-indexing when new records are inserted?

Do I need to re-index the MySQL table if I insert new records into it? Table has existing indexes.
No, because indexes are updated automatically on any change.
MySQL won't need to re-index the entire table when you insert a new record. It will however need to create an index entry for your newly inserted record.
There is therefore a slight performance impact in creating indexes on tables, in that inserts will take slightly longer. In most practical situations this is acceptable given the big performance improvement you get on reading data from indexed tables.
There is more information available at https://dev.mysql.com/doc/refman/5.0/en/insert-speed.html

MySQL Delete Performance

I'm deleting rows using a cron tab set to run every hour. For performance and less fragmentation, what is the best way to do this?
Also, should I run optimize table after the delete has finished?
The answer will depend on your data and how many rows you're deleting at a time.
If possible, delete the rows with a single query (rather than one query per row). For example:
DELETE FROM my_table WHERE status="rejected"
If possible, use an indexed column in your WHERE clause. This will help it select the rows that need to be deleted without doing a full table scan.
If you want to delete all the data, use TRUNCATE TABLE.
If deleting the data with a single query is causing performance problems, you could try limiting how many rows it deletes (by adding a LIMIT clause) and running the delete process more frequently. This would spread the deletes out over time.
Per the documentation, OPTIMIZE TABLE should be used if you have deleted a large part of a table or if you have made many changes to a table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns).
Optimizing the table can be very expensive. If you can, try deleting your data and optimizing the table once per day (at night). This will limit any impact to your users.

MySql - transfer a lot of records from one table to the other

I have a large table (~50M records) and i want to pass the records from this table to a different table that have the same structure (the new table have one extra index).
I'm using INSERT IGNORE INTO... to pass the records.
whats the fastest way to do this? is it by passing small chunks (lets say of 1M records) or bigger chunks?
is there any way i could speed the process?
Before perform Insert, disable indexes (DISABLE KEYS) (if you can) on destination table:
Reference can be found: Here
Also if you not using transanction / relations maybe consider switch to MyIsam engine.