Delete all fields in huge mysql table - mysql

I have a lot amount of db, now i need to delete some table, where is many data, about millions, but if i delete using sql syntaxis, phpmyadmin interface, or delete table, i still have some data after refreshing. How to delete clear all data in table?

The easiest way to ensure you get a table wiped is to use the TRUNCATE TABLE table_name statement. If you still have data in the table after that, it means something is constantly adding data to the table.

Related

Why not to delete tho old row and insert updated row?

I have a table (MySql) that some rows need to be updated when a user desires.
i know the right way is just using Sql UPDATE statement and i don't speak about 'Which is faster? Delete and insert or just update!'. but as my table update operation needs more time to write a code (cause of table's relations) why i don't delete the old row and insert updated field?
Yes, you can delete and insert. but what keeps the record in your database if the program crash a moment before it can insert data to Database?
Update keeps this from happening. It keeps the data in your database and change the value that needed to be changed. Maybe it is complicated to use in your database, but you can certain that your record still safe.
finally i get the answer!
in a RDBMS system there are relations between records and one record might have some dependencies. in such situations you cannot delete and insert new record because foreign key constraint cause data lose. records dependent (ie user posts) to main record (ie an user record) will be deleted!
if there are situations that you don't have records dependencies (not as exceptions! but in data models nature) (like no-sql) and you have some problems in updating a record (ie file checking) you can use this approach.

Insert data to table with existing data, then delete data that was not touched during session

I am making bunch of INSERT ON DUPLICATE KEY UPDATE to a table filled with data.
I need to fill table with that data AND remove data that I haven't filled (I mean remove rows that was not mentioned in my INSERTs).
What I tried and what was working:
create new timestamp column in table
During INSERTs insert or update this column with CURRENT_TIMESTAMP, so that all rows I touched have newest timestamps
Run delete query that deletes all rows that are older than the starting time of my script.
This idea works perfectly, but there is one problem: my replication binary log get filled with unnececary data on both modes (ROW and STATEMENT). I don't need that timestamps at all to be replicated...
I don't want to do TRUNCATE TABLE before inserts because my app should deliever a non-stop access to data (old or new). If I do TRUNCATE TABLE tables can be without data for some time.
I can also save all primary key values that I insert in scripts memory or temporary table, and then delete the rows that are not in that table, but I hope there is a more optimized and clever way to do that.
Do you have any idea how can I achieve that goal so I can update data, delete only untouched rows and replicate only changes (I guess in ROW mode)?
I'm not very familiar with replication binary logs, sorry in advance if won't work. I assumed that logging can be set differently for tables.
I would do the following:
create a table for the new data with the same primary key column with
the old table
delete all rows from old table where not found in the new table
update rows in the old table according to the new table
This way wouldn't be unnecessary log inserts.
This assumes that you have the required space in the server, but can work.

Create a view or new table for caching records

I'm experiencing huge performance problem in one legacy application.
There is a search form where user can search records with given value.
A result row contains 10 columns. Then a SP returns any row which contains in any column that value.
This SP uses 8 Tables and some of them have about million records. Every minute I get a new record. This SP conducts paging as well.
Execution of this SP takes sometimes around 40 seconds.
What I did was, I created a new table and put there all records by using a query from this SP, but without conditions.
When there is a new update or update in one of source table I use a trigger and update this new "cache" table.
Now waiting for results from this new table takes only 1-3 seconds.
Has someone experience with something like this?
One of my colleagues said I better use view, but then every time I will be making JOINS.
What do you think? Is there another way?
Often times temporary tables can help you resolve performance issues. One approach might be to collect only the records that you need to consider into temporary tables and then create your final select statement from the temporary tables joined to any other tables that you're not filtering.
As an example, let's say one of the fields you are searching for is field1 in table1. Start by inserting into table #table1 only records that have the value of field1 you are looking for:
select PrimaryKeyTable1, Field1, Field2, Field3, etc...
into #table1
from table1
where Field1 = 'Whatever you are looking for'
This should be pretty fast even for a big tables, especially if you have an index on Field1. You do this for every table with search fields to collect all the records that have relevant records you are searching.
Then you also need to be sure to insert any records into your temporary tables that might have foreign key references to any of your other temporary tables. So let's say you also built a table #table2 with the above method that has a foreign key to table1 called PrimaryKeyTable1. You would insert those records like:
Insert into #table1
(PrimaryKeyTable1, Field1, Field2, Field3, etc...)
select table1.PrimaryKeyTable1, table1.Field1, table1.Field2, table1.Field3, etc...
from table1
join #table2
on table1.PrimaryKeyTable1 = table2.PrimaryKeyTable1
where table1.PrimaryKeyTable1 not in
(Select PrimaryKeyTable1 from #table1)
Now you will also have any records in #table1 that match to a record in #table2 that contain records that match the search criteria. You do this for all your temporary tables that have relevant foreign keys. The order that you do the inserts matters; be sure that you don't reference any temporary tables until after the last insert statement while collecting the foreign key referenced records.
Then you can simply do your final select statement, replacing the actual tables with the temporary tables you have built and eliminating all the filters that search your field data. Depending on the structure of your query there might be other optimizations, but that is the general idea.
If you've already explored all of your indexing options and this still doesn't help, MS SQL Server has "Change Tracking" features that maybe be of use to you in building your cache table. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.
How to: Use SQL Server Change Tracking
Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.
Change Data Capture
Your solution is a robust way of doing what is called in Microsoft SQL Server "an indexed view" or "materialized view" in Oracle.
Basically you are correct - it's faster to navigate single indexed table then a dozen ones which are updated constantly.
You should really try creating an indexed view (some start here https://technet.microsoft.com/en-us/library/dd171921(v=sql.100).aspx) and it will probably solve all your performance issues.
You can use schema binding View and create cluster index on view.it will store your view data physically.but after creating schema binding view you can not alter your table.

Mysql setting a record as deleted or archive

Is there any way to omit some records in mysql select statement and not deleting them?
We can easily add a column for example deleted and set it to 1 for deleted ones and keep them but the problem is that we have to put where deleted = 1 in all queries.
What is the best way to keep some records as an archive?
I don't know how many tables you have and how much data you want to store, but a solution could be this one:
You create a tblName_HIST table for each the tables (tblName) you want to keep the virtually deleted data
Optional: Add a DELETED_DATE column to keep track of the date the record was deleted.
You add a Trigger on the tblName tables that AFTER DELETE statement INSERT the record in the tblName_HIST table.
This will allow you to keep the Queries and the DB tables made since now without modify them that much.

How do I efficiently change a MySQL table structure on a table with millions of entries?

I have a MySQL database that is up to about 17 GB in size and has 38 million entries. At the moment, I need to both increase the size of one column (varchar 40 to varchar 80) and add more columns.
Many of the fields are indexed including the one that I need to change. It is part of a unique pair that is necessary for the applications to work. In attempting to just make the change yesterday, the query ran for almost four hours without finishing, when I decided to cut our outage and just bring the service back up.
What is the most efficient way to make changes to something of this size?
Many of these entries are also old and if there is a good way to sort of shard off entries but still have them available that might help with this problem by making the table a much more manageable size.
You have some choices.
In any case you should take a backup before you do this stuff.
One possibility is to take your service offline and do it in place, as you have tried. If you do that you should disable key checks and constraints.
ALTER TABLE bigtable DISABLE KEYS;
SET FOREIGN_KEY_CHECKS=0;
ALTER TABLE (whatever);
ALTER TABLE (whatever else);
...
SET FOREIGN_KEY_CHECKS=1;
ALTER TABLE bigtable ENABLE KEYS;
This will allow the ALTER TABLE operation to go faster. It will regenerate the indexes all at once when you do ENABLE KEYS.
Another possibility is to create a new table with the new schema you want, then disable the keys on the new table, then do as #Bader suggested and insert the contents of the old table.
After your new table is built you will re-enable the keys on it, then rename the old table to some name like "old_bigtable" then rename the new table to "bigtable".
It's possible that you can keep your service online while you're populating the new table. But that might work poorly.
A third possibility is to dump your giant table (to a flat file) and then load it to a new table with the new layout. That is pretty much like the second possibility except that you get a table backup for free. You can make this go pretty fast with SELECT DATA INTO OUTFILE and LOAD DATA INFILE. You'll need to have access to your server machine's file system to do this.
In all cases, disable, then re-enable, the constraints and keys to get things to go fast.
Create a new table with the new structure you want with a different name for example NewTable.
Then insert data into this new table from the old table using the following query:
INSERT INTO NewTable (field1, field2, etc...) SELECT field1, field2, ... FROM OldTable
After this is done, you can drop the old table and rename the new table to the original name
DROP TABLE `OldTable`;
RENAME TABLE `NewTable` TO `OldTable` ;
I have tried this approach on a very large table and it's much much faster than altering the table.
With MySQL 5.1 and again with 5.5 certain alter statements were enhanced to just modify the structure without rewriting the entire table ( http://dev.mysql.com/doc/refman/5.5/en/alter-table.html - search for in-place). The availability of this though varies by the type of change you are making and the engine in use, the most value comes from InnoDB Plugin. In the case of your specific changes though the entire table would be rewritten.
When we encounter these issues, we typically try to leverage replica databases. As long as you are adding and not removing you can run your DDL against the replica first and then schedule a brief outage for promoting the replica to the master role. If you happen to be on RDS this is even one of their suggested uses for their replica instances http://aws.amazon.com/about-aws/whats-new/2012/10/11/amazon-rds-mysql-rr-promotion/.
Some other alternatives include:
Selecting out a subset of records into a new table with the desired structure (use INTO OUTFILE to avoid a table lock). Once complete you can schedule a maintenance window and REPLACE INTO or UPDATE any records that have changed in the origin table since the initial data copy. Once the update is complete a RENAME TABLE... of both tables wraps the changes up.
Using a tool like Percona's pt-online-schema-change: http://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html. This tool works with triggers so if you already have triggers on the tables you want to change this may not fit your needs.