Truncate mediawiki

Truncate mediawiki - mysql

I'm working with the mediawiki API ( e.g. http://en.wikipedia.org/w/api.php) and I would like to be able to 'truncate' the mysql tables in order to reset the local installation while keeping some tables (users, ?...).
What would be the SQL queries ?
I would say: tuncate all the tables but ${PREFIX}_user and update ${PREFIX}_user set user_editcount=0 ?
Any other(safer) suggestion ?

The correct answer was posted on the MediaWiki mailing list: see http://lists.wikimedia.org/pipermail/mediawiki-l/2009-October/032322.html
According to that post, it is probably ok to truncate user_newtalk, page, revision, text,
archive, pagelinks, templatelinks, imagelinks, categorylinks, category, externallinks, langlinks, hitcounter, watchlist, image, oldimage, filearchive, recentchanges, searchindex, interwiki, querycache, objectcache, log_search, trackbacks, job, querycache_info, redirect, querycachetwo, page_restrictions, protected_titles, page_props, change_tags, tag_summary, valid_tag, l10n_cache.
On more recent versions, add msg_resource and msg_resource_list to that list, to truncate message related caches.
Also: Remember to delete the files at the image folder, if truncating the
image table. Otherwise they will be out of sync, and you might have trouble uploading some images.

get list of tables from your database:
echo "show tables;" | mysql -u user_name -p db_name > tables
determine which tables you want to truncate, then create an sql script
TRUNCATE TABLE a;
TRUNCATE TABLE b;
update <prefix>user set user_editcount=0;
then run it through the client:
mysql -u user_name -p database_name < truncate-all.sql

Related

Batch for mysql dump - exclude some table based on SELECT

We have a hundred tables on our database, the data of twenty of these tables are generated thanks to a LOAD DATA INFILE
(there is therefore no point in us saving them with a MYSQLDUMP knowing that they are the heaviest - around 80% of the size of the database)
These LOAD DATA INFILE are managed using a PHP form, the names of the tables are therefore saved in the database like this :
Name of the table : import_table
Columns of the table : table_name, table_column, date_creation, etc ...
So when I make this query :
SELECT table_name FROM import_table
I have this result:
list_of_customer
all_order
all_invoice
...
I would therefore like, thanks to this table containing all the tables to ignore (and which can change at any time) create my batch in order to perform MYSQLDUMP
So I did that :
#ECHO OFF
"C:\wamp\bin\mysql\mysql8.0.18\bin\mysqldump.exe" mydatabase --result-file="C:\test1\test2\databases.sql" --user=**** --password=****
How can I integrate my SELECT table_name FROM import_table in order to use the option --ignore-table ?

Take of table_name in a file
Read that file in variable
Use that variable for --ignore-table
#ECHO OFF
"C:\wamp\bin\mysql\mysql8.0.18\bin\mysql.exe" mydatabase -e "SELECT group_concat(table_name) FROM import_table" --user=**** --password=**** > queryresult.txt
set /p ExcludedTables=<queryresult.txt
"C:\wamp\bin\mysql\mysql8.0.18\bin\mysqldump.exe" mydatabase --result-file="C:\test1\test2\databases.sql" --user=**** --password=**** --ignore-table-=%ExcludedTables%
I can't test as I am on mac but I hope you get the idea about how you can do it.

Delete substring via all mysql database at one query

I have a problem, cause some kind of malware have got to my site. I would like to delete all malware code from DB with 1 query. I believe it's possible.
I can't delete all rows, the malware has added a little code to each page/article/gallery/... title. So I would like to preserve original title of article. I hope it's possible.
For example:
<script src="...">...</script>About us
I need to About us will remain in database.
How can I do that via all database at once?

You can use string functions to do this.
Here's a demo:
mysql> SET #t = 'See <script src="...">...</script>About us';
mysql> SELECT CONCAT(
SUBSTRING(#t, 1, LOCATE('<script ', #t)-1),
SUBSTRING(#t, LOCATE('</script>', #t)+LENGTH('</script>'))) AS newstring;
+--------------+
| newstring |
+--------------+
| See About us |
+--------------+
This assumes the script tag only occurs once per string.
Then you'll have to use UPDATE to correct the data, one column and one table at a time:
UPDATE MyTable
SET MyStringColumn = CONCAT(
SUBSTRING(MyStringColumn, 1, LOCATE('<script ', MyStringColumn)-1),
SUBSTRING(MyStringColumn, LOCATE('</script>', MyStringColumn)+LENGTH('</script>')));
Another solution if you want to do all tables and all columns at once is to dump your database to a text file, and use a text editor to do global search and replace.
$ mysqldump mydatabase > mydatabase.sql
$ vim mydatabase.sql
:%s/<script src=.*<\/script>//g
$ mysql mydatabase < mydatabase.sql
Of course any data that changed between the dump and the restore will be overwritten.
If you can't pause changes to your database, you'll have to use the UPDATE solution to change data in-place.

MySQL - How to run long (>14 hour) job over an SSH connection?

I need to run a MySQL script that, according to my benchmarking, should take over 14 hours to run. The script is updating every row in a 332715-row table:
UPDATE gene_set SET attribute_fk = (
SELECT id FROM attribute WHERE
gene_set.name_from_dataset <=> attribute.name_from_dataset AND
gene_set.id_from_dataset <=> attribute.id_from_dataset AND
gene_set.description_from_dataset <=> attribute.description_from_dataset AND
gene_set.url_from_dataset <=> attribute.url_from_dataset AND
gene_set.name_from_naming_authority <=> attribute.name_from_naming_authority AND
gene_set.id_from_naming_authority <=> attribute.id_from_naming_authority AND
gene_set.description_from_naming_authority <=> attribute.description_from_naming_authority AND
gene_set.url_from_naming_authority <=> attribute.url_from_naming_authority AND
gene_set.attribute_type_fk <=> attribute.attribute_type_fk AND
gene_set.naming_authority_fk <=> attribute.naming_authority_fk
);
(The script is unfortunate; I need to transfer all the data from gene_set to attribute, but first I must correctly set a foreign key to point to attribute).
I haven't been able to successfully run it using this command:
nohup mysql -h [host] -u [user] -p [database] < my_script.sql
For example, last night, it ran over 10 hours but then the ssh connection broke:
Write failed: Broken pipe
Is there any way to run this script in a way to better ensure that it finishes? I really don't care how long it takes (1 day vs 2 days doesn't really matter) so long as I know it will finish.

The quickest way might be to run it in a screen or tmux session.

Expanding on my comment, you're getting poor performance for a 350k record update statement. This is because you're setting based on the result of a sub query, and not updating as a set. Thus you're running the statement once for each row. Update as such:
UPDATE gene_set g JOIN attribute_fk a ON < all where clauses > SET g.attribute_fk = a.id.
This doesn't answer your question per se, but I'll be interested to know how much faster it runs.

Here is how i did it in past where I ran monolithic alter queries in remote server which take ages sometime :
mysql -h [host] -u [user] -p [database] < my_script.sql > result.log 2>&1 &
This way you don't need to wait for it as it will finish on its own time,You could customize and add select now() at start and end in your my_script.sql to find out how long it took if you interest .
Things also to consider if applicable
Why this query take this long, can we improve it(eg : disable key checks .. , offline prepare the data and update with a temp table ..
Can we break the query to run in batches
What is the impact on rest of the DB
etc

If you have ssh access to the server you could copy it over and run it there with the following lines:
#copy over to tmp dir
scp my_script.sql user#remoteHost:/tmp/
#execute script on remote host
ssh -t user#remoteHost "nohup mysql \
-h localhost -u [user] -p [database] < /tmp/my_script.sql &"

Maybe you can try to do 300k updates with frequent commits instead of one single huge update. Doing that inc ase anything failed at you will maintain the changes already applied.
with some dimacic sql you can get all the lines in one go, later copy the file to your server ...

SQL Query for Search & Replace(Need to update) the whole database if matched the string in phpMyAdmin/MySQL - wordpress site

Currently i have the links look like as http://example.com, http://example.com/images, http://example.com/posts/..., http://example.com/links/... about 3000+ matched on the database need to be updated with "https" like as
https://example.com, https://example.com/images, https://example.com/posts/..., https://example.com/links/...
Where i have already updated the wp site url http://example.com to https://example.com therefore, there are 3000+ available in the DB.
Is there a easy way to make that or something in SQL?
Thanks!

Try this way, change table_name & field as per your requirement.
UPDATE table_name SET field = replace(field,'http','https');
UPDATE table_name SET field = replace(field,'[string-to-find]','[string-that-will-replace-it]');
NB: But Before you do that, you should definitely do a database dump or whatever you use for backups.More about Mysql Replace
mysqldump -h hostname -u username -p databasename > my_sql_dump.sql

Using mysqldump to format one insert per line?

This has been asked a few times but I cannot find a resolution to my problem. Basically when using mysqldump, which is the built in tool for the MySQL Workbench administration tool, when I dump a database using extended inserts, I get massive long lines of data. I understand why it does this, as it speeds inserts by inserting the data as one command (especially on InnoDB), but the formatting makes it REALLY difficult to actually look at the data in a dump file, or compare two files with a diff tool if you are storing them in version control etc. In my case I am storing them in version control as we use the dump files to keep track of our integration test database.
Now I know I can turn off extended inserts, so I will get one insert per line, which works, but any time you do a restore with the dump file it will be slower.
My core problem is that in the OLD tool we used to use (MySQL Administrator) when I dump a file, it does basically the same thing but it FORMATS that INSERT statement to put one insert per line, while still doing bulk inserts. So instead of this:
INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES (887,'0.0000'),191607,'1.0300');
you get this:
INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES
(887,'0.0000'),
(191607,'1.0300');
No matter what options I try, there does not seem to be any way of being able to get a dump like this, which is really the best of both worlds. Yes, it take a little more space, but in situations where you need a human to read the files, it makes it MUCH more useful.
Am I missing something and there is a way to do this with MySQLDump, or have we all gone backwards and this feature in the old (now deprecated) MySQL Administrator tool is no longer available?

Try use the following option:
--skip-extended-insert
It worked for me.

With the default mysqldump format, each record dumped will generate an individual INSERT command in the dump file (i.e., the sql file), each on its own line. This is perfect for source control (e.g., svn, git, etc.) as it makes the diff and delta resolution much finer, and ultimately results in a more efficient source control process. However, for significantly sized tables, executing all those INSERT queries can potentially make restoration from the sql file prohibitively slow.
Using the --extended-insert option fixes the multiple INSERT problem by wrapping all the records into a single INSERT command on a single line in the dumped sql file. However, the source control process becomes very inefficient. The entire table contents is represented on a single line in the sql file, and if a single character changes anywhere in that table, source control will flag the entire line (i.e., the entire table) as the delta between versions. And, for large tables, this negates many of the benefits of using a formal source control system.
So ideally, for efficient database restoration, in the sql file, we want each table to be represented by a single INSERT. For an efficient source control process, in the sql file, we want each record in that INSERT command to reside on its own line.
My solution to this is the following back-up script:
#!/bin/bash
cd my_git_directory/
ARGS="--host=myhostname --user=myusername --password=mypassword --opt --skip-dump-date"
/usr/bin/mysqldump $ARGS --database mydatabase | sed 's$VALUES ($VALUES\n($g' | sed 's$),($),\n($g' > mydatabase.sql
git fetch origin master
git merge origin/master
git add mydatabase.sql
git commit -m "Daily backup."
git push origin master
The result is a sql file INSERT command format that looks like:
INSERT INTO `mytable` VALUES
(r1c1value, r1c2value, r1c3value),
(r2c1value, r2c2value, r2c3value),
(r3c1value, r3c2value, r3c3value);
Some notes:
password on the command line ... I know, not secure, different discussion.
--opt: Among other things, turns on the --extended-insert option (i.e., one INSERT per table).
--skip-dump-date: mysqldump normally puts a date/time stamp in the sql file when created. This can become annoying in source control when the only delta between versions is that date/time stamp. The OS and source control system will date/time stamp the file and version. Its not really needed in the sql file.
The git commands are not central to the fundamental question (formatting the sql file), but shows how I get my sql file back into source control, something similar can be done with svn. When combining this sql file format with your source control of choice, you will find that when your users update their working copies, they only need to move the deltas (i.e., changed records) across the internet, and they can take advantage of diff utilities to easily see what records in the database have changed.
If you're dumping a database that resides on a remote server, if possible, run this script on that server to avoid pushing the entire contents of the database across the network with each dump.
If possible, establish a working source control repository for your sql files on the same server you are running this script from; check them into the repository from there. This will also help prevent having to push the entire database across the network with every dump.

As others have said using sed to replace "),(" is not safe as this can appear as content in the database.
There is a way to do this however:
if your database name is my_database then run the following:
$ mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database > my_database.sql
$ sed ':a;N;$!ba;s/)\;\nINSERT INTO `[A-Za-z0-9$_]*` VALUES /),\n/g' my_database.sql > my_database2.sql
you can also use "sed -i" to replace in-line.
Here is what this code is doing:
--skip-extended-insert will create one INSERT INTO for every row you have.
Now we use sed to clean up the data. Note that regular search/replace with sed applies for single line so we cannot detect the "\n" character as sed works one line at a time. That is why we put ":a;N;$!ba;" which basically tells sed to search multi-line and buffer the next line.
Hope this helps

What about storing the dump into a CSV file with mysqldump, using the --tab option like this?
mysqldump --tab=/path/to/serverlocaldir --single-transaction <database> table_a
This produces two files:
table_a.sql that contains only the table create statement; and
table_a.txt that contains tab-separated data.
RESTORING
You can restore your table via LOAD DATA:
LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt'
INTO TABLE table_a FIELDS TERMINATED BY '\t' ...
LOAD DATA is usually 20 times faster than using INSERT statements.
If you have to restore your data into another table (e.g. for review or testing purposes) you can create a "mirror" table:
CREATE TABLE table_for_test LIKE table_a;
Then load the CSV into the new table:
LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt'
INTO TABLE table_for_test FIELDS TERMINATED BY '\t' ...
COMPARE
A CSV file is simplest for diffs or for looking inside, or for non-SQL technical users who can use common tools like Excel, Access or command line (diff, comm, etc...)

I'm afraid this won't be possible. In the old MySQL Administrator I wrote the code for dumping db objects which was completely independent of the mysqldump tool and hence offered a number of additional options (like this formatting or progress feedback). In MySQL Workbench it was decided to use the mysqldump tool instead which, besides being a step backwards in some regards and producing version problems, has the advantage to stay always up-to-date with the server.
So the short answer is: formatting is currently not possible with mysqldump.

Try this:
mysqldump -c -t --add-drop-table=FALSE --skip-extended-insert -uroot -p<Password> databaseName tableName >c:\path\nameDumpFile.sql

I found this tool very helpful for dealing with extended inserts: http://blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html
It parses the mysqldump output and inserts linebreaks after each record, but still using the faster extended inserts. Unlike a sed script, there shouldn't be any risk of breaking lines in the wrong place if the regex happens to match inside a string.

I liked Ace.Di's solution with sed, until I got this error:
sed: Couldn't re-allocate memory
Thus I had to write a small PHP script
mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database | php mysqlconcatinserts.php > db.sql
The PHP script also generates a new INSERT for each 10.000 rows, again to avoid memory problems.
mysqlconcatinserts.php:
#!/usr/bin/php
<?php
/* assuming a mysqldump using --skip-extended-insert */
$last = '';
$count = 0;
$maxinserts = 10000;
while($l = fgets(STDIN)){
if ( preg_match('/^(INSERT INTO .* VALUES) (.*);/',$l,$s) )
{
if ( $last != $s[1] || $count > $maxinserts )
{
if ( $count > $maxinserts ) // Limit the inserts
echo ";\n";
echo "$s[1] ";
$comma = '';
$last = $s[1];
$count = 0;
}
echo "$comma$s[2]";
$comma = ",\n";
} elseif ( $last != '' ) {
$last = '';
echo ";\n";
}
$count++;
}

add
set autocommit=0;
to first line of your sql script file, then import by:
mysql -u<user> -p<password> --default-character-set=utf8 db_name < <path>\xxx.sql
, it will fast 10x.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Truncate mediawiki - mysql

Related

Batch for mysql dump - exclude some table based on SELECT

Delete substring via all mysql database at one query

MySQL - How to run long (>14 hour) job over an SSH connection?

SQL Query for Search & Replace(Need to update) the whole database if matched the string in phpMyAdmin/MySQL - wordpress site

Using mysqldump to format one insert per line?

Categories

Resources