MySQL checksum to all tables in a database - mysql

I am evaluating a PHP/MySQL based software.
I want to look which tables affected when certain operations triggered.
After some googling, I was told that checksum table tbl_name can do the job. I just need to know how to use checksum for all the tables in the db.
To checksum all the tables one by one manually definitely not preferred as the database contains hundreds of tables.

Checksumming all tables seems like a lot of expensive calculation work just to detect which tables changed.
I'd suggest to get this information using the sys.schema_table_statistics table.
mysql> select table_schema, table_name, rows_fetched, rows_inserted, rows_updated, rows_deleted
from sys.schema_table_statistics where table_schema='test'
+--------------+---------------------+--------------+---------------+--------------+--------------+
| table_schema | table_name | rows_fetched | rows_inserted | rows_updated | rows_deleted |
+--------------+---------------------+--------------+---------------+--------------+--------------+
| test | sysbench_results | 870 | 144 | 0 | 0 |
+--------------+---------------------+--------------+---------------+--------------+--------------+
You probably want to reset the counters between your tests. Use sys.ps_truncate_all_tables()
mysql> call sys.ps_truncate_all_tables(FALSE);
+---------------------+
| summary |
+---------------------+
| Truncated 31 tables |
+---------------------+
mysql> select table_schema, table_name, rows_fetched, rows_inserted, rows_updated, rows_deleted
from sys.schema_table_statistics where table_schema='test';
+--------------+---------------------+--------------+---------------+--------------+--------------+
| table_schema | table_name | rows_fetched | rows_inserted | rows_updated | rows_deleted |
+--------------+---------------------+--------------+---------------+--------------+--------------+
| test | sysbench_results | 0 | 0 | 0 | 0 |
+--------------+---------------------+--------------+---------------+--------------+--------------+
The sys schema comes pre-installed in MySQL 5.7.
If you use MySQL 5.6, you may need to install it yourself. It's just an SQL script that creates some views into the performance_schema. Very easy to install.
You can get the sys schema here: https://github.com/mysql/mysql-sys

I want to look which tables affected when certain operations triggered.
What do you mean by this?
Do you know what operations have been triggered, and you're merely attempting to understand what effect they had on your database (e.g. to verify their correctness)? Or do you not know what operations have been triggered (e.g. during some interval) but you nevertheless want to understand how the database has changed, perhaps in an attempt to determine what those operations were?
There are very few situations where I would expect the best approach to be that which you are exploring (inspecting the database for changes). Instead, some form of logging—whether built-in to the RDBMS (such as MySQL's General Query Log or perhaps through triggers as suggested by Sumesh), or more likely at some higher level (e.g. within the accessing application)—would almost always be preferable. This leads me to lean toward thinking you have an XY Problem.
However, on the assumption that you really do want to identify the tables that have been modified since some last known good point in time, you can query the INFORMATION_SCHEMA.TABLES table, which contains not only the CHECKSUM for every table in the RDBMS but also other potentially useful information like UPDATE_TIME. So, for example, to identify all tables changed in the last five minutes one could do:
SELECT TABLE_SCHEMA, TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE UPDATE_TIME > NOW() - INTERVAL 5 MINUTE

You could generate the CHECKSUM statements for all tables:
SELECT CONCAT('CHECKSUM TABLE ', table_name, ';') AS statement
FROM information_schema.tables
WHERE table_schema = 'YourDBNameHere'
Then copy this output and paste it into Workbench or whatever tool you need to use. If you need to do this from within application (e.g. PHP) code, then you would probably have to use pure dynamic MySQL.

For those who came here for an answer how to get checksum for all the tables in one query (as it was in my case):
SET group_concat_max_len = CAST(
(
SELECT SUM(LENGTH(TABLE_NAME)) + COUNT(*) * LENGTH(', ')
FROM information_schema.tables WHERE `TABLE_SCHEMA` = 'your_database_name'
) AS UNSIGNED
);
SET #sql_command:= (
SELECT CONCAT(
'CHECKSUM TABLE ',
GROUP_CONCAT( TABLE_NAME ORDER BY `TABLE_NAME` SEPARATOR ', ' )
)
FROM information_schema.tables
WHERE `TABLE_SCHEMA` = 'your_database_name'
ORDER BY `TABLE_NAME`
);
PREPARE statement FROM #sql_command;
EXECUTE statement;
DEALLOCATE PREPARE statement;
The mere idea is to create CHECKSUM TABLE statement which include all tables names in it. So yes, it is some sort of little bit upgraded version of answer given by Tim Biegeleisen.
First we set maximum permitted result lenght for GROUP_CONCAT() function (which is 1024 bytes by default). It is calculated as number of symbols in all table names inculding the separator which will be putted between these names:
SET group_concat_max_len = CAST(
(
SELECT SUM(LENGTH(TABLE_NAME)) + COUNT(*) * LENGTH(', ')
FROM information_schema.tables WHERE `TABLE_SCHEMA` = 'your_database_name'
) AS UNSIGNED
);
Then we put all the tables names together in one CHECKSUM TABLE statement and store it in string variable:
SET #sql_command:= (
SELECT CONCAT(
'CHECKSUM TABLE ',
GROUP_CONCAT( TABLE_NAME ORDER BY `TABLE_NAME` SEPARATOR ', ' )
)
FROM information_schema.tables
WHERE `TABLE_SCHEMA` = 'your_database_name'
ORDER BY `TABLE_NAME`
);
And finally executing the statement to see the results:
PREPARE statement FROM #sql_command;
EXECUTE statement;
DEALLOCATE PREPARE statement;
Unfortunately you can't further manipulate with result set using MySQL statements only (i.e. insert to table or join with other result sets).
So if you require to do some comparisons you will eventually need to use additional code in your favorite programming language (or use capable software) to accomplish the task.

The question does not state using a shell script to accomplish things isn't allowed, so I'll post one such approach here (PHP is able to invoke shell scripts - see http://php.net/manual/en/function.shell-exec.php - if safe mode is not enabled):
If your script has shell access at its disposal and a checksum tool - like md5sum - one can also do something like this to collect checksums for each table:
#!/bin/bash
DATABASEPATH="/var/lib/mysql/yourdatabase"
cd "$DATABASEPATH" &&
for TABLEFILE in `ls -t *.ibd`; do
SUMANDTABLE=`md5sum "$TABLEFILE"`
echo "${SUMANDTABLE//.ibd}"
done
And optionally, if you don't want a checksum calculated for all tables, you could also check if the modification date of the "$TABLEFILE" is within range. If not, you just exit the script (the ls -t orders by modification date, descending).
To access modification date use something like e.g. stat -c %Y "$TABLEFILE". This would give you the modification date in seconds since Epoch.
To access current date, also in seconds since Epoch use: date +%s.
One can then subtract the modification date from the current date to establish how many seconds ago a "$TABLEFILE" has changed.
Another related method, which in some cases could apply, would be to save the ls -t *.ibd listing (without even calculating checksums, just store filenames in order), then start an operation and at the end of that operation check for difference in file listing with another execution of ls -t *.ibd.

Related

MySQL select statement with tablename derived from database query

I want to write a SELECT statement, where the tablename is based on the response to a different SELECT query. I can't use stacked queries, and I can only use MySQL.
As pseudo-code, this is what I'd like to do:
tablenamevariable = (SELECT 'tablename');
SELECT * FROM tablenamevariable;
Which should be equivalent to executing SELECT * FROM tablename (where the string tablename comes from the database).
What I have so far is the following, which executes successfully:
SELECT * FROM (SELECT 'tablename') AS x;
However, the result simply prints tablename (which isn't what I want).
The background is an SQL injection which upper-cases all input. So what I want to do is SELECT * FROM (SELECT CHAR([...] USING UTF8MB4)) to be able to select data from a table with lower-case characters in the name.
You can't use a string as an identifier in the same query.
A subquery or an expression can return a string, but not an identifier.
So your subquery like select ... from (select ...) as x doesn't work the way you think. It will not query from the table named by the string. It will query from a derived table which consists of the string value returned by the subquery.
mysql> select * from (select 'abc' as tablename) as x;
+-----------+
| tablename |
+-----------+
| abc |
+-----------+
The reason for this is that in SQL, all identifiers must be fixed at the time the query is parsed, before it evaluates any expressions. This is so the table names can be validated that the corresponding tables exist, and you have SQL privileges to read those tables.
Another reason is that if the subquery worked the way you expect, then there would be no way to simply query strings from a subquery without querying an hypothetical table named by those strings. Also what would you expect it to do if the subquery returned multiple columns or multiple rows?
You clarified in an edit that what you're trying to do is to query a table after your query is formatted with uppercase table names, regardless of how the table was defined.
Case-sensitivity of identifiers in MySQL is a bit complex, because MySQL has versions on different operating systems, some of which have case-sensitive filesystems and some have case-insensitive filesystems.
But the result is that in most cases, it doesn't matter that your table names are uppercase in your query. Table name comparisons are case-insensitive by default on an OS that has uses case-insensitive filesystems.
mysql> select * from mytable limit 1;
+----+-------+
| pk | name |
+----+-------+
| 3 | hello |
+----+-------+
mysql> select * from MYTABLE limit 1;
+----+-------+
| pk | name |
+----+-------+
| 3 | hello |
+----+-------+
mysql> select * from MyTable limit 1;
+----+-------+
| pk | name |
+----+-------+
| 3 | hello |
+----+-------+
(Test performed on MySQL 8.0.31 on MacOS)
On UNIX and Linux, the default is that table comparisons are case-sensitive. But there is an option to configure this if you want it to work in a case-insensitive manner on UNIX or Linux. You should read https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html to understand how this works on different operating systems, and the option you can use to control it.
To do that you would need to use prepared statements:
set #t = 'tablename';
PREPARE stmt FROM concat('select * from ', #n);
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
It is highly unusual that you would need to do that though.

Finding out which tables are different in two versions of a database

I have 2 versions of a database (say db_dev and db_beta). I've made some changes in the db_dev database - added some tables, and changed a few columns in some existing tables. I need to find out the list of table names in which changes have been made.
I can easily find out the tables I've added by running the following query on the information_schema database:
SELECT table_name
FROM tables
WHERE table_schema = 'db_dev'
AND table_name NOT IN (SELECT table_name
FROM tables
WHERE table_schema = 'db_beta');
How do I get the table_names whose column_names do not match in the two database versions?
There are many ready made tools available which can give you changed schema by comparing two databases. Here are some tools which can serve your purpose :
Red-Gate's MySQL Schema & Data Compare
Maatkit
MySQL Diff
SQL EDT
Red-Gate's MySQL Compare is best tool for this purpose. Its paid though but they provide 14 days free trial version if you want to do something temporary.
Using information_schema, here is how it works.
First, you know that the information_schema.COLUMNS table contains the columns definition. If one column has been changed, or a table does not exist, it will reflect in the information_schema.COLUMNS table.
Difficult part is that you have to compare all columns of your COLUMNS table. So, you have to select TABLE_CATALOG,TABLE_NAME,COLUMN_NAME,ORDINAL_POSITION,COLUMN_DEFAULT, and so on (which is subject to evolution depending on your MySQL version).
The column list is the result of the following query:
SELECT GROUP_CONCAT(column_name)
FROM information_schema.COLUMNS
WHERE table_schema="information_schema"
AND table_name="COLUMNS" AND column_name!='TABLE_SCHEMA';
After that, we just have to SELECT TABLE_NAME, <column_list> and search for columns which appear once (column inexistent in other table), or where columns have two different definitions (columns altered). So we will have two different count in the resulting query to consider the two cases.
We will so use a prepared statement to retrieve the list of column we want, and grouping the result.
The resulting query does all the process for you:
SELECT CONCAT(
"SELECT DISTINCT TABLE_NAME
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA IN('db_dev', 'db_beta')
GROUP BY table_name, COLUMN_NAME
HAVING count(*)=1 OR
COUNT(DISTINCT CONCAT_WS(',', NULL, ",
GROUP_CONCAT(column_name)
,"))=2;")
FROM information_schema.COLUMNS
WHERE table_schema="information_schema"
AND table_name="COLUMNS" AND column_name!='TABLE_SCHEMA'
INTO #sql;
PREPARE stmt FROM #sql;
EXECUTE #sql;
The following solution does not use an sql query like you tried and does not give you a real list of tables, but it shows you all the changes in both databases.
You can do an sql dump of both database structures :
mysqldump -u root -p --no-data dbname > schema.sql
Then you can compare both files, e.g. using the diff linux tool.

How would I delete a single column in all tables in MYSQL

I have a column 'seq' in every table of my database that I would like to delete easily.
I have to do this on occasion in MySQL and am hoping this can be automated.
There isn't a simple magical expression to just do this. You need to generate a list of SQL statements and then run them, somehow.
(Most database folks don't routinely drop columns from a database in production; it takes a lot of time during which the tables are inaccessible, and it's destructive. A fat-finger error could really mess you up.)
You might start by using the information_schema in MySQL to discover which of your tables have a seq column in them. This query will return that list of tables for the database you're currently using.
SELECT DISTINCT TABLE_NAME
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = DATABASE()
AND COLUMN_NAME = 'seq'
You could then adapt that query to, for example, create a list of statements like this.
SELECT DISTINCT
CONCAT('UPDATE ',TABLE_NAME, ' SET seq = 0;') AS stmt
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = DATABASE()
AND COLUMN_NAME = 'seq'
This will produce a result set like this:
UPDATE table_a SET seq = 0;
UPDATE table_b SET seq = 0;
UPDATE user SET seq = 0;
Then you could run these statements one by one. These statements will zero out your seq columns.
Edit
You can also do
CONCAT('ALTER TABLE ',TABLE_NAME, ' DROP COLUMN seq;') AS stmt
to get a drop column statement for each table.
But, you might consider creating views of your tables that don't contain the seq columns, and then exporting to PostgreSQL using those views. If your tables are significant in size, this will save you a lot of time.

When was the last time a mysql table was accessed?

Is there a way to tell the last access time of a mysql table? By access I mean any type of operation in that table including update, alter or even select or any other operation.
Thanks.
You can get the last update time of a table.
SELECT update_time FROM information_schema.tables WHERE table_name='tablename'
You can use the OS level stat command.
Locate the ibd file for that particular table and run the below command
stat file_location
If the Table is being queried by SELECT, You can find the timestamp of when it was accessed with under the Access field.
I don't know how to get the exact time after-the-fact, but you can start dumping logs, do something, and then stop dumping logs. Whichever tables show up in the logs are the ones that were accessed during that time.
If you care to dig through the log, the queries are shown with timestamps.
Tell mysql where to put the log file
Add this line to my.cnf (on some systems it will be mysql.conf.d/mysqld.cnf).
general_log_file = /path/to/query.log
Enable the general log
mysql> SET global general_log = 1;
(don't forget to turn this off, it can grow very quickly)
Do the thing
All mysql queries will be added to /path/to/query.log
Disable the general log
mysql> SET global general_log = 0;
See which tables appeared
If it's short, you can just scroll through query.log. If not, then you can filter the log for known table names, like so:
query_words=$(cat mysql_general.log | tr -s [:space:] \\n | tr -c -d '[a-zA-Z0-9][:space:][_\-]' | egrep -v '[0-9]' | sort | uniq)
table_names=$(mysql -uroot -ptest -Dmeta -e"show tables;" | sort | uniq)
comm -12 <(echo $table_names) <(echo $query_words)
From there, you can grep the log file for whatever showed up in table_names. There you will find timestammped queries.
See also, this utility, which I made.
For a more detailed (db name and table name) plus the period (range), try this query:
select table_schema as DatabaseName,
table_name as TableName,
update_time as LastAccessTime
from information_schema.tables
where update_time < 'yyyy-mm-dd'
group by table_schema
order by update_time asc
Use information_schema database to find which table of respective database was updated:
SELECT UPDATE_TIME
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'dbname'
AND TABLE_NAME = 'tablename'
order by UPDATE_TIME DESC

selecting first column from all tables in a mysql database

i have a column named "name" which is present in all tables in mysql database.
I wanted to list all the names in all tables so i used the following query
select name from (SELECT table_name FROM information_schema.tables WHERE table_type='BASE TABLE') as abc
But it did not work for me and instead it returned me the table_name column alone.
Then i used show tables and stored the output in another table called table_list then i executed the following query
select name from (select table_name from table_list) as abc
This also returned me the same result i.e. all table names.
Can i know what is that i am doing wrong and what is the right way to do it?
I am using MySQL 5.4 and i want to either write a subquery or a procedure or a function purely in mysql.
There is PREPARE and EXECUTE which can run a sql statement from inside a user variable, so could probably use something similar to (untested!)
SET #a = "";
SELECT #a := CONCAT('(select name from ',GROUP_CONCAT(table_name SEPARATOR ') UNION (select name from '),')') FROM information_schema.tables WHERE table_type='BASE TABLE' GROUP BY 1;
PREPARE stmt FROM #a;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
What you need is a way to have variables in SQL (so you can say select * from $name where $name in (select ...)). SQL doesn't allow that.
My suggestion is to split the process up:
First run this query:
select 'select distinct name from ' || table_name || ' union'
from select table_name from table_list
That'll give you the selects to run. Put them into a small script, remove the last "union" and run that script.
[EDIT] If MySQL supports an "eval" operator in stored procedures (i.e. where you can build SQL from parts and then run it), you could do this. I checked the docs and it doesn't look like there is an eval. You could also write an extension in C (see chapter 22) that either implements the lookup or an "eval" function.
But my guess is that your database won't change all the time. So the most simple solution would be to write a small SQL script that creates the code for a view (that is a string; it doesn't actually create the view ). Every time the DB changes, you simply run the script to recreate the view and afterwards, you can run the query against the view to get a list of all names.