Can I hash / encrypt a database TEXT column? - mysql

Apologies in advance for what may be a silly question, but I am working on building a little "journal" website, where users can type in daily thoughts in a private way. I'm currently storing this information in a MEDIUMTEXT datatype in a MySQL database.
My question is: is there a way to store this so that individuals (like myself) who have access to the database are not able to read the field to secure users privacy, similar to how I might hash a password?
Thanks in advance

The better way is to encrypt data before the DBMS (with whatever programming language you are using on the website), store it already encrypted in MySQL/MariaDB, then read the encrypted data back and decrypt it again inside the software outside of the DBMS. That way your website software will do the encryption/decryption and MySQL/MariaDB will just store the data.
If you need to do it in SQL, here is a simple (not very secure!) way to do this by using the AES_ENCRYPT and AES_DECRYPT functions. First I will create a simple test table
CREATE TABLE encrypt(
col MEDIUMBLOB
);
When I insert the data, I will use the encrypt function:
-- do this if you are unsure if the general log is enabled systemwide
SET sql_log_off = 'ON';
INSERT INTO encrypt(col) VALUES
(AES_ENCRYPT("blah","mysecretpassword"));
To read the data back, you must use AES_DECRYPT:
SELECT CONVERT(AES_DECRYPT(col, "mysecretpassword") USING utf8) AS col
FROM encrypt;
+------+
| col |
+------+
| blah |
+------+
-- now you can turn the log back on if you need it
-- and it is not disabled globally in my.cnf
SET sql_log_off = 'OFF';
Supplying wrong password will give a NULL value. Note that while it will probably work flawless, MEDIUMTEXT is not the best datatype - when you store the encrypted data it is all binary so BLOB is better (it does not care for collations, encodings, etc).
Why it is not recommended? First you should be careful for log files (like general log or log-slow-queries). If they are enabled, the DBMS may log your secret key in plaintext and it will be easy to recover for people who have administrative access to the machine. So if you are going to use that way, you must definitely disable logging! In the example above I showed how you can use the sql_log_off variable which will disable the plaintext general query log only for the current session (it will not disable it serverwide that way so other queries will be logged).
But that's not the whole story about logging - there is also a binary log for transactions. If enabled, it will log changes to data (like all INSERT, UPDATE and DELETE statements in particular). There is mysqlbinlog utility using which people with admin rights on the DBMS machine will be able to recover data with it... and they may eventually recover your secret key from the INSERT statement above as well. If you want to prevent the binlog too, you must do this before executing the INSERT statement:
SET sql_log_bin = 'OFF';
And of course enable it back on after it. Note that it cannot happen inside a transaction - the bin log is needed for transaction management. Also this makes your database a bit crash unsafe. If the system crashes in the middle of the insert statement when your bin log is disabled, it may corrupt the data in the table. So final conclusion here - it is definitely better to do encryption/decryption in the application and do not do it in the DB. It will save you a lot of hassle with the logging issue.
Second the example above is not secure encryption because (by default) it is using the unsecure ECB mode. Briefly - all data is separated in blocks and each block is being encrypted the same way with the same key. That way equal plaintext blocks will result in the same encrypted blocks - this may leak patterns. Therefore it is better to use some block-chaining mode with initialization vector - it's a much stronger encryption. Unfortunately if you are using MariaDB, you should stop here as it does not support anything other than ECB yet (they are working to add other modes in future). If you are using MySQL, continue reading to improve the sql solution...
With MySQL you should consider switching the default encryption mode and start using the third IV (initialization vector) parameter of the encrypt/decrypt functions which must be 16 bytes (it should be a random value which is not a secret and it can be stored directly in the DB). Here is how:
First change the block_encryption_mode system variable to something different than ECB (check here). You can use CBC for example.
Then change the above queries like this:
CREATE TABLE encrypt(
col MEDIUMBLOB,
iv BINARY(16)
);
-- Run this each time you encrypt/decrypt
-- if you cannot guarantee that it is
-- set properly in my.cnf
SET block_encryption_mode="aes-256-cbc";
SET sql_log_off = 'ON';
INSERT INTO encrypt(iv, col)
VALUES(RANDOM_BYTES(16), AES_ENCRYPT("blah", "mysecretpassword", iv));
SELECT CONVERT(AES_DECRYPT(col, "mysecretpassword", iv) USING utf8) AS col
FROM encrypt;
SET sql_log_off = 'OFF';

Related

Importing and exporting TSVs with MySQL

I'm using a database with MySQL 5.7, and sometimes, data needs to be updated using a mixture of scripts and manual editing. Because people working with the database are usually not familiar with SQL, I'd like to export the data as a TSV, which then could be manipulated (for example with Python's pandas module) and then be imported back. I assume the standard way would be to directly connect to the database, but using TSVs has some upsides in this situation, I think. I've been reading the MySQL docs and some stackoverflow questions to find the best way to do this. I've found a couple of solutions, however, they all are somewhat inconvenient. I will list them below and explain my problems with them.
My question is: did I miss something, for example some helpful SQL commands or CLI options to help with this? Or are the solutions I found already the best when importing/exporting TSVs?
My example database looks like this:
Database: Export_test
Table: Sample
Field
Type
Null
Key
id
int(11)
NO
PRI
text_data
text
NO
optional
int(11)
YES
time
timestamp
NO
Example data:
INSERT INTO `Sample` VALUES (1,'first line\\\nsecond line',NULL,'2022-02-16 20:17:38');
The data contains an escaped newline, which caused a lot of problems for me when exporting.
Table: Reference
Field
Type
Null
Key
id
int(11)
NO
PRI
foreign_key
int(11)
NO
MUL
Example data:
INSERT INTO `Reference` VALUES (1,1);
foreign_key is referencing a Sample.id.
Note about encoding: As a caveat for people trying to do the same thing: If you want to export/import data, make sure that characters sets and collations are set up correctly for connections. This caused me some headache, because although the data itself is utf8mb4, the client, server and connection character sets were latin1, which caused some loss of data in some instances.
Export
So, for exporting, I found basically three solutions, and they all behave somewhat differently:
A: SELECT stdout redirection
mysql Export_test -e "SELECT * FROM Sample;" > out.tsv
Output:
id text_data optional time
1 first line\\\nsecond line NULL 2022-02-16 21:26:13
Pros:
headers are added, which makes it easy to use with external programs
formatting works as intended
Cons:
NULL is used for null values; when importing, \N is required instead; as far as I know, this can't be configured for exports
Workaround: replace NULL values when editing the data
B: SELECT INTO OUTFILE
mysql Export_test -e "SELECT * FROM Sample INTO OUTFILE '/tmp/out.tsv';"
Output:
1 first line\\\
second line \N 2022-02-16 21:26:13
Pros:
\N is used for null data
Cons:
escaped linebreaks are not handled correctly
headers are missing
file writing permission issues
Workaround: fix linebreaks manually; add headers by hand or supply them in the script; use /tmp/ as output directory
C: mysqldump with --tab (performs SELECT INTO OUTFILE behind the scenes)
mysqldump --tab='/tmp/' --skip-tz-utc Export_test Sample
Output, pros and cons: same as export variant B
Something that should be noted: the output is only the same as B, if --skip-tz-utc is used; otherwise, timestamps will be converted to UTC, and will be off after importing the data.
Import
Something I didn't realize it first, is that it's impossible to merely update data directly with LOAD INTO or mysqlimport, although that's something many GUI tools appear to be doing and other people attempted. For me as an beginner, this wasn't immediately clear from the MySQL docs. A workaround appears to be creating an empty table, import the data there and then updating the actual table of interest via a join. I also thought one could update individual columns with this, which again is not possible. If there are some other ways to achieve this, I would really like to know.
As far as I could tell, there are two options, which do pretty much the same thing:
LOAD INTO:
mysql Export_test -e "SET FOREIGN_KEY_CHECKS = 0; LOAD DATA INFILE '/tmp/Sample.tsv' REPLACE INTO TABLE Sample IGNORE 1 LINES; SET FOREIGN_KEY_CHECKS = 1;"
mysqlimport (performs LOAD INTO behind the scenes):
mysqlimport --replace Export_test /tmp/Sample.tsv
Notice: if there are foreign key constraints like in this example, SET FOREIGN_KEY_CHECKS = 0; needs to be performed (as far as I can tell, mysqlimport can't be directly used in these cases). Also, IGNORE 1 LINES or --ignore-lines can be used to skip the first line if the input TSV contains a header. For mysqlimport, the name of the input file without extension must be the name of the table. Again, file reading permissions can be an issue, and /tmp/ is used to avoid that.
Are there ways to make this process more convenient? Like, are there some options I can use to avoid the manual workarounds, or are there ways to use TSV importing to UPDATE entries without creating a temporary table?
What I ended up doing was using LOAD INTO OUTFILE for exporting, added a header manually and also fixed the malformed lines by hand. After manipulating the data, I used LOAD DATA INTO to update the data. In another case, I exported with SELECT to stdout redirection, manipulated the data and then added a script, which just created a file with a bunch of UPDATE ... WHERE statements with the corresponding data. Then I ran the resulting .sql in my database. Is the latter maybe the best option in this case?
Exporting and importing is indeed sort of clunky in MySQL.
One problem is that it introduces a race condition. What if you export data to work on it, then someone modifies the data in the database, then you import your modified data, overwriting your friend's recent changes?
If you say, "no one is allowed to change data until you re-import the data," that could cause an unacceptably long time where clients are blocked, if the table is large.
The trend is that people want the database to minimize downtime, and ideally to have no downtime at all. Advancements in database tools are generally made with this priority in mind, not so much to accommodate your workflow of taking the data out of MySQL for transformations.
Also what if the database is large enough that the exported data causes a problem because where do you store a 500GB TSV file? Does pandas even work on such a large file?
What most people do is modify data while it remains in the database. They use in-place UPDATE statements to modify data. If they can't do this in one pass (there's a practical limit of 4GB for a binary log event, for example), then they UPDATE more modest-size subsets of rows, looping until they have transformed the data on all rows of a given table.

Reading Encrypted data with Datastage Tool

Actually i need Your help in datastage 11.7 tool. i am reading a AES encrypted column from my source and type of column is nvarchar so when we start our job and read data from source. The job run Successfully and exactly same data is moved to my target data base with same column type.
And the Problem Actually occur is that when i query the data to check whether the my source and target values are same, the query does not show any result and visually if we look source,target value they are same value but sql statement return nothing and the database is Vertica.
Column value are special Alpha numeric and special characters like �D�&7��x��d$�Q
I'm not at all sure this is even properly possible via datastage - treated encrypted data and a varchar. Some DB's have internal keys that go with the data that require decrypting before extracting. I'm assuming that decrypting, transporting, landing and then encrypting is not an option.
But if I had to take a stab in the dark.
The very first thing I'd check is that the character set and collation is the same on both databases on a table level. A difference can result in blank results on the target side.
Also check that the NLS map in the datastage (map for stages and collation locale) is set accordingly. What that settings is, I don't know but making it the same in DataSTage and the DBs would be ideal ; Google. You need to comment on what is already set in the DB's. And run tests. I'm not sure the DataStage default of ISO-8859-1 will work.
Please post your solution if you find one.

Ensure MySQL table charset & collation

Situation: there's a table that is managed by application A. Application A inserts and updates data in the table throughout the day. Once per week it DROPs the table, recreates it, and inserts all data.
Problem: application A creates the table as utf8. Application B that relies on this table require for it to be ascii_bin. I did not design either application, nor do I have access to modifying their requirements.
What's needed: a way to ensure that the table is in ascii_bin. I considered writing a script and run it via CRON, which would check the current charset and set it if needed. Is there a better way of achieving this?
Since ALTER is one of the statements that causes an implicit COMMIT, I do not believe it is possible to do it as part of a trigger after INSERT or UPDATE.
You can set ascii_bin as a default charset for your database schema. Then all the created tables will have this charset when created, unless you explicitly specify another charset.
Refer to MySQL documentation on how to set the default charset: http://dev.mysql.com/doc/refman/5.0/en/charset-database.html
See SET NAME at http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
MySQL proxy might be a solution here.
You can rewrite the create statement when it goes through the proxy.
Alternatively, maybe you could remove privileges from Application A so it can't drop the table.
An ALTER statement that makes no changes is basically ignored. So, if the conversion to ascii_bin is run multiple times, it's not going to be much effort on the server. So, putting it in cron, or an exiting stored procedure that Applicatio B calls, or something else clever, isn't so bad.

What is this SQL injection doing?

Long story short, through an old asp site I run someone found an unfiltered URL parameter and was able to run this query. I'm trying to figure out what it DOES though...
The query should read:
select * from reserve where id = 345
the one that was ran was:
select * from reserve where id = 345 and ascii(substring((select concat(user,0x3a,password,0x3a,host) from mysql.user limit 0,1),17,1))=53
I'm really not sure what this obtains. Any Input?
It might be probing whether or not the web application is accessing the database as root. Removing the ascii(substring()) portions returns the following when run as root:
mysql> select concat(user,0x3a,password,0x3a,host) from mysql.user limit 0,1;
+--------------------------------------+
| concat(user,0x3a,password,0x3a,host) |
+--------------------------------------+
| root:<rootpw-hash>:localhost |
+--------------------------------------+
Following a successful probe, they may then attempt to retrieve the contents of mysql.user from which they can start cracking passwords against rainbow tables.
The second part of where condition is really strange: it looks for a mysql credentials and process them as follows:
concat(user,0x3a,password,0x3a,host) will be something like 'someUser:hisPass:localhost'
the above string will be splitted in a smaller one
the above string is converted to ascii code (you might know it from legacy languages as ord())
the result of the conversion is compared to 53 integer
I suppose that the first part of WHERE statement (id = 345) will always return true while the second one is too specific, so the entire query will probably return an empty result all the time.
the query is seemingly one from the a set of them:
by changing the charcode and substring start position and you can find out all usernames and the corresponding password hashes (when the page renders as expected you have a char match)
it allows to find out that the current user has access to the mysql schema.
An sql injection exploit does not necessarily immediately output the query result to the attackers screen, often the result is only either an error, or no error, or maybe the injection causes a measurable (to the attacker) delay. in that way the attacker can obtain 1 bit of information per request.
By sending lots of requests, iterating over string positions, doing a binary search on the characters - or as in this case a linear search ( which may indicate that the attacker does not really understand what he is doing, but he will get there eventually ), he will be able to find all the characters in the mysql root user passwordhash. ( Which can possibly be bruteforced offline ).
The SQL is trying to read user data from the My-Sql user table which typically contains a list of users and hosts that are allowed to access a given my-sql server.
It looks to me like the perp is trying to trick mysql into dumping the contents of the user table so they can then record the password hashes offline and dcrypt them to find valid logins.
If your web application is using a login that will allow access to the mysql users table, then this is a serious security flaw, if it's using a login that is only granted permission to the tables required for the app then no information will be obtainable.
Security tip: When setting up ANY kind of database it's vitally important that the application using does so with a login/access role that grants it ONLY what it needs.
If your application only ever needs to read data and never modify it, then it should never have any permissions other than to read. You always need to double check this, because most database systems will by default create user roles for a given database with full read, create, modify privileges.
Always create a specific user, just for that db and or collection of tables, and always give that user the absolute minimum that's required, if your app does then get hacked with a cross site scripting attack, the most their going to get access too is that one specific database.

MySQL Injection - Use SELECT query to UPDATE/DELETE

I've got one easy question: say there is a site with a query like:
SELECT id, name, message FROM messages WHERE id = $_GET['q'].
Is there any way to get something updated/deleted in the database (MySQL)? Until now I've never seen an injection that was able to delete/update using a SELECT query, so, is it even possible?
Before directly answering the question, it's worth noting that even if all an attacker can do is read data that he shouldn't be able to, that's usually still really bad. Consider that by using JOINs and SELECTing from system tables (like mysql.innodb_table_stats), an attacker who starts with a SELECT injection and no other knowledge of your database can map your schema and then exfiltrate the entirety of the data that you have in MySQL. For the vast majority of databases and applications, that already represents a catastrophic security hole.
But to answer the question directly: there are a few ways that I know of by which injection into a MySQL SELECT can be used to modify data. Fortunately, they all require reasonably unusual circumstances to be possible. All example injections below are given relative to the example injectable query from the question:
SELECT id, name, message FROM messages WHERE id = $_GET['q']
1. "Stacked" or "batched" queries.
The classic injection technique of just putting an entire other statement after the one being injected into. As suggested in another answer here, you could set $_GET['q'] to 1; DELETE FROM users; -- so that the query forms two statements which get executed consecutively, the second of which deletes everything in the users table.
In mitigation
Most MySQL connectors - notably including PHP's (deprecated) mysql_* and (non-deprecated) mysqli_* functions - don't support stacked or batched queries at all, so this kind of attack just plain doesn't work. However, some do - notably including PHP's PDO connector (although the support can be disabled to increase security).
2. Exploiting user-defined functions
Functions can be called from a SELECT, and can alter data. If a data-altering function has been created in the database, you could make the SELECT call it, for instance by passing 0 OR SOME_FUNCTION_NAME() as the value of $_GET['q'].
In mitigation
Most databases don't contain any user-defined functions - let alone data-altering ones - and so offer no opportunity at all to perform this sort of exploit.
3. Writing to files
As described in Muhaimin Dzulfakar's (somewhat presumptuously named) paper Advanced MySQL Exploitation, you can use INTO OUTFILE or INTO DUMPFILE clauses on a MySQL select to dump the result into a file. Since, by using a UNION, any arbitrary result can be SELECTed, this allows writing new files with arbitrary content at any location that the user running mysqld can access. Conceivably this can be exploited not merely to modify data in the MySQL database, but to get shell access to the server on which it is running - for instance, by writing a PHP script to the webroot and then making a request to it, if the MySQL server is co-hosted with a PHP server.
In mitigation
Lots of factors reduce the practical exploitability of this otherwise impressive-sounding attack:
MySQL will never let you use INTO OUTFILE or INTO DUMPFILE to overwrite an existing file, nor write to a folder that doesn't exist. This prevents attacks like creating a .ssh folder with a private key in the mysql user's home directory and then SSHing in, or overwriting the mysqld binary itself with a malicious version and waiting for a server restart.
Any halfway decent installation package will set up a special user (typically named mysql) to run mysqld, and give that user only very limited permissions. As such, it shouldn't be able to write to most locations on the file system - and certainly shouldn't ordinarily be able to do things like write to a web application's webroot.
Modern installations of MySQL come with --secure-file-priv set by default, preventing MySQL from writing to anywhere other than a designated data import/export directory and thereby rendering this attack almost completely impotent... unless the owner of the server has deliberately disabled it. Fortunately, nobody would ever just completely disable a security feature like that since that would obviously be - oh wait never mind.
4. Calling the sys_exec() function from lib_mysqludf_sys to run arbitrary shell commands
There's a MySQL extension called lib_mysqludf_sys that - judging from its stars on GitHub and a quick Stack Overflow search - has at least a few hundred users. It adds a function called sys_exec that runs shell commands. As noted in #2, functions can be called from within a SELECT; the implications are hopefully obvious. To quote from the source, this function "can be a security hazard".
In mitigation
Most systems don't have this extension installed.
If you say you use mysql_query that doesn't support multiple queries, you cannot directly add DELETE/UPDATE/INSERT, but it's possible to modify data under some circumstances. For example, let's say you have the following function
DELIMITER //
CREATE DEFINER=`root`#`localhost` FUNCTION `testP`()
RETURNS int(11)
LANGUAGE SQL
NOT DETERMINISTIC
MODIFIES SQL DATA
SQL SECURITY DEFINER
COMMENT ''
BEGIN
DELETE FROM test2;
return 1;
END //
Now you can call this function in SELECT :
SELECT id, name, message FROM messages WHERE id = NULL OR testP()
(id = NULL - always NULL(FALSE), so testP() always gets executed.
It depends on the DBMS connector you are using. Most of the time your scenario should not be possible, but under certain circumstances it could work. For further details you should take a look at chapter 4 and 5 from the Blackhat-Paper Advanced MySQL Exploitation.
Yes it's possible.
$_GET['q'] would hold 1; DELETE FROM users; --
SELECT id, name, message FROM messages WHERE id = 1; DELETE FROM users; -- whatever here');