How to insert a binary stream in to MySQL without ASCII conversation? - mysql

I have an application written in "C" - debian distro (using libmysqlclient). My application is inserting a huge number of rows in to the database (around 30.000 rows/sec, row length ~ 150 B). Insertion takes many of CPU, because the client (my app) must convert binary stream (integers, blobs) in to ASCII representation in to a valid SQL insert statement. And also the MySQL must convert this values in to binary representation and store it to the file.
And my question. Is some possibility to call SQL insert without this conversation? I have been using a classical sprintf (implemented in libc). Now, I'm using optimized version and
without calling sprintf (too many func calling). I thought I'll use a MySQL prepared statement, but it seems a prepared statement also convert '?' variables to the ASCII.
Jan

See mysql_real_query().
22.8.7.56 mysql_real_query()
int mysql_real_query(MYSQL *mysql, const char *stmt_str, unsigned long length)
Apparently, something like the below should work
#define DATA "insert into atable (acolumn) values('zero\000quotes\"\'etc ...')"
if (mysql_real_query(db, DATA, sizeof DATA - 1)) /* error */;

Related

How to store a row index for very large (csv) files?

I am working with large csv files (>> 10^6 lines) and need a row index for some operations. I need to do comparisons between two versions of the files identifying deletions, so I thought it'd be easiest to include a row index. I guess that the number of lines would quickly render traditional integers inefficient. I'm adverse to the idea of having a column containing lets say 634567775577 in plain text as row index (followed by the actual data row). Are there any best practise suggestions for this scenario?
The resulting files have to remain plain text, so serialisation / sqlite is not an option.
At the moment, I'm considering an index based on the actual row data (for example concatening the row data, converting to base64 or the likes), but would that be more reasonable than a plain integer? There should be no duplicate row within each file, so I guess this could be one way.
Cheers, Sacha
Ps: I heavily modified the initial question for clarification
you can use regular numbers.
Python is not afraid of large numbers :) (well, to the order of magnitude you described...)
just open a python shell and type 10**999 and see that it doesn't overflow or anything.
In Python, there's no actual bit limit for integers. In Python 2, there is technically -- an int is 32 bits, and a long is more than 32 bits. But if you're just declaring the number, that type casting will happen implicitly.
Python 3 just has one type, and it only cares about memory space.
So there's no real reason why you can't use an integer if you really want to add an index.
Python built-in library contains SQLite, a self-contained, one-file-fits-everything DBMS - which contrary to normal perception can be quite performant. If the records are to be consulted by a single application with no concurrency, it compares to specialized DBMS thta requires a separate daemon.
So, essentially, you can dump your CSV to a SQLITE database and create the indices you need - even on all four columns if it is the case.
Here is a template script you could customize to create such a DB -
I guessed the "1000" numbers for number of insert a time, but it could
not be optimal - try tweaking inserting is too slow.
import sqlite3
import csv
inserts_at_time = 1000
def create_and_populate_db(dbfilename, csvfilename):
db = sqlite3.connect(dbfilename)
db.execute("""CREATE TABLE data (col1, col2, col3, col4)""")
for col_name in "col1 col2 col3 col4".split():
db.execute(f"""CREATE INDEX {col_name} ON data ({col_name})""")
with open(csvfilanem) as in_file:
reader = csv.reader(in_file)
next(reader) # skips header row
total = counter = 0
lines = []
while True:
for counter, line in zip(range(inserts_at_time), reader):
lines.append(line)
db.executemany('INSERT INTO data VALUES (?,?,?,?)', lines)
total += counter
counter = 0
lines.clear()
print("\b" * 80, f"Inserted {counter} lines - total {total}")
if counter < inserts_at_time - 1:
break

SQL string literal hexadecimal key to binary and back

after extensive search I am resorting to stack-overflows wisdom to help me.
Problem:
I have a database table that should effectively store values of the format (UserKey, data0, data1, ..) where the UserKey is to be handled as primary key but at least as an index. The UserKey itself (externally defined) is a string of 32 characters representing a checksum, which happens to be (a very big) hexadecimal number, i.e. it looks like this UserKey = "000000003abc4f6e000000003abc4f6e".
Now I can certainly store this UserKey in a char(32)-field, but I feel this being mighty inefficient, as I store a series of in principle arbitrary characters, i.e. reserving space for for more information per character than the 4 bits i need to store the hexadecimal characters (0..9,A-F).
So my thought was to convert this string literal into the hex-number it really represents, and store that. But this number (32*4 bits = 16Bytes) is much too big to store/handle as SQL only handles BIGINTS of 8Bytes.
My second thought was to convert this into a BINARY(16) representation, which should be compact and efficient concerning memory. However, I do not know how to efficiently convert between these two formats, as SQL also internally only handles numbers up to the maximum of 8 Bytes.
Maybe there is a way to convert this string to binary block by block and stitch the binary together somehow, in the way of:
UserKey == concat( stringblock1, stringblock2, ..)
UserKey_binary = concat( toBinary( stringblock1 ), toBinary( stringblock2 ), ..)
So my question is: is there any such mechanism foreseen in SQL that would solve this for me? How would a custom solution look like? (I find it hard to believe that I should be the first to encounter such a problem, as it has become quite modern to use ridiculously long hashkeys in many applications)
Also, the Userkey_binary should than act as relational key for the table, so I hope for a bit of speed by this more compact representation, as it needs to determine the difference on a minimal number of bits. Additionally, I want to mention that I would like to do any conversion if possible on the Server-side, so that user-scripts have not to be altered (the user-side should, if possible, still transmit a string literal not [partially] converted values in the insert statement)
In Contradiction to my previous statement, it seems that MySQL's UNHEX() function does a conversion from a string block by block and then concat much like I stated above, so the method works also for HEX literal values which are bigger than the BIGINT's 8 byte limitation. Here an example table that illustrates this:
CREATE TABLE `testdb`.`tab` (
`hexcol_binary` BINARY(16) GENERATED ALWAYS AS (UNHEX(charcol)) STORED,
`charcol` CHAR(32) NOT NULL,
PRIMARY KEY (`hexcol_binary`));
The primary key is a generated column, so that that updates to charcol are the designated way of interacting with the table with string literals from the outside:
REPLACE into tab (charcol) VALUES ('1010202030304040A0A0B0B0C0C0D0D0');
SELECT HEX(hexcol_binary) as HEXstring, tab.* FROM tab;
as seen building keys and indexes on the hexcol_binary works as intended.
To verify the speedup, take
ALTER TABLE `testdb`.`tab`
ADD INDEX `charkey` (`charcol` ASC);
EXPLAIN SELECT * from tab where hexcol_binary = UNHEX('1010202030304040A0A0B0B0C0C0D0D0') #keylength 16
EXPLAIN SELECT * from tab where charcol = '1010202030304040A0A0B0B0C0C0D0D0' #keylength 97
the lookup on the hexcol_binary column is much better performing, especially if its additonally made unique.
Note: the hex conversion does not care if the hex-characters A through F are capitalized or not for the conversion process, however the charcol will be very sensitive to this.

How can I decode base32 to string in mysql

Im looking for a way to decode a string encrypted in base32 back to original string in mysql.
I know there is a SP to do this with base64 but cannot find anything for base32.
Is it possible? is there a stored procedure I can use somewhere?
What are ways to implement this?
Thanks!
BASE 64 or BASE 32 are not encrypted, they are just encoded. MySQL does not have a native function to perform encoding/decoding of Base 32 strings as it has for Base 64, FROM_BASE_64 e TO_BASE_64.
As an alternative you can try the CONV mathematical function (depending on the content stored as BASE32). Lets say you have UUID numbers stored as DECIMAL and need to show them as BASE32 or vice versa:
SELECT uuid, conv(uuid, 10, 32) uuid_b32, conv(conv(uuid, 10, 32), 32, 10)
FROM database.table;
The answer above is for number conversions between distinct base. If that is not the case, as when you have a binary file stored on a blob column, them you'll probably need to do the encoding/decoding outside MySQL. You may use MIME::Base32 or the proper module on your preferred language. Anyway, you'll need to know if the field has text or binary encoded in Base32.

Convert Tektronix's RIBinary data in TCL

I am pulling data from a Tektronix oscilloscope in Tektronix' RIBinary format using a TCL script, and then within the script I need to convert that to a decimal value.
I have done very little with binary conversions in the first place, but to add to my frustration the documentation on this binary format is also very vague in my opinion. Anyway, here's my current code:
proc ::Scope::CaptureWaveform {VisaAlias Channel} {
# Apply scope settings
::VISA::Write $VisaAlias "*WAI"
::VISA::Write $VisaAlias "DATa:STARt 1"
::VISA::Write $VisaAlias "DATa:STOP 4000"
::VISA::Write $VisaAlias "DATa:ENCdg RIBinary"
::VISA::Write $VisaAlias "DATa:SOUrce $Channel"
# Download waveform
set RIBinaryWaveform [::VISA::Query $VisaAlias "CURVe?"]
# Parse out leading label from scope output
set RIBinaryWaveform [string range $RIBinaryWaveform 11 end]
# Convert binary data to a binary string usable by TCL
binary scan $RIBinaryWaveform "I*" TCLBinaryWaveform
set TCLBinaryWaveform
# Convert binary data to list
}
Now, this code pulls the following data from the machine:
-1064723993 -486674282 50109321 -6337556 70678 8459972 143470359 1046714383 1082560884 1042711231 1074910212 1057300801 1061457453 1079313832 1066305613 1059935120 1068139252 1066053580 1065228329 1062213553
And this is what the machine pulls when I just take regular ASCII data (i.e. what the above data should look like after the conversion):
-1064723968 -486674272 50109320 -6337556 70678 8459972 143470352 1046714368 1082560896 1042711232 1074910208 1057300800 1061457472 1079313792 1066305600 1059935104 1068139264 1066053568 1065228352 1062213568
Finally, here is a reference to the RIBinary specification from Tektronix since I don't think it is a standard data type:
http://www.tek.com/support/faqs/how-binary-data-represented-tektronix-oscilloscopes
I've been looking for a while now on the Tektronix website for more information on converting the data and the above URL is all I've been able to find, but I'll comment or edit this post if I find any more information that might be useful.
Updates
Answers don't necessarily have to be in TCL. If anyone can help me logically work through this on a high level I can hash out the TCL details (this I think would be more helpful to others as well)
The reason I need to transfer the data in binary and then convert it afterwards is for the purpose of optimization. Due to this I can't have the device perform the conversion before the transfer as it will slow down the process.
I updated my code some and now my results are maddeningly close to the actual results. I assume it may have something to do with the commas that are in the data originally.
Below are now examples of the raw data sent from the device without any of my parsing.
On suggestion from #kostix, I made a second script with code he gave me that I modified to fit my data set. It can be seen below, however the result are exactly the same as my above code.
ASCIi:
:CURVE -1064723968,-486674272,50109320,-6337556,70678,8459972,143470352,1046714368,1082560896,1042711232,1074910208,1057300800,1061457472,1079313792,1066305600,1059935104,1068139264,1066053568,1065228352,1062213568
RIBinary:
:CURVE #280ÀçâýðüÿKì
Note on RIBinary - ":CURVE #280" is all part of the header that I need to parse out, but the #280 part of it can vary depending on the data I'm collecting. Here's some more info from Tektronix on what the #280 means:
block is the waveform data in binary format. The waveform is formatted
as: # where is the number of y bytes. For
example, if = 500, then = 3. is the number of bytes to
transfer including checksum.
So, for my current data set x = 2 and yyy = 80. I am just really unfamiliar with converting binary data, so I'm not sure what to do programmatically to deal with the block format.
On suggestion from #kostix I made a second script with code he gave me that I modified to fit my data set:
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
binary scan $RIBinaryWaveform a8a curv nbytes
encoding convertfrom ascii ${curv}
scan $nbytes %u n
set n
set headerlen [expr {$n + 9}]
binary scan $RIBinaryWaveform #9a$n nbytes
scan $nbytes %u n
set n
set numints [expr {$n / 4}]
binary scan $RIBinaryWaveform #${headerlen}I${numints} data
set data
The output of this code is the same as the code I provided above.
According to the documentation you link to, RIBinary is signed big-endian. Thus, you convert the binary data to integers with binary scan $data "I*" someVar (I* means “as many big-endian 4-byte integers as you can”). You use the same conversion with RPBinary (if you've got that) but you then need to chop each value to the positive 32-bit integer range by doing & 0xFFFFFFFF (assuming at least Tcl 8.5). For FPBinary, use R* (requires 8.5). SRIBinary, SRPBinary and SFPBinary are the little-endian versions, for which you use lower-case format characters.
Getting conversions correct can take some experimentation.
I have no experience with this stuff but like googleing. Here are my findings.
This document, in the section titled "Formatted I/O Operations" tells that the viQueryf() standard C API function combines viPrintf() (writing to a device) with viScanf() (reading from a device), and examples include calls like viQueryf (io, ":CURV?\n", "%#b", &totalPoints, rdBuffer); (see the section «IEEE-488.2 Binary Data—"%b"»), where the third argument to the function specifies the desired format.
The VISA::Query procedure from your Tcl library pretty much resembles that viQueryf() in my eyes, so I'd expect it to accept the third (optional) argument which specifies the format you want the data to be in.
If there's nothing like it, let's look at your ASCII data. Your FAQ entry and the document I found both specify that the opaque data might come in the form of a series of integers of different size and endianness. The "RIBinary" format states it should be big-endian signed integers.
The binary scan Tcl command is able to scan 16-bit and 32-bit big-endian integers from a byte stream — use the S* and I* formats, correspondingly.
Your ASCII data clearly looks like 32-bit integers, so I'd try scanning using I*.
Also see this doc — it appears to have much in common with the PDF guide I linked above, but might be handy anyway.
TL;DR
Try studying your API to find a way to explicitly tell the device the data format you want. This might produce a more robust solution in the case the device might be somehow reconfigured externally to change its default data format effectively pulling the rug under the feet of your code which relies on certain (guessed) default.
Try interpreting the data as outlined above and see if the interpretation looks sensible.
P.S.
This might mean nothing at all, but I failed to find any example which has "e" between the "CURV" and the "?" in the calls to viQueryf().
Update (2013-01-17, in light of the new discoveries about the data format): to binary scan the data of varying types, you might employ two techniques:
binary scan accepts as many specifiers in a row, you like; they're are processed from left to right as binary scan reads the supplied data.
You can do multiple runs of binary scanning over a chunk of your binary data either by cutting pieces of this chunk (string manipulation Tcl commands understand they're operating on a byte array and behave accordingly) or use the #offset term in the binary scan format string to make it start scanning from the specified offset.
Another technique worth employing here is that you'd better first train yourself on a toy example. This is best done in an interactive Tcl shell — tkcon is a best bet but plain tclsh is also OK, especially if called via rlwrap (POSIX systems only).
For instance, you could create a fake data for yourself like this:
% set b [encoding convertto ascii ":CURVE #224"]
:CURVE #224
% append b [binary format S* [list 0 1 2 -3 4 -5 6 7 -8 9 10 -11]]
:CURVE #224............
Here we first created a byte array containing the header and then created another byte array containing twelve 16-bit integers packed MSB first, and then appended it to the first array essentially creating a data block our device is supposed to return (well, there's less integers than the device returns). encoding convertto takes the name of a character encoding and a string and produces a binary array of that string converted to the specified encoding. binary format is told to consume a list of arbitrary size (* in the format list) and interpret it as a list of 16-bit integers to be packed in the big-endian format — the S format character.
Now we can scan it back like this:
% binary scan $b a8a curv nbytes
2
% encoding convertfrom ascii $curv
:CURVE #
% scan $nbytes %u n
1
% set n
2
% set headerlen [expr {$n + 9}]
11
% binary scan $b #9a$n nbytes
1
% scan $nbytes %u n
1
% set n
24
% set numints [expr {$n / 2}]
12
% binary scan $b #${headerlen}S${numints} data
1
% set data
0 1 2 -3 4 -5 6 7 -8 9 10 -11
Here we proceeded like this:
Interpret the header:
Read the first eight bytes of the data as ASCII characters (a8) — this should read our :CURVE # prefix. We convert the header prefix from the packed ASCII form to the Tcl's internal string encoding using encoding convertfrom.
Read the next byte (a) which is then interpreted as the length, in bytes, of the next field, using the scan command.
We then calculate the length of the header read so far to use it later. This values is saved to the "headerlen" variable. The length of the header amounts to the 9 fixed bytes plus variable-number of bytes (2 in our case) specifying the length of the following data.
Read the next field which will be interpreted as the "number of data bytes" value.
To do this, we offset the scanner by 9 (the length of ":CURVE #2") and read so many ASCII bytes as obtained on the previous step, so we use #9a$n for the format: $n is just obtaining the value of a variable named "n", and it will be 2 in our case. Then we scan the obtained value and finally get the number of the following raw data.
Since we will read 16-bit integers, not bytes, we divide this number by 2 and store the result to the "numints" variable.
Read the data. To do this, we have to offset the scanner by the length of the header. We use #${headerlen}S${numints} for the format string. Tcl expands those ${varname} before passing the string to the binary scan so the actual string in our case will be #11S12 which means "offset by 11 bytes then scan 12 16-bit big-endian integers".
binary scan puts a list of integers to the variable which name is passed, so no additional decoding of those integers is needed.
Note that in the real program you should probably do certain sanity checks:
* After the first step check that the static part of the header is really ":CURVE #".
* Check the return value of binary scan and scan after each invocation and check it equals to the number of variables passed to the command (which means the command was able to parse the data).
One more insight. The manual you cited says:
is the number of bytes to transfer including checksum.
so it's quite possible that not all of those data bytes represent measures, but some of them represent the checksum. I don't know what format (and hence length) and algorithm and position of this checksum is. But if the data does indeed include a checksum, you can't interpret it all using S*. Instead, you will probably take another approach:
Extract the measurement data using string range and save it to a variable.
binary scan the checksum field.
Calculate the checksum on the data obtained on the first step, verify it.
Use binary scan on the extracted data to get back your measurements.
Checksumming procedures are available in tcllib.
# Download waveform
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
# Extract block format data
set ResultCount [expr [string range ${RIBinaryWaveform} 2 [expr [string index${RIBinaryWaveform} 1] + 1]] / 4]
# Parse out leading label from Tektronics block format
set RIBinaryWaveform [string range ${RIBinaryWaveform} [expr [string index ${RIBinaryWaveform} 1] + 2] end]
# Convert binary data to integer values
binary scan ${RIBinaryWaveform} "I${ResultCount}" Waveform
set Waveform
Okay, the code above does the magic trick. This is very similar to all the things discussed on this page, but I figured I needed to clear up the confusion about the numbers from the binary conversion being different from the numbers received in ASCII.
After troubleshooting with a Tektronix application specialist we discovered that the data I had been receiving after the binary conversion (the numbers that were off by a few digits) were actually the true values captured by the scope.
The reason the ASCII values are wrong is a result of the binary-to-ASCII conversion done by the instrument and then the incorrect values are then passed by the scope to TCL.
So, we had it right a few days ago. The instrument was just throwing me for a loop.

AES Encryption in Oracle and MySQL are giving different results

I am in need to compare data between an Oracle database and a MySQL database.
In Oracle, the data is first encrypted with the AES-128 algorithm, and then hashed. Which means it is not possible to recover the data and decrypt it.
The same data is available in MySQL, and in plain text. So to compare the data, I tried encrypting and then hashing the MySQL data while following the same steps done in Oracle.
After lots of tries, I finally found out that the aes_encrypt in MySQL returns different results than the one in Oracle.
-- ORACLE:
-- First the key is hashed with md5 to make it a 128bit key:
raw_key := DBMS_CRYPTO.Hash (UTL_I18N.STRING_TO_RAW ('test_key', 'AL32UTF8'), DBMS_CRYPTO.HASH_MD5);
-- Initialize the encrypted result
encryption_type:= DBMS_CRYPTO.ENCRYPT_AES128 + DBMS_CRYPTO.CHAIN_CBC + DBMS_CRYPTO.PAD_PKCS5;
-- Then the data is being encrypted with AES:
encrypted_result := DBMS_CRYPTO.ENCRYPT(UTL_I18N.STRING_TO_RAW('test-data', 'AL32UTF8'), encryption_type, raw_key);
The result for the oracle code will be: 8FCA326C25C8908446D28884394F2E22
-- MySQL
-- While doing the same with MySQL, I have tried the following:
SELECT hex(aes_encrypt('test-data', MD5('test_key'));
The result for the MySQL code will be: DC7ACAC07F04BBE0ECEC6B6934CF79FE
Am I missing something? Or are the encryption methods between different languages not the same?
UPDATE:
According to the comments below, I believe I should mention the fact that the result of DBMS_CRYPTO.Hash in Oracle is the same as the result returned by the MD5 function in MySQL.
Also using CBC or CBE in Oracle gives the same result, since the IV isn't being passed to the function, thus the default value of the IV is used which is NULL
BOUNTY:
If someone can verify my last comment, and the fact that if using same padding on both sides, will yield same results gets the bounty:
#rossum The default padding in MySQL is PKCS7, mmm... Oh.. In Oracle
it's using PKCS5, can't believe I didn't notice that. Thanks. (Btw
Oracle doesn't have the PAD_PKCS7 option, not in 11g at least)
MySQL's MD5 function returns a string of 32 hexadecimal characters. It's marked as a binary string but it isn't the 16 byte binary data one would expect.
So to fix it, this string must be converted back to the binary data:
SELECT hex(aes_encrypt('test-data', unhex(MD5('test_key'))));
The result is:
8FCA326C25C8908446D28884394F2E22
It's again a string of 32 hexadecimal characters. But otherwise it's the same result as with Oracle.
And BTW:
MySQL uses PKCS7 padding.
PKCS5 padding and PKCS7 padding are one and the same. So the Oracle padding option is correct.
MySQL uses ECB block cipher mode. So you'll have to adapt the code accordingly. (It doesn't make any difference for the first 16 bytes.)
MySQL uses no initialization vector (the same as your Oracle code).
MySQL uses a non-standard folding a keys. So to achieve the same result in MySQL and Oracle (or .NET or Java), only use keys that are 16 byte long.
Just would like to give the complete solution for dummies based on #Codo's very didactic answer.
EDIT:
For being exact in general cases, I found this:
- "PKCS#5 padding is a subset of PKCS#7 padding for 8 byte block sizes".
So strictly PKCS5 can't be applied to AES; they mean PKCS7 but use their
names interchangeably.
About PKCS5 and PKCS7
/* MySQL uses a non-standard folding a key.
* So to achieve the same result in MySQL and Oracle (or .NET or Java),
only use keys that are 16 bytes long (32 hexadecimal symbols) = 128 bits
AES encryption, the MySQL AES_encrypt default one.
*
* This means MySQL admits any key length between 16 and 32 bytes
for 128 bits AES encryption, but it's not allowed by the standard
AES to use a non-16 bytes key, so do not use it as you won't be able
to use the standard AES decrypt in other platform for keys with more
than 16 bytes, and would be obligued to program the MySQL folding of
the key in that other platform, with the XOR stuff, etc.
(it's already out there but why doing weird non-standard things thay
may change when MySQL decide, etc.).
Moreover, I think they say the algorithm chosen by MySQL for those
cases is a really bad choose on a security level...
*/
-- ### ORACLE:
-- First the key is hashed with md5 to make it a 128 bit key (16 bytes, 32 hex symbols):
raw_key := DBMS_CRYPTO.Hash (UTL_I18N.STRING_TO_RAW ('test_key', 'AL32UTF8'), DBMS_CRYPTO.HASH_MD5);
-- MySQL uses AL32UTF8, at least by default
-- Configure the encryption parameters:
encryption_type:= DBMS_CRYPTO.ENCRYPT_AES128 + DBMS_CRYPTO.CHAIN_ECB + DBMS_CRYPTO.PAD_PKCS5;
-- Strictly speaking, it's really PKCS7.
/* And I choose ECB for being faster if applied and
#Codo said it's the correct one, but as standard (Oracle) AES128 will only accept
16 bytes keys, CBC also works, as I believe they are not applied to a 16 bytes key.
Could someone confirm this? */
-- Then the data is encrypted with AES:
encrypted_result := DBMS_CRYPTO.ENCRYPT(UTL_I18N.STRING_TO_RAW('test-data', 'AL32UTF8'), encryption_type, raw_key);
-- The result is binary (varbinary, blob).
-- One can use RAWTOHEX() for if you want to represent it in hex characters.
In case you use directly the 16 bytes hashed passphrase in hex characters representation or 32 hex random chars:
raw_key := HEXTORAW(32_hex_key)
encryption_type := 6 + 768 + 4096 -- (same as above in numbers; see Oracle Docum.)
raw_data := UTL_I18N.STRING_TO_RAW('test-data', 'AL32UTF8')
encrypted_result := DBMS_CRYPTO.ENCRYPT( raw_data, encryption_type, raw_key )
-- ORACLE Decryption:
decrypted_result := UTL_I18N.RAW_TO_CHAR( CRYPTO.DECRYPT( raw_data, encryption_type, raw_key ), 'AL32UTF8' )
-- In SQL:
SELECT
UTL_I18N.RAW_TO_CHAR(
DBMS_CRYPTO.DECRYPT(
UTL_I18N.STRING_TO_RAW('test-data', 'AL32UTF8'),
6 + 768 + 4096,
HEXTORAW(32_hex_key)
) , 'AL32UTF8') as "decrypted"
FROM DUAL;
-- ### MySQL decryption:
-- MySQL's MD5 function returns a string of 32 hexadecimal characters (=16 bytes=128 bits).
-- It's marked as a binary string but it isn't the 16 bytes binary data one would expect.
-- NOTE: Note that the kind of return of MD5, SHA1, etc functions changed in some versions since 5.3.x. See MySQL 5.7 manual.
-- So to fix it, this string must be converted back from hex to binary data with unHex():
SELECT hex(aes_encrypt('test-data', unhex(MD5('test_key')));
P.S.:
I would recommend to read the improved explanation in MySQL 5.7 Manual, which moreover now allows a lot more configuration.
MySQL AES_ENCRYPT improved explanation from v5.7 manual
Could be CBC vs ECB. Comment at the bottom of this page: http://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html says mysql function uses ECB