XXHash as a Mysql User Defined Function

XXHash as a Mysql User Defined Function - mysql

We store representations of millions of chemical compounds as BLOBs in a MySQL database. We also keep hashes of these BLOBs when we need to query among these compounds and comparing these hashes in the queries.
Since we found out that standard hash functions(such as CRC) provided by MySQL library collides frequently for our use-case, we used a custom hash function specific to our data, wrapped it as a MySQL plugin and created a User Defined Function with this plugin as below:
CREATE FUNCTION customhash RETURNS INTEGER SONAME 'customhash.so'
Unfortunately, we need to move our MySQL installation to another managed data centre and because of the security reasons & data centre policy, we are not allowed to customize MySQL by adding plugins.
We've recently heard about the XXHash library, we made a few tests on it and we found out it has great performance and it doesn't generate collisions in our data. Also, it turns out it has already been used by MySQL standard distribution internally.
I wonder if it is possible to configure MySQL server to call XXH64_digest function in our MySQL routines without compiling it as a plugin.

I've checked MySQL source code and built-in functions and I could not find any way to run XXHash in MySQL routines. It seems XXHash is used by MySQL internally and it is not user-visible.
In order to run XXHash in MySQL routines, I have developed a plugin in case anyone needs to use XXHash algorithm in MySQL server.
This plug in can be found here: Github repository for xxhash_mysql_plugin.
After installing plug in you can run the xxhash function in your select statements as below:

Related

how to define an architecture for a database deployment

Hy.
I'm building a software and then I deploy the sources on to 3 differents mysql databases (hosted on 3 differents places), my question is... is there a way to automatize changes I made in one of the db to the others???
Is there any tool I can use so I do not have to change manually all the databases???

If you want an automated job execution, you must create a scripting tools yourself and execute as your need. But what I see now is you need some way of Database Versioning.
It's hard to understand at first but it's easy enough to dig it down. The versioning process is:
baseline: This is your entire database tables and predefined records in one script. Usually it is recorded on major version such as 1.0.0.sql or 2.0.0.sql. Baseline only executed once.
updates: this is a "patch" for your tables separated in each sql scripts.
view, functions and procedures: each of your views, functions and procedures in separated sql file.
schema_change_log: this table records baseline version and updates patch versions executed in database
The publishing step:
check latest version in schema_change_log
get updates with version bigger than the latest version
execute the updates script
drop all views, functions and procedures
re-apply all views, functions and procedures

I recommend for this to use MySQL Workbench so you can make any changes to the model and then through the menu of Database you can synchronize the changes with your databases.

How to Save an Append or Delete Query in MySQL

So I'm moving from MS Access to MySQL:
In MS Access you can store certain INSERT, DELETE, and UPDATE queries as objects alongside your tables. Thus for anyone who don't understand computers that well, they can click on the objects and automatically run the queries to alter the master table for various business functions.
In MySQL, where and how do you store these queries, I seem to be only able to make tables. When I write a piece of code using the SQL editor, I can only save it to a remote location (such as my local desktop) and not onto the MySQL database, where it's accessible for my coworkers.
If you can't save it onto the server, how would I write a piece of code and execute it within the database that would be easily usable by others.
Thanks

The answer to this question is going to depend on your environment, your users, and your bandwidth to support any given solution. You are gaining a lot by making the switch from Access to MySQL, however you are losing some of the the WYSIWYG features. (e.g., Access forms that can bind directly to your data source.)
There are many approaches:
If your users are more advanced, simply having access to the database using MySQL Workbench may suffice. From there they would have access to run views, stored procedures, or to create their own custom queries.
Another option would be to script your objects using Python and provide a simple gui using TkInter. Python is generally thought of as an easy to use language; with built in suppport for MySQL and TkInter is its "default" interface.
Using the LAMP architecture is another largely popular paradigm using MySQL as the backend database.
There is also nothing stopping you from using Access to link to your MySQL db using MySQL as an external data source.
I hope this provides enough info to help you begin whittling down your options.

How does the phpMyAdmin export feature work?

If I were to want to create a PHP function that does the same thing as the Export tab in phpMyAdmin, how could I do it? I don't know if there is a MySQL function that does this or if phpMyAdmin just builds the export file (in SQL that is) manually. Without shell access. Just using PHP.
I tried the documentation for mysqldump, but that seemed to require using the shell. I'm not quite sure what that even is -- maybe my question is: how do you use shell?
My silly idea is to allow non-technical users to build a site on one server (say a localhost) using MySQL then export the site, database and all, to another server (eg. a remote server).
I think I'm pretty clear on the Import process.

You can check the phpMyAdmin source code (an advantage of open-source software). Check the export.php script and the supporting functions in the libraries/export/sql.php script file.
In summary, what phpMyAdmin does is:
get a list of the tables in the given database (SHOW TABLES FROM...),
get the create query for each table (SHOW CREATE TABLE...),
parse it and extract column definitions from it,
get all data (SELECT * FROM...)
build a query according to column data.
I've written similar code for my own apps (for backup purposes, when the GPL license of phpMyAdmin doesn't allow me to use it), however I use DESCRIBE to get column definitions. I think they rather parse the SHOW CREATE TABLE output because contains more information than DESCRIBE output.
This way to generate SQL sentences requires a bit of care to handle the escaping but it allows for some flexibility, as you can convert types, filter or sanitize data, etc. It is also a lot slower than using a tool like mysqldump and you should take care of not consuming all available memory (write soon, write often, don't keep everything in memory).
If you will implement a migration process (from server to server) maybe it would be easier to do it with some shell scripting and calling mysqldump directly, unless you will do everything with PHP.

What's the appropriate way to test code that uses MySQL-specific queries internally

I am collecting data and store this data in a MySQL database using Java. Additionally, I use Maven for building the project, TestNG as a test framework, and Spring-Jdbc for accessing the database. I've implemented a DAO layer that encapsulates the access to the database. Besides adding data using the DAO classes I want to execute some queries which aggregate the data and store the results in some other tables (like materialized views).
Now, I would like to write some testcases which check whether the DAO classes are working as they should. Therefore, I thought of using an in-memory database which will be populated with some test data. Since I am also using MySQL-specific SQL queries for aggregating data, I went into some trouble:
Firstly, I've thought of simply using the embedded-database functionality provided by Spring-Jdbc to instantiate an embedded database. I've decided to use the H2 implementation. There I ran into trouble because of the aggregation queries, which are using MySQL-specific content (e.g. time-manipulation functions like DATE()). Another disadvantage of this approach is that I need to maintain two ddl files - the actual ddl file defining the tables in MySQL (here I define the encoding and add comments to tables and columns, both features are MySQL-specific); and the test ddl file that defines the same tables but without comments etc. since H2 does not support comments.
I've found a description for using MySQL as an embedded database which I can use within the test cases (http://literatitech.blogspot.de/2011/04/embedded-mysql-server-for-junit-testing.html). That sounded really promising to me. Unfortunately, it didn't worked: A MissingResourceExcpetion occurred "Resource '5-0-21/Linux-amd64/mysqld' not found". It seems that the driver is not able to find the database daemon on my local machine. But I don't know what I have to look for to find a solution for that issue.
Now, I am a little bit stuck and I am wondering if I should have created the architecture differently. Do someone has some tips how I should setup an appropriate system? I have two other options in mind:
Instead of using an embedded database, I'll go with a native MySQL instance and setup a database that is only used for the testcases. This options sounds slow. Actually, I might want to setup a CI server later on and I thought that using an embedded database would be more appropriate since the test run faster.
I erase all the MySQL-specific stuff out of the SQL queries and use H2 as an embedded database for testing. If this option is the right choice, I would need to find another way to test the SQL queries that aggregates the data into materialized views.
Or is there a 3rd option which I don't have in mind?
I would appreciate any hints.
Thanks,
XComp

I've created Maven plugin exactly for this purpose: jcabi-mysql-maven-plugin. It starts a local MySQL server on pre-integration-test phase and shuts it down on post-integration-test.

If it is not possible to get the in-memory MySQL database to work I suggest using the H2 database for the "simple" tests and a dedicated MySQL instance to test MySQL-specific queries.
Additionally, the tests for the real MySQL database can be configured as integration tests in a separate maven profile so that they are not part of the regular maven build. On the CI server you can create an additional job that runs the MySQL tests periodically, e.g. daily or every few hours. With such a setup you can keep and test your product-specific queries while your regular build will not slow down. You can also run a normal build even if the test database is not available.
There is a nice maven plugin for integration tests called maven-failsafe-plugin. It provides pre- and post- integration test steps that can be used to setup the test data before the tests and to cleanup the database after the tests.

composing multiple mysql scripts

Is it possible to INCLUDE other mysql scripts in a composite script? Ideally I do not want to create a stored procedure for the included scripts... For larger projects I would like to maintain several smaller scripts hierarchically and then compose them as required... But for now, I'd be happy to just learn how to include other scripts...

source is a builtin command you can use in the MySQL client tool (which is what you're using to execute the SQL script):
mysql> source otherfile.sql
If you're executing SQL in a stored procedure or with an API, you should know that MySQL client builtins work only in the MySQL client.

MySQL scripts are just a list of commands, to be run in order against the database server. SQL is not a scripting language by any means, so it doesn't behave like one. The only way to "include" other scripts is to concatenate them together when you kick off the script load command.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008