MySQL search with typo - mysql

In my MySQL database I have a user table. I need to perform search as you type with typo over the user name field. There are few very old question on this topic. I tested the builtin full text search of mysql but it didn't work as expected (it does not handle typo) [I knew but I tried anyway].
What's my best option? I thought there should be an easy solution nowadays. I'm thinking about replicating the user table on elasticsearch and do the instant search from there, but I'd really like to avoid the syncronization nightmare that this will cause.
Thanks!!

You could use SOUNDEX for mysql. We have tried that but I can say that it does not work that well and it also makes the search a bit slow.
We Had a similar issue and switched to ES.
What we did is as follows:
Created a trigger for the table that will be synced to ES. The
trigger will write to a new table. The columns of such a table would
be:
IdToUpdate Operation DateTime IsSynced
The Operation would be create, update, delete. IsSynced will tell
whether the update is pushed to ES.
Then add a corn job that would query this table for all rows that will have issynced set to say '0', Add those ID's and operation to a Queue like RabbitMQ. And set the ISSynced to 1 for those ID's
The reason to use RabbitMQ is that it will make sure that the update is forwarded to ES. In case of failure we can always re-queue the the object.
Write a consumer to get the objects from the queue and update ES.
Apart from this you will also have to create a utility that will create an ES index from the database for first time use.
And you can also look at Fuzzy Search of ES that will handle typo's
Also Completion suggester which also supports fuzzy search.

Related

Save MySql 'Show' result in db

So I'm kind of stumped.
I have a MySql project that involves a database table that is being manipulated and altered by scripts on a regular basis. This isn't so unusual, but I need to automate a script to run (after hours, when changes aren't happening) that would save the result of the following:
SHOW CREATE TABLE [table-name];
This command generates the ready-to-run script that would create the (empty) table in it's current state.
In SqlWorkbench and Navicat it displays the result of this SHOW command in a field in a result set, as if it was the result of a SELECT statement.
Ideally, I want to take into a variable in a procedure, and change the table name; adding a '-mm-dd-yyyy' to end of it, so I could show the day-to-day changes in the table schema on an active server.
However, I can't seem to be able to do that. Unlike a Select result set, I can't use it like that. I can't get it in a variable, or save it to a temporary, or physical table or anything. I even tried to return this as a value in a function, from which I got the error that a function cannot return a result set - which explains why it's displayed like one in the db clients.
I suspect that this is a security thing in MySql? If so, I can totally understand why and see the dangers exposed to a hacker, but this isn't a public-facing box at all, and I have full root/admin access to it. Hopefully somebody has already tackled this problem before.
This is on MySql 8, btw.
[Edit] After my first initial comments, I need to add; I'm not concerned about the data with this question whatsoever, but rather just these schema changes.
What I'd really -like- to do is this:
SELECT `Create Table` FROM ( SHOW CREATE TABLE carts )
But this seems to be mixing apples and oranges, as SHOW and SELECT aren't created equal, although they both seem to return the same sort of object
You cannot do it in the MySQL stored procedure language.
https://dev.mysql.com/doc/refman/8.0/en/show.html says:
Many MySQL APIs (such as PHP) enable you to treat the result returned from a SHOW statement as you would a result set from a SELECT; see Chapter 29, Connectors and APIs, or your API documentation for more information. In addition, you can work in SQL with results from queries on tables in the INFORMATION_SCHEMA database, which you cannot easily do with results from SHOW statements. See Chapter 26, INFORMATION_SCHEMA Tables.
What is absent from this paragraph is any mention of treating the results of SHOW commands like the results of SELECT queries in other contexts. There is no support for setting a variable to the result of a SHOW command, or using INTO, or running SHOW in a subquery.
So you can capture the result returned by a SHOW command in a client programming language (Java, Python, PHP, etc.), and I suggest you do this.
In theory, all the information used by SHOW CREATE TABLE is accessible in the INFORMATION_SCHEMA tables (mostly TABLES and COLUMNS), but formatting a complete CREATE TABLE statement is a non-trivial exercise, and I wouldn't attempt it. For one thing, there are new features in every release of MySQL, e.g. new data types and table options, etc. So even if you could come up with the right query to produce this output, in a couple of years it would be out of date and it would be a thankless code maintenance chore to update it.
The closest solution I can think of, in pure MySQL, is to regularly clone the table structure (no data), like so:
CREATE TABLE backup_20220618 LIKE my_table;
As far as I know, to get your hands on the full explicit CREATE TABLE statement, as a string, would require the use of an external tool like mysqldump which was designed specifically for that purpose.

Mysql Query match to check if query has been updated

I am trying to match two MySQL Queries (for now, the target is "Create VIEW") to analyze if the result of execution would result in the same effect to Database.
The source of the queries is not the same, making the syntax across the queries inconsistent.
To further simplify the question, let me add more details:
Let's say there is an already existing View in the database.
This View was created using a Create VIEW ... SQL statement.
There is a possibility that the Create VIEW ... statement get's updated, hence to reflect the changes in the database currently this statement is executed at the time of migration.
But, I want to avoid this situation, if the statement Create VIEW ... will result in the same structure as of the existing View in the database, I want to avoid executing it.
To generate the CREATE VIEW from database I am using SHOW CREATE VIEW... (comparing this with the query originally used to create the VIEW).
The primary restriction is I need to make this decision only at the time of migration and cannot presume any conclusions (say, using git diff or commit history...).
I have already done some search to look for a solution for this:
Found no direct solution for this problem (like a SQL engine to which I can feed both queries and know if the result would be the same).
Decided to Parse the queries and to achieve that ended up looking into ANTLR (also used by MYSQL WorkBench)
ANTLR's approach looks promising but, this will require an extensive rule-based parsing and creating a query match program from scratch.
I realized that just parsing queries is not enough, I have to create my own POJOs to store the atomic lexers from queries and then compare the queries based on some rules.
Even if I could find predefined POJOs, that would allow to quickly create a solution for this problem.

Making sure that a table is constructed correctly

I have a schema of a database and a web application. I want to have the web application be able to select, insert and remove rows to a table, but the table may not exist, maybe in a testing environment, and the table may be missing columns, most likely because the web application has updated.
I want to be able to make sure that the table is ready to accept the data that the web application sends to it during the time the application is alive.
The idea I had is the application (written in Java) will have a table structure embedded into it, and when the application starts, just copy all of the data in the table (if it exists) to a temporary table, delete the old table and make a new one with the temporary table's data, and then drop the temporary table. As you can tell, it's nowhere near innovative.
Another idea I had is use the SHOW COLUMNS command to correct any missing columns parallel with the SHOW TABLES LIKE to check if it exists, but I feel like Stack Overflow would've had a better solution. Is that all I can do?
There are many ways to solve the problem of consistency of the database version and the version of the application.
However, in the production database, this situation is unacceptable.
I think that the simplest ways are the best.
To ensure such compliance, it is enough to execute a script that updates the database before performing the testing.
START TRANSACTION;
DROP TABLE ... IF EXISTS;
CREATE TABLE ...
COMMIT;
Remember about IF EXISTS and having DROP grant!
Such a script can be easily managed by placing it in RCS and controlling the version number needed in the application.
You can also save this version number in some table in the database itself and check when the application starts, whether the number is compatible with the assumed one and if you do not call the database update script.
Have a look at JPA an Hibernate. There is hbm2ddl.auto property. Looks like "update" option does what you want.
For more details
What are the possible values of the Hibernate hbm2ddl.auto configuration and what do they do

Solr: continuous migration from MySQL

This may sound like an opinion question, but it's actually a technical one: Is there a standard process for maintaining a simple data set?
What I mean is this: let's say all I have is a list of something (we'll say books). The primary storage engine is MySQL. I see that Solr has a data import handler. I understand that I can use this to pull in book records on a first run - is it possible to use this for continuous migration? If so, would it work as well for updating books that have already been pulled into Solr as it would for pulling in new book records?
Otherwise, if the data import handler isn't the standard way to do it, what other ways are there? Thoughts?
Thank you very much for the help!
If you want to update documents from within Solr, I believe you'll need to use the UpdateRequestHandler as opposed to the DataImportHandler. I've never had need to do this where I work, so I don't know all that much about it. You may find this link of interest: Uploading Data With Index Handlers.
If you want to update Solr with records that have newly been added to your MySQL database, you would use the DataImportHandler for a delta-import. Basically, how it works is you have some kind of field in MySQL that shows the new record is, well, new. If the record is new, Solr will import it. For example, where I work, we have an "updated" field that Solr uses to determine whether or not it should import that record. Here's a good link to visit: DataImportHandler
The question looks similar to the one which we are doing, but not with SQL. Its with HBase(hadoop stack DB). However there we have Hbase indexer, which after mapping DB with Solr, listens to the events in hbase(DB) for new rows, and then executes code to fetch those values from DB and add in Solr. Not sure if there is such for SQL. However the concept looks similar. IN SQL I know about triggers which can listen to inserts and updates. At that even, you can trigger something to execute the steps of adding them in continuosly manner.

How to version control data stored in mysql

I'm trying to use a simple mysql database but tweak it so that every field is backed up up to an indefinite number of versions. The best way I can illustrate this is by replacing each and every field of every table with a stack of all the values this field has ever had (each of these values should be timestamped). I guess it's kind of like having customized version control for all my data..
Any ideas on how to do this?
The usual method for "tracking any changes" to a table is to add insert/update/delete trigger procedures on the table and have those records saved in a history table.
For example, if your main data table is "ItemInfo" then you would also have an ItemInfo_History table that got a copy of the new record every time anything changed (via the triggers).
This keeps the performance of your primary table consistent, yet gives you access to the history of any changes if you need it.
Here are some examples, they are for SQL Server but they demonstrate the logic:
My Repository table
My Repository History table
My Repository Insert trigger procedure
My Repository Update trigger procedure
Hmm, what you're talking about sounds similar to Slowly Changing Dimension.
Be aware that version control on arbitrary database structures is officially a rather Hard Problem. :-)
A simple solution would be to add a version/revision field to the tables, and whenever a record is updated, instead of updating it in place, insert a copy with the changes applied and the version number incremented. Then when selecting, always choose the record with the latest version. That's roughly how most such schemes are implemented (e.g. Wikimedia does it pretty much this exact way).
Maybe a tool can help you to do that for you. Have a look at nextep designer :
https://github.com/christophefondacci/nextep-designer
With this IDE you will be able to take snapshots of your database structure and data and put it under version control. After this you can compute the differences between any 2 versions and generate the appropriate SQL that can insert / update / delete your data.
Maybe this is an alternative way to achieve what you wanted.