Punctuation insensitive search in mySQL - mysql

I have a database of phrases that users will search for from their own input. I want them to find the phrase regardless of what punctuation they use. For example if the phrase, "Hey, how are you?" is in the row, I want all of the following searches to return it:
"Hey! How are you?!"
"Hey how are you?"
"Hey :) How are you?"
Right now, I have the columns 'phrase' and 'phrase_search'. Phrase search is a stripped down version of phrase so our example would be 'hey-how-are-you'.
Is there anyway to achieve this without storing the phrase twice?
Thank you!
-Nicky

What you've done is probably the most time-efficient way of doing it. Yes, it requires double the space, but is that an issue?
If it is an issue, a possible solution would be to convert your search string to use wildcards (eg. %Hey%how%are%you%) and then filter the SQL results in your code by applying the same stripping function to the database input and the search string and comparing them. The rationale behind this is that there should be relatively few matches with non-punctuation characters in-between the words, so you're still getting MySQL to do the "heavy lifting" while your PHP/Perl/Python/whatever code can do a more fine-grained check on a relatively small number of rows.
(This assumes that you have some code calling this, rather than a user typing the SQL query from the command line, of course.)

Related

How to find words by partially strings

I have been trying to solve this problem for hours, but I dont know how to approach it, so I would need a push to a right direction.
I want to create a page where users can find the appropriate word, by providing word length and characters.
For example, user wants to find all the 5 letter words, where the second letter is R and fourth V, like this:
_R_V_
I have a table with column WORDS with words "letter", "moon", "drive", "mrive" and the query should return: "drive" and "mrive".
Is it possible to do it in MySQL?
While I was looking for the direction I found that I should create a trie structure. I dont know how to do that, but I will learn it if there is no easier way.
Yes, you can use LIKE :
SELECT * FROM YourTable t
WHERE t.word_col LIKE '_R_V_'
_ Wildcard stands for any single character. This will also force the string to be 5 characters in length, since % wildcard is not used.
You can find a great explanation about LIKE wildcards in the link above.

MYSQL search for right words | fixing spelling errors

I have a table dictionary which contains a list of words Like:
ID|word
---------
1|hello
2|google
3|similar
...
so i want if somebody writes a text like
"helo iam looking for simlar engines for gogle".
Now I want to check every word if it exists in the database, if not it should
get me the similar word for the word. For example: helo = hello, simlar = similar, gogle = google.
Well, i want to fix the spelling errors. In my database i have a full dictionary of all english words. I coudn't find any mysql function which helps me. LIKE isn't helpfull in my situation.
you can use soundex() function for comparing phonetically
your query should be something like:
select * from table where soundex(word) like soundex('helo');
and this will return you the hello row
There is a function that does roughly want you want, but it's intensive and will slow queries down. You might be able to use in your circumstances, I have used it before. It's called Levenshtein. You can get it here How to add levenshtein function in mysql?
What you want to do is called a fuzzy search. You could use the SOUNDEX function in MySQL, documented here:
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_soundex
You query would look like:
SELECT * FROM dictionary where SOUNDEX(word) = SOUNDEX(:yourSearchTerm)
... where your search term is bound to the :yourSearchTerm parameter value.
A next step would be to try implementing and making use of a Levenshtein function in MySQL. One is described here:
http://www.artfulsoftware.com/infotree/qrytip.php?id=552
The Levenshtein distance between two strings is the minimum number of
operations needed to transform one string into the other, where an
operation may be insertion, deletion or substitution of one character.
You might also consider looking into databases that are aimed at full text searching, such as Elastic Search, which provides this natively:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-fuzzy-query.html

Ideas for Find and Replace character

I need to search address fields and change one character to upper case if there is an apartment number. So '521 Main St. #3b' would change to '521 Main St. #3B'.
The way I know to do this would be to write a program that loops through the recordset, looks at the address field for the last character to see if it's an alpha, then if the character before it is a numeric, change the case of the last char and update the record.
Is this something that would be quicker/simpler with regular expressions (haven't ever used)?
If so, is this best done from within a programming environmnet or using a text editor such as Textmate or vi ? The data is in MySQL and Excel, but I can export it to a text file.
Thanks.
I solved this using TextMate which, once I began to understand a little regex, was simple. (details here Regex Syntax for making the last character Uppercase in TextMate)
Still, I wonder if something like sed or awk, (which I started to try out) might be a better tool. And the SQL solution that Olexa provided works. I just don't know how to have it apply to the entire recordset.
If the data is stored in MySQL, then it is better to process it there:
UPDATE addresses
SET address = CONCAT(LEFT(address, CHAR_LENGTH(address) - 1), UPPER(RIGHT(address, 1)))
WHERE address REGEXP BINARY '#[[:digit:]]+[[:lower:]]{1}$'
;
I've added BINARY because otherwise REGEXP is not case-sensitive, but BINARY may need to be omitted to support multi-byte strings. In this case, surplus updates will be made, but the result would be correct anyway.
P. S. An example on SQL Fiddle showing which values are affected, and how they are affected: http://sqlfiddle.com/#!2/b29326/1

How can I write a query with unwated spaces in my text?

I am currently improving the search functionality of my cms so that users can search for entries by copying and pasting text from a web page and finding it in the database.
The query is simple. It takes the search term and does a LIKE '% text here %' query.
The problem is, I'm not getting many results and have figured out why.
In the CMS itself, a lot of the text that has been entered from MS Word seems to be double spaced. Such as
"Hello my name is James"
However on the front end website it renders properly, with single spaces, like:
"Hello my name is James"
This means my query is never picking up the database entry based on what is shown on the web page.
Any suggestions? Do I tackle the double spaces in the CMS (seems risky to me with so much HTML in there!), or can I adjust my query to cope with it?
if it is only double spaces that are creating the issue, then just
replace(columnToSearch,' ',' ')
when searching, or as #ManMohan suggests, before inserting the data into your table in the first place
Take a look at MySQL REPLACE
REPLACE(str,from_str,to_str)
Returns the string str with all occurrences of the string from_str replaced by the string to_str.
REPLACE() performs a case-sensitive match when searching for from_str.

Mysql full text search exact phrase problem

Using full text search in mysql I'd like to have exact phrase:
"romantic dinner" to be found.
But I also would like each of the words could have synonyms like:
"romantic dinners" to be found for example (our language has great problem where every word has 8 endings like)...
I tried:
+"romantic (dinner dinners)" and
"romantic +(dinner dinners)"
but nothing seems to get results... Is it possible to make some logical OR inside exact search?
UPDATE: TO make it one sentance question: Is there a way to put some logical operators in exact match ("") in full text search?
It will only partially your problem, but there is a soundex() function in mysql which transform the given string to a soundex representation. Similars string should have the same soundex representation so it's maybe a start.
Hope this helps.
If '"roman* dinn*" doesn't work for you, it might be time to look into something like Solr: http://lucene.apache.org/solr/ which will allow for more sophisticated searches, and might already have a stemmer for your language.
SELECT field_name FROM table_name WHERE MATCH(field_name) AGAINST('romantic* dinner*' IN BOOLEAN MODE)...
definitely you can use + and - to further refining your results
reference: http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html