Fulltext search that isn't exact match - mysql

I have a MySQL-table called "customers1" running engine MyISAM. I've created a full text index on the columns name,adress and zip. Now one of the customers in that table is me. I spell my name "Gildebrand". Now i can't expect that the users can spell my name correctly, many might write "Glidebrant", but still want to find my. How could i do that search in SQL?
If i run the following query right now
SELECT * FROM customers1 WHERE MATCH(name,adress,zip) AGAINST('Gildebrand')
It finds me, of course. But if i misspell, "Glidebrand", it doesn't find me. What would be the best approach to this?

I would say the closest you can get from such a result if by using SOUNDEX() http://www.w3resource.com/mysql/string-functions/mysql-soundex-function.php

I have a generic search similar to yours. Here's basically what I do:
SELECT * FROM customers1 WHERE MATCH(name,adress,zip) AGAINST('?')
UNION
SELECT * FROM customers1 WHERE name LIKE ('?%')
This allows the user to just enter a prefix also.
If the user realizes they can't spell your name, but they're sure it starts with Gil, then they can just type that.
You can add additional UNION clauses if you want to support prefixes on other columns too.

Related

How to make a good sql search with like

For years when I want my user to search some field in my database where he can type anything he wants I use an algorithm to break the words and search each word separetely... a mess.
For example, if the user types in the search box "aaa bbb ccc" I dont like using:
SELECT id
FROM table
WHERE description LIKE '%aaa bbb ccc%'
Cause sometimes the user types things out of order and the query above wouldng find. What I usually do is breaking the string and concatenating it with PHP so the result becomes:
SELECT id
FROM table
WHERE description LIKE '%aaa%'
AND description LIKE '%bbb%'
AND description LIKE '%ccc%'
But today after talk to a friend I was wondering if there is some native way to do this faster using MY SQL?
What you want to do is called full text search and most relational databases support it nowadays, including mysql.
I think you can use REGEXP. For example:
Select * from table where description REGEXP 'aaa|bbb|ccc'
FULLTEXT Searches are really fast.
INSTR or locate works better than REGEXP. But it depends on various factors.
More comparison here
SELECT * from table where INSTR(description, 'aaa') >0
SELECT * from table where LOCATE(description, 'aaa') >0

Count the number of times keywords (in a table) appear in a field in another table

I will simplify my problem in order to explain it.
I have a table which contains text messages posted by users and another table which contains keywords.
I want to display, for each user, the number of times keywords are found in text messages.
I don't want the result to display a keyword if it's not found in text messages.
I wan't it to be case INSENSITIVE. All keywords are lowered but in messages, you can find lower & upper chars.
Because I'm not sure that my explanation is clear enough, here comes the SQLFiddle.
http://sqlfiddle.com/#!2/c402a
Hope anyone can help me.
I found what I was looking for. It wasn't easy for me but here is my query :
SELECT t_msg.msg_usr,
t_list.list_word,
count(t_list.list_word),
t_msg.msg_text
FROM t_msg
INNER JOIN t_list
ON LOWER(t_msg.msg_text) LIKE CONCAT("%", t_list.list_word, "%")
GROUP BY t_msg.msg_usr, t_list.list_word;
The SQLFiddle is there : http://sqlfiddle.com/#!2/ba052/8
The recommendation would be to not try solving this with a query. It's possible to write a query that will do it, such query will scan the messages table for each keyword separately, and produce a count (or a row that you can group by), but this won't scale, or be reliable in sense of language search.
Here is what you might want to do:
Create a table to map (user_id, keyword_id) to a count of this keyword in messages of this user. Let's call it t_keyword_count.
Each time you receive a message, before you save the message into the database, search it for all the keywords you care about (using whatever good text search libraries that account for misspellings, etc.). You should know the (user_id) for this message.
You will, at that point, be ready to add the message to the database, and will have an array of (keyword_id) with keywords that this message will have.
In a transaction, insert the message into the t_msg table, and run update/insert for (user_id,keyword_id) to have value=value+1 (or +n, if you need to count the same keyword more than once in the same message) for the t_keyword_count table.
If you are trying to solve the problem of having to do the above on existing data, you can do this manually, just to build up that t_keyword_count table first (depends on how many keywords you have in total, but even if there are a lot, this can be scripted). But you should change (or mirror) the t_msg.msg_text field to be a field suitable for text search, and use SQL text search functionality to find the keywords.

how to search several columns in a sql query using concat and UPPER

So I have a DB and a web page where I want to display the result of my db search
I have try a couple ways both work but they are incomplete for what I want to accomplish
I want to be able to search my table columns id,Id_Nombre,Pais,Estado,Ciudad,website all of them with a word or several words from my search text.
this code works but I have to type exactly the word it's case sensitive:
$query = "SELECT * FROM Medios_table WHERE concat(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE '%$Busqueda%'";
so as a result i I type a word like "People" in my search box and in my data base that word is type in any of those columns as "people" it wont find it.
second code I use works but only using one column
$query = "SELECT * FROM Medios_table WHERE UPPER(Id_Nombre) LIKE UPPER('%$Busqueda%')";
the result for this code is great since it will find it no matter the case used, but I need to extend this type of search to all the other columns, but so far everithing I use does not work.
I have tried:
$query = "SELECT * FROM Medios_table WHERE UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE UPPER(concat(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
etc.
any help is greatly appreciated, thanks.
Did you try UPPER around each column name? Like this:
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id),UPPER(Id_Nombre),UPPER(Pais),UPPER(Estado),UPPER(Ciudad),UPPER(website)) LIKE UPPER('%$Busqueda%')";
Your upper(concat(...)) version should work. If you had a problem with that, it's probably just a typo, like you left out a parenthesis or something.
You can't say upper(x,y,z) because upper takes only one parameter. You must do the concat first, then do the upper, i.e. upper(concat(x,y,z)).
That said, this query will be very slow on a big database, because the db engine has to read every record in the table, and then for each one search it character by character. If the table is small or this is done infrequently, that might be acceptable. If the table is big and/or this query will be executed often, you really need a totally different approach.
Update
If you really need to search against any text in a field, where you cannot make any assumptions about the text being searched in or the text being searched for in advance, you should investigate fulltext searches. http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
If you will be doing these sort of searches all the time, you might want to build a dictionary, that is, build a list of all the word with all the records that that word occurs in. I think that's basically what fulltext does, so this may or may not gain you anything.
But if you have some foreknowledge, that is, if it's not really that you want to search for arbitrary text occurring anywhere in arbitrary text, than break out the things you want to search for into separate fields.
To take a simple example, suppose you have a field that contains customer full name, like "Fred Smith", "Mary Jones", etc. You want to search for someone by last name. You could search for
where full_name like '%Smith%'
But this would require reading every record in the table. If, instead, you broke the field into first name and last name, then you could search for
where last_name='Smith'
If you have an index on last_name this would be a very fast search.
If you're just trying to look for an entered value in any of several fields, it is much, much faster to do
where estado='Toledo' or ciudad='Toledo' or pais='Toledo'
(assuming that you have indexes on estado, ciudad, and pais), then
where concat(estado, ciudad, pais) like '%Toledo%'
(where no index will do you any good).
If you want to do case-insensitive searches, create an index on upper(estado) instead of on estado, etc. Then "where upper(estado)='TOLEDO'" can use the index.
Also, I'm not sure about MySql, but some database engines can use an index if the LIKE does not begin with a wildcard. That is, "somefield like '%x%'" must read every record in the table. But on some db's, "somefield like 'x%'" can use can index to get to the records where the field starts with "x" and then just process those.
You need full-text search
This link may be useful Mysql full-text search

MySQL: Selecting most similar value?

So I have a table with peoples names.
I need to code a query that selects the person with the name most similar to the given one.
For example:
SELECT * FROM people WHERE name='joen'
'joen' doesnt exist in the table, so it will return John, which exists in the table.
What's the MySQL command for this?
You may be looking for SOUNDEX and SOUNDS_LIKE
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html
The Levenshtein algorithm computes the "distance" between words: Levenshtein MySQL function
You can use it like this if you add that function:
SELECT * FROM people WHERE levenshtein('joen', `name`) BETWEEN 0 AND 4;
I'm not an expert on this, but this is no trivial matter. By "similar" do you mean it might be one or 2 letters off, or do you mean "Jon" should match "Jonathan" and "Bill" should match "William"?
If the latter, you might want to find or build a table that maps names/nicknames to eachother, and do a search on that.
If it's misspellings, Levenshtein might be of assistance, but I don't know how you'd integrate that in an SQL query.
The LIKE Keyword might help you
SELECT * FROM people WHERE name LIKE 'jo%'
Selects users where name starts with "jo"
If you're programatticaly changing the query, you
can check for the "result" and query the database again
with the new Query by reducing the characters in the
"name" specified.
SELECT * FROM people WHERE name LIKE 'jones%'
SELECT * FROM people WHERE name LIKE 'jone%'
SELECT * FROM people WHERE name LIKE 'jon%'
SELECT * FROM people WHERE name LIKE 'jo%'

MySQL MATCH for more than one field

SELECT * FROM portfolio
INNER JOIN translation
ON portfolio.description = translation.key
WHERE
MATCH(it_translation.*) AGAINST('test')
Why this code doesn't work?
If I do like this MATCH(it_translation.field) AGAINST('test') everything is ok, but I wanna search FULLTEXT via more than one field, and I don't know how many fields in table.
IIRC for FULLTEXT to work you need a FULLTEXT index that covers every field you want to use it for, so if you "don't know how many fields in table" you won't be able to MATCH it like that.