Searching table efficiently for specific phrase - mysql

I want to search an entire column for a specific phrase. I know I can use the SQL statement of SELECT description FROM questions WHERE description LIKE '%what%' AND description LIKE '%if%' to search for the phrase "What if" in the description column. My problem is that if I have a million entries, then searching through the column might take a while.
Is there a way to search through an entire column efficiently to check if a specific phrase exists?

Full text search provides exactly what you want. By creating a text index for your search column, mysql will look for your phrase in that column efficiently, and will return to you a score based match for your phrase, which you can effectively use to get your result.

Related

Find and show partial text in mysql query

I have one mysql table that has a column that is "description".
In this column, some descriptions contain a word "peso" (this word can be "Peso", "peso", "PESO" but i need all the results).
What i need is one query that gives me all the rows that contains that word in the description text and show in another column on the result the partial text of the description starting in this word.
How can i do it? I don't have fulltext search.
You would do:
select substring(description, instr(lower(description), 'peso'))) as rest_of_string
from table
where lower(description) like '%peso%';
I'm adding the lower because you might be using a case-sensitive collation.

Count the number of times keywords (in a table) appear in a field in another table

I will simplify my problem in order to explain it.
I have a table which contains text messages posted by users and another table which contains keywords.
I want to display, for each user, the number of times keywords are found in text messages.
I don't want the result to display a keyword if it's not found in text messages.
I wan't it to be case INSENSITIVE. All keywords are lowered but in messages, you can find lower & upper chars.
Because I'm not sure that my explanation is clear enough, here comes the SQLFiddle.
http://sqlfiddle.com/#!2/c402a
Hope anyone can help me.
I found what I was looking for. It wasn't easy for me but here is my query :
SELECT t_msg.msg_usr,
t_list.list_word,
count(t_list.list_word),
t_msg.msg_text
FROM t_msg
INNER JOIN t_list
ON LOWER(t_msg.msg_text) LIKE CONCAT("%", t_list.list_word, "%")
GROUP BY t_msg.msg_usr, t_list.list_word;
The SQLFiddle is there : http://sqlfiddle.com/#!2/ba052/8
The recommendation would be to not try solving this with a query. It's possible to write a query that will do it, such query will scan the messages table for each keyword separately, and produce a count (or a row that you can group by), but this won't scale, or be reliable in sense of language search.
Here is what you might want to do:
Create a table to map (user_id, keyword_id) to a count of this keyword in messages of this user. Let's call it t_keyword_count.
Each time you receive a message, before you save the message into the database, search it for all the keywords you care about (using whatever good text search libraries that account for misspellings, etc.). You should know the (user_id) for this message.
You will, at that point, be ready to add the message to the database, and will have an array of (keyword_id) with keywords that this message will have.
In a transaction, insert the message into the t_msg table, and run update/insert for (user_id,keyword_id) to have value=value+1 (or +n, if you need to count the same keyword more than once in the same message) for the t_keyword_count table.
If you are trying to solve the problem of having to do the above on existing data, you can do this manually, just to build up that t_keyword_count table first (depends on how many keywords you have in total, but even if there are a lot, this can be scripted). But you should change (or mirror) the t_msg.msg_text field to be a field suitable for text search, and use SQL text search functionality to find the keywords.

how to search several columns in a sql query using concat and UPPER

So I have a DB and a web page where I want to display the result of my db search
I have try a couple ways both work but they are incomplete for what I want to accomplish
I want to be able to search my table columns id,Id_Nombre,Pais,Estado,Ciudad,website all of them with a word or several words from my search text.
this code works but I have to type exactly the word it's case sensitive:
$query = "SELECT * FROM Medios_table WHERE concat(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE '%$Busqueda%'";
so as a result i I type a word like "People" in my search box and in my data base that word is type in any of those columns as "people" it wont find it.
second code I use works but only using one column
$query = "SELECT * FROM Medios_table WHERE UPPER(Id_Nombre) LIKE UPPER('%$Busqueda%')";
the result for this code is great since it will find it no matter the case used, but I need to extend this type of search to all the other columns, but so far everithing I use does not work.
I have tried:
$query = "SELECT * FROM Medios_table WHERE UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE UPPER(concat(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id,Id_Nombre,Pais,Estado,Ciudad,website)) LIKE UPPER('%$Busqueda%')";
etc.
any help is greatly appreciated, thanks.
Did you try UPPER around each column name? Like this:
$query = "SELECT * FROM Medios_table WHERE concat(UPPER(id),UPPER(Id_Nombre),UPPER(Pais),UPPER(Estado),UPPER(Ciudad),UPPER(website)) LIKE UPPER('%$Busqueda%')";
Your upper(concat(...)) version should work. If you had a problem with that, it's probably just a typo, like you left out a parenthesis or something.
You can't say upper(x,y,z) because upper takes only one parameter. You must do the concat first, then do the upper, i.e. upper(concat(x,y,z)).
That said, this query will be very slow on a big database, because the db engine has to read every record in the table, and then for each one search it character by character. If the table is small or this is done infrequently, that might be acceptable. If the table is big and/or this query will be executed often, you really need a totally different approach.
Update
If you really need to search against any text in a field, where you cannot make any assumptions about the text being searched in or the text being searched for in advance, you should investigate fulltext searches. http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
If you will be doing these sort of searches all the time, you might want to build a dictionary, that is, build a list of all the word with all the records that that word occurs in. I think that's basically what fulltext does, so this may or may not gain you anything.
But if you have some foreknowledge, that is, if it's not really that you want to search for arbitrary text occurring anywhere in arbitrary text, than break out the things you want to search for into separate fields.
To take a simple example, suppose you have a field that contains customer full name, like "Fred Smith", "Mary Jones", etc. You want to search for someone by last name. You could search for
where full_name like '%Smith%'
But this would require reading every record in the table. If, instead, you broke the field into first name and last name, then you could search for
where last_name='Smith'
If you have an index on last_name this would be a very fast search.
If you're just trying to look for an entered value in any of several fields, it is much, much faster to do
where estado='Toledo' or ciudad='Toledo' or pais='Toledo'
(assuming that you have indexes on estado, ciudad, and pais), then
where concat(estado, ciudad, pais) like '%Toledo%'
(where no index will do you any good).
If you want to do case-insensitive searches, create an index on upper(estado) instead of on estado, etc. Then "where upper(estado)='TOLEDO'" can use the index.
Also, I'm not sure about MySql, but some database engines can use an index if the LIKE does not begin with a wildcard. That is, "somefield like '%x%'" must read every record in the table. But on some db's, "somefield like 'x%'" can use can index to get to the records where the field starts with "x" and then just process those.
You need full-text search
This link may be useful Mysql full-text search

How to search a word in MySQL table? MATCH AGAINST or LIKE?

I have a table for some companies that may have many branches in different countries. These countries are inserted in countries field.
Now, I have to make a searching system that allows users to find companies that have any branch in a specific country.
My question is: Which one do I have to use ? MATCH AGAINST or LIKE ? The query must search all records to find complete matched items.
attention: Records may have different country name separated with a comma.
MATCH AGAINST clause is used in Full Text Search.
for this you need to create a full text index on search column countries.
full text index search is much faster than LIKE '%country%' serach.
I would change the implementation: having a field that contains multiple values is a bad idea, for example, it's difficult to maintain - how will you implement remove a country from a company ?.
I believe that a better approach would be to have a separate table companies_countries which will have two columns: company_id and country_id, and could have multiple lines per company.
You should use LIKE . Because as #Omesh mentioned MATCH AGAINST clause is used for Full Text Search.. And Full Text Search need entire column for search.

mysql - extract specific words from text field using full text search

My question is a little simillar to Extract specific words from text field in mysql, but now the same.
I have a text field with words inside. In my language word can have many different endings. I need to find this endings.
I use fulltext search of mysql, but I would need to have access to the index database where all the field is "cut" to words and words are counted. I could then search for "test*" and I could quickly find "test", "tested", "testing". I need the list of all endigns that exist in my database, that is my primary goal.
As it is I can get the records with specific "test*" words in it, but I need not only to locate the occurence in the field, but to group somehow so I get the list of all the words that for example start with "test". I don't need location in which record they are, just a list, grouped so that "testing" is not written 10 times but only once (maybe a counter of how many times it is found but not necessary).
Is there a way to extract this info from fulltextsearch field or should I explode all this fields to words and make a index table full of words and just do a "like "word%" and group by the different results? I am not sure how to do that either in practice, but just to point me to the right direction please.
So to summarize: I have a text fied and I need to find out which words are inside that start with "test", like "tested", "test", "testing" etc... It doesn't make sense in English but in my language it does as we have same word on different endigns and there are so many of them, somethimes 20, I need to find out which ones are there so I can make a synonims table ;-)
UPDATE:
Database has columns ID (int), ingredients (text) and recipe (text).
Data in ingredients are cooking ingredients with different endings like:
1 egg
2 eggs
etc.
You can dump all words that are present in an index. And that would also show frequency of each word. E.g. test is used 200 times and testing is used 300 times.
Manual for that: http://dev.mysql.com/doc/refman/5.0/en/myisam-ftdump.html