Can any one simplify the where condition of this mysql select - mysql

Hi. Can any one simplify the where condition of this mysql select statement? It takes a long time to bring the result or it asks for SET SQL_BIG_SELECTS=1.
In the query below:
The postcode contains values like BH12 or SW10,
The *req_area* contains data like Kensington and Chelsea, SW10,
The region have values like Kensington and Chelsea,
The *town_area* have values like West Brompton, Chelsea.
select `a`.`user_id` AS `user_id`,`a`.`req_area` AS `req_area`,`a`.`req_area2` AS `req_area2`,`a`.`req_area3` AS `req_area3`,
`a`.`req_property_type` AS `req_property_type`,`a`.`req_bedrooms` AS `req_bedrooms`,`b`.`latitude` AS `latitude`,
`b`.`longitude` AS `longitude`,`b`.`postcode` AS `postcode`
from (`cff_user_property_req_view` `a` join `cff_uk_short_postcodes` `b`)
where
(`b`.`postcode` regexp concat("'",TRIM(`a`.`req_area`),'|',TRIM(`a`.`req_area2`),'|',TRIM(`a`.`req_area3`),"'")>=1 or
`b`.`region` regexp concat("'",TRIM(`a`.`req_area`),'|',TRIM(`a`.`req_area2`),'|',TRIM(`a`.`req_area3`),"'")>=1 or
`b`.`town_area` regexp concat("'",concat('[[:<:]]',`a`.`req_area`,'[[:>:]]'),'|',concat('[[:<:]]',`a`.`req_area2`,'[[:>:]]'),'|',concat('[[:<:]]',`a`.`req_area3`,'[[:>:]]'),"'")>=1)
order by `a`.`user_id`;
Thanks in advance.

The reason why this is so slow is because your code requires to evaluate three regular expressions on the whole outer product of the two tables. Regular expressions are slow, and anything that has to go through the whole table to find matching rows is rather slow as well. There is little you can do while preserving the exact semantics of the query you have given.
So instead of asking for ways to improove that query, you might be better of describing what it is you're tyring to achieve, and then find a way to model that in a better way. Fulltext search indices might help. Splitting columns into words and storing those words in an extra table might help. I'm not sure whether it would be better to edit your question, or to leave this question as it now stands, and ask a completely new question for that.
You probably should also give an example of what req_area should look like in cases where you expect a match. As the req_area fields are always included in a regular expression, your example won't yield a match, as this long req_area of “Kensington and Chelsea, SW10” is not included in its entirety in any of the other values from your example. Providing some actual examples using sqlfiddle would make it easier for others to experiment with possible queries, thus increasing both the quality of the answers you receive (as the queries have actually been checked) and the chances of receiving any answers at all (because people can go ahead and develop their answers through experiments).

Related

SQL Index on Strings Helpful?

So I have used MySQL a lot in small projects, for school; however, I'm not taking over a enterprise-ish scale project, and now speed matters, not just getting the right information back. I have Googled around a lot trying to learn how indexes might make my website faster, and I am hoping to further understand how they work, not just when to use them.
So, I find myself doing a lot of SELECT DISTINCTS in order to get all the distinct values, so i can populate my dropdowns. I have heard that this would be faster if this column was indexed; however, I don't completely understand why. If the values in this columns were ints, I would totally understand; basically a data structure like a BST would be created, and search times could be Log(n); however, if my column is strings, how can it put a string in a BST? This doesn't seem possible, since there is no metric to compare a string against another string (like there are with numbers). It seems like an index would just create a list of all the possible values for that column, but it seems as if the search would still require the database to go through every single row, making this search linear, just like if the database just scanned a regular tables.
My second question is what does the database do once it finds the right value in the index data structure. For example, let's say I'm doing a where age = 42. So, the database goes through the data structure until it finds 42, but how does it map that lookup to the whole row? Does the index have some sort of row number associated with it?
Lastly, if I am doing these frequent SELECT DISTINCT statements, is adding an index going to help? I feel like this must be a common task for websites, as many sites have dropdowns where you can filter results, I'm just trying to figure out if I'm approaching it the right way.
Thanks in advance.
You logic is good, however, your assumption that there is no metric to compare string to other strings is incorrect. Strings can simply be compared in alphabetical order, giving them a perfectly usable comparison metric that can be used to build the index.
It takes a tiny bit longer to compare strings then it does ints, however, having an index still speeds things up, regardless of the comparison cost.
I would like to mention however that if you are using SELECT DISTINCT as much as you say, there are probably problems with your database schema.
You should learn about normalizing your database. I recommend starting with this link: http://databases.about.com/od/specificproducts/a/normalization.htm
Normalization will provide you with querying mechanism that can vastly outweigh benefits received from indexing.
if your strings are something small like categories, then an index will help. If you have large chunks of random text, then you will likely want a full text index. If you are having to use select distinct a lot, your database may not be properly normalized for what you are doing. You could also put the distinct values in a separate table (that only has the distinct values), but this only helps if the content does not change a lot. Indexing strategies are particular to your application's access patterns, the data itself, and how the tables are normalized (or not).
HTH

how to perform MySQL smart text search in a column?

I am trying to search for a shop name in one of MySQL table, the table has a field called fullname. As of now I am using the SOUNDS LIKE method of MySQL however here's an example that failed:
Say I have the string Banana's Shop. Then using SOUNDS LIKE with query of 'nana' or 'bananas' won't give me the result. Here's my current query:
SELECT `fullName` FROM `shop` WHERE `fullName` SOUNDS LIKE 'nana';
is there a better way to do simple search like this in MySQL that is smarter so that typo's would also still match?
The ancient and slightly honorable SOUNDEX algorithm used by SOUNDS LIKE doesn't handle suffix sounds. That is, nana doesn't, and can't, match banana. banani will match banana, however.
Two utterances don't necessarily sound alike unless they have the same number of syllables. It's good for matching stuff like surnames: Smith, Schmitt, and Schmidt all have the same SOUNDEX value.
Calling SOUNDEX 'smart text search' is an exaggeration. http://en.wikipedia.org/wiki/Soundex
You might consider MySQL FULLTEXT search, which you can look up. This does a certain amount of phrase matching. That is, if you had "banana shop" and "banana slug" in your column, the word "banana" would have a shot at matching both those values.
Be careful with FULLTEXT. It works counterintuitively when you have less than about a couple of hundred rows in the table you're searching.
But that's not a typo-friendly word matcher. What you're asking isn't really easy.
You could consider the Levenshtein algorithm (which you can look up). But it's a hairball to get working properly.

Searching on a table using mysql table?

I have a question asked in an interview
A form with following field "price","tool" , "range_max" , "range_min" , "age_of_tool" also any field can be NULL.
How will u implement the searching on table a query ?
My answer was
I will use "AND" while searching...
What do u think should be right query I should give them ?
Personally, I think they're looking for intelligent search ideas based on the available criteria, not the logical join condition you're looking to use.
Contrary to what Ajay says, an AND search is more appropriate. Think of it from your own perspective when searching for stuff on the Internet. If you look for 'red' things and 'small' things, you'd expect the search results to only contain small red things.
Otherwise, take a look at those fields. They pretty much tell you what you need to search. For example, anything that uses "range_min" to discriminate is almost always going to be :-
WHERE [SomeValue] > range_min
Think logically about how each field might be employed in a search, and you'll probably arrive at better a understanding than any of the ready-rolled answers you receive on this question.
No, you should use OR in place of AND if you want to search in every column.
Because when we use AND then value must be present in all columns but in case of OR if it is present in one column, it will show you some records according to the search criteria you specified.

MySQL Fulltext search but using LIKE

I'm recently doing some string searches from a table with about 50k strings in it, fairly large I'd say but not that big. I was doing some nested queries for a 'search within results' kinda thing. I was using LIKE statement to get a match of a searched keyword.
I came across MySQL's Full-Text search which I tried so I added a fulltext index to my str column. I'm aware that Full-text searches doesn't work on virtually created tables or even with Views so queries with sub-selects will not fit. I mentioned I was doing a nested queries, example is:
SELECT s2.id, s2.str
FROM
(
SELECT s1.id, s1.str
FROM
(
SELECT id, str
FROM strings
WHERE str LIKE '%term%'
) AS s1
WHERE s1.str LIKE '%another_term%'
) AS s2
WHERE s2.str LIKE '%a_much_deeper_term%';
This is actually not applied to any code yet, I was just doing some tests. Also, searching strings like this can be easily achieved by using Sphinx (performance wise) but let's consider Sphinx not being available and I want to know how this will work well in pure SQL query. Running this query on a table without Full-text added takes about 2.97 secs. (depends on the search term). However, running this query on a table with Full-text added to the str column finished in like 104ms which is fast (i think?).
My question is simple, is it valid to use LIKE or is it a good practice to use it at all in a table with Full-text added when normally we would use MATCH and AGAINST statements?
Thanks!
In this case you not neccessarily need subselects. You can siply use:
SELECT id, str
FROM item_strings
WHERE str LIKE '%term%'
AND str LIKE '%another_term%'
AND str LIKE '%a_much_deeper_term%'
... but also raises a good question: the order in which you are excluding the rows. I guess MySQL is smart enough to assume that the longest term will be the most restrictive, so starting with a_much_deeper_term it will eliminate most of the records then perform addtitional comparsion only on a few rows. - Contrary to this, if you start with term you will probably end up with many possible records then you have to compare them against the st of the terms.
The interesting part is that you can force the order in which the comparsion is made by using your original subselect example. This gives the opportunity to make a decision which term is the most restrictive based upon more han just the length, but for example:
the ratio of consonants a vowels
the longest chain of consonants of the word
the most used vowel in the word
...etc. You can also apply some heuristics based on the type of textual infomation you are handling.
Edit:
This is just a hunch but it could be possible to apply the LIKE to the words in the fulltext indexitself. Then match the rows against the index as if you have serched for full words.
I'm not sure if this is actually done, but it would be a smart thing to pull off by the MySQL people. Also note that this theory can only be used if all possible ocurrences arein fact in the fulltext search. For this you need that:
Your search pattern must be at least the size of the miimal word-length. (If you re searching for example %id% then it can be a part of a 3 letter word too, which is excluded by default form FULLTEXT index).
Your search pattern must not be a substring of any listed excluded word for example: and, of etc.
Your pattern must not contain any special characters.

MySQL search in comma list [duplicate]

This question already has answers here:
MySQL query finding values in a comma separated string
(11 answers)
Closed 5 years ago.
I have a MySQL field with a reference to another table where ids are saved as comma seperated list, eg:
12,13,14,16
which stand for values in another table. I know this is very bad and wrong, but this comes from above and I cant do anything about that. The problem now is that i want to search in that field with a query like this:
SELECT ... WHERE field LIKE '%1%'
The Problem now is obviously that almost all entries can be found with this example Query, because the most common IDs are in Range 10-20. My Idea is to search for %,1,% instead, but this does not work for the first and last id in the field. Ist there something like an internal replace or how do i fix this the best way?
You need the FIND_IN_SET function:
SELECT ... WHERE FIND_IN_SET('1', field)
Be aware that plain FIND_IN_SET is case-insensitive,
i.e. FIND_IN_SET('b3','a1,a2,B3,b3') and FIND_IN_SET('B3','a1,a2,B3,b3') both return 3.
To be case sensitive, add 'binary' modifier to the 1st argument, e.g. FIND_IN_SET (binary 'b3', 'a1,a2,B3,b3') returns 4.
As others have said, Find_In_Set will let you write the query, but you really need to look at your database design (and I know you know this...)
The trouble with including Foreign Keys in a delimited list like this is that whole point of a foreign key is to enable you to locate the information in the other table quickly, using Indexes. By implementing a database as it sounds you have, you have all sorts of issues to resolve:
How do I prevent duplicates (which would waste space)
How do I remove a given value (Requires custom function, leading to possibility of errors?
How do I respond to performance issues as the size of my tables increase?
There's only one truly acceptable way to address this - which is not to face the problem in the first place.
Have a sit down chat with those on high, and explain the problems with their solution - then explain the advantages of doing the job properly.
If they won't even discuss the point, look for a job with a decent employer who values your contributions.
Martin.
FIND_IN_SET is your best bet
SELECT ... WHERE FIND_IN_SET(1,field_name)
After reading this question Id like to add that if your comma delimited list has spaces i.e. (1, 2,3 ,4) you will need to remove the leading/trailing spaces or use WHERE FIND_IN_SET(X,field) OR FIND_IN_SET(' X',field) OR FIND_IN_SET('X ',field)..... Just thought i'd share that since i came across that problem..... just gotta create those databases right the first time or they will give you all kinds of trouble.