I have a situation where I'm assembling a query based on user provided criteria and want to figure out what the most efficient way is to do this.
If I have a table that looks like this:
int id | varchar phone | varchar email | varchar RFID
and the user will pass in an array which defines the order (and items) with which they'd like to look up a user which *could look like this:
["id","email","phone"]
or it could look like this:
["email"]
or it could look like this:
["phone","rfid"]
or any other possible combination of those 4 fields.
Based on what I receive I need to look the user up in the order in which these fields arrived and if I find a match, I don't want to keep looking.
In other words if the input is
["email","rfid","phone"]
and I look into the db and find a user with the provided email, I don't want to keep looking to see if their rfid also matches, I just want to return said user.
However, if I don't find such an email, then I want to move on to the rfid.
So, in the various tests I've done (mostly playing with a case statement in the where clause) my results have been really terrible. Frequently taking almost a second to return a value, as opposed to taking <50ms when I simplify the where to search for the individual field.
I should note that all these fields are indexed.
So... my question is, should I just bite the bullet and make as many sql calls as there are items in the incoming array, or is there some really efficient way to structure a single query that will not bog down the system as my various attempts have.
I recognize that this may be too abstract a question, but am hoping that there's some mechanism for just such a use that I'm simply overlooking.
I don't think there's any good way to do a short-circuit in SQL. You can construct a WHERE clause that uses OR to combine the critiera, but doing this generally prevents it from using the indexes. You can use a UNION like this:
SELECT * FROM
(SELECT 1 precedence, table.*
FROM table
WHERE field1 = 'value'
UNION
SELECT 2 precedence, table.*
FROM table
WHERE field2 = 'value'
...
) x
ORDER BY precedence
LIMIT 1
where you replace field1, field2, etc. with the field names from the input array. This will produce the desired results in one query, but it will have to perform all the sub-queries, it won't short-circuit.
The best solution is probably to solve it in the application code. Loop through the fields in the input, and perform a query for just that field. When you get a result, break out of the loop and return it.
Related
I'm trying to get zip codes from zip_id's which are internally stored in companies service table below screens will give you clear idea
I have wrote this query
companies service table
Please suggest me your valuable views . Thanks in advance.
As already mentioned your database scheme is not very well designed, it violates even 1st normal form. You'd need another table where you'd store serv_area_id and zip_code (with possibly multiple rows for a signle serv_area_id) and search within this table and eventually join your original table.
Nevertheless, in order to get the result you describe you cannot use the IN operator as it operates on a value and multiple values in a form of table (either explicit via nested SELECT or enumeration literal (val1, ..., valN)). I would try some string matching as illustrated below. However, consider it rather an ugly hack than correct solution(!)
SELECT zip FROM cities_extended WHERE (
SELECT GROUP_CONCAT(',', serv_are_zipcodes)
FROM company_service_areas WHERE ...
) LIKE concat('%(', id, ')%')
Im building a simple search system, I have a simple form and I'm doing a query like this:
Select * from table where column_a like'%term%' or columnn_b like '%term%' or column_c like'%term%';
It is possible to determine which column was that the string %term% match (without using a bunch of if statements)?, actually I'm using CakePHP, but at this point I will not care if I need to build the query manually.
No, the identity of the matched column is lost in the evaluation of your or clauses. You'll need to do some post-processing (i.e., the "bunch of if statements" you were trying to avoid) to identify exactly which column in your result set matched.
The scenario is this: in a table A, I have one column "tags", which is varchar(255).
In this column I store numbers, separated by commas, like this:
2,14,31,33,56
etc. there can be none, one, or several.
and I need to make a SELECT query that will return rows that have a certain number in this field. right now I'm using this method (don't be alarmed, I know its a poor way.. that's why I'm asking for help!). for example, let's assume the number I want to check is 33. the query is:
SELECT * FROM table_a WHERE
tags LIKE "%,33,%" OR tags LIKE "33,%" OR tags LIKE "%,33" OR tags LIKE "33"
I'm no expert but I know this can't be the method. The first question that comes to mind is: is there a command similar to IN() but that works the other way around?
I mean, can I tell it "find rows where 'tags' contains value 33" ?
When asking this question, I can see that there may be another field type other than varchar(255) to contain this type of data (an array of numbers, after all)
Is there a GOOD and efficient way of doing this? my method works for small tables, yes, but if the table grows.. (say, 10k rows, 50k, 300k ... ) this is obviously a problem.
The function that you want is find_in_set():
SELECT *
FROM table_a
WHERE find_in_set(33, tags) > 0;
You can simplify your like statement to be:
SELECT *
FROM table_a
WHERE concat(',', tags, ',') LIKE '%,33,%';
Neither of these can make use of an index. Having a separate table with one row per entity and per tag is the right way to go (but I think you know this already).
I have the following problem:
I have a feed capturer that captures news from different sources every half an hour.
I only insert entries that don't have their URLs already in the database (URL is used to see if the record is already in database).
Even with that, I get some repeated entries, because some sites report the same news (that usually are from a news source like Reuters). I could look for these repeated entries during insertion, but i think this would slow the insertion time even more.
So, I can later find these repeated entries by the title. But I think this search is slow. Then, my idea is to generate a numeric field from the title and then search by this number for repeated titles.
What kind of encoding could I use (I thought in something reverse to base64) to encode the titles?
I'm suposing that searching for repeated numbers is a lot faster than searching for repeated words. Is that true or not?
Do you suggest a better solution for this problem?
Well, I don't care to have the repeated entries in the database, I just don't want to show then to the user. Like google, that filters the repeated results, but shows then if you want.
I hope I explained It well. Thanks in advance.
Fill the MD5 hash of the URL and title and build a UNIQUE index on it:
CREATE UNIQUE INDEX ux_mytable_title_url ON (title_hash, url_hash)
INSERT
INTO mytable (url, title, url_hash, title_hash)
VALUES ('url', 'title', MD5('url'), MD5('title'))
To select like Google (one result per title), use this query:
SELECT *
FROM (
SELECT DISTINCT title_hash
FROM mytable
) md
JOIN mytable mo
ON mo.url_title = md.title_hash
AND mo.url_hash =
(
SELECT url_hash
FROM mytable mi
WHERE mi.title_hash = md.title_hash
ORDER BY
mi.title_hash, mi.url_hash
LIMIT 1
)
so you can use a new table containing only the encoded keys based on title and url, you have then to add a key on it to accelerate search. But i don't think that you can use an effecient algorytm to transform strings to numbers ..
for the encryption use
SELECT MD5(CONCAT('title', 'url'));
and before every insertion you test if the encoded concatenation of title and url exists on this table.
#Quassnoi can explain better than I, but I think there is no visible difference in performance if you use a VARCHAR/CHAR or INT in a index to use it later for GROUPing or other method to find the duplicates. That way you could use the solution proposed by him but use a normal INDEX instead of a UNIQUE index and keep the duplicates in the database, filtering out only when showing to users.
Is there a way that I can do a select as such
select * from attributes where product_id = 500
would return
id name description
1 wheel round and black
2 horn makes loud noise
3 window solid object you can see through
and the query
select * from attributes where product_id = 234
would return the same results as would any query to this table.
Now obviously I could just remove the where clause and go about my day. But this involves editing code that I don't really want to modify so i'm trying to fix this at the database level.
So is there a "magical" way to ignore what is in the where clause and return whatever I want using a view or something ?
Even if it was possible, I doubt it would work. Both of those WHERE clauses expect one thing to be returned, therefore the code would probably just use the first row returned, not all of them.
It would also give the database a behaviour that would make future developers pull their hair out trying to understand.
Do it properly and fix the code.
or you could pass "product_id" instead of an integer, if there's no code checking for that...so the query would become:
select * from attributes where product_id = product_id;
this would give you every row in the table.
If you can't edit the query, maybe you can append to it? You could stick
OR 1=1
on the end.
You may be able to use result set metadata to get what you want, but a result set won't have descriptions of fields. The specific API to get result set metadata from a prepared query varies by programming language, and you haven't said what language you're using.
You can query the INFORMATION_SCHEMA for the products table.
SELECT ordinal_position, column_name, column_comment
FROM INFORMATION_SCHEMA.columns
WHERE table_name = 'products' AND schema_name = 'mydatabase';
You can restructure the database into an Entity-Attribute-Value design, but that's a much more ambitious change than fixing your code.
Or you can abandon SQL databases altogether, and use a semantic data store like RDF, which allows you to query metadata of an entity in the same way you query data.
As far out as this idea seems I'm always interested in crazy ways to do things.
I think the best solution I could come up with is to use a view that uses the products table to get all the products then the attributes table to get the attributes, so every possible product is accounted for and all will get the same result