Afternoon, here is my problem:
A user can make multiple selections for a search, upon which we need to select from a single column in a table where we have matching user ids to the criteria (you'll see example query below). These items can be in an assortment of AND/OR. For example:
User ID can have items:
1 and 3, or 4
or:
1 and 3 and 4
The user will enter the logic they want to apply in their search so we don't know for sure what the criteria will be in the select (AND/OR). I need to find an efficient solution for running a query/process to return the results.
I have tried a php process to select all of the criteria and then based on the logic the user chooses (and/or) to sort through the results, but this is a massive performance overhead on some large tables. One search was still running after a day..
The following query works very effectively, and unless someone has any other ideas I may have to write some code to generate a sql statement like this below and then run the query to get the results. It gets complicated with the mix of AND's and OR's though. This query is simple in that the logic is all AND:
select distinct(va1.userid) FROM validation_abilities as VA1
INNER JOIN validation_abilities as va2 ON va1.userid = va2.userid
INNER JOIN validation_abilities as va3 ON va1.userid = va3.userid
INNER JOIN validation_abilities as va4 ON va1.userid = va4.userid
INNER JOIN validation_abilities as va5 ON va1.userid = va5.userid
WHERE va1.ability_id = ‘179’
AND va2.ability_id = ‘178’
AND va3.ability_id = ‘289’
AND va4.ability_id = ‘287’
AND va5.ability_id = ‘328’
I just need to see if there is a more efficient design to this, what do you think?
Related
I have these two tables:
Achievement:
Achieves:
Question:
I want to retrieve rows from table Achievement. But, I do not want all the rows, I want the rows that a specific Steam ID has acquired. Let's take STEAM_0:0:46481449 for example, I want to check first the list of IDs that STEAM_0:0:46481449 has acquired (4th column in Achieves table states whether achievement is acquired or not) and then read only those achievements.
I hope that made sense, if not let me know so I can explain a little better.
I know how to do this with two MySQL statements, but can this be done with a single MySQL statement? That would be awesome if so please tell me :D
EDIT: I will add the two queries below
SELECT * FROM Achieves WHERE Achieves.SteamID = 'STEAM_0:0:46481449' AND Achieves.Acquired = 1;
Then after that I do the following query
SELECT * FROM Achievement;
And then through PHP I would check the IDs that I should take and output those. That's why I wanted to get the same result in 1 query since it's more readable and easier.
In sql left join, applying conditions on second table will filter the result when join conditions doesn't matter:
Select * from achievement
left join achieves on (achievement.id=achieves.id)
where achieves.acquired=1 and achieves.SteamID = 'STEAM_0:0:46481449'
Besides,I suggest not using ID in the achieves table as the shared key between two tables. Name it something else.
I don't think a left join makes sense here. There is no case where you don't want to see the Achievement table.
Something like this
SELECT *
FROM Achieves A
JOIN Achievement B on A.ID = B.ID
WHERE A.SteamID = 'STEAM_0:0:46481449'
AND A.Acquired = 1;
Below is my query
Select
count(t.prid)
from
(select
pr.prid
from
jcp
inner join pr ON pr.prid = jcp.prid
where
jcp.custid = 123 union select
pr.prid
from
jcl
inner join pr ON pr.prid = jcl.prid
where
jcl.custid = 123) as t
is there any way to make it more efficient? this query is inside some function and executing 1000s of time. so making it slow.
First of all, your query appears to be combining two very different types of data in your 'union' - the first part being the count of an ID, and the second being the literal ID - so I would question whether this is really doing what you intend it to do as written. However, just taking it at face value, you could eliminate the subquery in the first part as follows:
SELECT COUNT(pr.prid)
FROM jcp
INNER JOIN pr
ON pr.prid = jcp.prid
WHERE jcp.custid = 123
I can't say how much that would help your performance without knowing the context of your data, but it certainly wouldn't hurt.
Given the difference in the two data sets, it doesn't appear possible to avoid the union if you want to force these two different bits of data into the same column. If you were to put them into different columns, you could probably avoid the union.
This is really a two-part question, but in order not to mix things up, I'll divide into two actual questions. This one is about creating the correct SQL statement for selecting a row based on values in a many-to-many related table:
Now, the question is: what is the absolute simplest way of getting all resources where e.g metadata.category = subject AND where that category's corresponding metadata.value ='introduction'?
I'm sure this could be done in a lot of different ways, but I'm a novice in SQL, so please provide the simplest way possible... (If you could describe briefly what the statement means in plain English that would be great too. I have looked at introductions to SQL, but none of those I have found (for beginners) go into these many-to-many selections.)
The easiest way is to use the EXISTS clause. I'm more familiar with MSSQL but this should be close
SELECT *
FROM resources r
WHERE EXISTS (
SELECT *
FROM metadata_resources mr
INNER JOIN metadata m ON (mr.metadata_id = m.id)
WHERE mr.resource_id = r.id AND m.category = 'subject' AND m.value = 'introduction'
)
Translated into english it's 'return me all records where this subquery returns one or more rows, without returning the data for those rows'. This sub query is correlated to the outer query by the predicate mr.resource_id = r.id which uses the outer row as the predicate value.
I'm sure you can google around for more examples of the EXIST statement
Ok, here we go. There's this messy SELECT crossing other tables and ordering to get the one desired row. Basically I do the "math" inside the ORDER BY.
1 base table.
7 JOINS poiting to local tables.
WHERE with 2 clauses and a NOT IN crossing another table.
You'll see in the code the ORDER BY is pretty damn big/ugly, it sums the result of 5 different calculations. I need that result to order by those calculations in order to get the worst row-case.
The problem is once I execute the Stored Procedure it takes up to 8 seconds to run. That's kind of non-acceptable. So, I'm starting to check Indexes.
So, I'm looking for advices on how to make this query run faster.
I'm indexing the WHERE clauses and the field LINEA, Should I index something else? Like the rows Im crossing for the JOINs? or should I approach the query differently?
Query:
SET #LINEA = (
SELECT TOP 1
BOA.LIN
FROM
BAND_BA BOA
LEFT JOIN
TEL PAR
ON REPLACE(BOA.Lin,'-','') = SUBSTRING(PAR.Te,2,10)
LEFT JOIN
TELP CLP
ON REPLACE(BOA.Lin,'-','') = SUBSTRING(CLP.Numtel,2,10)
LEFT JOIN
CA C
ON REPLACE(BOA.Lin,'-','') = C.An
LEFT JOIN
RE R
ON REPLACE(BOA.Lin,'-','') = R.Lin
LEFT JOIN
PRODUCTOS2 P2
ON BOA.PRODUCTO = P2.codigo
LEFT JOIN
EN
ON REPLACE(BOA.Lin,'-','') = EN.G
LEFT JOIN
TIP ID
ON TIPID = ID.ID
WHERE
BOA.EST = 'C' AND
ID.SE = 'boA' AND
BOA.LIN NOT IN (
SELECT
LIN
FROM
BAN
)
ORDER BY (EN.VALUE + ANT.VALUE + REIT.VAL + C.VALUE + TEL.VALUE
) DESC,
I'll be frank, this is some pretty terrible SQL. Without seeing all your table structures, advice here will be incomplete. That being said, please don't post all your table structures because you are already very close to "hire a consultant" territory with this.
All the REPLACE logic should be done away with. If you need to JOIN on these fields, then add comparable fields to the tables so you don't need to manipulate the data. Every single JOIN that uses a REPLACE or SUBSTRING is a table or index scan - those are non-SARGable and a definite anti-pattern.
The ORDER BY is probably the most convoluted ORDER BY I have ever seen. Some major issues there:
Subqueries should all be eliminated and materialized either in the outer query or as variables
String manipulation should be eliminated (see item 1 above)
The entire query is basically a code smell. If you need to write code like this to meet business requirements then you either have a terribly inappropriate design or some other much larger issue in the organization or data.
One thing that can kill performance is using a lot of LEFT JOINs. To improve performance of LEFT JOIN, you might want to make sure that the column(s) to which you join have an index - that can have a huge impact on performance.
SELECT cf.FK_collection, c.collectionName,
uf.FK_userMe, uf.FK_userYou,
u.userId, u.username
FROM userFollows as uf
INNER JOIN collectionFollows as cf ON uf.FK_userMe = cf.FK_user
INNER JOIN collections as c ON cf.FK_collection = c.collectionId
INNER JOIN users as u ON uf.FK_userYou = u.userId
WHERE uf.FK_userMe = 2
Hey guys.
I'm trying to make this query, and it of course won't do as I want it to, since it's returning multiple rows which is in some way what I want, and yet it's not. Let me try to explain:
I trying to get both collectionFollows and userFollows, for showing a users activity on the site. But when doing this, I will have multiple rows from userFollows even tho a user only follows 1. This occurs because I'm following multiple collectionFollows.
So when I show my result it will return like this:
John is following 'webdesign'
John is following 'Lisa'
John is following 'programming'
John is following 'Lisa'
I would like to know if I have to make multiple queries or use an subquery? What would be best practice? And how would I write the query then?
You are actually combining two quite unrelated queries. I would keep them as separate queries, especially since you report them like that too. You could, if you like, use UNION ALL to combine those queries. This way, you have just a list of names of items you follow, regardless of the type of item it is. If you want, you can specify that too.
SELECT
cf.user,
cf.FK_collection as followItem,
c.collectionName as followName,
'collection' as followType
FROM collectionFollows as cf
INNER JOIN collections as c ON cf.FK_collection = c.collectionId
WHERE cf.user = 2
UNION ALL
SELECT
uf.FK_userMe,
u.userId,
u.username
'user' as followType
FROM userFollows as uf
INNER JOIN users as u ON uf.FK_userYou = u.userId
WHERE uf.FK_userMe = 2
An alternative would be to filter unique values in PHP, but even then your query will fail. Because of the inner joins, you will not get any results if a user only follows other users or only follows collections. You need at least one of both to get any results.
You could change INNER JOIN to LEFT JOIN, but then you would still have to post-process the query to filter doubles and filter out the NULL values.
UNION ALL is fast. It just sticks two query results together without furthes processing. This is different from UNION, which will filter double as well (like DISTINCT). In this case, it is not needed, because I assume a user can only follow a collection or other user once, so these queries will never return duplicate records. If that is indeed the case, UNION ALL will do just fine and will be faster than UNION.
Apart from UNION ALL, two separate queries is fine too.