Can't address parent field in query with multiple subqueries - mysql

EDIT: Better explanation
I have a page with a job. The job as an idea and three skills (skill_ids) and skill requirements (a user must have at least this skill value to be qualified).
I click on the job to find candidates, so I have the job_id and the three skill_ids and skill_id_requirements. So I can do this so far as the first answer proposed with joins. I find all users who have the three skills. The skills are saved in skill_ratings. So far it works as I use to find the skill_id's only.
But now I want the value and here I have my code where I compute the final value (called rating). The rating respects all given values, but isn't a simple average or the sum of all. That's why I need the long horrible code. In the long horrible code I usually insert a user's ID. But here I need all user_id's who have the skills mentioned above just to calculate if they are qualified. This is dynamic.
I'm having a table where I want to find people who are qualified for a position under some requirements. Here I work with one table called skill_ratings, but (as far as I see) need to add some subqueries. And here I have the problem. There are many subqueries and I've tried to address a parent query field. But it only seems to work in a first-grade subquery to a parent query.
Here's my structure:
SELECT * FROM table t
WHERE EXISTS (SELECT * FROM table d WHERE x > 1
AND b=t.id
AND y <= (SELECT a FROM (MAIN SUBQUERY WITH CALCULATIONS)))
GROUP BY xyx
But the error I get is: #1054 - Unknown column 'skra.usr_id_get' in 'where clause'. skra is the parent table in this case.
I want to get the following (pseudo-sql):
SELECT all FROM table t AS x
WHERE EXISTS (
SELECT all FROM table t AS y
WHERE y.skill_id = 1
AND y.usr_id_get = t.usr_id_get
AND y.value <= (my algorithm)
)
The main subquery is important so far as I want to get a computed number. Elsewhere the code works because I were able to work with predefined PHP-variables for a user's ID. But I can't do this here as I need to find the users within the boundaries of the where-clauses.
How can I solve this? Because addressing a parent-field in a subquery seems to be limited to a first-grade subquery.
EDIT: Code
Code removed due to project status.
Error: #1054 - Unknown column 'c.usr_id_get' in 'where clause'

We want users that have certain skills of certain levels. For example all users that have skill 1 with at least level 20 and skill 2 with at least level 70.
Here is an algorithm:
First of all we must get the skill levels. A user has several skill ratings and the average rating per skill is the level.
Then we want a table of criteria (skill 1 / level 20, skill 2 / level 70 in our example).
We collect all user skill levels that match the criteria (EXISTS clause) and then
keep the users that match all skill levels (count(*) = <desired number of skills>).
The query:
select
sr.usr_id_get
from
(
select usr_id_get, skill_id, avg(value) as level
from skill_ratings
group by usr_id_get, skill_id
) sr
where exists
(
select *
from
(
select 1 as skill_id, 20 as level
union all
select 2 as skill_id, 70 as level
) criteria
where sr.skill_id = criteria.skill_id
and sr.level >= criteria.level
)
group by usr_id_get
having count(*) = 2;
You can also make criteria a real (temporary) table. Then your query stays the same, no matter how many skills are requested. You'd have
where exists
(
select *
from criteria
where sr.skill_id = criteria.skill_id
and sr.level >= criteria.level
)
group by usr_id_get
having count(*) = (select count(*) from criteria);
then.

This looks like it could be done with a simple JOIN:
SELECT T.*
FROM your_table T
JOIN other_table Y ON (
T.usr_id_get = Y.usr_id_get
AND T.skill_id = 1
AND Y.value <= [...]
)
If you need to perform some sort of calculations before the join, then you could join with a subquery:
SELECT T.*
FROM your_table T
JOIN (
SELECT *
FROM other_table Y
WHERE Y.skill_id = 1
AND Y.value = [...]
) Y USING(usr_id_get)

If I understand correctly, you have a user, say user 123, and a skill, say skill 99. Now you want to get the avarage rating for user 123 and skill 99 and then find all users with an equal or better average rating on that skill.
This is how to get the avarage ratings for skill 99 per user:
select usr_id_get, avg(value)
from skill_ratings
where skill_id = 99
group by usr_id_get;
This is how to get all users with an equal or better avarage rating for skill 99 than user 123:
select usr_id_get
from skill_ratings
where skill_id = 99
group by usr_id_get
having avg(value) >=
(select avg(value) from skill_ratings where skill_id = 99 and usr_id_get = 123);
Add to this whatever other criteria you need.

Related

JOIN, ORDER BY and GROUP BY in same statement

I have two tables - results and contestants. Result table cointains
result_id (PK)
resul_contestant (who scored this)
value (how much did he scored)
result_discipline(where he scored this)
contestants table cointains
contestant_id (PK)
contestant_name
contestant_category
What I want to do is to select results for all contestants, but I only want to show one result per contestant - the highest (lowest) one.
So far I managed to do this:
SELECT * FROM contenstants
JOIN results ON result_contestant = contenstant_id
WHERE result_discipline = 1 AND contestant_category = 1
ORDER BY value DESC
GROUP BY contenstant_id;
However, this gives me syntax error. If I delete the GROUP BY line, I got results ordered from highest, but if any of the contestants scored in this discipline more than once, I got all of his scores.
If I delete the ORDER BY line, I got only one result per contestant, but it returns the first record in db, not the highest one.
How to fix this command to be valid? Also, there are some less_is_better disciplines, where I want the lowest score, but as far as I could use the ORDER BY on final query, it should be achieved by replacing DESC with ASC.
Thanks.
Don't use group by. Using select * with group by just doesn't make sense. Instead, use a filter to get the row you want:
SELECT *
FROM contenstants c JOIN
results r
ON r.result_contestant = c.contestant_id
WHERE r.result_discipline = 1 AND c.contestant_category = 1 AND
r.value = (select min(r2.value)
from results r2
where r2.result_contestant = r.result_contestant and
r2.result_discipline = r.result_discipline
)
ORDER BY value DESC;
Note: I'm not sure if you want min() or max() in the subquery.

SQL query - select max where a count greater than value

I've got two tables with the following structure:
Question table
id int,
question text,
answer text,
level int
Progress table
qid int,
attempts int,
completed boolean (qid means question id)
Now my questions is how to construct a query that selects the max level where the count of correct questions is greater than let's say 30.
I created this query, but it doesn't work and I don't know why.
SELECT MAX(Questions.level)
FROM Questions, Progress
WHERE Questions.id = Progress.qid AND Progress.completed = 1
GROUP BY Questions.id, Questions.level
Having COUNT(*) >= 30
I would like to have it in one query as I suspect this is possible and probably the most 'optimized' way to query for it. Thanks for the help!
This sort of construct will work. You can figure out the details.
select max(something) maxvalue
from SomeTables
join (select id, count(*) records
from ATable
group by id) temp on ATable.id = temp.id
where records >= 30
Do it step by step rather than joining the two tables. In an inner select find the questions (i.e. the question ids) that were answered 30 times correctly. In an outer select find the corresponding levels and get the maximum value:
select max(level)
from questions
where id in
(
select qid
from progress
where completed = 1
group by qid
having count(*) >= 30
);

SQL Comment Grouping

I have two table in MySQL
Table 1: List of ID's
--Just a single column list of ID's
Table 2: Groups
--Group Titles
--Members **
Now the member field is basically a comments field where all the ID's that are part of that group are listed. So for instance one whole field of members looks like this:
"ID003|ID004|ID005|ID006|ID007|ID008|... Etc."
There they can be up to 500+ listed in the field.
What I would like to do is to run a query and find out which ID's appear in only three or less groups.
I've been taking cracks at it, but honestly I'm totally lost. Any ideas?
Edit; I misunderstood the question the first time, so I'm changing my answer.
SELECT l.id
FROM List_of_ids AS l
JOIN Groups AS g ON CONCAT('|', g.members, '|') LIKE CONCAT('%|', l.id, '|%')
GROUP BY l.id
HAVING COUNT(*) <= 3
This is bound to perform very poorly, because it forces a table-scan of both tables. If you have 500 id's and 500 groups, it must run 250000 comparisons.
You should really consider if storing a symbol-separated list is the right way to do this. See my answer to Is storing a delimited list in a database column really that bad?
The proper way to design such a relationship is to create a third table that maps id's to groups:
CREATE TABLE GroupsIds (
memberid INT,
groupid INT,
PRIMARY KEY (memberid, groupid)
);
With this table, it would be much more efficient by using an index for the join:
SELECT l.id
FROM List_of_ids AS l
JOIN GroupsIds AS gi ON gi.memberid = l.id
GROUP BY l.id
HAVING COUNT(*) <= 3
select * from
(
select ID,
(
select count(*)
From Groups
where LOCATE(concat('ID', a.id, '|'), concat(Members, '|'))>0
) as groupcount
from ListIDTable as a
) as q
where groupcount <= 3

MySQL - 3 tables, is this complex join even possible?

I have three tables: users, groups and relation.
Table users with fields: usrID, usrName, usrPass, usrPts
Table groups with fields: grpID, grpName, grpMinPts
Table relation with fields: uID, gID
User can be placed in group in two ways:
if collect group minimal number of points (users.usrPts > group.grpMinPts ORDER BY group.grpMinPts DSC LIMIT 1)
if his relation to the group is manually added in relation tables (user ID provided as uID, as well as group ID provided as gID in table named relation)
Can I create one single query, to determine for every user (or one specific), which group he belongs, but, manual relation (using relation table) should have higher priority than usrPts compared to grpMinPts? Also, I do not want to have one user shown twice (to show his real group by points, but related group also)...
Thanks in advance! :) I tried:
SELECT * FROM users LEFT JOIN (relation LEFT JOIN groups ON (relation.gID = groups.grpID) ON users.usrID = relation.uID
Using this I managed to extract specified relations (from relation table), but, I have no idea how to include user points, respecting above mentioned priority (specified first). I know how to do this in a few separated queries in php, that is simple, but I am curious, can it be done using one single query?
EDIT TO ADD:
Thanks to really educational technique using coalesce #GordonLinoff provided, I managed to make this query to work as I expected. So, here it goes:
SELECT o.usrID, o.usrName, o.usrPass, o.usrPts, t.grpID, t.grpName
FROM (
SELECT u.*, COALESCE(relationgroupid,groupid) AS thegroupid
FROM (
SELECT u.*, (
SELECT grpID
FROM groups g
WHERE u.usrPts > g.grpMinPts
ORDER BY g.grpMinPts DESC
LIMIT 1
) AS groupid, (
SELECT grpUID
FROM relation r
WHERE r.userUID = u.usrID
) AS relationgroupid
FROM users u
)u
)o
JOIN groups t ON t.grpID = o.thegroupid
Also, if you are wondering, like I did, is this approach faster or slower than doing three queries and processing in php, the answer is that this is slightly faster way. Average time of this query execution and showing results on a webpage is 14 ms. Three simple queries, processing in php and showing results on a webpage took 21 ms. Average is based on 10 cases, average execution time was, really, a constant time.
Here is an approach that uses correlated subqueries to get each of the values. It then chooses the appropriate one using the precedence rule that if the relations exist use that one, otherwise use the one from the groups table:
select u.*,
coalesce(relationgroupid, groupid) as thegroupid
from (select u.*,
(select grpid from groups g where u.usrPts > g.grpMinPts order by g.grpMinPts desc limit 1
) as groupid,
(select gid from relations r where r.userId = u.userId
) as relationgroupid
from users u
) u
Try something like this
select user.name, group.name
from group
join relation on relation.gid = group.gid
join user on user.uid = relation.uid
union
select user.name, g1.name
from group g1
join group g2 on g2.minpts > g1.minpts
join user on user.pts between g1.minpts and g2.minpts

SQL query - Get rows only if value falls within range of "last n records" (record-specific)

I have two tables, a Countries table and a Weather table. I would like to retrieve all of the names of countries where it has not rained within the last 15 days.
The weather table has a column called "DayNum", which goes from 1 -> infinity and increases by 1 on each day, it is unique. This table also has a column called "Rain" which is just a bit boolean value of 0 or 1.
Also, not all Countries were added on the same day, so the max DayNum will be different for each country.
Examples of tables below (data is snipped for readability):
Countries:
ID Name
1 USA
2 Cananda
3 Brazil
Weather
ID Country_id DayNum Rain
1 1 1 0
2 1 2 0
3 1 3 1
Here is my current attempt at a query (been working on this for days):
SELECT countries.name, weather.daynum
FROM countries INNER JOIN weather ON countries.id = weather.country_id
GROUP BY countries.name
HAVING weather.daynum > (MAX(weather.day_num) - 15) AND SUM(weather.rain) = 0;
I think this should work, but I'm having serious performance issues. The actual query I need to write deals with different data (same exact concept) and millions of rows. This query seems to get slower at an exponential rate.
Can anyone offer any advice?
Another idea I had was to somehow limit the JOIN to only grab the top 15 records (whilst ORDERing BY weather.day_num), but I Haven't found a way to do this within a JOIN (if it's even possible).
You're not interested in the amount of rain, just whether it exists, so...
select * from countries
left join
(
select weather.country_id
from weather
inner join
(select country_id, MAX(daynum) as maxdaynum from weather group by country_id) maxday
on weather.country_id = maxday.country_id
and weather.daynum>maxday.maxdaynum-3
where rain=1
) rainy
on countries.id = rainy.country_id
where country_id is null
I presume you've already indexed your tables appropriately
You didn't include any information about the indices on your tables, but I'm betting the performance issues you are experiencing are related to the group by on the countries name field. It would certainly explain your performance issues if that column isn't indexed.
Having said that, this is a situation that probably calls for a subquery rather than an inner join. I would be tempted to write the query this way:
SELECT countries.id, countries.name
FROM countries
INNER JOIN
(
SELECT country_id
FROM weather
GROUP BY country_id
HAVING weather.daynum > (MAX(weather.day_num) - 15) AND SUM(weather.rain) = 0
) AS weather
ON weather.country_id = countries.id;
I have two tables, a Countries table and a Weather table. I would like to retrieve all of the names of countries where it has not rained within the last 15 days.
Here you go:
SELECT * FROM Country
WHERE
NOT EXISTS (
SELECT * FROM Weather
WHERE
Rain = 1
AND DayNum >= 2
AND Country_id = Country.ID
);
In plan English: for each country, check if there are any rainy days newer than the given day number. If there are, eliminate the country from the result.
Replace 2 with the day number 15 days ago. Index on {Country_id, DayNum, Rain} for decent performance. Unfortunately, MySQL is unlikely to execute this query optimally, but there are only so many countries so nested loops shouldn't be too bad since DBMS should be able to execute the inner query as a single index seek.
Alternatively, consider rewriting it as JOIN, for example:
SELECT Country.*
FROM Country LEFT JOIN Weather
ON Country_id = Country.ID
AND Rain = 1
AND DayNum >= 2
GROUP BY Country.ID, Country.Name
HAVING MAX(Rain) IS NULL OR MAX(Rain) = 0;
A working SQL Fiddle example is here.
Perhaps you can use a simple variable to store the min daynum required ? I am not a mySQL developer, but something like that will do the trick I think :
SELECT #minDaynum := (MAX(daynum)-15) FROM weather;
SELECT DISTINCT countries.name
FROM weather
INNER JOIN countries ON weather.country_id = countries.id
WHERE
weather.daynum >= #minDaynum AND
weather.rain = 1;
EDIT >> If just one variable doesn't work for your case, maybe try using a temporary table to speed things up (not sure if performances of temporary tables in mysql are really good though...) :
CREATE TEMPORARY TABLE min_daynums (country_id int, country_name, min_daynum int);
INSERT INTO min_daynum
SELECT countries.id, countries.name, MAX(weather.daynum)-15
FROM weather
INNER JOIN countries ON countries.id = weather.country_id
GROUP BY countries.id, countries.name
SELECT min_daynums.country_name
FROM min_daynums
WHERE
EXISTS(
SELECT 1
FROM weather
WHERE
weather.country_id = min_daynums.country_id
and weather.daynum >= min_daynums.min_daynum
and weather.rain = 1
)
Here I just store the min daynum for each country in a temp table. Hope it helps...