I am in a very complicated problem. Let me explain you first what I am doing right now:
I have a table name feedback in which I am storing grades against course id. The table looks like this:
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 6 | 12 | B | 3 | 3 | 3
| 7 | 11 | B+ | 2 | 7 | 8
| 8 | 11 | A+ | 1 | 1 | 2
g_point has just specific values for the grades, thus I can use these values to show the user courses sorted by grades.
Okay, now first my task is to print out the grade of each course. The grade can be calculated by the maximum occurrence against each course. For example from this table we can see the result of cid = 10 will be A+, because it is present two times there. This is simple. I have already implemented this query which I will write here in the end.
The main problem is when we talk about the course cid = 11 which has two different grades. Now in that situation client asks me to take the average of workload and easiness of both these courses and whichever course has the greater average should be shown. The average would be computed like this:
all workload values of the grade against course
+ all easiness values of the grade against course
/ 2
From this example cid = 11 has four entries,have equal number of grades against a course
B+ grade average
avgworkload(2 + 7)/2=x
avgeasiness(3 + 8)/2 = y
answer x+y/2 = 10
A+ grade average
avgworkload(5 + 1)/2=x
avgeasiness(4 + 2)/2 = y
answer x+y/2 = 3
so the grade should be B+.
This is the query which I am running to get the max occurrence grade
SELECT
f3.coursecodeID cid,
f3.grade_point p,
f3.grade g
FROM (
SELECT
coursecodeID,
MAX(mode_qty) mode_qty
FROM (
SELECT
coursecodeID,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f1
GROUP BY coursecodeID
) f2
INNER JOIN (
SELECT
coursecodeID,
grade_point,
grade,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f3
ON
f2.coursecodeID = f3.coursecodeID AND
f2.mode_qty = f3.mode_qty
GROUP BY f3.coursecodeID
ORDER BY f3.grade_point
Here is SQL Fiddle.
I added a table Courses with the list of all course IDs, to make the main idea of the query easier to see. Most likely you have it in the real database. If not, you can generate it on the fly from feedback by grouping by cid.
For each cid we need to find the grade. Group feedback by cid, grade to get a list of all grades for the cid. We need to pick only one grade for a cid, so we use LIMIT 1. To determine which grade to pick we order them. First, by occurrence - simple COUNT. Second, by the average score. Finally, if there are several grades than have same occurrence and same average score, then pick the grade with the smallest g_point. You can adjust the rules by tweaking the ORDER BY clause.
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
FROM courses
ORDER BY courses.cid
result set
cid CourseGrade
10 A+
11 B+
12 B
UPDATE
MySQL doesn't have lateral joins, so one possible way to get the second column g_point is to repeat the correlated sub-query. SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
FROM courses
ORDER BY CourseGPoint
result set
cid CourseGrade CourseGPoint
10 A+ 1
11 B+ 2
12 B 3
Update 2 Added average score into ORDER BY SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
,(
SELECT (AVG(workload) + AVG(easiness))/2
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS AvgScore
FROM courses
ORDER BY CourseGPoint, AvgScore DESC
result
cid CourseGrade CourseGPoint AvgScore
10 A+ 1 3.75
11 B+ 2 5
12 B 3 3
If I understood well you need an inner select to find the average, and a second outer select to find the maximum values of the average
select cid, grade, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid, grade
This solution has been tested on your data usign sql fiddle at this link
If you change the previous query to
select cid, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid
You will find the max average for each cid.
As mentioned in the comments you have to choose wich strategy use if you have more grades that meets the max average. For example if you have
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 9 | 11 | C | 1 | 3 | 6
You will have grades A+ and C soddisfing the maximum average 4.5
Related
My task is to find all those subjects, by their id, that have (at least one, but) the fewest lowest passing grades in the database (the grade being the grade 6). I've managed to write the solution with three queries, however my task is to write it as a single query in MySQL. Thank you in advance.
-- 1. single query "solution"
SELECT subject_id FROM (SELECT subject_id, COUNT(*) AS six_count
FROM exams WHERE grade = 6
GROUP BY subject_id) AS sixes
WHERE subject_id = (SELECT MIN(six_count) FROM sixes);
-- 2. multiple queries solution
CREATE TABLE sixes AS (SELECT subject_id, COUNT(*) AS six_count
FROM exams WHERE grade = 6
GROUP BY subject_id);
SELECT subject_id FROM sixes
WHERE subject_id = (SELECT MIN(six_count) FROM sixes);
DROP TABLE sixes;
EDIT:
Exams table example:
| subject_id | student_id | exam_year | exam_mark | grade | exam_date |
| 1 | 20100022| 2011 | 'apr' | 10 | 2011-04-11 |
| 2 | 20100055| 2011 | 'oct' | 6 | 2011-10-04 |
| 3 | 20110030| 2011 | 'jan1' | 7 | 2011-01-26 |
| 5 | 20110055| 2011 | 'jan2' | 6 | 2011-02-13 |
| 5 | 20110001| 2011 | 'jun1' | 8 | 2011-06-23 |
This should do the trick. The sub query selects the first lowest number of sixes. The main query selects all subjects with that number. The trick is in ORDER BY count(*) LIMIT 1, which makes the sub query return the record with the lowest count.
SELECT
subject_id,
count(*) as six_count
FROM exams
WHERE grade = 6
GROUP BY subject_id
HAVING count(*) =
( SELECT count(*)
FROM exams
WHERE grade = 6
GROUP BY subject_id
ORDER BY count(*)
LIMIT 1
)
This pattern should do the trick. Generalized names.
SELECT subjectID
FROM TEST_DATA
WHERE grade = 6
GROUP
BY SubjectID
HAVING COUNT(1) =
( SELECT count(1) AS minCount
FROM TEST_DATA
WHERE grade = 6
GROUP
BY subjectID
ORDER
BY minCount
LIMIT 1
);
Please see the picture for ERROR SCREENSHOT
Table: Candidate
+-----+---------+
| id | Name |
+-----+---------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
+-----+---------+
Table: Vote
+-----+--------------+
| id | CandidateId |
+-----+--------------+
| 1 | 2 |
| 2 | 4 |
| 3 | 3 |
| 4 | 2 |
| 5 | 5 |
+-----+--------------+
id is the auto-increment primary key, CandidateId is the id appeared in Candidate table.
Write a sql to find the name of the winning candidate, the above example will return the winner B.
+------+
| Name |
+------+
| B |
+------+
Notes:
You may assume there is no tie, in other words there will be at most one winning candidate.
Why this code can't work? Just try to use without limit
SELECT c.Name AS Name
FROM Candidate AS c
JOIN
(SELECT r.CandidateId AS can, MAX(r.Total_vote) AS big
FROM (SELECT CandidateId, COUNT(id) AS Total_vote
FROM Vote
GROUP BY CandidateId) AS r) AS v
ON c.id = v.can;
In your query, here: SELECT r.CandidateId AS can, MAX(r.Total_vote) AS big
you use MAX aggregate function, without group by, which is not correct SQL.
Try:
SELECT Candidate.* FROM Candidate
JOIN (
SELECT CandidateId, COUNT(id) AS Total_vote
FROM Vote
GROUP BY CandidateId
ORDER BY COUNT(id) DESC LIMIT 1
) v
ON Candidate.id = v.CandidateId
This is a join/group by query with order by:
select c.name
from candidate c join
vote v
on v.candidateid = c.id
group by c.id, c.name
order by count(*) desc
limit 1;
SELECT c.Name AS Name
FROM Candidate AS c JOIN (SELECT r.CandidateId AS can
FROM
(SELECT CandidateId, COUNT(id) AS Total_vote
FROM Vote
GROUP BY CandidateId) AS r
WHERE r.Total_vote = (SELECT MAX(r.Total_vote) FROM (SELECT
CandidateId, COUNT(id) AS Total_vote
FROM Vote
GROUP BY CandidateId) r)) AS v
ON c.id = v.can;
This is updated code
My code has two errors. The first one is "use of an aggregate like Max requires a Group By clause if there are any non-aggregated columns in the select list", but not sure why my previous code still can run and show no error. Maybe the system add the group by function automatically when it run.
The second one is that max can't be used with Group by in this format.
I have a Table with the following structure.
The Table has mostly records where gender = 1.
I'm looking for a solution to get a result set where on top around 60% of records have gender = 1 and around 40% with gender = 2 mixed, ordered by popularity desc.
The amount of member with gender = 2 is much less, which means after the result set should only have gender = 1 records.
Member table
id | nickname | gender | popularity
1 | jake | 1 | 80
2 | mike | 1 | 88
3 | dave | 1 | 75
4 | jenny | 2 | 85
5 | peter | 1 | 83
6 | nina | 2 | 88
7 | mister | 1 | 77
8 | drake | 1 | 80
Result should be something like, it must not meet exactly weighted list. the goal is to see mixed results of both genders.
id | nickname | gender | popularity
2 | mike | 1 | 88
5 | peter | 1 | 83
6 | nina | 2 | 88
1 | jake | 1 | 80
8 | drake | 1 | 80
4 | jenny | 2 | 85
7 | mister | 1 | 77
3 | dave | 1 | 75
My so far best result was (it don't take care about the 40:60 split):
SET #rank=0;
SET #rank2=0;
SELECT * FROM (
SELECT #rank:=#rank+1 AS rank, q.* FROM (SELECT * FROM test WHERE gender = 1 ORDER BY popularity DESC) AS q
UNION
SELECT #rank2:=#rank2+1 AS rank, q.* FROM (SELECT * FROM test WHERE gender = 2 ORDER BY popularity DESC) AS q
) AS r ORDER BY rank;
Please try...
SET #gender1Count = SELECT COUNT( * )
FROM tblMember
WHERE gender = 1;
SET #gender2Count = SELECT COUNT( * )
FROM tblMember
WHERE gender = 2;
SET #totalCount = SELECT COUNT( * )
FROM tblMember;
SELECT id AS id,
nickname AS nickname,
gender AS gender,
popularity AS popularity
FROM tblMember
JOIN ( SELECT id AS id
FROM tblMember
WHERE gender = 1
ORDER BY popularity DESC
LIMIT CASE
WHEN #gender1Count > #totalCount * 3 / 5
ROUND( #gender2Count * 3 / 2 )
ELSE
#gender1Count
END
UNION
SELECT id AS id
FROM tblMember
WHERE gender = 2
ORDER BY popularity DESC
LIMIT CASE
WHEN #gender1Count > #totalCount * 3 / 5
#gender2Count
ELSE
ROUND( #gender1Count * 2 / 3 )
END
) nominees ON tblMember.id = nominees.id
ORDER BY popularity DESC;
The above will give you a list where 60% of entries are gender = 1 and 40% are gender = 2. Please note that this is not the same as 60% or more of the total list as gender = 1 with the balance gender = 2 (or 40% or more of the total list as gender = 2 and the balance gender = 1).
It does this by forming a list of those whose gender equals 1 and sorting it into descending order of popularity. It then determines how many of the top entries it will grab from this list using LIMIT by checking if the count of gender = 1 members exceeds 60% (3/5ths) of the list. If it does then we will need to reduce the number of gender = 1 records to be retrieved to 3/2 times the count of gender = 2 members. The id's of the chosen records are then returned.
(A quickish explanation for those who aren't great at fractions, 40% is the same as 2/5 (two fifths). If gender = 2 has two fifths of the final list then gender = 1 must have the other three fifths (3/5). To find the size of 3/5ths of the list we start with the known 2/5ths (the count of gender = 2) and divide that into 2 so that we know the size of 1/5th of the list. We can then multiply this 1/5 by 3 to determine how many record will make up 3/5ths (60%) of our list.)
Similar logic is used to form the list of gender = 2 members to be included in the final list.
(Please note that the records at the end of each list will likely have popularity values equal to those of the most popular excluded members whose gender corresponds to each list. In the absence of any subsorting in the formation of the two lists the selection of those that are or are not chosen will be arbitrary (and essentially semirandom).)
The two lists are then joined using the UNION operator in what is a simple type of vertical join. (Note : The more familiar INNER JOIN, LEFT JOIN, etc., are all types of horizontal joins).
An inner JOIN is then performed upon our list of amalgamated id's with our original table, giving us our 60% / 40% list. Finally, this list is sorted into descending order of popularity.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Hello I am thirst to get help as I am stuck for two days on a complex logical query,if anybody can help to solve.
Order Table
id | region_id | created_at | sale
=============|=============|=========================
1 | 1 | 2011-09-21 | $250
2 | 2 | 2012-03-12 | $320
3 | 1 | 2010-09-15 | $300
4 | 2 | 2011-08-18 | $180
5 | 1 | 2012-04-13 | $130
6 | 3 | 2010-06-22 | $360
7 | 2 | 2011-09-25 | $330
Regions Table
id | region_name
=============|=============
1 | Region 1
2 | Region 2
3 | Region 3
Expected Output
What I have tried to achieve
select distinct `regions`.`region_name`, sum(orders.sale) as sum,
CASE WHEN MONTH(orders.created_at)>=4 THEN
concat(YEAR(orders.created_at), '-',YEAR(orders.created_at)+1)
ELSE concat(YEAR(orders.created_at)-1,'-', YEAR(orders.created_at))
END AS financial_year from `orders` inner join `regions` on `orders`.`region_id` = `regions`.`id` group by YEAR(orders.created_at), `regions`.`region_name` order by `orders`.`region_id` asc, YEAR(orders.created_at) asc
My Queries Output
Where is my logical problem in query,one thing data should be fetched financial year wise not only normal year wise.
Thanks
http://sqlfiddle.com/#!9/16fdfb/9
Just to fix your query you should not use GROUP BY YEAR since your financial year does not match to calendar year, and since you don't want output different financial year in different rows but in columns. You can transform your query to:
SELECT regions.region_name,
o.salePrev as `2010-11`,
o.saleCurrent as `2011-12`
FROM (SELECT
region_id,
SUM(IF(MONTH(orders.created_at)<4,sale,0)) salePrev,
SUM(IF(MONTH(orders.created_at)>=4,sale,0)) saleCurrent
FROM orders
GROUP BY region_id
) o
INNER JOIN regions
ON o.region_id = regions.id;
But as I mentioned in my comment, your condition MONTH(orders.created_at)<4 is year independent I would transform it into something like:
SELECT regions.region_name,
o.salePrev as `2010-11`,
o.saleCurrent as `2011-12`
FROM (SELECT
region_id,
SUM(IF(
(MONTH(orders.created_at)<4 && YEAR(orders.created_at) = 2012)
|| YEAR(orders.created_at) < 2012
,sale,0)) salePrev,
SUM(IF(MONTH(orders.created_at)>=4 && YEAR(orders.created_at) = 2012,sale,0)) saleCurrent
FROM orders
GROUP BY region_id
) o
INNER JOIN regions
ON o.region_id = regions.id;
But yes it does not group by year, that just group current (2012-04 +) year against all the past years (2012-04 -).
If you need all years...
UPDATE http://sqlfiddle.com/#!9/16fdfb/17
SELECT r.region_name,
SUM(IF(o.f_year=2010,o.y_sale,0)) as `2010-11`,
SUM(IF(o.f_year=2011,o.y_sale,0)) as `2011-12`,
SUM(IF(o.f_year=2012,o.y_sale,0)) as `2012-13`
FROM (SELECT
region_id,
IF(MONTH(orders.created_at)<4,YEAR(created_at)-1,YEAR(created_at)) f_year,
SUM(sale) y_sale
FROM orders
GROUP BY region_id, f_year
) o
INNER JOIN regions r
ON o.region_id = r.id
GROUP BY r.id
How to filter query with order by and limit when using left join
store_profile
id + store_name
1 | Accessorize.me
2 | Active IT
3 | Edushop
4 | Gift2Kids
5 | Heavyarm
6 | Bamboo
store_fee
id + store_id + date_end
1 | 1 | 27-6-2013
2 | 2 | 29-8-2013
3 | 3 | 02-6-2013
4 | 4 | 20-4-2013
5 | 4 | 01-7-2013
6 | 4 | 28-9-2013
7 | 5 | 03-9-2013
8 | 6 | 01-9-2013
my previous query
$order_by_for_sort_column = "order by $column" //sorting column
$query = "SELECT * FROM store_profile sp LEFT JOIN store_fee sf ON (sf.store_id = sp.id) $order_by_for_sort_column";
what i want is order by id desc and limit 1 for table store_fee not for for entire query. So i can grab the latest date in date_end for each store.
As you can see for store_id 4(store_fee) i have 3 different date and i just want grab the latest date.
and the result should be something like this
1 | Accessorize.me 27-6-2013
2 | Active IT 29-8-2013
3 | Edushop 02-6-2013
4 | Gift2Kids 28-9-2013
5 | Heavyarm 03-9-2013
6 | Bamboo 01-9-2013
SELECT a.id, a.store_name, MAX(b.date_End) date_end
FROM store_profile a
LEFT JOIN store_fee b
ON a.ID = b.store_ID
GROUP BY a.id, a.store_name
SQLFiddle Demo
but if the datatype date_End column is varchar, the above query won't work because it sorts the value by character and that it can mistakenly gives undesired result. 18-1-2013 is greater than 01-6-2013.
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
SELECT *
FROM store_profile AS sp
LEFT JOIN (
SELECT store_id, MAX(date_end)
FROM store_fee
GROUP BY store_id
) AS sf
ON sp.id=sf.store_id;