mysql - need two limits? - mysql

Given a table containing awards earned over several years by members of an organization that consists of multiple geographic districts, what mysql query would show the top award earner in each district? I can easily get the top ten award earners across all districts with this query:
SELECT Membername, count(Award)as Number FROM awards
GROUP BY Membername
ORDER BY Number desc
LIMIT 10
But I need a list with the top earner for each district (there are about 90 of them), and I haven't gotten it right yet.
I tried this:
SELECT Membername, District, count(Award)as Number FROM awards
GROUP BY Membername, District
ORDER BY Number desc, District
LIMIT 90
It has accurate counts for the members, but isn't showing one per district, so some districts show up more than once. How do I get it to list the top earner per district, with each district showing up just once?

You'll have to do it by applying a "rank" per district, then only grab per rank = 1... The #LastDistrict at the join location is defaulted to zero, in case the district is based on an ID. If district is char based, you can just change it to = "" instead to match the data type.
To clarify what is happening. The "AwardCounts" pre query does the entire query per district and member with however many award counts. Then, ordered by district and member award count (descending), thus putting the highest award count at the first position per district.
That is joined to another bogus alias "SQLVars" which just creates inline variables to the query called #RankSeq and #LastDistrict. So, the first time in, the "DistRankSeq" will become a 1 for the first district, then prime the "#LastDistrict" with the value of the district. The next entry for the same district (since it will be in proper sequence order) will be assigned the rank of 2, then 3, etc... When there is a change from whatever the "LAST" District was to the new record being tested, the rank gets set back to 1 and starts over again. So you could have one district with 100 members, another with 5, another with 17...
So, your final query has all of them with their respective ranks... Now, apply the HAVING the final district rank = 1... Doing this, you could also adjust the having to get the top 3 members per district (for example)...
select
AwardCounts.District,
AwardCounts.MemberName,
AwardCounts.memberAwards,
#RankSeq := if( #LastDistrict = AwardCounts.District, #RankSeq +1, 1 ) DistRankSeq,
#LastDistrict := AwardCounts.District as ignoreIt
from
( select
a.district,
a.membername,
count(*) as memberAwards
from
Awards a
group by
a.district,
a.membername
order by
a.district,
memberAwards desc ) AwardCounts
JOIN (select #RankSeq := 0, #LastDistrict = 0 ) SQLVars
HAVING
DistRankSeq = 1
EDIT PER FEEDBACK
If its the aggregation thats taking the time, then I would do the following. Create a new table with nothing but the aggregations per district, name and initial rank for the district. As any new record is added to this table, the trigger then adds one to the aggregate table count, then checks where that person is within their district and re-updates its new rank position. You could take it a step further and have another table of just "TOP" member per district table that is one per district with the person's name. When a new person hits the top position, their name is put in the table, overwriting whoever was there last.

There's a fairly common way to do this, using self-joins. The trick is to replace a search for 'largest' with a search for 'those items with nothing bigger'. As you have already found out
SELECT Membername, District, count(Award) as Number FROM awards
GROUP BY Membername, District
returns you a nice result of award counts. Let's write ... to save a bit of space as shorthand for that.
Now consider
SELECT a.Membername, a.District, a.Number FROM (...) a LEFT JOIN (...) b
ON a.District=b.District
AND a.Number<b.Number
WHERE b.Membername IS NULL
where the ... is that stuff written above. It's basically saying, for every entry in the award counts (a), find me all the entries (b) in the same district with more awards, and only return (a) if there aren't any (b)'s... in other words, a is the champ.
You will need to finesse this a bit if there's more than one member in the same district with the same winning count... this query will return all the tied members. You'll have to decide how to handle that. And watch out for those districts that don't have any awards at all... they won't even appear in your table.

There's a page specifically dedicated to the problem - and if you look at the older manuals you'll see the max-concat trick - which is often still more efficient.

Related

How do I get distinct value in one row and their respective average values as output in mySQL

I have a mySQL table with 100 rows and 6 columns namely ; full_name, name, score, city, gender, rating. I want the output as one column containing distinct city values (there are only 5 distinct cities initially & the user input value of his/her city will be added, namely; Delhi, Mumbai, Patna, Chennai ,Pune) and the second column having their respective avg score.
The database is linked to the python code which I am working on & use takes input which is stored in the above 6 columns. Then according to the user request, the output as analysis is showed as graphs using matplotlib. But I am really stuck at this part where I need to show a graph having X-VALUES as city names and Y-VALUES as respective avg score for that, I need to know the query to get such an output in mySQL where we get 2 columns storing the above.
How do I do it ?
SELECT city AS X,AVG(score) AS Y
FROM yourtable GROUP BY city
Is this, what you ment? Or if you want the result as one row, you add GROUP_CONCAT:
SELECT GROUP_CONCAT(X) AS gX,GROUP_CONCAT(Y) AS gY FROM
(SELECT city AS X,AVG(score) AS Y
FROM yourtable GROUP BY city) g
ok, redbull helps...
correct syntax :
select city, avg(score) from data group by city;
wrong syntax (what I was trying to do earlier) :
select data.distinct(city), data.avg(score) , check.check_city from data, (select name, distinct(city) as check_city from data) check where data.name = check.name and data.city in check.check_city;
Thanks Anyway !

SQL Count and Group By issue

I have a table Mbr that contains 3 fields, GroupType, LeaderID, and MemberID. Basically, all the members in an organization are divided up into these groups, identified by their leader's unique ID (LeaderID). Each member record also has their own MemberID, and the leaders themselves have a unique MemberID as well. The GroupType just designates whether the group a member is in which is considered a Large, Small, or Individual group.
I need to find out how many groups of each GroupType contain a certain number of members.
For example:
How many Large groups contain 6 members, 7 members, 8 members, 9 and so on.
How many Small Groups contain 2 members, 3 members, 4 members and 5 members
How many Individual groups there are.
Is it possible make a query to get a Count of the unique MemberID's for each group, and then get a COUNT of how many LeaderID's have a certain number of members associated to them?
Note: Since you are not specifying which DBMS you are using, I tried to do a basic query. In SQLServer or Oracle this can be much more elegant.
I'm assuming that a given Member can be Leader leader of only one Group if that is correct,
Question #1:
SELECT GroupType, NumberOfMembers, COUNT(LeaderID) AS NumberOfGroups
FROM (
SELECT GroupType, LeaderID, COUNT(*) AS NumberOfMembers
FROM MyTable
GROUP BY GroupType, LeaderID
) AS InnerGrouping
GROUP BY GroupType, NumberOfMembers
ORDER BY GroupType, NumberOfMembers
Question #2:
SELECT UniqueMemberIDPerGroup, COUNT(LeaderID) AS NumberOfLeaderID
FROM (
SELECT LeaderID, COUNT(DISTINCT MemberID) AS UniqueMemberIDPerGroup
FROM MyTable
GROUP BY LeaderID
) AS InnerGrouping
GROUP BY UniqueMemberIDPerGroup
I'm sure you can write some complex query with several subqueries to create a query to give you what you want, but I personally like more straightforward methods. In this case, it would be using some temp tables to store intermediate values. I would first group by several columns (that you are going to use as criterias) with count being the value for the query. I would then store these into a temp table and finally create a query to utilize the temp table to give you the results that you are looking for.

Query using two tables with DISTINCT

I have two tables - clients and - group
I need to get county and zip from clients and group-assigned from group
When I search, I cannot get distinct results, that is, instead of the output showing 100 clients with zipcode 12345 in jones county in main st group.
I need to have each zip and county listed once by group. I have googled and attempted many ways but it is just beyond me.
Can anyone assist in steering me to the correct way
Adding GROUP BY group, city, zip to the end of your query should get you what you need. It will only return unique combinations of the three.
Presumably you have something like:
select g.*, c.county, c.zip
from clients c join groups g on <some join condition>
You want one result per group. So, add a group by clause such as:
group by g.id -- assuming id uniquely identifies each group
This will give an arbitrary value for the other fields, which may be sufficient for what you are doing. (This uses a MySQL features called Hidden Columns.)

Query MySQL for rows that share a value, and returning them as columns?

This is for a homework assignment. I haven't copy-pasted the question below, I made an simpler version of it that focuses on the specific area where I'm stuck.
Let's say I have a table of two values: a person's name, and the place he had lunch yesterday. Assume everyone has lunch in pairs. How can I query the database to return all the pairs of people that had lunch together yesterday? Each pair must be only listed once.
I'm actually not even sure what the professor means by return them as pairs. I've sent him an email, but no reply yet. It seems like he wants me to write a query that returns a table with column 1 as person 1 and column 2 as person 2.
Any suggestions on how to go about this? Does it seem right to assume he wants them as separate columns?
So far, I basically have:
SELECT name, restaurant FROM lunches GROUP BY restaurant, name
which essentially just reorganizes the table so that the people who had lunch together are one after the other.
We have to assume there can be only one pair eating lunch in a given restaurant.
You can get a list of pairs either using self-join:
SELECT l1.name, l2.name FROM lunches l1
JOIN lunches l2
ON l1.restaurant = l2.restaurant AND l1.name < l2.name
or using GROUP BY:
SELECT GROUP_CONCAT(name) FROM lunches
GROUP BY restaurant
The first query will return pairs in two different columns, while the second in one column, using comma as separator (default for GROUP_CONCAT, you can change it to whatever you wish).
Also note that for the first query names in pairs will come in alphabetical order as we use < instead of <> to avoid listing each pair twice.

mysql group by and filtering the values in each grouped record

i have a table of users im grouping by age, but each user also has a nationality and if one of the users nationality is US i want that to be value in the group record, currently it seems to take the first nationality it finds, how can i write this query?
One way to do it would be:
SELECT *, IF(INSTR(GROUP_CONCAT('--', nationality, '--'), '--US--'),
'US', nationality)
FROM table GROUP BY age;
What this does is that GROUP_CONCAT combines all the nationalities of one age and if it finds the string 'US' among them, it returns 'US' and otherwise it returns the nationality as it would normally do. It also adds '--' in the beginning and end of a nationality to make 'US' become '--US--'. If you didn't do that, the query would also think that any other nationality which contains the consecutive characters 'US' would mean US. But those '--' characters are only used internally and are not shown in the final result.
Edit: Another (cleaner but longer) way came into my mind:
SELECT * FROM (SELECT * FROM table WHERE nation='US'
UNION
SELECT * FROM table WHERE nation!='US') AS tmp
GROUP BY age;
So, first select persons whose nationality is US, then select persons whose nationality is not US and combine those two sets of persons so you get a table of persons in an order where there are first persons who are from US and then others. Then perform the GROUP BY operation to that table and you'll always get the nationality to be US if there's at least one person from US in that age, because it will always come first.