Values in same row of groupwise maximum - mysql

I've got a table with the most common colors in images. It looks something like this:
file | color | count
---------------------
1 | ffefad | 166
1 | 443834 | 84
2 | 74758a | 3874
2 | abcdef | 228
2 | 876543 | 498
3 | 543432 | 3382
3 | abcdef | 483
I'm trying to get the most common color for each image. So I'd like my result to be:
file | color | count
---------------------
1 | ffefad | 166
2 | 74758a | 3874
3 | 543432 | 3382
So my problem seems to be that I need to GROUP BY the file column, but MAX() the count column. But simply
SELECT h.file, h.color, MAX(h.count) FROM histogram GROUP BY h.file
isn't working because it's indeterminate, so the color result won't match the row from the count result.
SELECT h.file, h.color, MAX(h.count) FROM histogram GROUP BY h.file, h.color
fixes the determinacy, but now every row is "unique" and all rows are returned.
I can't figure out a way to do a subquery or join, since the only "correct" values I can figure to get, file and count, are not distinct by themselves.
Perhaps I need a saner schema? It's "my" table so I can change that if need be.

SELECT tbl.file, tbl.color, tbl.count
FROM tbl
LEFT JOIN tbl as lesser
ON lesser.file = tbl.file
AND tbl.count < lesser.count
WHERE lesser.file IS NULL
order by tbl.file

select file , max(count)
FROM histogram
GROUP BY h.file
This will give the max(count) by file. Turn it into a subquery and inner join so it acts as a filter.
select h.file, h.colour, h.count
from histogram inner join
(select file , max(count) as maxcount
FROM histogram
GROUP BY h.file) a
on a.file = h.file and a.maxcount = h.count
This will respond with 2 rows if there are more than 1 colour with the same max count.

Related

Find multiple totals by adding values from mysql table

I need to create a number adding all the values i can find in the db related to a specific customer.
Ex.
| Cust. | Value |
| 1 | 3 |
| 2 | 1 |
| 1 | 1 |
| 2 | 1 |
| 3 | 5 |
The result i want is : Customer #1 = 4, Customer #2 = 2; Customer #3 = 5.
There is a way to do that right into the mysql query?
Try Below query.
Select CONCAT('Customer #' , cust) as customer , sum(Value)
FROM customer_table
Group By cust
You want to SUM the values with a specific GROUP BY clause. Think of the GROUP BY as dividing rows into buckets and the SUM as aggregating the contents of those buckets into something useful.
Something like:
SELECT SUM(Value) FROM table GROUP BY Cust

Retrieve distinct values without reducing number of results

I'm writing a MySQL request for retrieving data from a list of questions.
The table looks like this :
-----------------------------------------------------
| id | answer_name | rating | question_id | answers |
-----------------------------------------------------
Where several rows can have the same answer_name value, since several questions can be asked about the same answer.
Now, for retrieving the data I use a LIMIT clause which is calculated from ratings and the total number of rows.
For example, if I wanna get the data between 80% and 100% of rating, and there are 100 rows, I would use ORDER BY rating LIMIT 80, 20.
My problem is the following : I need to retrieve data with distinct values for answer_name column, but using a GROUP BY clause makes the number of result (e.g. of rows in the table) reduce cause of aggregation, causing the top percentages of rows to return nothing cause of searching rows at a limit that doesn't exist.
Does anyone know if there is a way to keep the number of results the same and still to retrieve distinct results for the answer_name column ?
EDIT :
Here are some sample rows and expected output :
game_data table :
-----------------------------------------------------
| id | answer_name | rating | question_id | answers |
|----|-------------|--------|-------------|---------|
| 1 | A. Merkel | 40 | 1 | [1,2,3] |
| 2 | A. Merkel | 45 | 2 | [2,3,4] |
| 3 | B. Clinton | 55 | 1 | [2,5,8] |
| 4 | B. Clinton | 50 | 2 | [3,5,8] |
| 5 | L. Messi | 17 | 4 | [7,8,9] |
| 6 | L. Messi | 18 | 5 | [7,8,9] |
| 7 | L. Messi | 25 | 6 | [7,8,9] |
| 8 | D. Beckham | 21 | 4 | [6,7,8] |
| 9 | D. Beckham | 52 | 5 | [6,7,8] |
| 10 | D. Beckham | 41 | 6 | [6,7,8] |
-----------------------------------------------------
Where answers is an array of ids referring to another table.
Let's say I wanna retrieve the 50% to 80% of the table, ordered by rating.
SELECT id FROM game_data GROUP BY answer_name ORDER BY rating LIMIT 5, 3
Here the problem is the GROUP BY answer_name is gonna reduce the number of rows of the table, and therefore instead of returning 3 results, will return an empty set.
Also, I want the selected value in the GROUP BY close to be randomly chosen.
Using group by like this goes against pretty much every instinct, but you said you want random values, so it's good enough.
select * from (
select q.*, #rank := #rank + 1 as rank
from (
select * from game_data
group by answer_name
order by rating desc
) q, (select #rank := 0) qq
) qqq
where rank between (#rank * .5) and (#rank * .8)
demo here
How does it work? First (in the innermost query) we group by your answer_name, to get your distinct results, and we order it by the rating as required.
Then in the query wrapping around that one, we give those results a ranking from 1 to however many rows are in the result. Once this level of the query completes, we know our best answer is answer 1, and our 'worst' answer is the last value of our #rank variable.
Then we get to the outermost query. We can use that #rank variable to determine our percentages, which we use to filter the where clause.
In all likelihood this will give you the same results each time you run the same query, but the values chosen are indeterminate - so it could change. If you want truly random (ie changes with each execution) that's a different kettle of fish altogether.
(note, this bit: , (select #rank := 0) qq is purely to initialise the variable)
Simple is That.
Use Group By 'id' not 'answer_name' b/c Group By not get duplicate values
SELECT * FROM game_data GROUP BY id ORDER BY rating

Join two tables using multiple rows in the join

I have two tables
Table: color_document
+----------+---------------------+
| color_id | document_id |
+----------+---------------------+
| 180907 | 4270851 |
| 180954 | 4270851 |
+----------+---------------------+
Table: color_group
+----------------+-----------+
| color_group_id | color_id |
+----------------+-----------+
| 3 | 180954 |
| 4 | 180907 |
| 11 | 180907 |
| 11 | 180984 |
| 12 | 180907 |
| 12 | 180954 |
+----------------+-----------+
Is it possible for a query to get a result that looks something like this using multiple color id's to join the two tables?
Result
+----------------+--------------+
| color_group_id | document_id |
+----------------+--------------+
| 12 | 4270851 |
+----------------+--------------+
Since Color Group 12 is the only group that has the exact same set of Colors that Document 4270851 has.
I've got some bad data that i'm being forced to work with so I've had to manufacture the color groups by finding each unique set of color_id's associated with document_id's. I'm trying to then create a new relationship directly between my manufactured color groups and documents.
I know I could probably do something with a GROUP_CONCAT to make a pseudo key of concatenated color ids, but I'm trying to find a solution that would also work in, say, Oracle. Am I barking up the completely wrong tree with this logic?
My ultimate goal is to be able to have a single row in a table that would represent any number of Colors that are associated with a Document to be exported to a completely different system than the one I'm working with.
Any thoughts/comments/suggestions are greatly appreciated.
Thank you in advance for looking at my question.
Do a normal join of the two tables, and count the number of rows in each pairing. Then test whether this is the same as the number of times each of the items appears in the original tables. If all are the same, then all color IDs must match.
SELECT a.color_group_id, a.document_id
FROM (
SELECT color_group_id, document_id, COUNT(*) ct
FROM color_document d
JOIN color_group g ON d.color_id = g.color_id
GROUP BY color_group_id, document_id) a
JOIN (
SELECT color_group_id, COUNT(*) ct
FROM color_group
GROUP BY color_group_id) b
ON a.color_group_id = b.color_group_id and a.ct = b.ct
JOIN (
SELECT document_id, COUNT(*) ct
FROM color_document
GROUP BY document_id) c
ON a.document_id = c.document_id and a.ct = c.ct
SQLFIDDLE
If i understand your question correct you just have to join the two tables and then group the results by color_group_id an document_id.
SQL Fiddle
select color_group_id, document_id
from
color_document cd join
color_group cg
on cd.color_id = cg.color_id
group by color_group_id, document_id
That query will give you this result set:
COLOR_GROUP_ID DOCUMENT_ID
3 4270851
4 4270851
11 4270851
12 4270851
Is that what you want?

Show all grouped results and sort

I have a table, like that one:
| B | 1 |
| C | 2 |
| B | 2 |
| A | 2 |
| C | 3 |
| A | 2 |
I would like to fetch it, but sorted and grouped. That is, I would like it grouped by the letter, but sorted by the highest sum of the group. Also, I want to show all entries within the group:
| C | 3 |
| C | 2 |
| A | 2 |
| A | 2 |
| B | 2 |
| B | 1 |
The order is that way because C has 3 and 2. 3+2=5, which is higher than 2+2=4 for A which in turn is higher than 2+1=3 for B.
I need to show all "grouped" letters because there are other columns that are distinct all of which I need shown.
EDIT:
Thanks for the quick reply. I have the audacity, however, to inquire further.
I have this query:
SELECT * FROM `ip_log` WHERE `IP` IN
(SELECT `IP` FROM `ip_log` GROUP BY `IP` HAVING COUNT(DISTINCT `uid`) > 1)
GROUP BY `uid` ORDER BY `IP`
The letters in the upper description are ip (I need it grouped by the IP addresses) and the numbers are timestamp (I need it sorted by the sum (or just used as the sorting parameter)). Should I create a temporary table and then use the solution below?
select t.Letter, t.Value
from MyTable t
inner join (
select Letter, sum(Value) as ValueSum
from MyTable
group by Letter
) ts on t.Letter = ts.Letter
order by ts.ValueSum desc, t.Letter, t.Value desc
SQL Fiddle Example
If your table's columns are letter and number, the way I would go around to doing this would be the following:
SELECT
letter,
GROUP_CONCAT(number ORDER BY number DESC),
SUM(number) AS total
FROM table
GROUP BY letter
ORDER BY total desc
What you will get, based on your example is the following:
| C | 3,2 | 5
| A | 2,2 | 4
| B | 2,1 | 3
You can then process that data to get the actual information you want/need.
If you still want the data in the format you requested originally, it is not possible with a single query. The reason for that is that you can't sort based on an aggregated data that you are not calculating in the same query (the SUM of the number column). So you will need to make a sub-query to calculate that and feed it back into the original query (disclaimer: untested query):
SELECT
letter,
number
FROM table
JOIN (SELECT ltr, SUM(number) AS total FROM table GROUP BY letter) AS totals
ON table.letter = totals.ltr
ORDER BY totals.total desc, letter desc, number desc

Distinct Help

Alright so I have a table, in this table are two columns with ID's. I want to make one of the columns distinct, and once it is distinct to select all of those from the second column of a certain ID.
Originally I tried:
select distinct inp_kll_id from kb3_inv_plt where inp_plt_id = 581;
However this does the where clause first, and then returns distinct values.
Alternatively:
select * from (select distinct(inp_kll_id) from kb3_inv_plt) as inp_kll_id where inp_plt_id = 581;
However this cannot find the column inp_plt_id because distinct only returns the column, not the whole table.
Any suggestions?
Edit:
Each kll_id may have one or more plt_id. I would like unique kll_id's for a certain kb3_inv_plt id.
| inp_kll_id | inp_plt_id |
| 1941 | 41383 |
| 1942 | 41276 |
| 1942 | 38005 |
| 1942 | 39052 |
| 1942 | 40611 |
| 1943 | 5868 |
| 1943 | 4914 |
| 1943 | 39511 |
| 1944 | 39511 |
| 1944 | 41276 |
| 1944 | 40593 |
| 1944 | 26555 |
If you do mean, by "make distinct", "pick only inp_kll_ids that happen just once" (not the SQL semantics for Distinct), this should work:
select inp_kll_id
from kb3_inv_plt
group by inp_kll_id
having count(*)=1 and inp_plt_id = 581;
Get all the distinct first (alias 'a' in my following example) and then join it back to the table with the specified criteria (alias 'b' in my following example).
SELECT *
FROM (
SELECT
DISTINCT inp_kll_id
FROM kb3_inv_plt
) a
LEFT JOIN kb3_inv_plt b
ON a.inp_kll_id = b.inp_kll_id
WHERE b.inp_plt_id = 581
in this table are two columns with
ID's. I want to make one of the
columns distinct, and once it is
distinct to select all of those from
the second column of a certain ID.
SELECT distinct tableX.ID2
FROM tableX
WHERE tableX.ID1 = 581
I think your understanding of distinct may be different from how it works. This will indeed apply the where clause first, and then get a distinct list of unique entries of tableX.ID2, which is exactly what you ask for in the first part of your question.
By making a row distinct, you're ensuring no other rows are exactly the same. You aren't making a column distinct. Let's say your table has this data:
ID1 ID2
10 4
10 3
10 7
4 6
When you select distinct ID1,ID2 - you get the same as select * because the rows are already distinct.
Can you add information to clear up what you are trying to do?