How would I do this in MySQL? - mysql

Lets say I have a database of widgets. I am showing a list of the top ten groupings of each widget, separated by category.
So lets say I want to show a list of all widgets in category A, but I want to sort them based on the total number of widgets in that category and only show the top 10 groupings.
So, my list might look something like this.
Top groupings in Category A
100 Widgets made by company 1 in 1990.
90 Widgets made by company 1 in 1993.
70 Widgets made by company 3 in 1993.
etc...(for 10 groupings)
This part is easy, but now lets say I want a certain grouping to ALWAYS show up in the listings even if it doesnt actually make the top ten.
Lets say I ALWAYS want to show the number of Widgets made by company 1 in 2009, but I want this grouping to be shown somewhere in my list randomly (not first or last)
So the end list should look something like
Top groupings in Category A
100 Widgets made by company 1 in 1990.
90 Widgets made by company 1 in 1993.
30 Widgets made by company 1 in 2009.
70 Widgets made by company 3 in 1993.
How would i accomplish this in MySQL?
thanks
Edit:
Currently, my query looks like this
SELECT
year,
manufacturer,
MAX(price) AS price,
image_url,
COUNT(id) AS total
FROM
widgets
WHERE
category_id = A
AND
year <> ''
AND
manufacturer <> ''
GROUP BY
category_id,
manufacturer,
year
ORDER BY
total DESC,
price ASC
LIMIT
10
);
Thats without the mandatory grouping in there.
The placement doesnt necessarily have to be random, just shouldnt be on any extreme end. And the list should be 10 groupings including the mandatory listing. So 9 + 1

I would use an UNION query: your current query union the query for 2009, then handle the sorting in the presentation layer.

You can write 2 separate query (one for all companies and another just for company 1) and then use UNION to join them together. Finally, add ORDER BY RAND().
It will look like
SELECT * FROM
(
SELECT company_id, company_name, year, count(*) as num_widgets
....
LIMIT 10
UNION DISTINCT
SELECT company_id, company_name, year, count(*) as num_widgets
...
WHERE company_id =1
...
LIMIT 10
)x
ORDER BY RAND();

You could add a field that you make true for company 1 in 2009 and include it in the where clause. Something like
select * from companies where group = 'some group' or included = true order by included, widgets_made limit 10
For the random part you would have that as subquery then include a column that has a random number from 1 to 10 if the field that you made is true, and rownum otherwise, then sort by that column

Related

MySQL ORDER BY Column = value AND distinct?

I'm getting grey hair by now...
I have a table like this.
ID - Place - Person
1 - London - Anna
2 - Stockholm - Johan
3 - Gothenburg - Anna
4 - London - Nils
And I want to get the result where all the different persons are included, but I want to choose which Place to order by.
For example. I want to get a list where they are ordered by LONDON and the rest will follow, but distinct on PERSON.
Output like this:
ID - Place - Person
1 - London - Anna
4 - London - Nils
2 - Stockholm - Johan
Tried this:
SELECT ID, Person
FROM users
ORDER BY FIELD(Place,'London'), Person ASC "
But it gives me:
ID - Place - Person
1 - London - Anna
4 - London - Nils
3 - Gothenburg - Anna
2 - Stockholm - Johan
And I really dont want Anna, or any person, to be in the result more then once.
This is one way to get the specified output, but this uses MySQL specific behavior which is not guaranteed:
SELECT q.ID
, q.Place
, q.Person
FROM ( SELECT IF(p.Person<=>#prev_person,0,1) AS r
, #prev_person := p.Person AS person
, p.Place
, p.ID
FROM users p
CROSS
JOIN (SELECT #prev_person := NULL) i
ORDER BY p.Person, !(p.Place<=>'London'), p.ID
) q
WHERE q.r = 1
ORDER BY !(q.Place<=>'London'), q.Person
This query uses an inline view to return all the rows in a particular order, by Person, so that all of the 'Anna' rows are together, followed by all the 'Johan' rows, etc. The set of rows for each person is ordered by, Place='London' first, then by ID.
The "trick" is to use a MySQL user variable to compare the values from the current row with values from the previous row. In this example, we're checking if the 'Person' on the current row is the same as the 'Person' on the previous row. Based on that check, we return a 1 if this is the "first" row we're processing for a a person, otherwise we return a 0.
The outermost query processes the rows from the inline view, and excludes all but the "first" row for each Person (the 0 or 1 we returned from the inline view.)
(This isn't the only way to get the resultset. But this is one way of emulating analytic functions which are available in other RDBMS.)
For comparison, in databases other than MySQL, we could use SQL something like this:
SELECT ROW_NUMBER() OVER (PARTITION BY t.Person ORDER BY
CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.ID) AS rn
, t.ID
, t.Place
, t.Person
FROM users t
WHERE rn=1
ORDER BY CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.Person
Followup
At the beginning of the answer, I referred to MySQL behavior that was not guaranteed. I was referring to the usage of MySQL User-Defined variables within a SQL statement.
Excerpts from MySQL 5.5 Reference Manual http://dev.mysql.com/doc/refman/5.5/en/user-variables.html
"As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement."
"For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed."
"the order of evaluation for expressions involving user variables is undefined."
Try this:
SELECT ID, Place, Person
FROM users
GROUP BY Person
ORDER BY FIELD(Place,'London') DESC, Person ASC;
You want to use group by instead of distinct:
SELECT ID, Person
FROM users
GROUP BY ID, Person
ORDER BY MAX(FIELD(Place, 'London')), Person ASC;
The GROUP BY does the same thing as SELECT DISTINCT. But, you are allowed to mention other fields in clauses such as HAVING and ORDER BY.

sum in mysql than group by

I am making a small lottery game for fun and to improve myself.
On the database, I have
table(id, package, value, price, purchase_code,round)
See an example. There is two package, package1 and package2.
package1 has a value of 3 and package2 has a value of 4. This means, that if I buy the package1, i got 3 ticket which is playing, giving me bigger chance to win in the current round, so it inserts 3 record into a table, containing the informations. So in this case, I have the following records in my table:
id pacakage_id value price purchase_code round
1 1 3 10 w3hjkrw 1
2 1 3 10 w3hjkrw 1
3 1 3 10 w3hjkrw 1
I would like to see overall how money the users spent , and for this, I used sum(price).
Ok, but as you can see, the three record was one purchase, so sum(price) would give me the result 30. I tried to group by purchase_code, but it is not doing what I want.
Here is the code:
$income_query = mysql_query("SELECT SUM(price) FROM lottery WHERE round = '$current_round' GROUP BY code") or die(mysql_error());
while($result = mysql_fetch_array($income_query)) {
$round_money = $result['SUM(price)']." $";
Think you will need to do a sub query to get the package id price. Possibly using distinct, although I would just use a normal aggregate function (MAX will do the job here).
Something like this:-
SELECT code, SUM(package_id_price)
FROM(
SELECT code, package_id, MAX(price) AS package_id_price
FROM lottery
WHERE round = '$current_round'
GROUP BY code, package_id
) Sub1
GROUP BY code

MySQL query for items where average price is less than X?

I'm stumped with how to do the following purely in MySQL, and I've resorted to taking my result set and manipulating it in ruby afterwards, which doesn't seem ideal.
Here's the question. With a dataset of 'items' like:
id state_id price issue_date listed
1 5 450 2011 1
1 5 455 2011 1
1 5 490 2011 1
1 5 510 2012 0
1 5 525 2012 1
...
I'm trying to get something like:
SELECT * FROM items
WHERE ([some conditions], e.g. issue_date >= 2011 and listed=1)
AND state_id = 5
GROUP BY id
HAVING AVG(price) <= 500
ORDER BY price DESC
LIMIT 25
Essentially I want to grab a "group" of items whose average price fall under a certain threshold. I know that my above example "group by" and "having" are not correct since it's just going to give the AVG(price) of that one item, which doesn't really make sense. I'm just trying to illustrate my desired result.
The important thing here is I want all of the individual items in my result set, I don't just want to see one row with the average price, total, etc.
Currently I'm just doing the above query without the HAVING AVG(price) and adding up the individual items one-by-one (in ruby) until I reach the desired average. It would be really great if I could figure out how to do this in SQL. Using subqueries or something clever like joining the table onto itself are certainly acceptable solutions if they work well! Thanks!
UPDATE: In response to Tudor's answer below, here are some clarifications. There is always going to be a target quantity in addition to the target average. And we would always sort the results by price low to high, and by date.
So if we did have 10 items that were all priced at $5 and we wanted to find 5 items with an average < $6, we'd simply return the first 5 items. We wouldn't return the first one only, and we wouldn't return the first 3 grouped with the last 2. That's essentially how my code in ruby is working right now.
I would do almost an inverse of what Jasper provided... Start your query with your criteria to explicitly limit the few items that MAY qualify instead of getting all items and running a sub-select on each entry. Could pose as a larger performance hit... could be wrong, but here's my offering..
select
i2.*
from
( SELECT i.id
FROM items i
WHERE
i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500 ) PreQualify
JOIN items i2
on PreQualify.id = i2.id
AND i2.issue_date > 2011
AND i2.listed = 1
AND i2.state_id = 5
order by
i2.price desc
limit
25
Not sure of the order by, especially if you wanted grouping by item... In addition, I would ensure an index on (state_id, Listed, id, issue_date)
CLARIFICATION per comments
I think I AM correct on it. Don't confuse "HAVING" clause with "WHERE". WHERE says DO or DONT include based on certain conditions. HAVING means after all the where clauses and grouping is done, the result set will "POTENTIALLY" accept the answer. THEN the HAVING is checked, and if IT STILL qualifies, includes in the result set, otherwise throws it out. Try the following from the INNER query alone... Do once WITHOUT the HAVING clause, then again WITH the HAVING clause...
SELECT i.id, avg( i.price )
FROM items i
WHERE i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500
As you get more into writing queries, try the parts individually to see what you are getting vs what you are thinking... You'll find how / why certain things work. In addition, you are now talking in your updated question about getting multiple IDs and prices at apparent low and high range... yet you are also applying a limit. If you had 20 items, and each had 10 qualifying records, your limit of 25 would show all of the first item and 5 into the second... which is NOT what I think you want... you may want 25 of each qualified "id". That would wrap this query into yet another level...
What MySQL does makes perfectly sense. What you want to do does not make sense:
if you have let's say 4 items, each with price of 5 and you put HAVING AVERAGE <= 7 what you say is that the query should return ALL the permutations, like:
{1} - since item with id 1, can be a group by itself
{1,2}
{1,3}
{1,4}
{1,2,3}
{1,2,4}
...
and so on?
Your algorithm of computing the average in ruby is also not valid, if you have items with values 5, 1, 7, 10 - and seek for an average value of less than 7, element with value 10 can be returned just in a group with element of value 1. But, by your algorithm (if I understood correctly), element with value 1 is returned in the first group.
Update
What you want is something like the Knapsack problem and your approach is using some kind of Greedy Algorithm to solve it. I don't think there are straight, easy and correct ways to implement that in SQL.
After a google search, I found this article which tries to solve the knapsack problem with AI written in SQL.
By considering your item price as a weight, having the number of items and the desired average, you could compute the maximum value that can be entered in the 'knapsack' by multiplying desired_cost with number_of_items
I'm not entirely sure from your question, but I think this is a solution to your problem:
SELECT * FROM items
WHERE (some "conditions", e.g. issue_date > 2011 and listed=1)
AND state_id = 5
AND id IN (SELECT id
FROM items
GROUP BY id
HAVING AVG(price) <= 500)
ORDER BY price DESC
LIMIT 25
note: This is off the top of my head and I haven't done complex SQL in a while, so it might be wrong. I think this or something like it should work, though.

Selecting most recent as part of group by (or other solution ...)

I've got a table where the columns that matter look like this:
username
source
description
My goal is to get the 10 most recent records where a user/source combination is unique. From the following data:
1 katie facebook loved it!
2 katie facebook it could have been better.
3 tom twitter less then 140
4 katie twitter Wowzers!
The query should return records 2,3 and 4 (assume higher IDs are more recent - the actual table uses a timestamp column).
My current solution 'works' but requires 1 select to generate the 10 records, then 1 select to get the proper description per row (so 11 selects to generate 10 records) ... I have to imagine there's a better way to go. That solution is:
SELECT max(id) as MAX_ID, username, source, topic
FROM events
GROUP BY source, username
ORDER BY MAX_ID desc;
It returns the proper ids, but the wrong descriptions so I can then select the proper descriptions by the record ID.
Untested, but you should be able to handle this with a join:
SELECT
fullEvent.id,
fullEvent.username,
fullEvent.source,
fullEvent.topic
FROM
events fullEvent JOIN
(
SELECT max(id) as MAX_ID, username, source
FROM events
GROUP BY source, username
) maxEvent ON maxEvent.MAX_ID = fullEvent.id
ORDER BY fullEvent.id desc;

Help with MySQL query... Need help ordering a group of rows

I can tell it best by explaining the query I have, and what I need.
I need to be able to get a group of items from the database, grouped by category, manufacturer, and year made. The groupings need to be sorted based on total amount of items within the group. This part is done with the query below.
Secondly, I need to be able to show an image of the most expensive item out of the group, which is why I use MAX(items.current_price). I thought MAX() gets the ENTIRE row corresponding to the largest column value. I was wrong, as MAX only gets the numeric value of the largest price. So the query doesnt work well for that.
SELECT
items.id,
items.year,
items.manufacturer,
COUNT(items.id) AS total,
MAX(items.current_price) AS price,
items.gallery_url,
FROM
ebay AS items
WHERE
items.primary_category_id = 213
AND
items.year <> ''
AND
items.manufacturer <> ''
AND
items.bad_item <> 1
GROUP BY
items.primary_category_id,
items.manufacturer,
items.year
ORDER BY
total DESC,
price ASC
LIMIT
10
if that doesnt explain it well, the results should be something like this
id 10548
year 1989
manufacturer bowman
total 451
price 8500.00 (The price of the most expensive item in the table/ not the price of item 10548)
gallery_url http://ebay.xxxxx (The image of item 10548)
A little help please. Thanks
I've had this same problem, and I'm fairly certain you have to do two queries (or a subquery, that's a matter of taste).
The first query is like what you have (except id isn't helping you).
The second query uses the GROUP BY fields and one (one!) MAX field to get the id and any other meta-data you need.
I believe this is the implementation, although it's hard to test:
SELECT
items.id,
items.year,
items.manufacturer,
items.gallery_url
FROM
ebay as items
NATURAL JOIN
(
SELECT
COUNT(items.id) AS total,
MAX(items.current_price) AS current_price,
items.primary_category_id,
items.manufacturer,
items.year
FROM
ebay AS items
WHERE
items.primary_category_id = 213
AND
items.year <> ''
AND
items.manufacturer <> ''
AND
items.bad_item <> 1
GROUP BY
items.primary_category_id,
items.manufacturer,
items.year
ORDER BY
total DESC,
price ASC
LIMIT
10
) as bigones
ORDER BY
bigones.total DESC,
bigones.current_price ASC
This documentation may help you understand what's going on:
http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html
... all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.