MySQL - Limit by bottom (X) percent of rows - mysql

Here's my table structure:
+-------+--------+----------+
| item | price | quantity |
+-------+--------+----------+
| 22452 | 579150 | 4 |
| 34664 | 334425 | 7 |
| 32249 | 204750 | 3 |
| 39970 | 97500 | 5 |
| 36907 | 116415 | 6 |
| 4338 | 207451 | 17 |
| 23425 | 388050 | 4 |
| 23427 | 532350 | 14 |
| 76080 | 180000 | 6 |
| 76076 | 400000 | 4 |
+-------+--------+----------+
Item is not unique and there could be anywhere from 1 to a few thousand rows for each item, so I'm grouping by item for the results. My current query is the following:
SELECT item AS id,
COUNT(item) as total,
ROUND(AVG(price/quantity)) AS mean,
ROUND(MIN(price/quantity)) AS cheapest
FROM `data`
GROUP BY item;
In addition to these 4 results, I would like to calculate the average price of the bottom 15% of rows of the (price/quantity) value (not < 0.15*MAX(price/quantity) but 0.15*total, ordered by (price/quantity) ASC). The solution I thought of involved temporary tables using the count of that item as the limiter, but I'd highly prefer it to be a single query if possible. I'm sure I'll need a subquery in here, but I'm unsure of how to go about getting the count for that particular item and then limiting by 15% of that result.
UPDATE WITH ANSWER FROM BELOW
Using #GordonLinoff answer below got me basically all the way there. I did run into two issues, however. The biggest one was the #rn variable wasn't resetting, which was causing it to keep increment and subsequently only the first row of the items was getting included. The second was any item where 15% of the number of times it appears in the table is < 1, NULL was being returned. The corrections were minor and I've included the final query I used below:
SELECT item AS id,
COUNT(item) as total,
ROUND(AVG(price/quantity)) AS mean,
ROUND(MIN(price/quantity)) AS cheapest,
ROUND(avg(case when rn <= IF(cnt * 0.15 < 1, cnt, cnt * 0.15) then price/quantity end)) as Cheapest15Percent
FROM
(SELECT d.*, cnt, IF(#item = d.item, #rn := #rn + 1, if(#item := d.item, #rn := 1, 1)) as rn
FROM `data` d LEFT JOIN
(SELECT item, COUNT(*) cnt FROM `udata` GROUP BY item) di
ON d.item = di.item CROSS JOIN
(SELECT #rn := 0, #item := -1) vars
ORDER BY d.item, d.price/d.quantity) d
GROUP BY d.item;

This assumes that you want the average of the cheapest 15% for each item.
The following query enumerates the rows for each item and gets the total rows:
select d.*, cnt,
if(#item = item, #rn := #rn + 1, if(#item := item, 1, 1)) as rn
from `data` d left join
(select item, count(*) cnt
from data
group by item
) di
on d.item = di.item cross join
(select #rn := 0, #item := -1) vars
order by item, price/quantity;
You can then basically plug this into your query and do conditional aggregation:
SELECT item AS id,
COUNT(item) as total,
ROUND(AVG(price/quantity)) AS mean,
ROUND(MIN(price/quantity)) AS cheapest,
avg(case when rn <= cnt * 0.15 then price/quantity end) as Cheapest15Percent
FROM (select d.*, cnt,
if(#item = item, #rn := #rn + 1, if(#item := item, 1, 1)) as rn
from `data` d left join
(select item, count(*) cnt
from data
group by item
) di
on d.item = di.item cross join
(select #rn := 0, #item := -1) vars
order by item, price/quantity
) d
GROUP BY item;

Related

Convert Row to Column mySQL with ID (not pivoting)

I have a value store in a database like this:
ID | Date | Value
----------------------------------------------
1 | 11/20 | 1
1 | 11/21 | 2
2 | 11/20 | 10
2 | 11/21 | 20
However, I need it to be like this:
Date | Value ID 1 | Value ID 2
----------------------------------------------
11/20| 1 | 10
11/21| 2 | 20
So the new column can be plot in a trend (column 1 = date, column 2 = value#1, column 3 = value #2, column 4 = value#4, etc).
Here is the query for a single tag:
SELECT *
FROM (
SELECT ID, _date, ESYNC_TAGSHISTORY.Val, #curRow := #curRow + 1 AS row_number
FROM ESYNC_TAGSHISTORY
JOIN (SELECT #curRow:=0) i
INNER JOIN ESYNC_TAGS ON ESYNC_TAGSHISTORY.TAGID=ESYNC_TAGS.ID
WHERE ESYNC_TAGS.NAME='I_TT_21052' AND ESYNC_TAGS.STATIONID=1 AND (_date BETWEEN now()-INTERVAL 45 MINUTE AND now()) ) s
WHERE row_number mod 60 = 0;
And the results:
ID | Date | Value ID 1 | Row
----------------------------------------------
1 | 11/20| 1 | 1
1 | 11/21| 2 | 2
EDIT :
With some modification my query look like this
SELECT *
FROM (
SELECT ID, _date, ESYNC_TAGSHISTORY.Val, #curRow := #curRow + 1 AS row_number,
if (ESYNC_TAGS.NAME='I_TT_21052', ESYNC_TAGSHISTORY.Val, NULL) as 'I_TT_21052',
if (ESYNC_TAGS.NAME='I_TT_91214', ESYNC_TAGSHISTORY.Val, NULL) as 'I_TT_40011'
FROM ESYNC_TAGSHISTORY
JOIN (SELECT #curRow:=0) i
INNER JOIN ESYNC_TAGS ON ESYNC_TAGSHISTORY.TAGID=ESYNC_TAGS.ID
WHERE ESYNC_TAGS.STATIONID=1 AND (_date BETWEEN now()-INTERVAL 5 MINUTE AND now()) ) s
WHERE row_number mod 1 = 0
ORDER BY ID ,_date;
Result look like this
SQL RESULT
My Problem now is to move the data from the last column at the same place as the other (get the value line up with the date)
EDIT #2 : Finally for further reference query look like this :
SELECT _date, I_TT_21052, I_TT_40011, row_number
From(
SELECT max(_date) as _date, max(I_TT_21052) as I_TT_21052, max(I_TT_40011) as I_TT_40011, #curRow := #curRow + 1 AS row_number
FROM (
SELECT ID, _date, ESYNC_TAGSHISTORY.Val,
if (ESYNC_TAGS.NAME='I_TT_21052', ESYNC_TAGSHISTORY.Val, NULL) as 'I_TT_21052',
if (ESYNC_TAGS.NAME='I_TT_91214', ESYNC_TAGSHISTORY.Val, NULL) as 'I_TT_40011'
FROM ESYNC_TAGSHISTORY
JOIN (SELECT #curRow:=0) i
INNER JOIN ESYNC_TAGS ON ESYNC_TAGSHISTORY.TAGID=ESYNC_TAGS.ID
WHERE ESYNC_TAGS.STATIONID=1 AND (_date BETWEEN now()-INTERVAL 24 HOUR AND now()) ) s
GROUP BY _date)v
WHERE row_number mod 150 = 0;
select
case when id=1 then count(Id) else 0 end) as Value1,
case when id=2 then count(Id) else 0 end) as Value2
from ESYNC_TAGSHISTORY
This is not exact but try this kind of query you will get result

SQL filter rows without join

I'm always "irk" by unnecessary join. But in this case, I wonder if it's possible to not use join.
This is an example of the table I have:
id | team | score
1 | 1 | 300
2 | 1 | 257
3 | 2 | 127
4 | 2 | 533
5 | 3 | 459
This is what I want:
team | score | id
1 | 300 | 1
2 | 533 | 4
3 | 459 | 5
Doing a query looking like this:
(basically: who's the best player of each team)
SELECT team, MAX(score) AS score, id
FROM my_table
GROUP BY team
But I get something like that:
team | score | id
1 | 300 | 1
2 | 533 | 3
3 | 459 | 5
But it's not the third player that got 533 points, so the result have no consistency.
Is it possible to get truthworthy results without joining the table with itself? How to achieve that?
You can do it without joins by using subquery like this:
SELECT id, team, score
FROM table1 a
WHERE score = (SELECT MAX(score) FROM table1 b WHERE a.team = b.team);
However in big tables this can be very slow as you have to run the whole subquery for every row in your table.
However there's nothing wrong with using join to filter results like this:
SELECT id, team, score FROM table1 a
INNER JOIN (
SELECT MAX(score) score, team
FROM table1
GROUP BY team
) b ON a.score = b.score AND a.team = b.team
Although joining itself is quite expensive, this way you only have to run two actual queries regardless how many rows are in your tables. So in big tables this method can still be hundreds, if not thousands of times faster than the first method with subquery.
You can use variables:
SELECT id, team, score
FROM (
SELECT id, team, score,
#seq := IF(#t = team, #seq,
IF(#t := team, #seq + 1, #seq + 1)) AS seq,
#grp := IF(#t2 = team, #grp + 1,
IF(#t2 := team, 1, 1)) AS grp
FROM mytable
CROSS JOIN (SELECT #seq := 0, #t := 0, #grp := 0, #t2 := 0) AS vars
ORDER BY score DESC) AS t
WHERE seq <= 3 AND grp = 1
Variable #seq is incremented each time a new team is met as the records are being processed in descending score order. Variable #grp is used to enumerate records within each team partition. Records with #grp = 1 are the ones having the greatest score value within the team slice.
Demo here
Unfortantly , MySQL doesn't support window functions like ROW_NUMBER() which could have solved this easily.
There are several ways on doing that:
NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE t.team = s.team AND s.score > t.score)
NOT IN() :
SELECT * FROM YourTable t
WHERE (t.team,t.score) IN(SELECT s.team,MAX(s.score)
FROM YourTable s
GROUP BY s.team)
A correlated query:
SELECT distinct t.id,t.team,
(SELECT s.score FROM YourTable s
WHERE s.team = t.team
ORDER BY s.score DESC
LIMIT 1)
FROM YourTable t
Or a join which I understand you already have.
EDIT : I take my words back, you can do it with a variable like #GiorgosBetsos solution.
You could do something like this:
SELECT team, score, id
FROM (SELECT *
,RANK() OVER
(PARTITION BY team ORDER BY score DESC) AS Rank
FROM my_table) ranked_result
WHERE Rank = 1;
Some info on Rank functionality: Clicketyclickclick

first N row of each id in MySQL

I have a table for my users scores like this:
id | kills
----------
2 | 1
1 | 1
1 | 5
1 | 3
2 | 4
2 | 5
3 | 5
I want to get the first 2 rows of each player which have more than 2 kills. So the result should look like this
id | kills
----------
1 | 5
1 | 3
2 | 4
2 | 5
3 | 5
I tried this but it doesn't work:
SELECT *
FROM user_stats us
WHERE
(
SELECT COUNT(*)
FROM user_stats f
WHERE f.id=us.id AND f.kills > 2
) <= 2;
I suspect that you just want the two largest values for users that have kills > 2. If so, use variables:
select us.*
from (select us.*,
(#rn := if(#i = id, #rn + 1,
if(#i := id, 1, 1)
)
) as seqnum
from user_stats us cross join
(select #rn := 0, #i := -1) params
where us.kills > 2
order by us.id, kills desc
) us
where seqnum <= 2;
select * from user_stats
where (id,kills) in (select id, max(kills) from user_stats where kills > 2 group by id
union
select id, min(kills) from user_stats where kills > 2 group by id)
Try this. I am coming from Oracle, where rownum is a count of rows selected. This should have the same effect.
select #rownum:=#rownum+1, us.*
from user_stats us , (select #rownum := 0) r
where id in (
select id from user_stats f
group by id
having count(*) > 2
)
and #rownum < 3;
based on response of vkp. Take min and max when id has more then 1 kill
select id, max(kills)
from user_stats
group by id
having count(kills) > 2
union
select id, min(kills)
from user_stats
group by id
having count(kills) > 2
order by id

MySQL find user rank for each category

Let's say I have the following table:
user_id | category_id | points
-------------------------------
1 | 1 | 4
2 | 1 | 2
2 | 1 | 5
1 | 2 | 3
2 | 2 | 2
1 | 3 | 1
2 | 3 | 4
1 | 3 | 8
Could someone please help me to construct a query to return user's rank per category - something like this:
user_id | category_id | total_points | rank
-------------------------------------------
1 | 1 | 4 | 2
1 | 2 | 3 | 1
1 | 3 | 9 | 1
2 | 1 | 7 | 1
2 | 2 | 2 | 2
2 | 3 | 4 | 2
First, you need to get the total points per category. Then you need to enumerate them. In MySQL this is most easily done with variables:
SELECT user_id, category_id, points,
(#rn := if(#cat = category_id, #rn + 1,
if(#cat := category_id, 1, 1)
)
) as rank
FROM (SELECT u.user_id, u.category_id, SUM(u.points) as points
FROM users u
GROUP BY u.user_id, u.category_id
) g cross join
(SELEct #user := -1, #cat := -1, #rn := 0) vars
ORDER BY category_id, points desc;
You want to get the SUM of points for each unique category_id:
SELECT u.user_id, u.category_id, SUM(u.points)
FROM users AS u
GROUP BY uc.category_id
MySQL doesn't have analytic functions like other databases (Oracle, SQL Server) which would be very convenient for returning a result like this.
The first three columns are straightforward, just GROUP BY user_id, category_id and a SUM(points).
Getting the rank column returned is a bit more of a problem. Aside from doing that on the client, if you need to do that in the SQL statement, you could make use of MySQL user-defined variables.
SELECT #rank := IF(#prev_category = r.category_id, #rank+1, 1) AS rank
, #prev_category := r.category_id AS category_id
, r.user_id
, r.total_points
FROM (SELECT #prev_category := NULL, #rank := 1) i
CROSS
JOIN ( SELECT s.category_id, s.user_id, SUM(s.points) AS total_points
FROM users s
GROUP BY s.category_id, s.user_id
ORDER BY s.category_id, total_points DESC
) r
ORDER BY r.category_id, r.total_points DESC, r.user_id DESC
The purpose of the inline view aliased as i is to initialize user defined variables. The inline view aliased as r returns the total_points for each (user_id, category_id).
The "trick" is to compare the category_id value of the previous row with the value of the current row; if they match, we increment the rank by 1. If it's a "new" category, we reset the rank to 1. Note this only works if the rows are ordered by category, and then by total_points descending, so we need the ORDER BY clause. Also note that the order of the expressions in the SELECT list is important; we need to do the comparison of the previous value BEFORE it's overwritten with the current value, so the assignment to #prev_category must follow the conditional test.
Also note that if two users have the same total_points in a category, they will get distinct values for rank... the query above doesn't give the same rank for a tie. (The query could be modified to do that as well, but we'd also need to preserve total_points from the previous row, so we can compare to the current row.
Also note that this syntax is specific to MySQL, and that this is behavior is not guaranteed.
If you need the columns in the particular sequence and/or the rows in a particular order (to get the exact resultset specified), we'd need to wrap the query above as an inline view.
SELECT t.user_id
, t.category_id
, t.total_points
, t.rank
FROM (
SELECT #rank := IF(#prev_category = r.category_id, #rank+1, 1) AS rank
, #prev_category := r.category_id AS category_id
, r.user_id
, r.total_points
FROM (SELECT #prev_categor := NULL, #rank := 1) i
CROSS
JOIN ( SELECT s.category_id, s.user_id, SUM(s.points) AS total_points
FROM users s
GROUP BY s.category_id, s.user_id
ORDER BY s.category_id, total_points DESC
) r
ORDER BY r.category_id, r.total_points DESC, r.user_id DESC
) t
ORDER BY t.user_id, t.category_id
NOTE: I've not setup a SQL Fiddle demonstration. I've given an example query which has only been desk checked.

Selecting the next row in a mysql subselect

I've got a table of data thats ordered by a non-primary key e.g.
id | likes
4 | 6
2 | 5
5 | 2
3 | 2
1 | 2
I need a query to find the row after id #5 which would be id #3.
I've tried using row numbers and written this but it seems really inefficient
select * from (
SELECT l.id,
l.likes,
#curRow := #curRow + 1 AS row_number
FROM sk_posters l
JOIN (SELECT #curRow := 0) r
WHERE active = 'yes'
order by likes desc, id desc)
as mycount where row_number =
(select row_number from (
SELECT l.id,
l.likes,
#curRow := #curRow + 1 AS row_number
FROM sk_posters l
JOIN (SELECT #curRow := 0) r
WHERE active = 'yes'
order by likes desc, id desc)
as mycount
where id=5)+1 limit 1
If there a better, more efficient way to do this?
You can use limit at the end of query like
SELECT * FROM YOUR_TABLE LIMIT 2,1