Not able to order column in mysql - mysql

I am using this table from the Northwind dataset (can be generated from query below)
+-----------+-----------+
| NumOrders | CustCount |
+-----------+-----------+
| 1 | 1 |
| 2 | 2 |
| 3 | 7 |
| 4 | 6 |
| 5 | 10 |
| 6 | 8 |
| 7 | 7 |
| 8 | 4 |
| 9 | 5 |
| 10 | 11 |
| 11 | 4 |
| 12 | 3 |
| 13 | 3 |
| 14 | 6 |
| 15 | 3 |
| 17 | 1 |
| 18 | 3 |
| 19 | 2 |
| 28 | 1 |
| 30 | 1 |
| 31 | 1 |
+-----------+-----------+`
And I want to write a query to provide a histogram of the number of x people who made y number of orders
select
case
when NumOrders > 0 and NumOrders <= 5 then '0 - 5'
when NumOrders > 5 and NumOrders <=10 then '6 - 10'
else '10+'
end as Bucket,
CustomerCount = sum(CustCount)
from (
select
NumOrders,
CustCount = count(*)
from (
select *
from (
select
CustomerID,
count(*) as NumOrders
from orders
group by CustomerID
) c
) b
group by NumOrders
)a
group by
(
case
when NumOrders > 0 and NumOrders <= 5 then '0 - 5'
when NumOrders > 5 and NumOrders <=10 then '6 - 10'
else '10+'
end
)
From the query above I am getting this output, which is ordered incorrectly.
+--------+---------------+
| Bucket | CustomerCount |
+--------+---------------+
| 0 - 5 | 26 |
| 10+ | 28 |
| 6 - 10 | 35 |
+--------+---------------+
I would like it to be ordered as
+--------+---------------+
| Bucket | CustomerCount |
+--------+---------------+
| 0 - 5 | 26 |
| 6 - 10 | 35 |
| 10+ | 28 |
+--------+---------------+
Can someone suggest how to order it correctly?

You just need
Order by NumOrders
at the very end of your query

I can't see what part of the problem this fails to solve...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(NumOrders SERIAL PRIMARY KEY
,CustCount INT NOT NULL
);
INSERT INTO my_table VALUES
(1 ,1),
(2 ,2),
(3 ,7),
(4 ,6),
(5 ,0),
(6 ,8),
(7 ,7),
(8 ,4),
(9 ,5),
(10,1),
(11,4),
(12,3),
(13,3),
(14,6),
(15,3),
(17,1),
(18,3),
(19,2),
(28,1),
(30,1),
(31,1);
SELECT CASE WHEN numorders BETWEEN 0 AND 5 THEN '0-5'
WHEN numorders BETWEEN 6 AND 10 THEN '6-10'
ELSE '+10' END bucket
, COUNT(*) total
FROM my_table
GROUP
BY bucket
ORDER
BY numorders;
+--------+-------+
| bucket | total |
+--------+-------+
| 0-5 | 5 |
| 6-10 | 5 |
| +10 | 11 |
+--------+-------+

Related

Count occurences in Mysql

Let's say, in given num_table, there is a column, in which only numbers from 1 to 35 are stored.
Code for count nums in last 25rows is:
select num, count(*)
from (select C_1 as num from num_table order by id desc limit 25) n
group by num
order by num asc;
Result:
| num | count(*) |
|------|----------|
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
| 10 | 1 |
| 11 | 1 |
| 12 | 1 |
| 15 | 1 |
| 16 | 2 |
| 17 | 1 |
| 20 | 1 |
| 21 | 1 |
| 22 | 1 |
| 23 | 1 |
| 25 | 1 |
| 28 | 2 |
| 29 | 2 |
| 30 | 1 |
| 32 | 2 |
|------|----------|
How to get a result, where nums from 1 to 35 - which occured 0 times within last 25 rows - will be also displayed?
Example of desired result:
| num | count(*) |
|------|----------|
| 1 | 0 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 1 |
| ... | ... |
| 35 | 0 |
Maybe the quickest way is to make your existing query as sub-query and LEFT JOIN your num_table with it like :
SELECT A.C_1, IFNULL(cnt,0) total_count
FROM num_table A
LEFT JOIN
(SELECT num, COUNT(*) cnt
FROM (SELECT C_1 AS num FROM num_table ORDER BY id DESC LIMIT 25) n
GROUP BY num) B
ON A.C_1=B.num
GROUP BY A.C_1, cnt
ORDER BY A.C_1 ASC;
Here's a fiddle for reference:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=3ced94d698fd8a55a8ad07a9d3b42f3d
And by the way, the current result you're showing is only 24 rows despite you did LIMIT 25 in the first sub-query. So in my example fiddle, the result is slightly different.
Here is another way to solve your problem.
In this solution, first, you need a table with numbers between 1 and 35, but only for the query, so then you can left join (because with a left join you can have also 0 counter values) it with your existent num_table.
You can do it like this:
WITH RECURSIVE numbers(id) AS (
SELECT 1 as id
UNION ALL
SELECT id+1 FROM numbers WHERE id < 35
)
SELECT numbers.id AS num, count(nt.id) AS total
FROM numbers
LEFT JOIN (SELECT C_1 FROM num_table ORDER BY id DESC LIMIT 25) nt ON (nt.C_1 = numbers.id)
GROUP BY numbers.id

Adding a moving average column to a table using values from previous 2 entries

I currently have the following simplified tables in my database. The points table contains rows of points awarded to each user for every bid form they have voted in.
I would like to add a column to this table that for each row, it shows the AVERAGE of the previous TWO points awarded to THAT user.
Users
+----+----------------------+
| id | name |
+----+----------------------+
| 1 | Flossie Schamberger |
| 2 | Lawson Graham |
| 3 | Hadley Reilly |
+----+----------------------+
Bid Forms
+----+-----------------+
| id | name |
+----+-----------------+
| 1 | Summer 2017 |
| 2 | Winter 2017 |
| 3 | Summer 2018 |
| 4 | Winter 2019 |
| 5 | Summer 2019 |
+----+-----------------+
Points
+-----+---------+--------------------+------------+------------+
| id | user_id | leave_bid_forms_id | bid_points | date |
+-----+---------+--------------------+------------+------------+
| 1 | 1 | 1 | 6 | 2016-06-19 |
| 2 | 2 | 1 | 8 | 2016-06-19 |
| 3 | 3 | 1 | 10 | 2016-06-19 |
| 4 | 1 | 2 | 4 | 2016-12-18 |
| 5 | 2 | 2 | 8 | 2016-12-18 |
| 6 | 3 | 2 | 4 | 2016-12-18 |
| 7 | 1 | 3 | 10 | 2017-06-18 |
| 8 | 2 | 3 | 12 | 2017-06-18 |
| 9 | 3 | 3 | 4 | 2017-06-18 |
| 10 | 1 | 4 | 4 | 2017-12-17 |
| 11 | 2 | 4 | 4 | 2017-12-17 |
| 12 | 3 | 4 | 2 | 2017-12-17 |
| 13 | 1 | 5 | 16 | 2018-06-17 |
| 14 | 2 | 5 | 12 | 2018-06-17 |
| 15 | 3 | 5 | 10 | 2018-06-17 |
+-----+---------+--------------------+------------+------------+
For each row in the points table I would like an average_points column to be calculated like follows.
The average point column is the average of that users PREVIOUS 2 points. So for the first entry in the table for each user, the average is obviously 0 because there were no previous points awarded to them.
The previous 2 points for each user should be determined using the date column.
The table below is what I would like to have as the final output.
For clarity, to the side of the table, I have added the calculation and numbers used to arrive at the value in the averaged_points column.
+-----+---------+--------------------+------------+-----------------+
| id | user_id | leave_bid_forms_id | date | averaged_points |
+-----+---------+--------------------+------------+-----------------+
| 1 | 1 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 2 | 2 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 3 | 3 | 1 | 2016-06-19 | 0 | ( 0 + 0 ) / 2
| 4 | 1 | 2 | 2016-12-18 | 3 | ( 6 + 0 ) / 2
| 5 | 2 | 2 | 2016-12-18 | 4 | ( 8 + 0 ) / 2
| 6 | 3 | 2 | 2016-12-18 | 5 | ( 10 + 0) / 2
| 7 | 1 | 3 | 2017-06-18 | 5 | ( 4 + 6 ) / 2
| 8 | 2 | 3 | 2017-06-18 | 8 | ( 8 + 8 ) / 2
| 9 | 3 | 3 | 2017-06-18 | 7 | ( 4 + 10) / 2
| 10 | 1 | 4 | 2017-12-17 | 7 | ( 10 + 4) / 2
| 11 | 2 | 4 | 2017-12-17 | 10 | ( 12 + 8) / 2
| 12 | 3 | 4 | 2017-12-17 | 4 | ( 4 + 4 ) / 2
| 13 | 1 | 5 | 2018-06-17 | 7 | ( 4 + 10) / 2
| 14 | 2 | 5 | 2018-06-17 | 8 | ( 4 + 12) / 2
| 15 | 3 | 5 | 2018-06-17 | 3 | ( 2 + 4 ) / 2
+-----+---------+--------------------+------------+-----------------+
I've been trying to use subqueries to solve this issue as AVG doesn't seem to be affected by any LIMIT clause I have.
So far I have come up with
select id, user_id, leave_bid_forms_id, `date`,
(
SELECT
AVG(bid_points)
FROM (
Select `bid_points`
FROM points as p2
ORDER BY p2.date DESC
Limit 2
) as thing
) AS average_points
from points as p1
This is in this sqlfiddle but to be honest I'm out of my depth here.
Am I on the right path? Wondering if someone would be able to show me where I need to tweak things please!
Thanks.
EDIT
Using the the answer below as a basis I was able to tweak the sql to work with the tables provided in the original sqlfiddle.
I have added that to this sqlfiddle to show it working
The corrected sql to match the code above is
select p.*,
IFNULL(( (coalesce(points_1, 0) + coalesce(points_2, 0)) /
( (points_1 is not null) + (points_2 is not null) )
),0) as prev_2_avg
from (select p.*,
(select p2.bid_points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1
) as points_1,
(select p2.bid_points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1, 1
) as points_2
from points as p
) p;
Although I am about to ask another question about the best way to make this dynamic with the number of previous poingt that need to be averaged.
You can use window functions, which were introduced in MySQL 8.
select p.*,
avg(points) over (partition by user_id
order by date
rows between 2 preceding and 1 preceding
) as prev_2_avg
from p;
In earlier versions, this is a real pain, because MySQL does not support nested correlation clauses. One method is with a separate column for each one:
select p.*,
( (coalesce(points_1, 0) + coalesce(points_2, 0)) /
( (points_1 is not null) + (points_2 is not null) )
) as prev_2_avg
from (select p.*,
(select p2.points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1
) as points_1,
(select p2.points
from points p2
where p2.user_id = p.user_id and
p2.date < p.date
order by p2.date desc
limit 1, 1
) as points_2
from p
) p;

MySQL comparing rows and getting changes

Kind of lost on anything more than selects and joins and need help with this. I have a table that maintains attributes of products that are created. There are currently 110k rows in that table. I'm looking for a way to query that data and return data related to changes in the attributes of each product.
+-----------+---------+--------+--------+--------+
| attrib_id | prod_id | height | weight | length |
+-----------+---------+--------+--------+--------+
| 1 | 120 | 20 | 3 | 5 |
| 2 | 101 | 5 | 10 | 20 |
| 3 | 101 | 5 | 10 | 20 |
| 4 | 101 | 5 | 10 | 20 |
| 5 | 120 | 20 | 3 | 5 |
| 6 | 101 | 8 | 10 | 20 |
| 7 | 120 | 20 | 3 | 5 |
| 8 | 101 | 8 | 15 | 30 |
| 9 | 101 | 16 | 15 | 20 |
| 10 | 120 | 20 | 10 | 3 |
+-----------+---------+--------+--------+--------+
I would like to see something like this as an output when ever a product attributes change:
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
| attrib_id | prod_id | orig_height | new_height | chg_height | orig_weight | new_weight | chg_weight | orig_length | new_length | chg_length |
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
| 6 | 101 | 5 | 8 | 3 | 10 | | | 20 | | |
| 10 | 120 | 20 | | | 3 | 10 | 7 | 5 | 3 | -2 |
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
Your expected output is bit incorrect.
You want to find min and max attrib_id and then use aggregation to find the required values:
select attrib_id,
prod_id,
original_height,
case when original_height = new_height then null else new_height end new_height,
nullif(new_height - original_height, 0) chg_height,
original_weight,
case when original_weight = new_weight then null else new_weight end new_weight,
nullif(new_weight - original_weight, 0) chg_weight,
original_length,
case when original_length = new_length then null else new_length end new_length,
nullif(new_length - original_length, 0) chg_length
from (
select t2.max_id attrib_id,
t.prod_id,
max(case when t.attrib_id = t2.min_id then t.height end) original_height,
max(case when t.attrib_id = t2.max_id then t.height end) new_height,
max(case when t.attrib_id = t2.min_id then t.weight end) original_weight,
max(case when t.attrib_id = t2.max_id then t.weight end) new_weight,
max(case when t.attrib_id = t2.min_id then t.length end) original_length,
max(case when t.attrib_id = t2.max_id then t.length end) new_length
from t
join (
select prod_id,
min(attrib_id) min_id,
max(attrib_id) max_id
from t
group by prod_id
) t2 on t.prod_id = t2.prod_id
and t.attrib_id in (t2.min_id, t2.max_id)
group by t.prod_id
) t;
Demo
OK, I will suggest a completely different approach. I thought about it reading the words “when ever a product attributes change” in your question. Other answers recompute all joins and aggregates every time, while your table t is essentially a history log which is bound to grow and grow and your query will be slower and slower. My approach is to create a table report and keep it in sync by means of a trigger. You have to start with two empty tables
DROP TABLE IF EXISTS t;
CREATE TABLE t (attrib_id INT, prod_id INT, height INT, weight INT, length INT);
DROP TABLE IF EXISTS report;
CREATE TABLE report (
attrib_id INT, prod_id INT,
orig_height INT, new_height INT, chg_height INT,
orig_weight INT, new_weight INT, chg_weight INT,
orig_length INT, new_length INT, chg_length INT
);
and then, define the trigger:
DROP TRIGGER IF EXISTS trig;
DELIMITER $$
CREATE TRIGGER trig AFTER INSERT ON t FOR EACH ROW BEGIN
DECLARE old_prod_id, old_height, old_weight, old_length INT;
SELECT prod_id, new_height, new_weight, new_length
INTO old_prod_id, old_height, old_weight, old_length
FROM report
WHERE prod_id = NEW.prod_id;
IF ISNULL(old_prod_id) THEN
INSERT INTO report(attrib_id, prod_id, orig_height, orig_weight, orig_length)
VALUES (NEW.attrib_id, NEW.prod_id, NEW.height, NEW.weight, NEW.length);
ELSEIF old_height != NEW.height OR old_weight != NEW.weight OR old_length != NEW.length
OR ISNULL(old_height) -- First change: I suppose checking one field is enough
THEN
UPDATE report SET
attrib_id = NEW.attrib_id,
new_height = NEW.height, chg_height = NEW.height - orig_height,
new_weight = NEW.weight, chg_weight = NEW.weight - orig_weight,
new_length = NEW.length, chg_length = NEW.length - orig_length
WHERE prod_id = NEW.prod_id;
END IF;
END$$
DELIMITER ;
When you fill t with the values you’ve given us, you get:
> SELECT * FROM report;
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
| attrib_id | prod_id | orig_height | new_height | chg_height | orig_weight | new_weight | chg_weight | orig_length | new_length | chg_length |
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
| 10 | 120 | 20 | 20 | 0 | 3 | 10 | 7 | 5 | 3 | -2 |
| 9 | 101 | 5 | 16 | 11 | 10 | 15 | 5 | 20 | 20 | 0 |
+-----------+---------+-------------+------------+------------+-------------+------------+------------+-------------+------------+------------+
and you have a much more flexible situation, where you can easily fine-tune your SELECT ... FROM report query as you like it.
I'm not sure if this is what are you looking for but it tracks all changes:
select attrib_id, prod_id,
height as ori_height, nheight as new_height, (nheight - height) as chg_height,
weight as ori_weight, nweight as new_weight, (nweight - weight) as chg_weight,
length as ori_length, nlength as new_length, (nlength - length) as chg_length
from (
select attr1.attrib_id, attr1.prod_id, attr1.height, attr1.weight, attr1.length
,(select attr2.height from attr attr2
where attr2.prod_id = attr1.prod_id and attr2.attrib_id > attr1.attrib_id limit 1) nheight
,(select attr2.weight from attr attr2
where attr2.prod_id = attr1.prod_id and attr2.attrib_id > attr1.attrib_id limit 1) nweight
,(select attr2.length from attr attr2
where attr2.prod_id = attr1.prod_id and attr2.attrib_id > attr1.attrib_id limit 1) nlength
from attr attr1
order by attr1.prod_id, attr1.attrib_id
) calc
;
It returns all changes, row by row:
+-----------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
| attrib_id | prod_id | ori_height | new_height | chg_height | ori_weight | new_weight | chg_weight | ori_length | new_length | chg_length |
+-----------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
| 2 | 101 | 5 | 5 | 0 | 10 | 10 | 0 | 20 | 20 | 0 |
| 3 | 101 | 5 | 5 | 0 | 10 | 10 | 0 | 20 | 20 | 0 |
| 4 | 101 | 5 | 8 | 3 | 10 | 10 | 0 | 20 | 20 | 0 |
| 6 | 101 | 8 | 8 | 0 | 10 | 15 | 5 | 20 | 30 | 10 |
| 8 | 101 | 8 | 16 | 8 | 15 | 15 | 0 | 30 | 20 | -10 |
| 9 | 101 | 16 | NULL | NULL | 15 | NULL | NULL | 20 | NULL | NULL |
+-----------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
| 1 | 120 | 20 | 20 | 0 | 3 | 3 | 0 | 5 | 5 | 0 |
| 5 | 120 | 20 | 20 | 0 | 3 | 3 | 0 | 5 | 5 | 0 |
| 7 | 120 | 20 | 20 | 0 | 3 | 10 | 7 | 5 | 3 | -2 |
| 10 | 120 | 20 | NULL | NULL | 10 | NULL | NULL | 3 | NULL | NULL |
+-----------+---------+------------+------------+------------+------------+------------+------------+------------+------------+------------+

SUM from the results of a subquery of N results as max for each user

Let's suppose this schema:
CREATE TABLE test
(
test_Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
user_Id INT NOT NULL,
date DATE,
result VARCHAR(255) NOT NULL,
) engine=innodb;
My goal is to pick up the last 5 results as maximum for each different user_Id, ordered from newest to oldest. Besides that, depending on result column I want to calculate a ratio of those last results, to be able to pick up the 3 users with best ratio.
So let's take this data as example:
test_Id | user_Id | date | result
1 | 1 |2016-09-05 | A
2 | 3 |2016-09-13 | A
3 | 3 |2016-09-30 | A
4 | 4 |2016-09-22 | A
5 | 4 |2016-09-11 | C
6 | 7 |2016-09-18 | D
7 | 4 |2016-09-08 | B
8 | 6 |2016-09-20 | E
9 | 7 |2016-09-16 | A
10 | 7 |2016-09-29 | E
11 | 7 |2016-09-23 | A
12 | 7 |2016-09-16 | B
13 | 4 |2016-09-15 | B
14 | 7 |2016-09-07 | C
15 | 7 |2016-09-09 | A
16 | 3 |2016-09-26 | A
17 | 4 |2016-09-11 | C
18 | 4 |2016-09-30 | E
What I have been able to achieve is this query:
SELECT p.user_Id, p.RowNumber, p.date, p.result,
SUM(CASE WHEN p.result='A' OR p.result='B'
THEN 1 ELSE 0 END) as avg
FROM (
SELECT #row_num := IF(#prev_value=user_Id,#row_num+1,1)
AS RowNumber, test_Id, user_Id, date, result,
#prev_value := user_Id
FROM test,
(SELECT #row_num := 1) x,
(SELECT #prev_value := '') y
WHERE #prev_value < 5
ORDER BY user_Id, YEAR(date) DESC, MONTH(date) DESC,
DAY(date) DESC
) p
WHERE p.RowNumber <=10
GROUP BY p.user_Id, p.test_Id
ORDER BY p.user_Id, p.RowNumber;
This query provides me this kind of output:
RowNumber |test_Id | user_Id | date | result | avg
1 | 1 | 1 |2016-09-05 | A | 1
1 | 3 | 3 |2016-09-30 | A | 1
2 | 16 | 3 |2016-09-26 | A | 1
3 | 2 | 3 |2016-09-13 | A | 1
1 | 18 | 4 |2016-09-30 | E | 0
2 | 4 | 4 |2016-09-22 | A | 1
3 | 13 | 4 |2016-09-15 | B | 1
4 | 5 | 4 |2016-09-11 | C | 0
5 | 17 | 4 |2016-09-11 | C | 0
1 | 8 | 6 |2016-09-20 | E | 0
1 | 10 | 7 |2016-09-29 | E | 0
2 | 11 | 7 |2016-09-23 | A | 1
3 | 6 | 7 |2016-09-18 | D | 0
4 | 9 | 7 |2016-09-16 | A | 1
5 | 12 | 7 |2016-09-16 | B | 1
What I was expecting is that in the avg column would get the total of the results for each user that match the condition (A or B value), to be able to calculate a ratio from the 5 results for each user_id. (0, 0.2, 0.4, 0.6, 0.8, 1).
Something like this:
RowNumber |test_Id | user_Id | date | result | avg
1 | 1 | 1 |2016-09-05 | A | 1
1 | 3 | 3 |2016-09-30 | A | 3
2 | 16 | 3 |2016-09-26 | A | 3
3 | 2 | 3 |2016-09-13 | A | 3
1 | 18 | 4 |2016-09-30 | E | 2
2 | 4 | 4 |2016-09-22 | A | 2
3 | 13 | 4 |2016-09-15 | B | 2
4 | 5 | 4 |2016-09-11 | C | 2
5 | 17 | 4 |2016-09-11 | C | 2
1 | 8 | 6 |2016-09-20 | E | 0
1 | 10 | 7 |2016-09-29 | E | 3
2 | 11 | 7 |2016-09-23 | A | 3
3 | 6 | 7 |2016-09-18 | D | 3
4 | 9 | 7 |2016-09-16 | A | 3
5 | 12 | 7 |2016-09-16 | B | 3
Am I being restricted by the GROUP BY p.user_Id, p.test_Id clause when doing the SUM? I tried the query with only user_Id as GROUP BY clause and only test_Id too as GROUP BY clause, without getting the expected results.
I think you need to calculate the avg and then join
select a.rn,a.test_id,a.user_id,a.date,a.result,u.avg from
(
select t1.*
, if (t1.user_id <> #p, #rn:=1,#rn:=#rn+1) rn
, #p:=t1.user_id p
from (select #rn:=0, #p:='') rn,test t1
order by t1.user_id, t1.date desc
) a
join
(
select s.user_id
, sum(case when s.result = 'A' or s.result = 'B' then 1 else 0 end) as avg
from
(
select t1.*
, if (t1.user_id <> #p, #rn:=1,#rn:=#rn+1) rn
, #p:=t1.user_id p
from (select #rn:=0, #p:='') rn,test t1
order by t1.user_id, t1.date desc
) s
where s.rn <= 5
group by s.user_id
) u on u.user_id = a.user_id
where a.rn <= 5

Mysql optimization and explode

I have the following query that displays the top 10 most drawn pairs of numbers from the whole
table
select
p, count(p) as frequency
from
(SELECT
id,
CASE power1 <= power2 WHEN TRUE THEN CONCAT(power1,"-",power2) ELSE CONCAT(power2,"-",power1)
END p
FROM power
UNION
SELECT
id,
CASE power1<=power3 WHEN TRUE THEN CONCAT(power1,"-",power3) ELSE CONCAT(power3,"-",power1) END p
FROM power
UNION
SELECT
id,
CASE power1<=power4 WHEN TRUE THEN CONCAT(power1,"-",power4) ELSE CONCAT(power4,"-",power1) END p
FROM power
UNION
...............................................
SELECT
id,
CASE power19<=power20 WHEN TRUE THEN CONCAT(power19,"-",power20) ELSE CONCAT(power20,"-",power19)
END p
FROM power) as b
group by
p
order by
frequency desc, p asc
limit
0, 10
How can I impose a limit to take just the first 100 lines in descending order by ID? The query would be like this:
ORDER BY id LIMIT 0,100
But I haven't been able to adapt it for the above.
Could the code be optimized more than that?
power1, Power2 are values from tables.... would it work if i would have a string like 3,4,5,6 and then explode "," and after that power1 becomes 3, power2 to become 4, etc?
I mean the table format to look something like this :
table2
LATER EDIT :
I have table like this :
Table: data
+----+----+-----+
| id | nr | set |
+----+----+-----+
| 1 | 52 | 1 |
| 2 | 47 | 1 |
| 3 | 4 | 1 |
| 4 | 3 | 1 |
| 5 | 77 | 1 |
| 6 | 71 | 1 |
| 7 | 6 | 1 |
| 8 | 41 | 1 |
| 9 | 15 | 1 |
| 10 | 79 | 1 |
| 11 | 35 | 2 |
| 12 | 50 | 2 |
| 13 | 16 | 2 |
| 14 | 1 | 2 |
| 15 | 32 | 2 |
| 16 | 77 | 2 |
| 17 | 30 | 2 |
| 18 | 7 | 2 |
| 19 | 20 | 2 |
| 20 | 28 | 2 |
| .. | .. | ... |
+----+----+-----+
I have like 34360 id
And the following query :
SELECT
`n1`.`nr` AS `num_1`,
`n2`.`nr` AS `num_2`,
COUNT(1) AS `total`
FROM (select * from data ORDER BY id DESC limit 0,1000) AS `n1`
JOIN `data` AS `n2`
ON `n1`.`set` = `n2`.`set` AND `n1`.`nr` < `n2`.`nr`
GROUP BY `n1`.`nr`, `n2`.`nr`
ORDER BY `total` DESC
LIMIT 20
And is working fine !
I would like to know how i can find out the pairs of numbers that have not been drawn together for the longest time. Per example:
1,42 (together, as a pair) has not been drawn for 24 draws
32,45-as a pair as well-has not been drawn for 22 draws
etc
Consider the following:
Un-normalised:
id power1 power2 power3 power4
1 4 9 10 16
2 6 12 15 19
3 2 4 6 7
4 3 8 15 17
5 2 10 11 14
6 4 10 12 19
7 1 4 9 11
Normalised:
id power value
1 1 4
1 2 9
1 3 10
1 4 16
2 1 6
2 2 12
2 3 15
2 4 19
3 1 2
3 2 4
3 3 6
3 4 7
4 1 3
4 2 8
4 3 15
4 4 17
5 1 2
5 2 10
5 3 11
5 4 14
6 1 4
6 2 10
6 3 12
6 4 19
7 1 1
7 2 4
7 3 9
7 4 11
So...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL
,power INT NOT NULL
,value INT NOT NULL
,PRIMARY KEY(id,power)
);
INSERT INTO my_table VALUES
(1,1,4),(1,2,9),(1,3,10),(1,4,16),
(2,1,6),(2,2,12),(2,3,15),(2,4,19),
(3,1,2),(3,2,4),(3,3,6),(3,4,7),
(4,1,3),(4,2,8),(4,3,15),(4,4,17),
(5,1,2),(5,2,10),(5,3,11),(5,4,14),
(6,1,4),(6,2,10),(6,3,12),(6,4,19),
(7,1,1),(7,2,4),(7,3,9),(7,4,11);
SELECT LEAST(x.value,y.value)a -- LEAST/GREATEST is only necessary in the event that
, GREATEST(x.value,y.value) b -- power1 value may be greater than powerN value
, COUNT(*) freq
FROM my_table x
JOIN my_table y
ON y.id = x.id
AND y.power < x.power
GROUP
BY LEAST(x.value, y.value) -- again only necessary if using LEAST/GREATEST above
, GREATEST(x.value,y.value)
ORDER
BY freq DESC
, a
, b;
+----+----+------+
| a | b | freq |
+----+----+------+
| 4 | 9 | 2 |
| 4 | 10 | 2 |
| 12 | 19 | 2 |
| 1 | 4 | 1 |
| 1 | 9 | 1 |
| 1 | 11 | 1 |
| 2 | 4 | 1 |
| 2 | 6 | 1 |
| 2 | 7 | 1 |
| 2 | 10 | 1 |
| 2 | 11 | 1 |
| 2 | 14 | 1 |
| 3 | 8 | 1 |
| 3 | 15 | 1 |
| 3 | 17 | 1 |
| 4 | 6 | 1 |
| 4 | 7 | 1 |
| 4 | 11 | 1 |
| 4 | 12 | 1 |
| 4 | 16 | 1 |
| 4 | 19 | 1 |
| 6 | 7 | 1 |
| 6 | 12 | 1 |
| 6 | 15 | 1 |
| 6 | 19 | 1 |
| 8 | 15 | 1 |
| 8 | 17 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 9 | 16 | 1 |
| 10 | 11 | 1 |
| 10 | 12 | 1 |
| 10 | 14 | 1 |
| 10 | 16 | 1 |
| 10 | 19 | 1 |
| 11 | 14 | 1 |
| 12 | 15 | 1 |
| 15 | 17 | 1 |
| 15 | 19 | 1 |
+----+----+------+
While I fully agree with #Strawberry about normalising your data, the following is an example of how to possibly do it with your current data structure (not tested).
SELECT CASE a.power_val <= b.power_val WHEN TRUE THEN CONCAT(a.power_val,"-",b.power_val) ELSE CONCAT(b.power_val,"-",a.power_val) END p,
COUNT(a.id) as frequency
FROM
(
SELECT id,1 AS power_col, power1 AS power_val FROM power UNION
SELECT id,2, power2 FROM power UNION
SELECT id,3, power3 FROM power UNION
SELECT id,4, power4 FROM power UNION
SELECT id,5, power5 FROM power UNION
SELECT id,6, power6 FROM power UNION
SELECT id,7, power7 FROM power UNION
SELECT id,8, power8 FROM power UNION
SELECT id,9, power9 FROM power UNION
SELECT id,10, power10 FROM power UNION
SELECT id,11, power11 FROM power UNION
SELECT id,12, power12 FROM power UNION
SELECT id,13, power13 FROM power UNION
SELECT id,14, power14 FROM power UNION
SELECT id,15, power15 FROM power UNION
SELECT id,16, power16 FROM power UNION
SELECT id,17, power17 FROM power UNION
SELECT id,18, power18 FROM power UNION
SELECT id,19, power19 FROM power UNION
SELECT id,20, power20 FROM power
ORDER BY id DESC
LIMIT 2000
) a
INNER JOIN
(
SELECT id, 1 AS power_col, power1 AS power_val FROM power UNION
SELECT id, 2, power2 FROM power UNION
SELECT id,3, power3 FROM power UNION
SELECT id,4, power4 FROM power UNION
SELECT id,5, power5 FROM power UNION
SELECT id,6, power6 FROM power UNION
SELECT id,7, power7 FROM power UNION
SELECT id,8, power8 FROM power UNION
SELECT id,9, power9 FROM power UNION
SELECT id,10, power10 FROM power UNION
SELECT id,11, power11 FROM power UNION
SELECT id,12, power12 FROM power UNION
SELECT id,13, power13 FROM power UNION
SELECT id,14, power14 FROM power UNION
SELECT id,15, power15 FROM power UNION
SELECT id,16, power16 FROM power UNION
SELECT id,17, power17 FROM power UNION
SELECT id,18, power18 FROM power UNION
SELECT id,19, power19 FROM power UNION
SELECT id,20, power20 FROM power
ORDER BY id DESC
LIMIT 2000
) b
ON a.id = b.id
AND a.power_col != b.power_col
GROUP BY p
ORDER BY frequency DESC, p ASC
LIMIT 0,10
Note using normalised data structures would likely be far quicker.
EDIT
Think something like the following might give you what you need.
The big sub query is to get every possible combination (idea is to also cope with pairs that have never been used), with the first number being smaller than the 2nd just for consistency. This is then joined against the tables of data to get the matching numbers and the respective id fields. Then uses MIN to get the smallest id:-
SELECT all_combo.num_1,
all_combo.num_2,
MIN(d1.id)
FROM
(
SELECT sub0.nr AS num_1,
sub1.nr AS num_2
FROM
(
SELECT DISTINCT nr
FROM data
) sub0
INNER JOIN
(
SELECT DISTINCT nr
FROM data
) sub1
WHERE sub0.nr < sub1.nr
) all_combo
LEFT OUTER JOIN data d1 ON all_combo.num_1
LEFT OUTER JOIN data d2 ON all_combo.num_2 AND d1.set = d2.set
GROUP BY all_combo.num_1,
all_combo.num_2