Sort to have non-consecutive column values - mysql

is this possible in mysql queries? Current data table is:
id - fruit - name
1 - Apple - George
2 - Banana - George
3 - Orange - Jake
4 - Berries - Angela
In the name column, i would like to sort it so there is no consecutive name on my select query.
My desires output would be, no consecutive george in name column.
id - fruit - name
1 - Apple - George
3 - Orange - Jake
2 - Banana - George
4 - Berries - Angela
Thanks in advance.

In MySQL 8+, you can do:
order by row_number() over (partition by name order by id)
In earlier versions, you can do this using variables.

Another idea...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,name VARCHAR(12) NOT NULL
);
INSERT INTO my_table VALUES
(1,'George'),
(2,'George'),
(3,'Jake'),
(4,'Angela');
SELECT x.*
FROM my_table x
JOIN my_table y
ON y.name = x.name
AND y.id <= x.id
GROUP
BY x.id
ORDER
BY COUNT(*)
, id;
+----+--------+
| id | name |
+----+--------+
| 1 | George |
| 3 | Jake |
| 4 | Angela |
| 2 | George |
+----+--------+

Following solution would work for all the MySQL versions, especially version < 8.0
In a Derived table, first sort your actual table, using name and id.
Then, determine the row number for a particular row, within all the rows having same name value.
Now, use this result-set and sort it by the row number values. So, all the rows having row number = 1 will come first (for all the different name value(s)) and so on. Hence, consecutive name rows wont appear.
You can try the following using User-defined Session Variables:
SELECT dt2.id,
dt2.fruit,
dt2.name
FROM (SELECT #row_no := IF(#name_var = dt1.name, #row_no + 1, 1) AS row_num,
dt1.id,
dt1.fruit,
#name_var := dt1.name AS name
FROM (SELECT id,
fruit,
name
FROM your_table_name
ORDER BY name,
id) AS dt1
CROSS JOIN (SELECT #row_no := 0,
#name_var := '') AS user_init_vars) AS dt2
ORDER BY dt2.row_num,
dt2.id
DB Fiddle DEMO

Here is my algorithm:
count each name's frequency
order by frequency descending and name
cut into partitions as large as the maximum frequency
number rows within each partition
order by row number and partition number
An example: Names A, B, C, D, E
step 1 and 2
------------
AAAAABBCCDDEE
step 3 and 4
------------
12345
AAAAA
BBCCD
DEE
step 5
------
ABDABEACEACAD
The query:
with counted as
(
select id, fruit, name, count(*) over (partition by name) as cnt
from mytable
)
select id, fruit, name
from counted
order by
(row_number() over (order by cnt desc, name) - 1) % max(cnt) over (),
row_number() over (order by cnt desc, name);
Common table expression (WITH clauses) and window functions (aggregation OVER) are available as of MySQL 8 or MariaDB 10.2. Before that you can retreat to subqueries, which will make the same query quite long and hard to read, though. I suppose you could also use variables instead, somehow.
DB fiddle demo: https://www.db-fiddle.com/f/8amYX6iRu8AsnYXJYz15DF/1

Related

Create custom aggregate function [duplicate]

I'm trying to create an aggregated function MEDIAN() in MySQL like MIN(), MAX(), AVG() which takes the input the column name or string that has concatenated values of the desired column.
I'm having trouble understanding the limitations of MySQL custom functions & would be really helpful if some can help me find out how this is done.
Example:
MySQL table has 2 columns (ID, num)
+----+-----+
| id | num |
+----+-----+
| 1 | 5 |
| 1 | 6 |
| 1 | 7 |
| 2 | 1 |
| 2 | 3 |
| 2 | 5 |
+----+-----+
SELECT id, MEDIAN(num) as median
FROM table
GROUP BY id;
OR
SELECT id, MEDIAN(GROUP_CONCAT(num SEPARATOR ',') as median
FROM table
GROUP BY id;
Expected Output is
+----+--------+
| id | median |
+----+--------+
| 1 | 6 |
| 2 | 3 |
+----+--------+
User defined aggregate stored functions were added in MariaDB-10.3.3
MySQL can do aggregate functions however not in SQL. They need a UDF (shared library implemenation)
EDIT: I am aware that this answer does not directly address the question, since the question is "how to create an aggregate median function in mySQL" and my answer specifically says how to do it without a UDF.
However, the accepted answer says that it is not possible in mySQL, so I gave a solution that would address the aggregate median ability without having to use a UDF, in case someone might want to calculate the aggregate medians anyway.
It is possible to do without a UDF, and I know of two ways to do it. The first uses two selects and a join, the first select to get the values and rankings, and the second select to get the counts, then joins them. The second uses json functions to get everything in one select. They are both a little lengthy, but they work and are reasonably fast.
SOLUTION #1 (two selects and a join, one to get counts, one to get rankings)
SELECT x.group_field,
avg(
if(
x.rank - y.vol/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field,
#r:= IF(#current=group_field, #r+1, 1) as rank,
#current:=group_field
FROM (
SELECT group_field, value_field
FROM table_name
ORDER BY group_field, value_field
) z, (SELECT #r:=0, #current:='') v
) x, (
SELECT group_field, count(*) as vol
FROM table_name
GROUP BY group_field
) y WHERE x.group_field = y.group_field
GROUP BY x.group_field;
SOLUTION #2 (uses a json object to store the counts and avoids the join)
SELECT group_field,
avg(
if(
rank - json_extract(#vols, path)/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field, path,
#rnk := if(#curr = group_field, #rnk+1, 1) as rank,
#vols := json_set(
#vols,
path,
coalesce(json_extract(#vols, path), 0) + 1
) as vols,
#curr := group_field
FROM (
SELECT p.group_field, p.value_field, concat('$.', p.group_field) as path
FROM table_name
JOIN (SELECT #curr:='', #rnk:=1, #vols:=json_object()) v
ORDER BY group_field, value_field DESC
) z
) y GROUP BY group_field;

How to Create an Aggregate Function? [duplicate]

I'm trying to create an aggregated function MEDIAN() in MySQL like MIN(), MAX(), AVG() which takes the input the column name or string that has concatenated values of the desired column.
I'm having trouble understanding the limitations of MySQL custom functions & would be really helpful if some can help me find out how this is done.
Example:
MySQL table has 2 columns (ID, num)
+----+-----+
| id | num |
+----+-----+
| 1 | 5 |
| 1 | 6 |
| 1 | 7 |
| 2 | 1 |
| 2 | 3 |
| 2 | 5 |
+----+-----+
SELECT id, MEDIAN(num) as median
FROM table
GROUP BY id;
OR
SELECT id, MEDIAN(GROUP_CONCAT(num SEPARATOR ',') as median
FROM table
GROUP BY id;
Expected Output is
+----+--------+
| id | median |
+----+--------+
| 1 | 6 |
| 2 | 3 |
+----+--------+
User defined aggregate stored functions were added in MariaDB-10.3.3
MySQL can do aggregate functions however not in SQL. They need a UDF (shared library implemenation)
EDIT: I am aware that this answer does not directly address the question, since the question is "how to create an aggregate median function in mySQL" and my answer specifically says how to do it without a UDF.
However, the accepted answer says that it is not possible in mySQL, so I gave a solution that would address the aggregate median ability without having to use a UDF, in case someone might want to calculate the aggregate medians anyway.
It is possible to do without a UDF, and I know of two ways to do it. The first uses two selects and a join, the first select to get the values and rankings, and the second select to get the counts, then joins them. The second uses json functions to get everything in one select. They are both a little lengthy, but they work and are reasonably fast.
SOLUTION #1 (two selects and a join, one to get counts, one to get rankings)
SELECT x.group_field,
avg(
if(
x.rank - y.vol/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field,
#r:= IF(#current=group_field, #r+1, 1) as rank,
#current:=group_field
FROM (
SELECT group_field, value_field
FROM table_name
ORDER BY group_field, value_field
) z, (SELECT #r:=0, #current:='') v
) x, (
SELECT group_field, count(*) as vol
FROM table_name
GROUP BY group_field
) y WHERE x.group_field = y.group_field
GROUP BY x.group_field;
SOLUTION #2 (uses a json object to store the counts and avoids the join)
SELECT group_field,
avg(
if(
rank - json_extract(#vols, path)/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field, path,
#rnk := if(#curr = group_field, #rnk+1, 1) as rank,
#vols := json_set(
#vols,
path,
coalesce(json_extract(#vols, path), 0) + 1
) as vols,
#curr := group_field
FROM (
SELECT p.group_field, p.value_field, concat('$.', p.group_field) as path
FROM table_name
JOIN (SELECT #curr:='', #rnk:=1, #vols:=json_object()) v
ORDER BY group_field, value_field DESC
) z
) y GROUP BY group_field;

Create a user defined function that works with GROUP BY in mysql

I'm trying to create an aggregated function MEDIAN() in MySQL like MIN(), MAX(), AVG() which takes the input the column name or string that has concatenated values of the desired column.
I'm having trouble understanding the limitations of MySQL custom functions & would be really helpful if some can help me find out how this is done.
Example:
MySQL table has 2 columns (ID, num)
+----+-----+
| id | num |
+----+-----+
| 1 | 5 |
| 1 | 6 |
| 1 | 7 |
| 2 | 1 |
| 2 | 3 |
| 2 | 5 |
+----+-----+
SELECT id, MEDIAN(num) as median
FROM table
GROUP BY id;
OR
SELECT id, MEDIAN(GROUP_CONCAT(num SEPARATOR ',') as median
FROM table
GROUP BY id;
Expected Output is
+----+--------+
| id | median |
+----+--------+
| 1 | 6 |
| 2 | 3 |
+----+--------+
User defined aggregate stored functions were added in MariaDB-10.3.3
MySQL can do aggregate functions however not in SQL. They need a UDF (shared library implemenation)
EDIT: I am aware that this answer does not directly address the question, since the question is "how to create an aggregate median function in mySQL" and my answer specifically says how to do it without a UDF.
However, the accepted answer says that it is not possible in mySQL, so I gave a solution that would address the aggregate median ability without having to use a UDF, in case someone might want to calculate the aggregate medians anyway.
It is possible to do without a UDF, and I know of two ways to do it. The first uses two selects and a join, the first select to get the values and rankings, and the second select to get the counts, then joins them. The second uses json functions to get everything in one select. They are both a little lengthy, but they work and are reasonably fast.
SOLUTION #1 (two selects and a join, one to get counts, one to get rankings)
SELECT x.group_field,
avg(
if(
x.rank - y.vol/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field,
#r:= IF(#current=group_field, #r+1, 1) as rank,
#current:=group_field
FROM (
SELECT group_field, value_field
FROM table_name
ORDER BY group_field, value_field
) z, (SELECT #r:=0, #current:='') v
) x, (
SELECT group_field, count(*) as vol
FROM table_name
GROUP BY group_field
) y WHERE x.group_field = y.group_field
GROUP BY x.group_field;
SOLUTION #2 (uses a json object to store the counts and avoids the join)
SELECT group_field,
avg(
if(
rank - json_extract(#vols, path)/2 BETWEEN 0 AND 1,
value_field,
null
)
) as median
FROM (
SELECT group_field, value_field, path,
#rnk := if(#curr = group_field, #rnk+1, 1) as rank,
#vols := json_set(
#vols,
path,
coalesce(json_extract(#vols, path), 0) + 1
) as vols,
#curr := group_field
FROM (
SELECT p.group_field, p.value_field, concat('$.', p.group_field) as path
FROM table_name
JOIN (SELECT #curr:='', #rnk:=1, #vols:=json_object()) v
ORDER BY group_field, value_field DESC
) z
) y GROUP BY group_field;

In MYSQL - Select up to a certain value in a column

I searched for a solution since a few weeks, but however could not really solve problem: Is it possible to select only a few rows until or up to a certain value, which could repeating itself further down my table?
I think, an example can be very useful:
Type | OBID | RECID
5 | T-000032 | 5637637
1 | T-123456 | 5637636
1 | T-789123 | 5637635
2 | T-123456 | 5637634
2 | T-789123 | 5637633
1 | T-221133 | 5637628
2 | T-221133 | 5637612
Here a little example:
This section of my table will always start with Type 5 followed by a couple of rows with Type 1. I only need this special "group" of rows with Type 1 since the first row with type 2 appears.
I would not be attracted to any other row with Type 1 - only this ones:
1 | T-123456 | 5637636
1 | T-789123 | 5637635
Quasi only this rows with Type 1 which are between
the first row with Type 5 and
the first row with Type 2.
I hope, you could help me.
Thank you very very much.
Chrissy
This feels like a gaps and islands problem, but in this case you just want a single island. One approach is to use subqueries to find:
The highest RECID value where Type=1. This represents the
inclusive upper bound of the island.
The highest RECID value where Type!=1, and where the RECID
value is also less than the above RECID value. This serves as
the exclusive lower bound of the island.
Here is a working query:
SELECT *
FROM yourTable
WHERE Type = 1 AND RECID > (SELECT MAX(RECID) FROM yourTable
WHERE Type <> 1 AND RECID < (SELECT MAX(RECID) FROM yourTable
WHERE Type = 1)) AND
RECID <= (SELECT MAX(RECID) FROM yourTable WHERE Type = 1)
ORDER BY
RECID DESC;
Demo
You can try below for mysql version less than 8.0
select * from
(SELECT
#row_number:=CASE
WHEN #Type = PType THEN #row_number + 1
ELSE 1
END AS num,
#Type:=Type as PType,
Type,
OBID,RECID
FROM
tablename order by type,RECID desc
)X where num in (1,2)
OR You can use row_number() in case mysql version 8.0+
select * from
(
select *, row_number() over(partition by type order by recid desc) as rn
from tablename
)X where rn in (1,2)

Query to Segment Results Based on Equal Sets of Column Value

I'd like to construct a single query (or as few as possible) to group a data set. So given a number of buckets, I'd like to return results based on a specific column.
So given a column called score which is a double which contains:
90.00
91.00
94.00
96.00
98.00
99.00
I'd like to be able to use a GROUP BY clause with a function like:
SELECT MIN(score), MAX(score), SUM(score) FROM table GROUP BY BUCKETS(score, 3)
Ideally this would return 3 rows (grouping the results into 3 buckets with as close to equal count in each group as is possible):
90.00, 91.00, 181.00
94.00, 96.00, 190.00
98.00, 99.00, 197.00
Is there some function that would do this? I'd like to avoid returning all the rows and figuring out the bucket segments myself.
Dave
create table test (
id int not null auto_increment primary key,
val decimal(4,2)
) engine = myisam;
insert into test (val) values
(90.00),
(91.00),
(94.00),
(96.00),
(98.00),
(99.00);
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row
from test,(select #row:=0) as r order by id
) as t
group by ceil(row/2)
+-------+--------+--------+
| lower | higher | total |
+-------+--------+--------+
| 90.00 | 91.00 | 181.00 |
| 94.00 | 96.00 | 190.00 |
| 98.00 | 99.00 | 197.00 |
+-------+--------+--------+
3 rows in set (0.00 sec)
Unluckily mysql doesn't have analytical function like rownum(), so you have to use some variable to emulate it. Once you do it, you can simply use ceil() function in order to group every tot rows as you like. Hope that it helps despite my english.
set #r = (select count(*) from test);
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row
from test,(select #row:=0) as r order by id
) as t
group by ceil(row/ceil(#r/3))
or, with a single query
select min(val) as lower,max(val) as higher,sum(val) as total from (
select id,val,#row:=#row+1 as row,tot
from test,(select count(*) as tot from test) as t2,(select #row:=0) as r order by id
) as t
group by ceil(row/ceil(tot/3))