Sorting by frequency in MySQL/SQL

Sorting by frequency in MySQL/SQL - mysql

Is there way of sorting by frequency that a value occurs? If a value appears in multiple rows, would we just use the WHERE clause? Is it just about making the query more specific?
As a simple example:
CREATE TABLE mytable
( id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY
, val VARCHAR(15) NOT NULL
);
INSERT INTO mytable (id, val) VALUES
(1,'one')
,(2,'prime')
,(3,'prime')
,(4,'square')
,(5,'prime')
,(6,'six')
,(7,'prime')
,(8,'cube')
,(9,'square')
;
We can write a simple query to return the rows
SELECT t.val
, t.id
FROM mytable t
ORDER BY t.val
But what query do we use to get the most frequently occurring values listed first? To return a result like this:
freq val id
---- ------ --
4 prime 2
4 prime 3
4 prime 5
4 prime 7
2 square 4
2 square 9
1 cube 8
1 one 1
1 six 6
where freq is the frequency (the count of the number of rows) that a value appears in the val column. The value 'prime' appears in four rows, so freq has a value of 4.
What MySQL SELECT query would I use to return a result like this?

Try this:
SELECT A.Freq , A.val , A.id
FROM ( SELECT COUNT(*) AS Freq , val , id
FROM mytable
GROUP BY val , id ) A
ORDER BY Freq DESC ;
EDIT:
As suggested by spencer7593, the id is defined as auto-increment in the table and hence the GROUP BY should not include it. Still, if that would be the case, it is not clear how the result could be as shown. I'm adding here an alternative SELECT that, supposedly, should yield the shown output:
SELECT B.Freq , A.val , A.id
FROM mytable A
INNER JOINT ( SELECT val , COUNT(*) AS Freq
FROM mytable
GROUP BY val) B
ON A.val = B.val
ORDER BY B.Freq DESC ;
[NOTE: This was NOT tested!!!!]

Related

How can I select the second-to-last rows in a mysql table, grouped by column?

Structure is:
CREATE TABLE current
(
id BIGINT NOT NULL AUTO_INCREMENT,
PRIMARY KEY(id),
symbol VARCHAR(5),
UNIQUE (id), INDEX (symbol)
) ENGINE MyISAM;
id
symbol
1
A
2
B
3
C
4
C
5
B
6
A
7
C
8
C
9
A
10
B
I am using the following
SELECT *
FROM current
WHERE id
IN
(
SELECT MAX(id)
FROM current
GROUP BY symbol
)
to return the last records in a table.
id
symbol
8
C
9
A
10
B
How can I return the next-to-last results in a similar fashion?
I know that I need
ORDER BY id DESC LIMIT 1,1
somewhere, but my foo is weak.
I would want to return
id
symbol
5
B
6
A
7
C

For versions of MySql prior to 8.0, use a subquery in the WHERE clause to filter out the max id of each symbol and then aggregate:
SELECT MAX(id) id, symbol
FROM current
WHERE id NOT IN (SELECT MAX(id) FROM current GROUP BY symbol)
GROUP BY symbol
ORDER BY id;
See the demo.

SELECT *
FROM current
WHERE id IN (
SELECT DISTINCT T.id FROM current AS T
WHERE id=(
SELECT id FROM current
WHERE symbol=T.symbol
ORDER BY id DESC LIMIT 1,1
)
)

Easy if your MySql can use ROW_NUMBER. (MySql 8)
Just make it sort descending, then take the 2nd.
WITH CTE AS (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY symbol ORDER BY id DESC) AS symbol_rn
FROM current
)
SELECT id, symbol
FROM CTE
WHERE symbol_rn = 2
ORDER BY id;
In MySql 7.5 you can simply self-join on the symbol, and group by.
Then the 2nd last will have 1 higher id.
SELECT c1.id, c1.symbol
FROM current c1
LEFT JOIN current c2
ON c2.symbol = c1.symbol
AND c2.id >= c1.id
GROUP BY c1.id, c1.symbol
HAVING COUNT(c2.id) = 2
ORDER BY c1.id;
id
symbol
5
B
6
A
7
C
db<>fiddle here
The performance will really benefit from an index on symbol.

You can try this;
SELECT *
FROM current
WHERE id
IN (SELECT MAX(id)
FROM current
GROUP BY symbol)
ORDER BY id DESC LIMIT 1,3
limit 1,3 says; get the last 3 results excluding the last result. You can change the numbers.

Exclude top and bottom n rows in SQL

I'm trying to query a database but excluding the first and last rows from the table. Here's a sample table:
id | val
--------
1 1
2 9
3 3
4 1
5 2
6 6
7 4
In the above example, I'd first like to order it by val and then exclude the first and last rows for the query.
id | val
--------
4 1
5 2
3 3
7 4
6 6
This is the resulting set I would like. Note row 1 and 2 were excluded as they had the lowest and highest val respectively.
I've considered LIMIT, TOP, and a couple of other things but can't get my desired result. If there's a method to do it (even better with first/last % rather than first/last n), I can't figure it out.

You can try this mate:
SELECT * FROM numbers
WHERE id NOT IN (
SELECT id FROM numbers
WHERE val IN (
SELECT MAX(val) FROM numbers
) OR val IN (
SELECT MIN(val) FROM numbers
)
);

You can try this:
Select *
from table
where
val!=(select val from table order by val asc LIMIT 1)
and
val!=(select val from table order by val desc LIMIT 1)
order by val asc;
You can also use UNION and avoid the 2 val!=(query)

;WITH cte (id, val, rnum, qty) AS (
SELECT id
, val
, ROW_NUMBER() OVER(ORDER BY val, id)
, COUNT(*) OVER ()
FROM t
)
SELECT id
, val
FROM cte
WHERE rnum BETWEEN 2 AND qty - 1

What if you use UNION and exclude the val you don't want. Something like below
select * from your_table
where val not in (
select top 1 val from your_table order by val
union
select top 1 val from your_table order by val desc)

select min value of range [0,44) not in a column

I have a table with an int valued column, which has values between 0 and 43 (both included).
I would like a query that returns the min value of the range [0,44) which is not in the table.
For example:
if the table contains: 3,5, 14. The query should return 0
if the table contains: 0,1, 14. The query should return 2
if the table contains: 0,3, 14. The query should return 1
If the table contains all values, the query should return empty.
How can I achieve that?

Since the value you want is either 0 or 1 greater than a value that exists in the table, you can just do;
SELECT MIN(value)
FROM (SELECT 0 value UNION SELECT value+1 FROM MyTable) a
WHERE value < 44 AND value NOT IN (SELECT value FROM MyTable)
An SQLfiddle to test with.

One way would be to create another table that contains the integers in [0,43] and then left join that and look for NULLs, the NULLs will tell you what values are missing.
Suppose you have:
create table numbers (n int not null);
and this table contains the integers from 0 to 43 (inclusive). If your table is t and has a column n which holds the numbers of interest, then:
select n.n
from numbers n left join t on n.n = t.n
where t.n is null
order by n.n
limit 1
should give you the result you're after.
This is a fairly common SQL technique when you're working with a sequence. The most common use is probably calendar tables.

One approach is to generate a set of 44 rows with integer values, and then perform an anti-join against the distinct set of values from the table, and the grab the mininum value.
SELECT MIN(r.val) AS min_val
FROM ( SELECT 0 AS val UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
-- ...
SELECT 44
) r
LEFT
JOIN ( SELECT t.int_valued_col
FROM mytable t
WHERE t.int_valued_col >= 0
AND t.int_valued_col <= 43
GROUP BY t.int_valued_col
) v
ON v.int_valued_col = r.col
WHERE v.int_valued_col IS NULL

A little bit hacky and MySQL-specific:
SELECT NULLIF(MAX(IF(val=#min, #min:=(val+1), #min)), #max) as min_empty
FROM (
SELECT DISTINCT val
FROM table1
-- WHERE val BETWEEN 0 AND 43
ORDER BY val) as vals, (SELECT #min:=0, #max:=44) as init;

GROUP BY - do not group NULL

I'm trying to figure out a way to return results by using the group by function.
GROUP BY is working as expected, but my question is: Is it possible to have a group by ignoring the NULL field. So that it does not group NULLs together because I still need all the rows where the specified field is NULL.
SELECT `table1`.*,
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`
FROM `table1`
WHERE (enabled = 1)
GROUP BY `ancestor`
So now let's say I have 5 rows and the ancestor field is NULL, it returns me 1 row....but I want all 5.

Perhaps you should add something to the null columns to make them unique and group on that? I was looking for some sort of sequence to use instead of UUID() but this might work just as well.
SELECT `table1`.*,
IFNULL(ancestor,UUID()) as unq_ancestor
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`
FROM `table1`
WHERE (enabled = 1)
GROUP BY unq_ancestor

When grouping by column Y, all rows for which the value in Y is NULL are grouped together.
This behaviour is defined by the SQL-2003 standard, though it's slightly surprising because NULL is not equal to NULL.
You can work around it by grouping on a different value, some function (mathematically speaking) of the data in your grouping column.
If you have a unique column X then this is easy.
Input
X Y
-------------
1 a
2 a
3 b
4 b
5 c
6 (NULL)
7 (NULL)
8 d
Without fix
SELECT GROUP_CONCAT(`X`)
FROM `tbl`
GROUP BY `Y`;
Result:
GROUP_CONCAT(`foo`)
-------------------
6,7
1,2
3,4
5
8
With fix
SELECT GROUP_CONCAT(`X`)
FROM `tbl`
GROUP BY IFNULL(`Y`, `X`);
Result:
GROUP_CONCAT(`foo`)
-------------------
6
7
1,2
3,4
5
8
Let's take a closer look at how this is working
SELECT GROUP_CONCAT(`X`), IFNULL(`Y`, `X`) AS `grp`
FROM `tbl`
GROUP BY `grp`;
Result:
GROUP_CONCAT(`foo`) `grp`
-----------------------------
6 6
7 7
1,2 a
3,4 b
5 c
8 d
If you don't have a unique column that you can use, you can try to generate a unique placeholder value instead. I'll leave this as an exercise to the reader.

GROUP BY IFNULL(required_field, id)

SELECT table1.*,
GROUP_CONCAT(id SEPARATOR ',') AS children_ids
FROM table1
WHERE (enabled = 1)
GROUP BY ancestor
, CASE WHEN ancestor IS NULL
THEN table1.id
ELSE 0
END

Maybe faster version of previous solution in case you have unique identifier in table1 (let suppose it is table1.id) :
SELECT `table1`.*,
GROUP_CONCAT(id SEPARATOR ',') AS `children_ids`,
IF(ISNULL(ancestor),table1.id,NULL) as `do_not_group_on_null_ancestor`
FROM `table1`
WHERE (enabled = 1)
GROUP BY `ancestor`, `do_not_group_on_null_ancestor`

To union multiple tables and group_concat different column and a sum of the column for the (unique primary or foreign key) column to display a value in the same row
select column1,column2,column3,GROUP_CONCAT(if(column4='', null, column4)) as
column4,sum(column5) as column5
from (
select column1,group_concat(column2) as column2,sum(column3 ) as column3,'' as
column4,'' as column5
from table1
group by column1
union all
select column1,'' as column2,'' as column3,group_concat(column4) as
column4,sum(column5) as column5
from table 2
group by column1
) as t
group by column1

How to find the mode of a set of data before joining with another table?

These are the two tables I am looking at:
k3_alert_types
Type Description
0 No Show
1 Stop Arrival
2 ...
3 ...
4 ...
5 ...
k3_alert
Type
1
22
33
2
4
5
65
33
1
The tables are just examples, as the actual data sets are much larger. What I would like to do is find the mode of types in the k3_alert table, which I have done with the following:
SELECT TYPE , number_of_alerts
FROM
(
SELECT id, TYPE, COUNT(TYPE) AS number_of_alerts FROM k3_alert
GROUP BY TYPE
)t1
WHERE number_of_alerts IN
(
SELECT MAX( count_type ) FROM
(
SELECT id, TYPE , COUNT(TYPE ) AS count_type FROM k3_alert
GROUP BY TYPE
)t
)
I know how to join both tables:
SELECT k3_alert_types.description, k3_alert_types.type as type
FROM k3_alert_types
INNER JOIN k3_alert ON k3_alert_types.type = k3_alert.type
ORDER BY type
But I don't know how to do both at once.
I want to see this as the outcome of the whole process (just an example):
Description Type number_of_alerts
No Show 1 350
Any suggestions?
edit: Server type: MariaDB,
PHP extension: mysql

This should work:
SELECT at.description, at.type, COUNT(*) as number_of_alerts
FROM k3_alert_types at
INNER JOIN k3_alert a ON at.type = a.type
GROUP BY at.description, at.type
ORDER BY number_of_alerts DESC
LIMIT 1

So what I did was I used a CTE to store the value of mode and then selected top 1. If you wanted more flexibility or have a huge dataset you can use a temp table instead of a CTE.
Below is the code:
DECLARE #AlertType TABLE
(Type1 INT,
Descr varchar(20))
INSERT INTO #AlertType
(
Type1,
Descr
)
VALUES
( 1, 'Stop Arrival'),( 0,'No Show')
DECLARE #Alert TABLE
(Type1 INT)
INSERT INTO #Alert
(
Type1
)
VALUES (1),(0),(1),(23),(1),(5),(1)
;WITH CTE AS
(SELECT Type1, COUNT(*) AS number_of_alerts
FROM #Alert
GROUP BY Type1
)
SELECT TOP 1 AT.Descr, t1.Type1, t1.number_of_alerts
FROM CTE AS t1
JOIN #AlertType AS AT
ON AT.Type1 = t1.Type1
ORDER BY t1.number_of_alerts DESC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Sorting by frequency in MySQL/SQL - mysql

Related

How can I select the second-to-last rows in a mysql table, grouped by column?

Exclude top and bottom n rows in SQL

select min value of range [0,44) not in a column

GROUP BY - do not group NULL

How to find the mode of a set of data before joining with another table?

Categories

Resources