I'm struggling to work out how to get the most commonly occurring value from a table in MySQL.
Example:
CREATE TABLE words(`letter1` char(1), `letter2` char(1));
INSERT INTO words(`letter1`, `letter2`)VALUES
('A', 'A'), ('B', 'A'), ('C', 'A'), ('D', 'A'), ('D', 'B'),
('B', 'B'), ('D', 'B'), ('A', 'C'), ('B', 'D'), ('D', 'A');
So for letter1 I want to pick out the value 'D' and for letter2 I want to pick out 'A'.
For a tie I'm not too bothered which of the tied values it picks.
Thanks for any help, It looks like it ought to be easy but I can't figure it out. For one letter it would be easy but for multiple I don't know how to.
SELECT ( SELECT letter1
FROM table
GROUP BY 1
ORDER BY COUNT(*) DESC LIMIT 1 ) letter1,
( SELECT letter2
FROM table
GROUP BY 1
ORDER BY COUNT(*) DESC LIMIT 1 ) letter2;
If two or more letters have the same and maximal amount of occurences then one indefinite letter of these letters (but in most cases - the least lexicographically) will be returned.
Will it work for you?
Query for letter1
SELECT letter1 FROM words GROUP BY letter1 ORDER BY COUNT(1) DESC LIMIT 1;
Query for letter2
SELECT letter2 FROM words GROUP BY letter2 ORDER BY COUNT(1) DESC LIMIT 1;
For calculations based on individual columns, Akina already provided an answer. Just in case you'd like to find the mostly used pair , you can try this if you don't care which pair to pick in case of a tie:
select count(*) ct,letter1,letter2 from words group by letter1,letter2 order by ct desc limit 1;
Related
I have a column with complex user id. I want to replace the text within my select query.
This creates a new column as updated_by for every single value. I want them to be replaced in a single column. How can I achieve this?
select replace(updated_by, '5eaf5d368141560012161636', 'A'),
replace(updated_by, '5e79d03e9abae00012ffdbb3', 'B'),
replace(updated_by, '5e7b501e9abae00012ffdbd6', 'C'),
replace(updated_by, '5e7b5b199abae00012ffdbde', 'D'),
replace(updated_by, '5e7c817c9ca5540012ea6cba', 'E'),
updated_by
from my_table
GROUP BY updated_by;
In Postgres I would use a VALUES expression to form a derived table:
To just select:
SELECT *
FROM my_table m
JOIN (
VALUES
('5eaf5d368141560012161636', 'A')
, ('5e79d03e9abae00012ffdbb3', 'B')
, ('5e7b501e9abae00012ffdbd6', 'C')
, ('5e7b5b199abae00012ffdbde', 'D')
, ('5e7c817c9ca5540012ea6cba', 'E')
) u(updated_by, new_value) USING (updated_by);
Or LEFT JOIN to include rows without replacement.
You may need explicit type casts with non-default data types. See:
Casting NULL type when updating multiple rows
For repeated use, create a persisted translation table.
CREATE TABLE updated_by_translation (updated_by text PRIMARY KEY, new_value text);
INSERT INTO my_table
VALUES
('5eaf5d368141560012161636', 'A')
, ('5e79d03e9abae00012ffdbb3', 'B')
, ('5e7b501e9abae00012ffdbd6', 'C')
, ('5e7b5b199abae00012ffdbde', 'D')
, ('5e7c817c9ca5540012ea6cba', 'E')
;
Data types and constraints according to your actual use case.
SELECT *
FROM my_table m
LEFT JOIN updated_by_translation u USING (updated_by);
MySQL recently added a VALUES statement, too. The manual:
VALUES is a DML statement introduced in MySQL 8.0.19
But it requires the keyword ROW for every row. So:
...
VALUES
ROW('5eaf5d368141560012161636', 'A')
, ROW('5e79d03e9abae00012ffdbb3', 'B')
, ROW('5e7b501e9abae00012ffdbd6', 'C')
, ROW('5e7b5b199abae00012ffdbde', 'D')
, ROW('5e7c817c9ca5540012ea6cba', 'E')
...
Use case:
select case updated_by
when '5eaf5d368141560012161636' then 'A'
when '5e79d03e9abae00012ffdbb3' then 'B'
when '5e7b501e9abae00012ffdbd6' then 'C'
when '5e7b5b199abae00012ffdbde' then 'D'
when '5e7c817c9ca5540012ea6cba' then 'E'
end as updated_by
from my_table
This has to be nested liek this
SELECT
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(updated_by,
'5e7c817c9ca5540012ea6cba',
'E'),
'5e7b5b199abae00012ffdbde',
'D'),
'5e7b501e9abae00012ffdbd6',
'C'),
'5e79d03e9abae00012ffdbb3',
'B'),
'5eaf5d368141560012161636',
'A'),
updated_by
FROM
my_table
GROUP BY updated_by
This will replace all occurring, patterns, if they are not foung nothing happens
You can use a recursive CTE if you need to handle multiple values within a single row:
with replacements as (
select '5eaf5d368141560012161636' as oldval, 'A' as newval union all
select '5e79d03e9abae00012ffdbb3' as oldval, 'B' union all
select '5e7b501e9abae00012ffdbd6' as oldval, 'C' union all
select '5e7b5b199abae00012ffdbde' as oldval, 'D' union all
select '5e7c817c9ca5540012ea6cba' as oldval, 'E'
),
r as (
select r.*, row_number() over (order by oldval) as seqnum
from replacements r
),
recursive cte (
select r.seqnum, replace(t.updated_by, r.oldval, r.newval) as updated_by
from my_table t join
r
on r.seqnum = 1
union all
select r.seqnum, replace(cte.updated_by, r.oldval, r.newval) as updated_by
from cte t join
r
on r.seqnum = cte.seqnum + 1
)
select cte.*
from cte
where seqnum = (select count(*) from replacements);
Here is a toy example:
CREATE TABLE TEST
(
ID INT,
AGG NVARCHAR(20),
GRP NVARCHAR(20)
);
INSERT INTO TEST VALUES
(1, 'AB', 'X'), (2, 'BC', 'X'), (3, 'AC', 'X'),
(4, 'EF', 'Y'), (5, 'FG', 'Y'), (6, 'DC', 'Y'),
(7, 'JI', 'Z'), (8, 'IJ', 'Z'), (9, 'JK', 'Z');
Now, I would like to do this (this is a valid code in MySQL, but not in MEMSQL):
SELECT
COUNT(*),
SUM(ID),
GROUP_CONCAT(AGG ORDER BY AGG),
GRP
FROM TEST
GROUP BY GRP
So that the output looks like this (Required Output):
3 6 AB,AC,BC X
3 15 DC,EF,FG Y
3 24 IJ,JI,JK Z
Note that the values in the third column are sorted for each row. My output looks like this (Current Wrong Output):
3 6 BC,AB,AC X
3 15 DC,EF,FG Y
3 24 IJ,JI,JK Z
Compare each row in the third column, the lists are sorted.
However, since the above query is not valid in MEMSQL, I have to remove the ORDER BY AGG part in GROUP_CONCAT which causes the third column to not be sorted.
As per the documentation of GROUP_CONCAT, the expression can also be a function, however, there is no built in function to sort. I have tried many combinations of SELECT ... ORDER BY statements in GROUP_CONCAT without success. Is this impossible to do, or am I missing something?
I think this works for my case.
SELECT
COUNT(*),
SUM(T.ID),
GROUP_CONCAT(T.AGG),
T.GRP
FROM (
SELECT
*,
RANK() OVER(PARTITION BY GRP ORDER BY AGG) AS R
FROM TEST
) T
GROUP BY T.GRP
ORDER BY T.R
It is rather convoluted, so I hope someone can suggest an improvement.
Try this:
SELECT
COUNT(*),
SUM(ID),
GROUP_CONCAT(AGG),
GRP
FROM TEST
GROUP BY GRP
ORDER BY GROUP_CONCAT(AGG)
I have a simple table
CREATE TABLE `example` (
`id` int(12) NOT NULL,
`food` varchar(250) NOT NULL
);
With the following data
INSERT INTO `example` (`id`, `food`) VALUES
(1, 'apple'),
(2, 'apple'),
(3, 'apple'),
(4, 'apple'),
(5, 'apple'),
(6, 'apple'),
(7, 'apple'),
(8, 'banana'),
(9, 'banana'),
(10, 'potato'),
(11, 'potato'),
(12, 'potato'),
(13, 'banana'),
(14, 'banana'),
(15, 'banana');
I want to get the oldest 10 rows
SELECT *
FROM example
ORDER BY id ASC
LIMIT 10
But I don't want to get more than 5 rows where food has the same value.
My current query receives 7 apple (more than I want), 2 banana, and 1 potato. In the data provided, I'd want to receive 5 apple, 2 banana, and 3 potato.
How can I accomplish this?
Update:
SQL Group BY, Top N Items for each Group is not a duplicate because it involves a different database. In particular, GROUP BY works different in sql-server than it does in MySQL
You can add a count (in reverse) for each food . . . using variables or a correlated subquery. This will use the latter:
select t.*
from (select t.*,
(select count(*) from example t2 where t2.food = t.food and t2.id >= t.id) as seqnum
from example t
) t
where seqnum <= 5
order by id desc
limit 10;
I didn't create the table and test this, but it should give you what you want. Just a different approach than the one above.
Select *
From (Select ID, Food
, Count(Food) Over(Partition By Food Order by ID) as Appearances
From Your_Table) as a
Where a.Appearances <= 5
Order By ID Asc
You can obviously put the limit if you want.
i have a schema look like this:
CREATE TABLE users
(
id int auto_increment primary key,
name varchar(20),
point int(255)
);
INSERT INTO users
(name, point)
VALUES
('Jack', 1),
('Rick', 5),
('Danny', 11),
('Anthony', 24),
('Barla', 3),
('James', 15),
('Melvin', 12),
('Orthon', 5),
('Kenny', 2),
('Smith', 30),
('Steven', 27),
('Darly', 45),
('Peter', 44),
('Parker', 66),
('Lola', 78),
('Jennifer', 94),
('Smart', 87),
('Jin', 64),
('David', 31),
('Jill', 78),
('Ken', 48),
('Martin', 19),
('Adrian', 20),
('Oliver', 16),
('Ben', 100);
and my sql is:
select id, name, point from users Order by point desc, rand() LIMIT 5
problem is, my query does not select 5 row randomly and order them by point. Any idea, how to solve it? here is sqlfiddle:
http://sqlfiddle.com/#!2/18f15/1
select id,name,point from
(select id, name, point from users Order by rand()
LIMIT 5) abc
order by point desc;
SQLFIDDLE
problem is, my query does not select 5 row randomly and order them by point.
It's because in your given Query you are using ORDER BY clause.
select id, name, point from users Order by point desc, rand() LIMIT 5
Try with removing point desc, in ORDER BY clause
select id, name, point from users Order by rand() LIMIT 5
SQL FIDDLE
Edit
select id,name,pont from
(select id, name, point from users Order by rand() LIMIT 5)temp
order by point desc
Note: There should be no Table with name temp in you DB (i.e. since you are using it in you alias)
What is wrong with this query?
SELECT *, (SELECT COUNT(*)
FROM
(
SELECT NULL
FROM words
WHERE project=projects.id
GROUP BY word
HAVING COUNT(*) > 1
) T1) FROM projects
MySQL returns 1054 Unknown column 'projects.id' in 'where clause'
Thanks
Does this work?
SELECT *, (SELECT COUNT(*)
FROM words
WHERE words.project=projects.id) as pCount
FROM projects
Your inner subquery knows nothing about the outer query, so the projects table is not available.
It looks like you are trying to count for each project the number of words which occur more than once.
You can run your subquery for all projects and then use a JOIN to get the rest of the data from the projects table:
SELECT projects.*, COUNT(word) AS cnt
FROM projects
LEFT JOIN (
SELECT project, word
FROM words
GROUP BY project, word
HAVING COUNT(*) > 1
) T1
ON T1.project = projects.id
GROUP BY projects.id
Result:
id cnt
1 0
2 1
3 2
Test data:
CREATE TABLE projects (id INT NOT NULL);
INSERT INTO projects (id) VALUES (1), (2), (3);
CREATE TABLE words (project INT NOT NULL, word VARCHAR(100) NOT NULL);
INSERT INTO words (project, word) VALUES
(1, 'a'),
(2, 'a'),
(2, 'b'),
(2, 'b'),
(3, 'b'),
(3, 'b'),
(3, 'c'),
(3, 'c');