counting comma separated values mysql-postgre - mysql

I have a column called "feedback", and have 1 field called "emotions". In those emotions field, we can see the random values and random length like
emotions
sad, happy
happy, angry, boring
boring
sad, happy, boring, laugh
etc with different values and different length.
so, the question is, what's query to serve the mysql or postgre data:
emotion
count
happy
3
angry
1
sad
2
boring
3
laugh
1
based on SQL: Count of items in comma-separated column in a table we could try using
SELECT value as [Holiday], COUNT(*) AS [Count]
FROM OhLog
CROSS APPLY STRING_SPLIT([Holidays], ',')
GROUP BY value
but it wont help because that is for sql server, not mysql or postgre. or anyone have idea to translation those sqlserver query to mysql?
thank you so much.. I really appreciate it

Using Postgres:
create table emotions(id integer, emotions varchar);
insert into emotions values (1, 'sad, happy');
insert into emotions values (2, 'happy, angry, boring');
insert into emotions values (3, 'boring');
insert into emotions values (4, 'sad, happy, boring, laugh');
select
emotion, count(*)
from
(select
trim(regexp_split_to_table(emotions, ',')) as emotion
from emotions) as t
group by
emotion;
emotion | count
---------+-------
happy | 3
sad | 2
boring | 3
laugh | 1
angry | 1
From String functions regexp_split_to_table will split the string on ',' and return the individual elements as rows. Since there are spaces between the ',' and the word use trim to get rid of the spaces. This then generates a 'table' that is used as a sub-query. In the outer query group by the emotion field and count them.

Try the following using MySQL 8.0:
WITH recursive numbers AS
(
select 1 as n
union all
select n + 1 from numbers where n < 100
)
,
Counts as (
select trim(substring_index(substring_index(emotions, ',', n),',',-1)) as emotions
from Emotions
join numbers
on char_length(emotions) - char_length(replace(emotions, ',', '')) >= n - 1
)
select emotions,count(emotions) as counts from Counts
group by emotions
order by emotions
See a demo from db-fiddle.
The recursive query is to generate numbers from 1 to 100, supposing that the maximum number of sub-strings is 100, you may change this number accordingly.

I've used MySQL 8.0, the query has no string limits. (Thanks to Ahmed for the intuition on recursive clause)
WITH RECURSIVE cte AS (
SELECT ( LENGTH(REGEXP_REPLACE(emotions, ' ?[A-z]+ ?', ''))+1) AS n, emotions AS subs
FROM feedback
UNION ALL
SELECT n-1 AS n, ( SUBSTRING_INDEX(subs, ', ', n-1) ) AS subs
FROM cte
HAVING n>0
)
SELECT SUBSTRING_INDEX(subs, ', ', -1) AS emotions, COUNT(subs) AS cnt
FROM cte
GROUP BY emotions

Related

MySQL - How can I group music together when the names are similar?

I would like to be able to return a single line when the name of some musics are the same or similar, as for example this case:
music with similar names
You can see that the names are the same with an extension like " - JP Ver." or something like that, I would like to be able to group them in one row with the first column incrementing the whole.
My current request to return these lines is as follows:
select count(id) number, name, sec_to_time(floor(sum(duration) / 1000)) time
from track
where user_id = 'value'
group by name, duration
order by number desc, time desc;
I would like to get a result like this
Thank you for reading and responding! I wish you all a good day!
Try:
SELECT COUNT(name) no,
TRIM(SUBSTRING_INDEX(name, '-', 1)) namee
FROM track
GROUP BY namee
Example: https://onecompiler.com/mysql/3xt3bfev6
Use GROUP_CONCAT
Here is a proof of concept script. You can add your other columns. I have grouped by the first 4 letters. You will probably want to use more.
CREATE TABLE track (
idd INT,
nam CHAR(50),
tim INT
);
INSERT INTO track VALUES (1,'Abba 1',5);
INSERT INTO track VALUES (2,'Abba 2',6);
INSERT INTO track VALUES (3,'Beta 1',12);
INSERT INTO track VALUES (4,'Beta 4',8);
SELECT
LEFT(nam,4) AS 'Group',
COUNT(idd) AS 'Number',
GROUP_CONCAT(DISTINCT idd ORDER BY idd ASC SEPARATOR ' & ') AS IDs,
GROUP_CONCAT(DISTINCT nam ORDER BY nam ASC SEPARATOR ', ') AS 'track names',
SUM(tim) AS 'total time'
FROM track
GROUP BY LEFT(nam,4);
DROP TABLE track;
Output
Group Number IDs track names total time
Abba 2 1 & 2 Abba 1, Abba 2 11
Beta 2 3 & 4 Beta 1, Beta 4 20

Is there a way to use aggregate COUNT() values within CASE?

I need to retrieve unique yet truncated part numbers, with their description values being conditionally determined.
DATA:
Here's some simplified sample data:
(the real table has half a million rows)
create table inventory(
partnumber VARCHAR(10),
description VARCHAR(10)
);
INSERT INTO inventory (partnumber,description) VALUES
('12345','ABCDE'),
('123456','ABCDEF'),
('1234567','ABCDEFG'),
('98765','ZYXWV'),
('987654','ZYXWVU'),
('9876543','ZYXWVUT'),
('abcde',''),
('abcdef','123'),
('abcdefg','321'),
('zyxwv',NULL),
('zyxwvu','987'),
('zyxwvut','789');
TRIED:
I've tried too many things to list here.
I've finally found a way to get past all the 'unknown field' errors and at least get SOME results, but:
it's SUPER kludgy!
my results are not limited to unique prods.
Here's my current query:
SELECT
LEFT(i.partnumber, 6) AS prod,
CASE
WHEN agg.cnt > 1
OR i.description IS NULL
OR i.description = ''
THEN LEFT(i.partnumber, 6)
ELSE i.description
END AS `descrip`
FROM inventory i
INNER JOIN (SELECT LEFT(ii.partnumber, 6) t, COUNT(*) cnt
FROM inventory ii GROUP BY ii.partnumber) AS agg
ON LEFT(i.partnumber, 6) = agg.t;
GOAL:
My goal is to retrieve:
prod
descrip
12345
ABCDE
123456
123456
98765
ZYXWV
987654
987654
abcde
abcde
abcdef
abcdef
zyxwv
zyxwv
zyxwvu
zyxwvu
QUESTION:
What are some cleaner ways to use the COUNT() aggregate data with a CASE type conditional?
How can I limit my results so that all prods are UNIQUE?
You can check if a left(partnumber, 6) is not unique in the result by checking if count(*) > 1. In such a case let descrip be left(partnumber, 6). Otherwise you can use max(description) (or min(description)) to get the single description but satisfy the needs to use an aggregation function on columns not in the GROUP BY. To replace empty or NULL descriptions, nullif() and coalesce() can be used.
That would lead to the following using just one level of aggregation and no joins:
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
left(partnumber, 6)
ELSE
coalesce(nullif(max(description), ''), left(partnumber, 6))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
But there seems to be a bug in MySQL and this query fails. The engine doesn't "see" that, in the list after SELECT partnumber is only used in the expression left(partnumber, 6), which is also in the GROUP BY. Instead the engine falsely complains about partnumber not being in the GROUP BY and not subject to an aggregation function.
As a workaround, we can use a derived table, that does the shortening of partnumber to its first six characters. We then use use that column of the derived table instead of left(partnumber, 6).
SELECT l6pn AS prod,
CASE
WHEN count(*) > 1 THEN
l6pn
ELSE
coalesce(nullif(max(description), ''), l6pn)
END AS descrip
FROM (SELECT left(partnumber, 6) AS l6pn,
description
FROM inventory) AS x
GROUP BY l6pn
ORDER BY l6pn;
Or we slap some actually pointless max()es around the left(partnumber, 6) other than the first, to work around the bug.
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
max(left(partnumber, 6))
ELSE
coalesce(nullif(max(description), ''), max(left(partnumber, 6)))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
db<>fiddle (Change the DBMS to some other like Postgres or MariaDB to see that they also accept the first query.)

MySQL group by comma seperated list unique [duplicate]

This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 2 years ago.
The column textfield has comma-seperated list values
ID | textfield
1 | english,russian,german
2 | german,french
3 | english
4 | null
I'm attempting to count the amount of languages in textfield. The default language is "English", so if null then "English". The correct amount of languages is 4(english,russian,german,french).
Here is my query to attempt doing this:
SELECT SUM((length(`textfield`) - length(replace(`textfield`, ',', '')) + 1)) as my
FROM yourtable;
The result i get is 6, i don't know how to group the languages.
Here is fiddle
http://sqlfiddle.com/#!9/0e532/1
The desired result is 4. How do i solve?
Identifying the source of error
What your query is doing is counting how many languages in each row, and adding them all together. Your query does not take into account duplicates. Since English shows up twice in the table, it is counted twice (and German, too), hence in your example six. Also, another issue is that your current code considers null as what null truly means, the absence of a value.
For example, if your database was
ID | textfield
---|----------
1 | null
you would also be arriving at incorrect results (more on this below).
Solution
This gets you a comma separated result of the languages, no duplicates.
SELECT
GROUP_CONCAT(DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(textfield, ',', n.digit+1), ',', -1)) textfield
FROM
yourtable
INNER JOIN
(SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6) n
ON LENGTH(REPLACE(textfield, ',', '')) <= LENGTH(textfield)-n.digit;
This query can serve as a subquery for what you were attempting to do in the question prompt. In other words, instead of the length('textfield') ... you would provide the resulting column name from this query
Null as in English
This logic should not be implemented at the database level, IMHO. If you want to go ahead and consider null entries as English, that is fine. The downside is the example I provided for you before. When you have a query that solves for the total languages in the database, if English wasn't an explicitly stated language and instead just a null value, then the query wouldn't 'count' English (it's null). But you can't just add 1 every time you find the amount of languages because English might already be explicit.
Recommendations:
Avoid comma separated lists in databases by normalizing your data
No value makes sense for a null field
For version 5.6 (like in the fiddle)
SELECT COUNT(DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(languages.textfield, ',', numbers.num), ',', -1)) languages_count
FROM (SELECT COALESCE(textfield, 'english') textfield
FROM yourtable) languages
JOIN (SELECT 1 num UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) numbers
ON numbers.num <= LENGTH(languages.textfield) - LENGTH(REPLACE(languages.textfield, ',', '')) + 1;
fiddle
For version 8.x (as claimed in a comment)
SELECT COUNT(DISTINCT jsontable.value) languages_count
FROM yourtable
CROSS JOIN JSON_TABLE( CONCAT('["', REPLACE(COALESCE(textfield, 'english'), ',', '","'), '"]'),
"$[*]" COLUMNS( value VARCHAR(254) PATH "$" )
) AS jsontable;
fiddle

Find all numbers that are present more then 3 times in a column of CSVs

I basically want this: if certain number is present >= 3 times then do some action ...
My table's column is this:
As you can see here that number 38 is present >= 3 times in absent_sids column, so I want to have some actions on him like ban or something else. But I don't know what sql query should I write because;
1. I am quite new to php/mysql
2. The column has comma separated numbers, and its quite difficult for me to search in this column through mysql query and bring the absent_sid that is >= 3 times in a given period of time/date.
Plz help
This is quite long but working. Steps: 1) convert the array into rows using CHAR_LENGTH and REPLACE function. 2) Use GROUP BY and HAVING COUNT to search for numbers that exists 3 or more times
See demo here: http://sqlfiddle.com/#!9/39afc0/2
SELECT absent_sids
from (
SELECT
tablename.aid,
SUBSTRING_INDEX(
SUBSTRING_INDEX(
tablename.absent_sids, ',', numbers.n), ',', -1) as absent_sids
FROM
(select ORDINAL_POSITION as n
from INFORMATION_SCHEMA.COLUMNS
where table_name='COLUMNS'
and ORDINAL_POSITION <= (
select round(max(length(absent_sids))/2)
from tablename)) numbers
INNER JOIN tablename
ON CHAR_LENGTH(tablename.absent_sids)
-CHAR_LENGTH(REPLACE(tablename.absent_sids, ',', ''))
>= numbers.n-1) tab
GROUP BY absent_sids
HAVING COUNT(*) >= 3

Adding another value below the row

Hey guys i have did some coding in mysql to add a new line value to a row..
SELECT
babe
FROM
(SELECT
concat_ws(' ', 'assword \n') AS babe,
) test;
When i did like this i get an output like
BABE
assword name
What i need is an output like
BABE
assword
name(this would be below assword)
Is there any mysql functions to do this ??...or can i UPDATE the row ??..
I am a newbie in mysql. Hope you guys can help me out..Thanks in advance..
The statement includes a newline character in the babe column. You can confirm this by using the HEX() function to view the character encodings.
For example:
SELECT HEX(t.babe)
FROM ( SELECT CONCAT_WS(' ', 'assword \n') AS babe ) t
On my system, that Will output:
617373776F7264200A
It's easy enough to understand what was returned
a s s w o r d \n
61 73 73 77 6F 72 64 20 0A
(In the original query, there's an extraneous comma that will prevent the statement from running. Perhaps there was another expression in the SELECT list of the inline view, and that was returning the 'name' value that's shown in the example output. But we don't see any reference to that in the outer query.
It's not clear why you need the newline character. If you want to return:
BABE
-----------
asssword
name
That looks like two separate rows to me. But it's valid (but peculiar) to do this:
SELECT t.babe
FROM ( SELECT CONCAT_WS(' ', 'assword \nname') AS babe ) t
FOLLOWUP
Q: i just wanted to know how to add a new row below the assword ..if u know please edit the answer
It's not clear what result you are trying to achieve. The specification, divorced from the context of a use-case, is just bizarre.
A: If I had a need to return two rows: one row with the literal 'assword' and another row "below" it with the literal 'name', I could do this:
( SELECT 'assword' AS some_string )
UNION ALL
( SELECT 'name' AS some_string )
ORDER BY some_string
In this particular case, we can get the ordering we need by a simple reference to the column in the ORDER BY clause.
In the more general case, when there isn't a convenient expression for the ORDER BY clause, I would add an additional column, and perform a SELECT on the resultset from the UNION ALL operation. In this example, that "extra" column is named seq:
SELECT t.some_string
FROM ( SELECT 'assword' AS some_string, 1 AS seq
UNION ALL SELECT 'name', 2
)
ORDER BY t.seq
As another example:
( SELECT 'do' AS tone, 1 AS seq )
UNION ALL ( SELECT 're', 2 )
UNION ALL ( SELECT 'mi', 3 )
UNION ALL ( SELECT 'fa', 4 )
ORDER BY seq
I'd only need to add an outer SELECT if I needed a projection operation (for example, to remove the seq column from the returned resultset.
SELECT t.tone
FROM ( SELECT 'do' AS tone, 1 AS seq
UNION ALL SELECT 're', 2
UNION ALL SELECT 'mi', 3
UNION ALL SELECT 'fa', 4
)
ORDER BY t.seq