Using Count and Group By on table field with "-" - mysql

I am running a query that currently counts species from animals table. However I am not getting the desired result with the query listed below. Currently the COUNT is counting number of specie, which is composed of two words type and breed(e.g. dog-pitbull). The query returns 1 for all entries. However, How could I group result and count by dogs,cats,birds, etc. disregarding breed? SQLFIDDLE
Query
SELECT specie, COUNT(*) as Total FROM animals GROUP BY specie;
Schema
CREATE TABLE animals
(`id` int, `name` varchar(20), `specie` varchar(55))
;
INSERT INTO animals
(`id`, `name`, `specie`)
VALUES
(1, 'dougie', 'dog-poodle'),
(2, 'bonzo', 'dog-pitbull'),
(3, 'cadi', 'cat-persian'),
(4, 'mr.turtle', 'turtle-snapping'),
(5, 'spotty', 'turtle-spotted'),
(6, 'tweety', 'bird-canary')
;

This query will give you the animal type and the count i.e. 2 dog, 1 cat, 2 turtle, 1 bird.
It looks at the value of specie and it return the value before the 1st - found along with the count.
SELECT
SUBSTRING_INDEX(specie,'-',1) AS specie
, COUNT(*) AS total
FROM animals
GROUP BY SUBSTRING_INDEX(specie,'-',1);

USE SUBSTRING_INDEX:
SELECT SUBSTRING_INDEX(specie,'-',1), specie, COUNT(*) AS Total
FROM animals GROUP BY SUBSTRING_INDEX(specie,'-',1);
RESULT:
bird bird-canary 1
cat cat-persian 1
dog dog-poodle 2
turtle turtle-snapping 2
DOKU SUBSTRING_INDEX

Related

counting comma separated values mysql-postgre

I have a column called "feedback", and have 1 field called "emotions". In those emotions field, we can see the random values and random length like
emotions
sad, happy
happy, angry, boring
boring
sad, happy, boring, laugh
etc with different values and different length.
so, the question is, what's query to serve the mysql or postgre data:
emotion
count
happy
3
angry
1
sad
2
boring
3
laugh
1
based on SQL: Count of items in comma-separated column in a table we could try using
SELECT value as [Holiday], COUNT(*) AS [Count]
FROM OhLog
CROSS APPLY STRING_SPLIT([Holidays], ',')
GROUP BY value
but it wont help because that is for sql server, not mysql or postgre. or anyone have idea to translation those sqlserver query to mysql?
thank you so much.. I really appreciate it
Using Postgres:
create table emotions(id integer, emotions varchar);
insert into emotions values (1, 'sad, happy');
insert into emotions values (2, 'happy, angry, boring');
insert into emotions values (3, 'boring');
insert into emotions values (4, 'sad, happy, boring, laugh');
select
emotion, count(*)
from
(select
trim(regexp_split_to_table(emotions, ',')) as emotion
from emotions) as t
group by
emotion;
emotion | count
---------+-------
happy | 3
sad | 2
boring | 3
laugh | 1
angry | 1
From String functions regexp_split_to_table will split the string on ',' and return the individual elements as rows. Since there are spaces between the ',' and the word use trim to get rid of the spaces. This then generates a 'table' that is used as a sub-query. In the outer query group by the emotion field and count them.
Try the following using MySQL 8.0:
WITH recursive numbers AS
(
select 1 as n
union all
select n + 1 from numbers where n < 100
)
,
Counts as (
select trim(substring_index(substring_index(emotions, ',', n),',',-1)) as emotions
from Emotions
join numbers
on char_length(emotions) - char_length(replace(emotions, ',', '')) >= n - 1
)
select emotions,count(emotions) as counts from Counts
group by emotions
order by emotions
See a demo from db-fiddle.
The recursive query is to generate numbers from 1 to 100, supposing that the maximum number of sub-strings is 100, you may change this number accordingly.
I've used MySQL 8.0, the query has no string limits. (Thanks to Ahmed for the intuition on recursive clause)
WITH RECURSIVE cte AS (
SELECT ( LENGTH(REGEXP_REPLACE(emotions, ' ?[A-z]+ ?', ''))+1) AS n, emotions AS subs
FROM feedback
UNION ALL
SELECT n-1 AS n, ( SUBSTRING_INDEX(subs, ', ', n-1) ) AS subs
FROM cte
HAVING n>0
)
SELECT SUBSTRING_INDEX(subs, ', ', -1) AS emotions, COUNT(subs) AS cnt
FROM cte
GROUP BY emotions

How to select all records based on non-duplication of one column

I just cannot seem to find an answer for this deceptively simple question. Most every solution either deletes all the duplicates, selects all the duplicates, or selects all the records except the duplicates. How can I select all rows such that, in this example, the "name" column values are unique, while selecting the first record of any duplicate set and ignoring the remaining duplicates of that same name? I do need all the values from all the columns in all the records in the selected record set.
Given the set of records:
pk fk name secs note
1 100 cat 90 gray
2 111 dog 123 mix
3 233 fish 75 gold
4 334 dog 932 black
5 238 cow 90 stray
6 285 cat 90 stray
The returned set should be:
pk fk name secs note
1 100 cat 90 gray
2 111 dog 123 mix
3 233 fish 75 gold
5 238 cow 90 stray
-- SQL
drop table if exists foo;
create table foo (
pk int unsigned,
fk int unsigned,
name varchar(10),
secs int,
note varchar(10),
primary key (pk)
) engine=innodb default charset=utf8;
insert into foo
(pk, fk, name, secs, note)
values
(1, 100, 'cat', 90, 'gray'),
(2, 111, 'dog', 123, 'mix'),
(3, 233, 'fish', 75, 'gold'),
(4, 334, 'dog', 932, 'black'),
(5, 238, 'cow', 90, 'stray'),
(6, 285, 'cat', 90, 'stray');
Here is a query that that returns what you're looking for:
SELECT
F.*
FROM
foo F
INNER JOIN
( SELECT
MIN(F2.pk) AS `pk`
,F2.name
FROM foo F2
GROUP BY F2.name
) T
ON T.pk = F.pk;
Hope this will help you.
I think you are looking for this :
select * from table group by name;
Which can be written using distinct keyword as :
select distinct on name * from table;
In SQL it will be
select pk, fk, name, secs, note
from (select row_number()
over(partition by name order by pk) as rn,*
from foo) as tbl
where rn=1

group by - with distinct results incl. count

Short question about the statement "group by" in mysql:
My current db structure looks like:
CREATE TABLE TableName
(
ID int primary key,
name varchar(255),
number varchar(255)
);
INSERT INTO TableName
(ID, name, number)
VALUES
(1, "Test 1", "100000"),
(2, "Apple", "200000"),
(3, "Test 1 beta", "100000"),
(4, "BLA", "300000"),
(5, "ABU", "400000"),
(6, "CBA", "700000"),
(7, "ABC", "600000"),
(8, "Orange - Test", "400000"),
(9, "ABC", "");
My current statement looks like:
SELECT name, number, count(*) as Anzahl
FROM TableName
group by name,number
with this statement the result looks like:
NAME NUMBER ANZAHL
ABC 1
Test 1 100000 2
Apple 200000 1
BLA 300000 1
ABU 400000 2
ABC 600000 1
CBA 700000 1
But the value "ABC" wouldn't merged.
the result should look like:
NAME NUMBER ANZAHL
Test 1 100000 2
Apple 200000 1
BLA 300000 1
ABU 400000 2
ABC 600000 2
CBA 700000 1
Any Ideas how it could work?
SQLFiddle:
http://sqlfiddle.com/#!2/dcbee/1
the solution must be performant for something like +1 000 000 rows
First of all IMHO, it's a bad design to store numbers into character column. Working with integers is faster than characters. Being said that, I assume all values in name column will be numbers. Here's is a query to avoid multiple ABC values
SELECT name,
SUM(convert(number, SIGNED INTEGER)) as number,
count(*) as Anzahl
FROM TableName
GROUP BY name ;
This is what I suggest (SQL Fiddle Link: http://sqlfiddle.com/#!2/c6f83b/5/0)
Like #Parag said, I strongly urge you to changed the table definition
Then the SQL is easy:
SELECT name, number, COUNT(*) AS anzahl
FROM tablename
WHERE number IS NOT NULL
GROUP BY name, number

SQL Query for exact match in many to many relation

I have the following tables(only listing the required attributes)
medicine (id, name),
generic (id, name),
med_gen (med_id references medicine(id),gen_id references generic(id), potency)
Sample Data
medicine
(1, 'Crocin')
(2, 'Stamlo')
(3, 'NT Kuf')
generic
(1, 'Hexachlorodine')
(2, 'Methyl Benzoate')
med_gen
(1, 1, '100mg')
(1, 2, '50ml')
(2, 1, '100mg')
(2, 2, '60ml')
(3, 1, '100mg')
(3, 2, '50ml')
I want all the medicines which are equivalent to a given medicine. Those medicines are equivalent to each other that have same generic as well as same potency. In the above sample data, all the three have same generics, but only 1 and three also have same potency for the corresponding generics. So 1 and 3 are equivalent medicines.
I want to find out equivalent medicines given a medicine id.
NOTE : One medicine may have any number of generics. Medicine table has around 102000 records, generic table around 2200 and potency table around 200000 records. So performance is a key point.
NOTE 2 : The database used in MySQL.
One way to do it in MySQL is to leverage GROUP_CONCAT() function
SELECT g.med_id
FROM
(
SELECT med_id, GROUP_CONCAT(gen_id ORDER BY gen_id) gen_id, GROUP_CONCAT(potency ORDER BY potency) potency
FROM med_gen
WHERE med_id = 1 -- here 1 is med_id for which you're trying to find analogs
) o JOIN
(
SELECT med_id, GROUP_CONCAT(gen_id ORDER BY gen_id) gen_id, GROUP_CONCAT(potency ORDER BY potency) potency
FROM med_gen
WHERE med_id <> 1 -- here 1 is med_id for which you're trying to find analogs
GROUP BY med_id
) g
ON o.gen_id = g.gen_id
AND o.potency = g.potency
Output:
| MED_ID |
|--------|
| 3 |
Here is SQLFiddle demo

Creating a frequency table in Access VBA

I have a table where different participants are given multiple boxes of medicines on multiple days. I am trying to create a frequency table showing how much medicines have been distributed by the number of boxes to the participants.
The result I'm looking for is -
2 boxes = 1 (since only Lynda got a total of 2 boxes), 4 boxes = 2 (since Ryan and Rinky both got a total of 4 boxes after adding up the medicine boxes)
Please let me know what approach would be the best in this case.
Thanks for your help.
-Nams
I think you want:
SELECT t.SumOf, Count(t.[PARTICIPANT ID]) AS CountOf
FROM (SELECT Table1.[PARTICIPANT ID], Sum(Table1.MEDICINE_BOX) AS SumOf
FROM Table1
GROUP BY Table1.[PARTICIPANT ID]) AS t
GROUP BY t.SumOf;
Where table1 is the name of your table.
If your table is like this:
medicine_dispense
participantID date amount_boxes
ABC 8/29/12 1
ABC 8/30/12 2
XYZ 8/29/12 1
XYZ 8/30/12 1
then a query like this:
select
amount_boxes, count(participantID)
from
medicine_dispense
should work
I'll use generic SQL. You can paste SQL into Access queries in SQL view. (You might have to delete the CHECK() constraint.)
create table participant_meds (
participant varchar(10) not null,
distribution_date date not null default current_date,
num_boxes integer not null check (num_boxes > 0),
primary key (participant, distribution_date)
);
insert into participant_meds values ('Ryan', '2012-02-03', 1);
insert into participant_meds values ('Ryan', '2012-06-07', 3);
insert into participant_meds values ('Rinky', '2012-02-28', 4);
insert into participant_meds values ('Lynda', '2012-03-04', 2);
insert into participant_meds values ('Russ', '2012-04-05', 2);
insert into participant_meds values ('Russ', '2012-05-08', 2);
insert into participant_meds values ('Russ', '2012-06-12', 2);
Resulting data, sorted, for copy/paste.
participant distribution_date num_boxes
Lynda 2012-03-04 2
Rinky 2012-02-28 4
Russ 2012-04-05 2
Russ 2012-05-08 2
Russ 2012-06-12 2
Ryan 2012-02-03 1
Ryan 2012-06-07 3
This query gives you the total boxes per participant.
select sum(num_boxes) boxes, participant
from participant_meds
group by participant;
6;"Russ"
2;"Lynda"
4;"Ryan"
4;"Rinky"
Use that query in the FROM clause as if it were a table. (I'd consider storing that query as a view, because I suspect that the total number of boxes per participant might be useful. Also, Access has historically been good at optimizing queries that use views.)
select boxes num_boxes, count(participant) num_participants
from (select sum(num_boxes) boxes, participant
from participant_meds
group by participant) total_boxes
group by num_boxes
order by num_boxes;
num_boxes num_participants
--
2 1
4 2
6 1