mysql count distinct value - mysql

I have trouble wondering how do I count distinct value. using if on the select column
I have SQLFIDDLE here
http://sqlfiddle.com/#!2/6bfb9/3
Records shows:
create table team_record (
id tinyint,
project_id int,
position varchar(45)
);
insert into team_record values
(1,1, 'Junior1'),
(2,1, 'Junior1'),
(3,1, 'Junior2'),
(4,1, 'Junior3'),
(5,1, 'Senior1'),
(6,1, 'Senior1'),
(8,1, 'Senior2'),
(9,1, 'Senior2'),
(10,1,'Senior3'),
(11,1, 'Senior3'),
(12,1, 'Senior3')
I need to count all distinct value, between Junior and Senior column.
all same value would count as 1.
I need to see result something like this.
PROJECT_ID SENIOR_TOTAL JUNIOR_TOTAL
1 3 3
mysql query is this. but this is not a query to get the result above.
SELECT
`team_record`.`project_id`,
`position`,
SUM(IF(position LIKE 'Senior%',
1,
0)) AS `Senior_Total`,
SUM(IF(position LIKE 'Junior%',
1,
0)) AS `Junior_Total`
FROM
(`team_record`)
WHERE
project_id = '1'
GROUP BY `team_record`.`project_id`
maybe you could help me fix my query above to get the result I need.
thanks

I think you want this:
SELECT
project_id,
COUNT(DISTINCT CASE when position LIKE 'Senior%' THEN position END) Senior_Total,
COUNT(DISTINCT CASE when position LIKE 'Junior%' THEN position END) Junior_Total
FROM team_record
WHERE project_id = 1
GROUP BY project_id
The CASE will return a null if the WHEN is false (ie ELSE NULL is the default, which I omitted for brevity), and nulls aren't counted in DISTINCT.
Also, unnecessary back ticks, brackets and qualification removed.

Related

Is there a way to use aggregate COUNT() values within CASE?

I need to retrieve unique yet truncated part numbers, with their description values being conditionally determined.
DATA:
Here's some simplified sample data:
(the real table has half a million rows)
create table inventory(
partnumber VARCHAR(10),
description VARCHAR(10)
);
INSERT INTO inventory (partnumber,description) VALUES
('12345','ABCDE'),
('123456','ABCDEF'),
('1234567','ABCDEFG'),
('98765','ZYXWV'),
('987654','ZYXWVU'),
('9876543','ZYXWVUT'),
('abcde',''),
('abcdef','123'),
('abcdefg','321'),
('zyxwv',NULL),
('zyxwvu','987'),
('zyxwvut','789');
TRIED:
I've tried too many things to list here.
I've finally found a way to get past all the 'unknown field' errors and at least get SOME results, but:
it's SUPER kludgy!
my results are not limited to unique prods.
Here's my current query:
SELECT
LEFT(i.partnumber, 6) AS prod,
CASE
WHEN agg.cnt > 1
OR i.description IS NULL
OR i.description = ''
THEN LEFT(i.partnumber, 6)
ELSE i.description
END AS `descrip`
FROM inventory i
INNER JOIN (SELECT LEFT(ii.partnumber, 6) t, COUNT(*) cnt
FROM inventory ii GROUP BY ii.partnumber) AS agg
ON LEFT(i.partnumber, 6) = agg.t;
GOAL:
My goal is to retrieve:
prod
descrip
12345
ABCDE
123456
123456
98765
ZYXWV
987654
987654
abcde
abcde
abcdef
abcdef
zyxwv
zyxwv
zyxwvu
zyxwvu
QUESTION:
What are some cleaner ways to use the COUNT() aggregate data with a CASE type conditional?
How can I limit my results so that all prods are UNIQUE?
You can check if a left(partnumber, 6) is not unique in the result by checking if count(*) > 1. In such a case let descrip be left(partnumber, 6). Otherwise you can use max(description) (or min(description)) to get the single description but satisfy the needs to use an aggregation function on columns not in the GROUP BY. To replace empty or NULL descriptions, nullif() and coalesce() can be used.
That would lead to the following using just one level of aggregation and no joins:
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
left(partnumber, 6)
ELSE
coalesce(nullif(max(description), ''), left(partnumber, 6))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
But there seems to be a bug in MySQL and this query fails. The engine doesn't "see" that, in the list after SELECT partnumber is only used in the expression left(partnumber, 6), which is also in the GROUP BY. Instead the engine falsely complains about partnumber not being in the GROUP BY and not subject to an aggregation function.
As a workaround, we can use a derived table, that does the shortening of partnumber to its first six characters. We then use use that column of the derived table instead of left(partnumber, 6).
SELECT l6pn AS prod,
CASE
WHEN count(*) > 1 THEN
l6pn
ELSE
coalesce(nullif(max(description), ''), l6pn)
END AS descrip
FROM (SELECT left(partnumber, 6) AS l6pn,
description
FROM inventory) AS x
GROUP BY l6pn
ORDER BY l6pn;
Or we slap some actually pointless max()es around the left(partnumber, 6) other than the first, to work around the bug.
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
max(left(partnumber, 6))
ELSE
coalesce(nullif(max(description), ''), max(left(partnumber, 6)))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
db<>fiddle (Change the DBMS to some other like Postgres or MariaDB to see that they also accept the first query.)

count comma-separated values from a column - sql

I want count the length of a comma separated column
I have use these
(LENGTH(Col2) - LENGTH(REPLACE(Col2,",","")) + 1)
in my select query.
Demo:
id | mycolumn
1 2,5,8,60
2 4,5,1
3 5,Null,Null
query result for first two row is coming correctly.for 1 = 4 ,2 = 3 but for 3rd row it is calculating null value also.
Here is what I believe the actual state of your data is:
id | mycolumn
1 2,5,8,60
2 4,5,1
3 NULL
In other words, the entire value for mycolumn in your third record is NULL, likely from doing an operation involving a NULL value. If you actually had the text NULL your current query should still work.
The way to get around this would be to use COALESCE(val, "") when handling the NULL values in your strings.
Crude way of doing it is to replace the occurances of ',Null' with nothing first:-
SELECT a.id, (LENGTH(REPLACE(mycolumn, ',Null', '')) - LENGTH(REPLACE(REPLACE(mycolumn, ',Null', ''),",","")) + 1)
FROM some_table a
If the values refer to the id of rows in another table then you can join against that table using FIND_IN_SET and then count the matches (assuming that the string 'Null' is not an id on that other table)
SELECT a.id, COUNT(b.id)
FROM some_table a
INNER JOIN id_list_table b
ON FIND_IN_SET(b.id, a.mycolumn)
GROUP BY a.id

SELECT CASE, COUNT(*)

I want to select the number of users that has marked some content as favorite and also return if the current user has "voted" or not. My table looks like this
CREATE TABLE IF NOT EXISTS `favorites` (
`user` int(11) NOT NULL DEFAULT '0',
`content` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`user`,`content`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
Say I have 3 rows containing
INSERT INTO `favorites` (`user`, `content`) VALUES
(11, 26977),
(22, 26977),
(33, 26977);
Using this
SELECT COUNT(*), CASE
WHEN user='22'
THEN 1
ELSE 0
END as has_voted
FROM favorites WHERE content = '26977'
I expect to get has_voted=1 and COUNT(*)=3 but
I get has_voted=0 and COUNT(*)=3. Why is that? How to fix it?
This is because you mixed aggregated and non-aggregated expressions in a single SELECT. Aggregated expressions work on many rows; non-aggregated expressions work on a single row. An aggregated (i.e. COUNT(*)) and a non-aggregated (i.e. CASE) expressions should appear in the same SELECT when you have a GROUP BY, which does not make sense in your situation.
You can fix your query by aggregating the second expression - i.e. adding a SUM around it, like this:
SELECT
COUNT(*) AS FavoriteCount
, SUM(CASE WHEN user=22 THEN 1 ELSE 0 END) as has_voted
FROM favorites
WHERE content = 26977
Now both expressions are aggregated, so you should get the expected results.
Try this with SUM() and without CASE
SELECT
COUNT(*),
SUM(USER = '22') AS has_voted
FROM
favorites
WHERE content = '26977'
See Fiddle Demo
Try this:
SELECT COUNT(*), MAX(USER=22) AS has_voted
FROM favorites
WHERE content = 26977;
Check the SQL FIDDLE DEMO
OUTPUT
| COUNT(*) | HAS_VOTED |
|----------|-----------|
| 3 | 1 |
You need sum of votes.
SELECT COUNT(*), SUM(CASE
WHEN user='22'
THEN 1
ELSE 0
END) as has_voted
FROM favorites WHERE content = '26977'
You are inadvertently using a MySQL feature here: You aggregate your results to get only one result record showing the number of matches (aggregate function COUNT). But you also show the user (or rather an expression built on it) in your result line (without any aggregate function). So the question is: Which user? Another dbms would have given you an error, asking you to either state the user in a GROUP BY or aggregate users. MySQL instead picks a random user.
What you want to do here is aggregate users (or rather have your expression aggregated). Use SUM to sum all votes the user has given on the requested content:
SELECT
COUNT(*),
SUM(CASE WHEN user='22' THEN 1 ELSE 0 END) as sum_votes
FROM favorites
WHERE content = '26977';
You forgot to wrap the CASE statement inside an aggregate function. In this case has_voted will contain unexpected results since you are actually doing a "partial group by". Here is what you need to do:
SELECT COUNT(*), SUM(CASE WHEN USER = 22 THEN 1 ELSE 0 END) AS has_voted
FROM favorites
WHERE content = 26977
Or:
SELECT COUNT(*), COUNT(CASE WHEN USER = 22 THEN 1 ELSE NULL END) AS has_voted
FROM favorites
WHERE content = 26977

Get the earliest time if rows are more than 1

I need help with some MySQL pain in the ****... Anyway, i got the following sql :
SELECT id,count(*),
CASE
WHEN count(*) > 1 THEN
// I need the minimal `taskdate_time` column from the selected rows
// where a certain boolean is active
ELSE taskdate_time
END
FROM timehistory th
WHERE `send`=true
GROUP BY date_format(taskdate_time, "%Y-%m-%d"), user_id
As described in the comments, i need to get the earliest time out for the two rows where a column called removed is not FALSE
How do i achieve this?
My columns are :
`id` - int
`taskdateuser_id` int
`user_id` int
`changed_by` int
`batch_id` int
`taskdate_time` timestamp
`send` tinyint
`isread` tinyint
`update` tinyint
`removed` tinyint
Many thanks in advance!!!
EDIT:
I might explain it a bit more. If i got the following table rows :
The red marked rows are captured by the CASE count(*) > 1, because there are 2 rows returned by the group by. Then i need to to a SELECT from that 2 captured rows where removed=false and min(taskdate_time). So if 4 rows are returned for that group by, and 2 of the rows are removed=false and the other are removed=true then i need to do a subselect for the minimum taskdate_time that 2 rows where removed=false.
SELECT id,
count(*),
CASE WHEN count(*) > 1
THEN (SELECT MAX(taskdate_time) FROM timehistory f WHERE f.id = th.id AND removed = 0)
ELSE taskdate_time
END
FROM timehistory th
WHERE `send` = true
GROUP BY date_format(taskdate_time, "%Y-%m-%d"), user_id
You could try something like this:
SELECT TH.user_id, COUNT(*),
CASE WHEN COUNT(*) > 1
THEN MIN(IF(TH.removed, TH.taskdate_time, NULL))
ELSE TH.taskdate_time
END
FROM TimeHistory TH
...
Sample Fiddle Demo
However, if COUNT > 1 AND there aren't any records where TH.removed is true, then this will return NULL for that value. What should it return in those cases?
--EDIT--
In response to comments, then this should work just wrapping it with COALESCE:
COALESCE(
CASE
WHEN COUNT(*) > 1
THEN MIN(IF(TH.removed, TH.taskdate_time, NULL))
ELSE TH.taskdate_time
END, MIN(TH.taskdate_time))

disperse relational structure of MySQL table

I have a table that consists of
id (auto_increment)
number int (can contain values from 10 to 12)
myvalue (varchar)
What I want to do is disperse the relational structure of this table for report purpose. I.e , I´d like to have something like:
id (auto_increment)
number10 (containing myvalue WHERE number=10)
number11 (containing myvalue WHERE number=11)
number12 (containing myvalue WHERE number=12)
I know that I can get the respective results by
SELECT myvalue FROM mytable WHERE number = 10;
but I haven´t figured out how to write these three SELECT statements into one single table or view.
thx for any help in advance!
Something like this maybe?:
SELECT
id,
IF(number=10, myvalue, NULL) AS number10,
IF(number=11, myvalue, NULL) AS number11,
IF(number=12, myvalue, NULL) AS number12
FROM mytable
This might do what you need. You've not explained it very well though so it might not!
SELECT user,
MIN(CASE WHEN number = 10 then myvalue end) AS number10,
MIN(CASE WHEN number = 11 then myvalue end) AS number11,
MIN(CASE WHEN number = 12 then myvalue end) AS number12
FROM table
WHERE number IN (10,11,12)
GROUP BY user
I don't get the "id number10 number11 number12" stuff, but if you want to select the rows with the number field matching a set of values, you can just do:
SELECT * FROM mytable WHERE number IN (10, 11, 12);
Or, alternatively, you can select a number range:
SELECT * FROM mytable WHERE number >= 10 AND number <= 12;
Edit 2:
Vin-G's got it. I was way off.