Is there a way to use aggregate COUNT() values within CASE? - mysql

I need to retrieve unique yet truncated part numbers, with their description values being conditionally determined.
DATA:
Here's some simplified sample data:
(the real table has half a million rows)
create table inventory(
partnumber VARCHAR(10),
description VARCHAR(10)
);
INSERT INTO inventory (partnumber,description) VALUES
('12345','ABCDE'),
('123456','ABCDEF'),
('1234567','ABCDEFG'),
('98765','ZYXWV'),
('987654','ZYXWVU'),
('9876543','ZYXWVUT'),
('abcde',''),
('abcdef','123'),
('abcdefg','321'),
('zyxwv',NULL),
('zyxwvu','987'),
('zyxwvut','789');
TRIED:
I've tried too many things to list here.
I've finally found a way to get past all the 'unknown field' errors and at least get SOME results, but:
it's SUPER kludgy!
my results are not limited to unique prods.
Here's my current query:
SELECT
LEFT(i.partnumber, 6) AS prod,
CASE
WHEN agg.cnt > 1
OR i.description IS NULL
OR i.description = ''
THEN LEFT(i.partnumber, 6)
ELSE i.description
END AS `descrip`
FROM inventory i
INNER JOIN (SELECT LEFT(ii.partnumber, 6) t, COUNT(*) cnt
FROM inventory ii GROUP BY ii.partnumber) AS agg
ON LEFT(i.partnumber, 6) = agg.t;
GOAL:
My goal is to retrieve:
prod
descrip
12345
ABCDE
123456
123456
98765
ZYXWV
987654
987654
abcde
abcde
abcdef
abcdef
zyxwv
zyxwv
zyxwvu
zyxwvu
QUESTION:
What are some cleaner ways to use the COUNT() aggregate data with a CASE type conditional?
How can I limit my results so that all prods are UNIQUE?

You can check if a left(partnumber, 6) is not unique in the result by checking if count(*) > 1. In such a case let descrip be left(partnumber, 6). Otherwise you can use max(description) (or min(description)) to get the single description but satisfy the needs to use an aggregation function on columns not in the GROUP BY. To replace empty or NULL descriptions, nullif() and coalesce() can be used.
That would lead to the following using just one level of aggregation and no joins:
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
left(partnumber, 6)
ELSE
coalesce(nullif(max(description), ''), left(partnumber, 6))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
But there seems to be a bug in MySQL and this query fails. The engine doesn't "see" that, in the list after SELECT partnumber is only used in the expression left(partnumber, 6), which is also in the GROUP BY. Instead the engine falsely complains about partnumber not being in the GROUP BY and not subject to an aggregation function.
As a workaround, we can use a derived table, that does the shortening of partnumber to its first six characters. We then use use that column of the derived table instead of left(partnumber, 6).
SELECT l6pn AS prod,
CASE
WHEN count(*) > 1 THEN
l6pn
ELSE
coalesce(nullif(max(description), ''), l6pn)
END AS descrip
FROM (SELECT left(partnumber, 6) AS l6pn,
description
FROM inventory) AS x
GROUP BY l6pn
ORDER BY l6pn;
Or we slap some actually pointless max()es around the left(partnumber, 6) other than the first, to work around the bug.
SELECT left(partnumber, 6) AS prod,
CASE
WHEN count(*) > 1 THEN
max(left(partnumber, 6))
ELSE
coalesce(nullif(max(description), ''), max(left(partnumber, 6)))
END AS descrip
FROM inventory
GROUP BY left(partnumber, 6)
ORDER BY left(partnumber, 6);
db<>fiddle (Change the DBMS to some other like Postgres or MariaDB to see that they also accept the first query.)

Related

mysql - query to extract report from book register

I have the below query in mysql, when I run the query, it gives me the complete report and "where clause does not work"
SELECT oo.dateaccessioned AS 'Date',
oo.barcode AS 'Acc. No.',
ooo.title AS 'Title',
ooo.author AS 'Author/Editor',
concat_ws(' , ', o.editionstatement, oo.enumchron) AS 'Ed./Vol.',
concat_ws(' ', o.place, o.publishercode) AS 'Place & Publisher',
ooo.copyrightdate AS 'Year', o.pages AS 'Page(s)',
ooooooo.name AS 'Source',
oo.itemcallnumber AS 'Class No./Book No.',
concat_ws(', ₹', concat(' ', ooooo.symbol, oooo.listprice), oooo.rrp_tax_included) AS 'Cost',
concat_ws(' , ', oooooo.invoicenumber, oooooo.shipmentdate) AS 'Bill No. & Date',
'' AS 'Withdrawn Date',
'' AS 'Remarks'
FROM biblioitems o
LEFT JOIN items oo ON oo.biblioitemnumber=o.biblioitemnumber
LEFT JOIN biblio ooo ON ooo.biblionumber=o.biblionumber
LEFT JOIN aqorders oooo ON oooo.biblionumber=o.biblionumber
LEFT JOIN currency ooooo ON ooooo.currency=oooo.currency
LEFT JOIN aqinvoices oooooo ON oooooo.booksellerid=oo.booksellerid
LEFT JOIN aqbooksellers ooooooo ON ooooooo.id=oo.booksellerid
WHERE cast(oo.barcode AS UNSIGNED) BETWEEN <<Accession Number>> AND <<To Accession Number>>
GROUP BY oo.barcode
ORDER BY oo.barcode ASC
Can you please help me to generate a report based on above query - oo.barcode (it is a varchar). I am a Library team member than a database administrator. My oo.barcode begins with HYD and then numercs. I know if it(oo.barcode) is a number only field the above query works without any issue.
I search about how cast works but not able to understand as i am not into database administration.
If the barcode column is VARCHAR and begins with "HYD", CAST AS UNSIGNED will cause a value of HYD123 to result in 0.
The non-numeric characters of the string would need to be removed prior to casting the value as an integer.
This can be achieved by trimming the leading text "HYD" from the barcode.
CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
Otherwise, if the prefix is always 3 characters, the substring position of barcode can be used.
CAST(SUBSTR(barcode, 4) AS UNSIGNED)
If any other non-numeric characters are contained within the string, such as HYD-123-456-789, HYD123-456-789PT, HYD123-456.789, etc, they will also needed to be removed, as the type conversion will treat them in unexpected ways.
In addition, any leading 0's of the resulting numeric string value will be truncated from the resulting integer, causing 0123 to become 123.
For more details on how CAST functions see: 12.3 Type Conversion in Expression Evaluation
Examples db<>fiddle
CREATE TABLE tester (
barcode varchar(255)
);
INSERT INTO tester(barcode)
VALUES ('HYD123'), ('HYD0123'), ('HYD4231');
Results
SELECT cast(barcode AS UNSIGNED)
FROM tester;
cast(barcode AS UNSIGNED)
0
0
0
SELECT CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
FROM tester;
CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED)
123
123
4231
SELECT barcode
FROM tester
WHERE CAST(TRIM(LEADING 'HYD' FROM barcode) AS UNSIGNED) BETWEEN 120 AND 4232;
barcode
HYD123
HYD0123
HYD4231
SELECT CAST(SUBSTR(barcode, 4) AS UNSIGNED)
FROM tester;
CAST(SUBSTR(barcode, 4) AS UNSIGNED)
123
123
4231
SELECT barcode
FROM tester
WHERE CAST(SUBSTR(barcode, 4) AS UNSIGNED) BETWEEN 120 AND 4232;
barcode
HYD123
HYD0123
HYD4231
JOIN optimization
To obtain the expected results, you most likely want an INNER JOIN of the items table with an ON criteria matching the desired barcode range condition. Since INNER JOIN is the equivalent of using WHERE oo.barcode IS NOT NULL, as is the case with your current criteria - NULL matches within the items table are already being excluded.
INNER JOIN items AS oo
ON oo.biblioitemnumber = o.biblioitemnumber
AND CAST(SUBSTR(oo.barcode, 4) AS UNSIGNED) BETWEEN ? AND ?
Full-Table Scanning
It is important to understand that transforming the column value to suit a criteria will cause a full-table scan that does not benefit from indexing, which will run very slowly.
Instead it is best to store the integer only version of the value in the database to see the benefits of indexing.
This can be accomplished in many ways, such as generated columns.
GROUP BY without an aggregate
Lastly, you should avoid using GROUP BY without an aggregate function. You most likely are expecting DISTINCT or similar form of limiting the record set. Please see MySQL select one column DISTINCT, with corresponding other columns on ways to accomplish this.
To ensure MySQL is not selecting "any value from each group" at random (leading to differing results between query executions), limit the subset data to the distinct biblioitemnumber column values from the available barcode matches. One approach to accomplish the limited subset is as follows.
/* ... */
FROM biblioitems o
INNER JOIN (
SELECT biblioitemnumber, barcode, booksellerid, enumchron, itemcallnumber
FROM items WHERE biblioitemnumber IN(
SELECT MIN(biblioitemnumber)
FROM items
WHERE CAST(SUBSTR(barcode, 4) AS UNSIGNED) BETWEEN ? AND ?
GROUP BY barcode
)
) AS oo
ON oo.biblioitemnumber = o.biblioitemnumber
LEFT JOIN biblio ooo ON ooo.biblionumber=o.biblionumber
LEFT JOIN aqorders oooo ON oooo.biblionumber=o.biblionumber
LEFT JOIN currency ooooo ON ooooo.currency=oooo.currency
LEFT JOIN aqinvoices oooooo ON oooooo.booksellerid=oo.booksellerid
LEFT JOIN aqbooksellers ooooooo ON ooooooo.id=oo.booksellerid
ORDER BY oo.barcode ASC
Try this :
...
WHERE cast(SUBSTRING_INDEX(oo.barcode,'HYD',-1) AS UNSIGNED INTEGER) BETWEEN <<Accession Number>> AND <<To Accession Number>>
...
SUBSTRING_INDEX(oo.barcode,'HYD',-1) will transform HYD132453741 to 132453741
demo here

MySQL - Match certain IDs, but only those IDs

I have a table like so:
id_type id_option
"1" "1"
"1" "5"
"2" "1"
"2" "5"
"2" "8"
I am trying to write a query that given a list of option IDs finds the "type" that matches the list, but only those ID's
For example, if given 1 and 5 as options, it should return the type 1 but only the type 1 as the 8 required to match type 2 is not present.
I have tried the following:
SELECT *
FROM my_table
WHERE id_option IN (1, 5)
GROUP BY id_type
HAVING COUNT(DISTINCT id_option) = 2
This returns both "types" - I had hoped that the COUNT restriction of 2 would have helped but I now understand why it doesn't, but I can't think of a clever way to limit this.
I could just pull the first record as typically the types with less options are saved first but I don't think I can rely on this 100%
Thank you for your time
Here's a solution:
SELECT *
FROM my_table
GROUP BY id_type
HAVING SUM(id_option IN (1,5)) = COUNT(*)
It relies on a trick specific to MySQL: boolean true is literally the integer 1. So you can use SUM() to count the rows where a condition is true, but putting a boolean expression inside SUM().
For folks reading this who use other databases besides MySQL, you'd have to use an expression to convert the boolean condition to the integer 1:
HAVING SUM(CASE WHEN id_option IN (1,5) THEN 1 ELSE 0 END) = COUNT(*)
In this case, let all rows become part of the groups. That is, do not use a WHERE clause to restrict the query to rows where the id_option is 1 or 5. Then count the total rows in the group, and "count" (i.e. use the SUM() trick) the rows where the id_options is 1 or 5. Comparing these counts will be equal if there are no id_options values besides 1 or 5.
If you also want to make sure that both 1 and 5 are found, you need another condition:
SELECT *
FROM my_table
GROUP BY id_type
HAVING SUM(id_option IN (1,5)) = COUNT(*)
AND COUNT(DISTINCT CASE WHEN id_option IN (1,5) THEN id_option END) = 2
The CASE expression will return 1 or 5, or if there are any other values, those are converted to NULL. The COUNT() function ignores NULLs.
If you can pass the options as a sorted comma separated list string, then use GROUP_CONCAT():
SELECT id_type
FROM my_table
GROUP BY id_type
HAVING GROUP_CONCAT(id_option ORDER BY id_option) = '1,5'
If there are duplicate options for each type, use DISTINCT:
HAVING GROUP_CONCAT(DISTINCT id_option ORDER BY id_option) = '1,5'
While I can't comment yet, here's a tiny adjustment to Bill Karwin's last example (in the accepted solution):
SELECT *
FROM my_table
GROUP BY id_type
HAVING SUM(id_option IN (1,5)) = COUNT(*)
AND COUNT(DISTINCT id_option) = 2

Creating new data table on existing one

Hello I've got a question, how (if it possible), I can create new datatables with close same rows but if In column value is in string "/" for example
ID
column_param
column_sym
column_value
column_val2
First
param_test1
ABC
11/12
test
Second
param_test2
CDE
22/11
test
Third
param_test3
EFG
44
teste
4'th
param_test4
HIJ
33/22
test
And here if I have param_test1 and param_test4 and if in this column value has "/" I want to create 2 other rows but if I will not set param_test2 then it stay as it is and everything should be in new datatable. Is any way to create this?
Thank you in advance.
Expected result:
As per Gordon's answer, I'm not sure what should be done with the your ID column.
I've replaced these with row numbers.
Depending on your version of MySQL/MariaDB, the ROW_NUMBER() window function may not be available. Depending on whether IDs in the output are necessary you may be able to simply omit this.
I've assumed the existence of a table called myNumbers which contains a single field num and is populated with positive integers from 1 to whatever you're likely to need.
I've included more in the output that you asked for, which will hopefully help you understand what's going on
SELECT
ROW_NUMBER() OVER (ORDER BY d.ID, n.num) as NewID,
d.ID as OriginalID,
n.num as,
d.column_param,
d.column_sym,
d.column_value as orig_value,
CASE WHEN column_param = 'param_test2' THEN d.column_value
ELSE substring_index(substring_index(d.column_value,'/',n.num),'/',-1) END as split_value,
d.column_val2
FROM
myData d
JOIN myNumbers n on char_length(d.column_value)-char_length(replace(d.column_value,'/','')) >= n.num-1
WHERE
n.num = 1 OR d.column_param <> 'param_test2'
ORDER BY
d.ID,
n.num
See this DB Fiddle (the columns output in a different order than I've specified, but I think that's a DB Fiddle quirk).
If you only want to "split" say param_test1 and param_test4 rows the code above code could be amended as follows:
SELECT
ROW_NUMBER() OVER (ORDER BY d.ID, n.num) as NewID,
d.ID as OriginalID,
d.column_param,
d.column_sym,
n.num,
d.column_value as orig_value,
CASE WHEN column_param NOT IN ('param_test1','param_test4') THEN d.column_value
ELSE substring_index(substring_index(d.column_value,'/',n.num),'/',-1) END as split_value,
d.column_val2
FROM
myData d
JOIN myNumbers n on char_length(d.column_value)-char_length(replace(d.column_value,'/','')) >= n.num-1
WHERE
n.num = 1 OR d.column_param IN ('param_test1','param_test4')
ORDER BY
d.ID,
n.num
I don't know how the id is being set, but you can do what you want using union all:
select column_param, column_sym,
substring_index(column_value, '/', 1) as column_value,
column_val2
from t
union all
select column_param, column_sym,
substring_index(column_value, '/', -1) as column_value,
column_val2
from t
where column_value = '%/%';

UNION in MySQL 5.7.2

I'm using MySQL 5.7.
I am getting bad results by a UNION of COUNT(*).
SELECT
COUNT(*) AS Piezas
, ''Motor
from parque
where `parque`.`CausasParalizacion` = 2
UNION
SELECT
''Piezas
, COUNT(*) AS Motor
from parque
where `parque`.`CausasParalizacion` = 3
The result should be 30 and 12, and I am getting 3330 and 3132.
Can anyone help?
I don't think MySQL is returning a "bad" result. The results returned by MySQL are per the specification.
Given no GROUP BY each of the SELECT statements will return one row. We can verify by running each SELECT statement separately. We'd expect the UNION result of the two SELECT to be something like
Piezas Motor
------ -----
mmm
ppp
You say the results should be '30' and '12'
My guess is that MySQL is returning the characters '30' and '12'.
But we should be very suspicious, and note the hex representation of the ASCII encoding of those characters
x'30' -> '0'
x'31' -> '1'
x'32' -> '2'
x'33' -> '3'
As a demonstration
SELECT HEX('30'), HEX('12')
returns
HEX('30') HEX('12')
--------- ---------
3330 3132
I don't think MySQL is returning "bad" results. I suspect that the column metadata for the columns is confusing the client. (We do note that both of the columns is a mix of two different datatypes being UNION'd. On one row, the datatype is string/varchar (an empty string), and the other row is integer/numeric (result of COUNT() aggregate.)
And I'm not sure what the resultset metadata for the columns ends up as.
I suspect that the issue with the client interpretation the resultset metadata, determining the datatype of the columns. And the client is deciding that the most appropriate way to display the values is as a hex representation of the raw bytes.
Personally, I would avoid returning a UNION result of different/incompatible datatypes. I'd prefer the datatypes be consistent.
If I had to do the UNION of incompatible datatypes, I would include an explicit conversion into compatible/appropriate datatypes.
But once I am at that point, I have to question why I need any of that rigmarole with the mismatched datatypes, why we need to return two separate rows, when we could just return a single row (probably more efficiently to boot)
SELECT SUM( p.`CausasParalizacion` = 2 ) AS Piezas
, SUM( p.`CausasParalizacion` = 3 ) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
To avoid the aggregate functions returning NULL,
we can wrap the aggregate expressions in an IFNULL (or ANSI-standard COALESCE) function..
SELECT IFNULL(SUM( p.`CausasParalizacion` = 2 ),0) AS Piezas
, IFNULL(SUM( p.`CausasParalizacion` = 3 ),0) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
-or-
we could use a COUNT() of an expression that is either NULL or non-NULL
SELECT COUNT(IF( p.`CausasParalizacion` = 2 ,1,NULL) AS Piezas
, COUNT(IF( p.`CausasParalizacion` = 3 ,1,NULL) AS Motor
FROM parque p
WHERE p.`CausasParalizacion` IN (2,3)
If, for some reason it turns out it is faster to run two separate SELECT statements, we could still combine the results into a single row. For example:
SELECT s.Piezas
, t.Motor
FROM ( SELECT COUNT(*) AS Piezas
FROM parque p
WHERE p.`CausasParalizacion` = 2
) s
CROSS
JOIN ( SELECT COUNT(*) AS Motor
FROM parque q
WHERE q.`CausasParalizacion` = 3
) t
Spencer, I think that the problem was about encoding. Ej. When I execute the consult in console, the result was the expected, the otherwise in the phpmyadmin.
However, I must say that your first solution works perfectly, Thanks a lot bro.

mysql count distinct value

I have trouble wondering how do I count distinct value. using if on the select column
I have SQLFIDDLE here
http://sqlfiddle.com/#!2/6bfb9/3
Records shows:
create table team_record (
id tinyint,
project_id int,
position varchar(45)
);
insert into team_record values
(1,1, 'Junior1'),
(2,1, 'Junior1'),
(3,1, 'Junior2'),
(4,1, 'Junior3'),
(5,1, 'Senior1'),
(6,1, 'Senior1'),
(8,1, 'Senior2'),
(9,1, 'Senior2'),
(10,1,'Senior3'),
(11,1, 'Senior3'),
(12,1, 'Senior3')
I need to count all distinct value, between Junior and Senior column.
all same value would count as 1.
I need to see result something like this.
PROJECT_ID SENIOR_TOTAL JUNIOR_TOTAL
1 3 3
mysql query is this. but this is not a query to get the result above.
SELECT
`team_record`.`project_id`,
`position`,
SUM(IF(position LIKE 'Senior%',
1,
0)) AS `Senior_Total`,
SUM(IF(position LIKE 'Junior%',
1,
0)) AS `Junior_Total`
FROM
(`team_record`)
WHERE
project_id = '1'
GROUP BY `team_record`.`project_id`
maybe you could help me fix my query above to get the result I need.
thanks
I think you want this:
SELECT
project_id,
COUNT(DISTINCT CASE when position LIKE 'Senior%' THEN position END) Senior_Total,
COUNT(DISTINCT CASE when position LIKE 'Junior%' THEN position END) Junior_Total
FROM team_record
WHERE project_id = 1
GROUP BY project_id
The CASE will return a null if the WHEN is false (ie ELSE NULL is the default, which I omitted for brevity), and nulls aren't counted in DISTINCT.
Also, unnecessary back ticks, brackets and qualification removed.