How to get counts using the `IN` operator - mysql

I am trying to use the IN operator to get the count of certain fields in the table.
This is my query:
SELECT order_id, COUNT(*)
FROM remake_error_type
WHERE order_id IN (1, 2, 100)
GROUP BY order_id;
My current output:
| order_id | COUNT(*) |
+----------+----------+
| 1 | 8 |
| 2 | 8 |
My expected output:
| order_id | COUNT(*) |
+----------+----------+
| 1 | 8 |
| 2 | 8 |
| 100 | 0 |

You can write your query this way:
SELECT t.id, COUNT(remake_error_type.order_id)
FROM
(SELECT 1 AS id UNION ALL SELECT 2 UNION ALL SELECT 100) as t
LEFT JOIN remake_error_type
ON t.id = remake_error_type.order_id
GROUP BY
t.id
a LEFT JOIN will return all rows from the subquery on the left, and the COUNT(remake_error_type.order_id) will count all values where the join succeeds.

You can create a temporary table, insert as many order_ids as required, and perform the left join to remake_error_type. At a small number of orders the other answers are sufficient, but if you were doing this for a lot of orders, UNION ALL and sub-queries are inefficient, both to type it up and to execute on the server.
Additionally, this is a very dynamic approach, because you can control easily the values in your temp table by modifying the insert statement.
However, this will only work if the database user has sufficient privileges: at least select, create temporary and drop table.
DROP TABLE IF EXISTS myTempOrders;
CREATE TEMPORARY TABLE myTempOrders (order_id INTEGER, PRIMARY KEY(order_id));
INSERT INTO myTempOrders (order_id) VALUES (1), (2), (100);
SELECT temp.order_id, count(*)
FROM myTempOrders temp
LEFT JOIN remake_error_type ON temp.order_id = remake_error_type.order_id
GROUP BY 1
If the order_id values exist in some table, then it is possible to extract the desired result without creating a temporary table and inserting values into it.
To qualify, the table must
have an auto increment primary key with # rows greater than the maximum sought order_id value
have a starting increment value less than the minimum sought order_id value
have no missing values in the primary key (i.e. no records have been deleted)
if a qualified table exists, then you can run the following query, where you have to replace surrogate with the qualified table name and surrogate_id with the auto-incrementing primary key of the qualified table name
SELECT surrogate.surrogate_id, count(*)
FROM my_qualified_table surrogate
LEFT JOIN remake_error_type ON surrogate.surrogate_id = remake_error_type.order_id
WHERE surrogate.surrogate_id IN (1, 2, 100)
GROUP BY 1

You could use a union for this. No, this does not use the IN operator, but it is an alternative that will give you your expected results. One option is to hardcode the order_id and use conditional aggregation to get the SUM() of rows with that id:
SELECT 1 AS order_id, SUM(order_id = 1) AS numOrders FROM myTable
UNION ALL
SELECT 2 AS order_id, SUM(order_id = 2) AS numOrders FROM myTable
UNION ALL
SELECT 100 AS order_id, SUM(order_id = 100) AS numOrders FROM myTable;
Here is an SQL Fiddle example.

Related

count of individual column with group by on multiple columns

I have two columns account_number and customer_id. A single customer can have multiple account but a single account can't have multiple customer.
I have dumped a file containing account_num and its corresponding customer_id to db through LOAD DATA INFILE command. Now I am trying to validate through query does any account which has come multiple times in a file has same customer_id or different customer_id in two different rows.
REQUIREMENT : i want to return those accounts which has come multiple times but having diferent customer ids
I tried with group by , but didn't get desired result.
This is my query which is not giving the desired result
SELECT ACCOUNT_NUM,UNIQUE_CUSTOMER_ID,COUNT(UNIQUE_CUSTOMER_ID)
FROM LINKAGE_FILE
GROUP BY ACCOUNT_NUM, UNIQUE_CUSTOMER_ID
HAVING COUNT(ACCOUNT_NUM) > 1 AND COUNT(UNIQUE_CUSTOMER_ID) = 1;
Hope I am clear.
You can simply get the count of unique customer ids using COUNT(DISTINCT..) for every account_num and filter out those cases where count is more than 1, inside the HAVING clause:
SELECT
ACCOUNT_NUM,
COUNT(DISTINCT CUSTOMER_ID) AS unique_customer_count
FROM LINKAGE_FILE
GROUP BY ACCOUNT_NUM
HAVING unique_customer_count > 1
Drop the customer check into a join query like so
DROP TABLE if exists t;
create table t(accountid int,cid int);
insert into t values
(1,1),(1,2).(1,1),(2,3),(3,4),(3,4);
select distinct t.accountid,t.cid
from t
join
(
select accountid,count(distinct cid) cids
from t
group by accountid having cids > 1
) s on s.accountid = t.accountid;
+-----------+------+
| accountid | cid |
+-----------+------+
| 1 | 1 |
| 1 | 2 |
+-----------+------+
2 rows in set (0.00 sec)
You can use EXISTS :
SELECT lf.*
FROM LINKAGE_FILE lf
WHERE EXISTS (SELECT 1 FROM LINKAGE_FILE lf1 WHERE lf1.ACCOUNT_NUM = lf.ACCOUNT_NUM AND lf1.UNIQUE_CUSTOMER_ID <> lf.UNIQUE_CUSTOMER_ID);
However, you can also aggregation with your query :
SELECT ACCOUNT_NUM, COUNT(DISTINCT UNIQUE_CUSTOMER_ID)
FROM LINKAGE_FILE
GROUP BY ACCOUNT_NUM
HAVING COUNT(DISTINCT UNIQUE_CUSTOMER_ID) > 1;
By this, you can get only ACCOUNT_NUMs which have two or more CUSTOMER_IDs.

Count group by enum including possible enum values that have 0 count

I have a table of items. One of the fields is a category (represented by an enum). Some categories have zero items.
So I did this:
select category, count(*) as total from items group by category;
+------------+-------+
| category | total |
+------------+-------+
| one | 6675 |
+------------+-------+
I want to generate a table like this (where two is the other possible enum value):
+------------+-------+
| category | total |
+------------+-------+
| one | 6675 |
+------------+-------+
| two | 0 |
+------------+-------+
How do I do this with an mysql SQL query?
Enum datatype is generally preferred for those cases where possible options (values) are not too many (prefer <= 10), and you are not going to add new options in future (atleast not very frequently). So, a good use-case for Enum is gender: (m, f, n). In your case, it would be generally better to have a Master table of all possible Categories, instead of using Enum for them. Then it is easier to do a LEFT JOIN from the Master table.
However, as asked by you:
A solution uses the enum type to generate the table, and includes 0
entries
Works for all MySQL/MariaDB versions:
We will need to get the list of all possible Enum values from INFORMATION_SCHEMA.COLUMNS:
SELECT
SUBSTRING(COLUMN_TYPE, 6, CHAR_LENGTH(COLUMN_TYPE) - 6) AS enum_values
FROM
information_schema.COLUMNS
WHERE
TABLE_NAME = 'items' -- your table name
AND
COLUMN_NAME = 'category' -- name of the column
AND
TABLE_SCHEMA = 'your_db' -- name of the database (schema)
But then, this query will give you all the enum values in comma-separated string, like below:
'one','two','three','four'
Now, we will need to convert this string into multiple rows. To achieve that, we can use a Sequence (Number series) table. You can define a permanent table in your database storing integers ranging from 1 to 100 (you may find this table helpful in many other cases as well) (OR, another approach is to use a Derived Table - check this to get an idea: https://stackoverflow.com/a/58052199/2469308).
CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL, PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1), (2), ...... , (99), (100);
Now, we will do a JOIN between "enum values string" and seq table, based on the position of comma, to extract enum values into different rows. Note that instead of just using , (comma) to extract enum values, we would use ',' (to avoid cases when there might be a comma inside the value string). String operations utilizing Substring_Index(), Trim(), Char_Length() etc functions can be used to extract enum values. You can check this answer to get a general idea about this technique:
Schema (View on DB Fiddle)
CREATE TABLE items
(id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
category ENUM('one','two','three','four'),
item_id INT UNSIGNED) ENGINE=InnoDB;
INSERT INTO items (category, item_id)
VALUES ('one', 1),
('two', 2),
('one', 2),
('one', 3);
CREATE TABLE seq (n tinyint(3) UNSIGNED NOT NULL,
PRIMARY KEY(n));
INSERT INTO seq (n) VALUES (1),(2),(3),(4),(5);
Query #1
SELECT Trim(BOTH '\'' FROM Substring_index(Substring_index(e.enum_values,
'\',\'',
seq.n),
'\',\'', -1)) AS cat
FROM (SELECT Substring(column_type, 6, Char_length(column_type) - 6) AS
enum_values
FROM information_schema.columns
WHERE table_name = 'items'
AND column_name = 'category'
AND table_schema = 'test') AS e
JOIN seq
ON ( Char_length(e.enum_values) - Char_length(REPLACE(e.enum_values,
'\',\'',
''))
) / 3 >= seq.n - 1
| cat |
| ----- |
| one |
| two |
| three |
| four |
Now, the hard part is done. All we need to do is do a LEFT JOIN from this subquery (having all category enum values) to your items table, to get Count per category.
The final query follows (View on DB Fiddle):
SELECT all_cat.cat AS category,
Count(i.item_id) AS total
FROM (SELECT Trim(BOTH '\'' FROM Substring_index(
Substring_index(e.enum_values,
'\',\'',
seq.n),
'\',\'', -1)) AS cat
FROM (SELECT Substring(column_type, 6, Char_length(column_type) - 6)
AS
enum_values
FROM information_schema.columns
WHERE table_name = 'items'
AND column_name = 'category'
AND table_schema = 'test') AS e
JOIN seq
ON ( Char_length(e.enum_values) - Char_length(
REPLACE(e.enum_values,
'\',\'',
''))
) / 3 >= seq.n - 1) AS all_cat
LEFT JOIN items AS i
ON i.category = all_cat.cat
GROUP BY all_cat.cat
ORDER BY total DESC;
Result
| category | total |
| -------- | ----- |
| one | 3 |
| two | 1 |
| three | 0 |
| four | 0 |
Here is some fun with MySQL 8.0 and JSON_TABLE():
select c.category, count(i.category) as total
from information_schema.COLUMNS s
join json_table(
replace(replace(replace(trim('enum' from s.COLUMN_TYPE),'(','['),')',']'),'''','"'),
'$[*]' columns (category varchar(50) path '$')
) c
left join items i on i.category = c.category
where s.TABLE_SCHEMA = 'test' -- replace with your db/schema name
and s.TABLE_NAME = 'items'
and s.COLUMN_NAME = 'category'
group by c.category
It converts the ENUM type definition from information_schema to a JSON array, which is then converted by JSON_TABLE() to a table, which you then can use for a LEFT JOIN.
See demo on db-fiddle
Note: The categories should not contain any characters from ()[]'".
But seriously – Just create the categories table. There are more reasons to do that. For example you might want to render a drop-down menu with all possible categories. That would be simple with
select category from categories
I would say that it's basically bad practice to encode your enumerations into the script. Therefore, create a table with the enumerations present (and their relative keys), then it's a simple case of grouping a left joined query...
SELECT
cat.enum_name,
COUNT(data.id) AS total
FROM
category_table cat
LEFT JOIN
data_table data
ON cat.cate_id = data.cat_id
GROUP BY
cat.enum_name
using in-select subquery
select cat.categoryname
(
select count(*) -- count total
from items as i
where i.category = cat.category -- connect
) as totalcount
from cat
order by cat.categoryname
You can make a fictive dataset of the different categories and do a left join with your original table as shown below.
SELECT A.category, count(*) total FROM
(SELECT 'one' as Category
UNION ALL
SELECT 'two' as Category) A
LEFT JOIN items B
ON A.Category=B.Category
GROUP BY B.Category;
If you would prefer to get a list of all the category dynamically, then save them in another table (say All_category_table) then do a join as shown below:
SELECT A.category, count(*) total FROM
(SELECT Category FROM All_category_table) A
LEFT JOIN items B
ON A.Category=B.Category
GROUP BY B.Category;
This answer is applicable for when you do not have another table holding the possible category values.
Let's say you have a table called real_table with a not null & value constrained column category. In this column you know you can theoretically encounter 5 different values: 'CATEGORY_0', 'CATEGORY_1', 'CATEGORY_2', 'CATEGORY_3', 'CATEGORY_4':
CREATE TABLE real_table
(
id VARCHAR(255) NOT NULL
PRIMARY KEY,
category VARCHAR(255) NOT NULL
CONSTRAINT category_in CHECK (
category in ('CATEGORY_0',
'CATEGORY_1',
'CATEGORY_2',
'CATEGORY_3',
'CATEGORY_4')
)
);
But your actual data set in the table does not include any row with value 'CATEGORY_0'. So when you run a query such as:
SELECT real_table.category AS category, COUNT(*) AS cnt
FROM real_table
GROUP BY real_table.category;
you will see, that you get result like this:
category
cnt
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
Hmm, the 'CATEGORY_0' is omitted. Not good.
Since your categories are not backed by another table, then you must create an artificial dataset of the possible categories that looks as below:
SELECT 'CATEGORY_0' AS category_entry
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry;
You can use this in your original query as a table to do a right join on:
SELECT all_categories.category_entry AS category,
COUNT(real_table.id) AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
RIGHT JOIN
(SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry) all_categories
ON real_table.category = all_categories.category_entry
GROUP BY all_categories.category_entry;
Now when you run the query, you should get the desired output:
category
cnt
CATEGORY_0
0
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
The 'CATEGORY_0' is now included with zero cnt. Nice.
Now let's say that the category column is not not null constrained and can also possibly include some other unexpected category values (e.g. 'CATEGORY_66'):
CREATE TABLE real_table
(
id VARCHAR(255) NOT NULL
PRIMARY KEY,
category VARCHAR(255) -- nullable and no constraint for valid values
);
We would like to include these null and unexpected category counts in the result set as well.
Then we must prepare the artificial dataset of the possible categories differently:
SELECT DISTINCT all_categories.category_entry
FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry
UNION ALL
SELECT DISTINCT category
FROM real_table AS category_entry) all_categories;
and use it as before:
SELECT distinct_categories.category_entry AS category,
COUNT(real_table.id) AS cnt -- important to count some non-null value, such as PK of the real_table
FROM real_table
RIGHT JOIN
(SELECT DISTINCT all_categories.category_entry
FROM (SELECT 'CATEGORY_0' AS category_entry -- not present in any row in table 'all_categories'
UNION ALL
SELECT 'CATEGORY_1' AS category_entry
UNION ALL
SELECT 'CATEGORY_2' AS category_entry
UNION ALL
SELECT 'CATEGORY_3' AS category_entry
UNION ALL
SELECT 'CATEGORY_4' AS category_entry
UNION ALL
SELECT DISTINCT category
FROM real_table AS category_entry) all_categories) distinct_categories
ON real_table.category = distinct_categories.category_entry
GROUP BY distinct_categories.category_entry;
Now when you run the query, the output should also include counts for additional categories and null categories
category
cnt
CATEGORY_0
0
CATEGORY_1
150
CATEGORY_2
20
CATEGORY_3
12
CATEGORY_4
1
CATEGORY_66
13
10
Both unexpected 'CATEGORY_66' (with 13 entries) as well as null category (with 10 entries) are now included in the result set
I cannot vouch for the performance of the provided queries - somebody more experienced might weigh in on that?

Find which values are not in table

Simple question, but I'm drawing a blank. Any help is appreciated.
I have a table of ids:
-------
| ids |
-------
| 1 |
| 5 |
| 7 |
-------
Except the actual table is thousands of entries long.
I have a list (x), not a table, of other ids, say 2, 6, 7. I need to see which ids from x are not in the ids table.
I need to get back (2,6).
I tried something like this:
SELECT id FROM ids WHERE id IN (2,6,7) GROUP BY id HAVING COUNT(*) = 0;
However, COUNT(*) returns count of retrieved rows only, it doesn't return 0.
Any suggestions?
Create a temporary table, insert the IDs that you need into it, and run a join, like this:
CREATE TEMPORARY TABLE temp_wanted (id BIGINT);
INSERT INTO temp_wanted(id) VALUES (2),(6),(7);
SELECT id
FROM temp_wanted t
LEFT OUTER JOIN ids i ON i.id=t.id
WHERE i.id IS NULL
Try something with "NOT IN" clause:
select * from
(SELECT 2 as id
UNION ALL
SELECT 6 as id
UNION ALL
SELECT 7 as id) mytable
WHERE ID not in (SELECT id FROM ids)
See fiddle here

MySQL: how to increase speed of a select query with 2 joins and 1 subquery

In a table 'ttraces' I have many records for different tasks (whose value is held in 'taskid' column and is a foreign key of a column 'id' in a table 'ttasks'). Each task inserts a record to 'ttraces' every 8-10 seconds, so caching data to increase performance is not a good idea. What I need is to select only the newest records for each task from 'ttraces', that means the records with the maximum value of the column 'time'. At the moment, I have over 500000 records in the table. The very simplified structure of these two tables looks as follows:
-----------------------
| ttasks |
-----------------------
| id | name | blocked |
-----------------------
---------------------
| ttraces |
---------------------
| id | taskid | time |
---------------------
And my query is shown below:
SELECT t.name,tr.time
FROM
ttraces tr
JOIN
ttasks t ON tr.itask = t.id
JOIN (
SELECT taskid, MAX(time) AS max_time
FROM ttraces
GROUP BY itask
) x ON tr.taskid = x.taskid AND tr.time = x.max_time
WHERE t.blocked
All columns used in WHERE and JOIN clauses are indexed. As for now the query runs for ~1,5 seconds. It's extremely crucial to increase its speed. Thanks for all suggestions. BTW: the database is running on a hosted, shared server and I can't move it anywhere else for the moment.
[EDIT]
EXPLAIN SELECT... results are:
--------------------------------------------------------------------------------------------------------------
id select_type table type possible_keys key key_len ref rows Extra
--------------------------------------------------------------------------------------------------------------
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 74
1 PRIMARY t eq_ref PRIMARY PRIMARY 4 x.taskid 1 Using where
1 PRIMARY tr ref taskid,time time 9 x.max_time 1 Using where
2 DERIVED ttraces index NULL itask 5 NULL 570853
--------------------------------------------------------------------------------------------------------------
The engine is InnoDB.
I may be having a bit of a moment, but is this query not logically the same, and (almost certainly) faster?
SELECT t.id, t.name,max(tr.time)
FROM
ttraces tr
JOIN
ttasks t ON tr.itask = t.id
where BLOCKED
group by t.id, t.name
Here's my idea... You need one composite index on ttraces having taskid and time columns (in that order). Than, use this query:
SELECT t.name,
trm.mtime
FROM ttasks AS t
JOIN (SELECT taskid,
Max(time) AS mtime
FROM ttraces
GROUP BY taskid) AS trm
ON t.id = trm.taskid
WHERE t.blocked
Does this code return correct result? If so how is its speed time?
SELECT t.name, max_time
FROM ttasks t JOIN (
SELECT taskid, MAX(time) AS max_time
FROM ttraces
GROUP BY taskid
) x ON t.id = x.taskid
If there are many traces for each task then you can keep a table with only the newest traces. Whenever you insert into ttraces you also upsert into ttraces_newest:
insert into ttraces_newest (id, taskid, time) values
(3, 1, '2012-01-01 08:02:01')
on duplicate key update
`time` = current_timestamp
The primary key to ttraces_newest would be (id, taskid). Querying ttraces_newest would be cheaper. How much cheaper depends on how many traces there are to each task. Now the query is:
SELECT t.name,tr.time
FROM
ttraces_newest tr
JOIN
ttasks t ON tr.itask = t.id
WHERE t.blocked

return null row in mysql if record not found for the given id

Hi am using the below mysql query
SELECT *
FROM particulars pp
WHERE (pp.SnoFK IN (108,999999)
AND pp.curMonth = STR_TO_DATE('01/02/2012', '%d/%m/%Y'))
In my table i have record for only 108, so it returns only one row for 108.
Is there any other option in mysql that can i return two rows which i dont have the id in the table like
1.108 | *
2.999999 | null values
I have no better idea:
http://sqlfiddle.com/#!2/82cc5/2
SELECT
ids.id,
particulars.*
FROM ( SELECT 108 AS id
UNION SELECT 1122 AS id
UNION SELECT 999999 AS id
) AS ids -- create a "table" with the required numbers
LEFT JOIN particulars ON particulars.SnoFK = ids.id