MySQL GROUP BY column ignorning specific value - mysql

I have a query on a table of phone calls that uses GROUP BY to show only unique caller ids. The problem is that if a caller has caller id blocking their caller id shows up as "Unknown" and the client doesn't want all Unknowns to be summed up together. So basically, instead of just GROUP BY caller_id I need to somehow do something like GROUP BY caller_id IF caller_id != 'Unknown'
Is this even possible? I'd like to avoid doing all the group processing in PHP if at all possible.

You can do something like:
SELECT caller_id FROM phone_calls WHERE caller_id != 'Unknown' GROUP BY caller_id;
or consider DISTINCT - in most cases it is faster - if you have index created on caller_id the performance is usually the same, but if not DISTINCT is better. If you need :
... show only unique caller ids ..
but maybe for your case (doing aggregation or something similar) you are not able to use it but just in case:
SELECT DISTINCT caller_id FROM phone_calls WHERE caller_id != 'Unknown';
-- EDIT AFTER discussion in comments
SELECT * FROM callers;
+----+-----------+-----------+
| id | caller_id | call_time |
+----+-----------+-----------+
| 1 | abc | 24 |
| 2 | abc | 16 |
| 3 | xyz | 10 |
| 4 | xyz | 10 |
| 5 | Unknown | 11 |
| 6 | Unknown | 12 |
| 7 | Unknown | 13 |
| 8 | xyz | 1 |
| 9 | abc | 10 |
+----+-----------+-----------+
SELECT caller_id, SUM(call_time) FROM callers
WHERE caller_id != 'Unknown'
GROUP BY caller_id;
+-----------+----------------+
| caller_id | SUM(call_time) |
+-----------+----------------+
| abc | 50 |
| xyz | 21 |
+-----------+----------------+
SELECT caller_id, SUM(call_time) FROM callers
GROUP BY caller_id;
+-----------+----------------+
| caller_id | SUM(call_time) |
+-----------+----------------+
| abc | 50 |
| Unknown | 36 |
| xyz | 21 |
+-----------+----------------+
SELECT caller_id, SUM(call_time) as total_time FROM callers
WHERE caller_id != 'Unknown'
GROUP BY caller_id
UNION
SELECT caller_id, call_time FROM callers
WHERE caller_id = 'Unknown';
+-----------+------------+
| caller_id | total_time |
+-----------+------------+
| abc | 50 |
| xyz | 21 |
| Unknown | 11 |
| Unknown | 12 |
| Unknown | 13 |
+-----------+------------+
SELECT caller_id, SUM(call_time) as total_time FROM callers
GROUP BY caller_id,
(case when caller_id = 'Unknown'
AND id is not null
then id end
);
+-----------+------------+
| caller_id | total_time |
+-----------+------------+
| abc | 50 |
| Unknown | 11 |
| Unknown | 12 |
| Unknown | 13 |
| xyz | 21 |
+-----------+------------+

Related

How to select the first match of a group of conditions in MySQL

I have a table like this:
MyTable
-------------------------------
| ID | from | to |
-------------------------------
| 1 | U_002 | C_005 |
| 2 | U_015 | C_004 |
| 3 | C_005 | U_011 |
| 4 | U_008 | C_001 |
| 5 | U_007 | C_005 |
| 6 | U_001 | C_005 |
| 7 | C_004 | U_015 |
| 8 | U_002 | C_002 |
| 9 | U_001 | C_009 |
| 10 | U_010 | C_005 |
| 11 | C_005 | U_001 |
| 12 | U_004 | C_003 |
| 13 | U_005 | C_005 |
| 14 | U_010 | C_001 |
| 15 | C_005 | U_001 |
-------------------------------
ID, is the Unique Incremental Key of the table.
The goal is:
By giving a value (for example: C_005, U_001, C_010, etc..) Obtain the first match of this two conditions: ((from == value) || (to == value)) starting from higher ID.
This means, that data can be "duplicate", but I only wants the first result of the group.
For example, C_004 and U_015, have TWO entries (C_004 -> U_015 and U_015 -> C_004). This should return only ONE.
Since we want to start from higher Id, that mean that it would return only 7 | C_004 | U_015.
Let's put an example:
Value = C_005
The expected output is:
15 | C_005 | U_001
13 | U_005 | C_005
10 | U_010 | C_005
5 | U_007 | C_005
3 | C_005 | U_011
1 | U_002 | C_005
The idea, is to get the ""last"" (because we are starting from higher Id) coincidence of TWO values.
As I have said, two values can have multiple coincidences, but I only want to get the "last" one (Higher Id).
use max()
select max(id) id,`from`,`to`
from table_name
group by `from`,`to`
Your data is messed up because you have duplicates and potentially cycles too. Arrggh. You should fix the data.
But you can still do what you want with a recursive CTE:
with recursive cte as (
select id, f, t, 1 as lev, cast(t as char(1000)) as visited
from t
where f in ('C_005') /*, 'U_001', 'C_010') */
union all
select t.id, t.f, t.t, lev + 1, concat_ws(',', cte.visited, t.f)
from cte join
t
on cte.f = t.t
where cte.visited not like concat('%', t.f, '%')
)
select distinct id, f, t
from cte
order by id desc;
Here is a db<>fiddle.

Select multiple rows into one row from one table

I have a table with related data across multiple rows that I need to query as one row.
string_value | def_id | location | model | asset_num | exp_date |
-------------+--------+----------+-------+-----------+------------+
null | 16 | A | CR35 | 1 | 2015-02-01 |
SWIT: C | 25 | A | CR35 | 1 | null |
null | 16 | B | CR85 | 2 | 2015-07-28 |
SWIT: D | 25 | B | CR85 | 2 | null |
What I am looking to end up with is a query that gives me results:
string_value | location | model | asset_num | exp_date |
-------------+----------+-------+-----------+------------+
SWIT: C | A | CR35 | 1 | 2015-02-01 |
SWIT: D | B | CR85 | 2 | 2015-07-28 |
Using aggregate function MAX() with GROUP BY return your expected result:
SELECT MAX(string_value) AS string_value ,
location,
MAX(model) AS model,
MAX(asset_num) AS asset_num,
MAX(exp_date) AS exp_date
FROM TableName
GROUP BY location
You can Try below - using aggregation and group by
select location, model, asset_num,max(string_value),max(exp_date)
from tablename
group by location, model, asset_num
I am guessing that the triple location, model, asset_num defines the row in the result set. If so, use aggregation:
select location, model, asset_num,
max(string_value) as string_value,
max(exp_date) as exp_date
from t
group by location, model, asset_num;

MySQLl key-value store ordering with specific condition

I have the following structure:
+----------+--------+---------------------+
| id| gr_id| name | value |
+----------+--------+---------------------+
| 1 | 11 | name | Burro |
| 2 | 11 | submit | 2019/05/10 |
| 3 | 11 | date | 2019/05/17 |
| 4 | 12 | name | Ajax |
| 5 | 12 | submit | 2019/05/10 |
| 6 | 12 | date | 2019/05/18 |
+----------+--------+---------------------+
I have to order it by the date(if the name is date), from highest to lowest date, also it has to keep the groups (gr_id) without mixing the elments.
The desired result would look like this:
+----------+--------+---------------------+
| id| gr_id| name | value |
+----------+--------+---------------------+
| 4 | 12 | name | Ajax |
| 5 | 12 | submit | 2019/05/10 |
| 6 | 12 | date | 2019/05/18 |
| 1 | 11 | name | Burro |
| 2 | 11 | submit | 2019/05/10 |
| 3 | 11 | date | 2019/05/17 |
+----------+--------+---------------------+
How can i implement this?
You'll have to associate the group ordering criteria with all the elements of the group. You can do it through a subquery, or a join.
Subquery version:
SELECT t.*
FROM (SELECT gr_id, value as `date` FROM t WHERE `name` = 'date') AS grpOrder
INNER JOIN t ON grpOrder.gr_id = t.gr_id
ORDER BY grpOrder.`date`
, CASE `name`
WHEN 'name' THEN 1
WHEN 'submit' THEN 2
WHEN 'date' THEN 3
ELSE 4
END
Join version:
SELECT t1.*
FROM t AS t1
INNER JOIN AS t2 ON t1.gr_id = t2.gr_id AND t2.`name` = 'date'
ORDER BY t2.value
, CASE t1.`name`
WHEN 'name' THEN 1
WHEN 'submit' THEN 2
WHEN 'date' THEN 3
ELSE 4
END

How can I treat with NULL as minimum value?

I have a table like this:
// notifications
+----+-----------+-------+---------+---------+------+
| id | score | type | post_id | user_id | seen |
+----+-----------+-------+---------+---------+------+
| 1 | 15 | 1 | 2342 | 342 | 1 |
| 2 | 5 | 1 | 2342 | 342 | 1 |
| 3 | NULL | 2 | 5342 | 342 | 1 |
| 4 | -10 | 1 | 2342 | 342 | NULL |
| 5 | 5 | 1 | 2342 | 342 | NULL |
| 6 | NULL | 2 | 8342 | 342 | NULL |
| 7 | -2 | 1 | 2342 | 342 | NULL |
+----+-----------+-------+---------+---------+------+
-- type: 1 means "it is a vote", 2 means "it is a comment (without score)"
Here is my query:
SELECT SUM(score), type, post_id, seen
FROM notifications
WHERE user_id = 342
GROUP BY type, post_id
ORDER BY (seen IS NULL) desc
As you see, there is SUM() function, Also both type and post_id columns are in the GROUP BY statement. Well now I'm talking about seen column. I don't want to put it into GROUP BY statement. So I have to use either MAX() or MIN() for it. Right?
Actually I need to select NULL as seen column (in query above) if there is even one row which has seen = NULL. My current query selects 1 as seen's value, even when I use MIN(seen). So why 1 is minimum when there is NULL?
Also I want to order rows so that all SEEN = NULL be in the top of list. How can I do that?
Expected result:
// notifications
+-----------+-------+---------+------+
| score | type | post_id | seen |
+-----------+-------+---------+------+
| 13 | 1 | 2342 | NULL |
| NULL | 2 | 8342 | NULL |
| NULL | 2 | 5342 | 1 |
+-----------+-------+---------+------+
You could do this
case when sum(seen is null) > 0
then null
else min(seen)
end
You could use the following query:
SELECT SUM(score), type, post_id, min(IFNULL(seen, 0)) as seen
FROM notifications
WHERE user_id = 342
GROUP BY type, post_id
ORDER BY seen desc

Top 'n' results for each keyword

I have a query to get the top 'n' users who commented on a specific keyword,
SELECT `user` , COUNT( * ) AS magnitude
FROM `results`
WHERE `keyword` = "economy"
GROUP BY `user`
ORDER BY magnitude DESC
LIMIT 5
I have approx 6000 keywords, and would like to run this query to get me the top 'n' users for each and every keyword we have data for. Assistance appreciated.
Since you haven't given the schema for results, I'll assume it's this or very similar (maybe extra columns):
create table results (
id int primary key,
user int,
foreign key (user) references <some_other_table>(id),
keyword varchar(<30>)
);
Step 1: aggregate by keyword/user as in your example query, but for all keywords:
create view user_keyword as (
select
keyword,
user,
count(*) as magnitude
from results
group by keyword, user
);
Step 2: rank each user within each keyword group (note the use of the subquery to rank the rows):
create view keyword_user_ranked as (
select
keyword,
user,
magnitude,
(select count(*)
from user_keyword
where l.keyword = keyword and magnitude >= l.magnitude
) as rank
from
user_keyword l
);
Step 3: select only the rows where the rank is less than some number:
select *
from keyword_user_ranked
where rank <= 3;
Example:
Base data used:
mysql> select * from results;
+----+------+---------+
| id | user | keyword |
+----+------+---------+
| 1 | 1 | mysql |
| 2 | 1 | mysql |
| 3 | 2 | mysql |
| 4 | 1 | query |
| 5 | 2 | query |
| 6 | 2 | query |
| 7 | 2 | query |
| 8 | 1 | table |
| 9 | 2 | table |
| 10 | 1 | table |
| 11 | 3 | table |
| 12 | 3 | mysql |
| 13 | 3 | query |
| 14 | 2 | mysql |
| 15 | 1 | mysql |
| 16 | 1 | mysql |
| 17 | 3 | query |
| 18 | 4 | mysql |
| 19 | 4 | mysql |
| 20 | 5 | mysql |
+----+------+---------+
Grouped by keyword and user:
mysql> select * from user_keyword order by keyword, magnitude desc;
+---------+------+-----------+
| keyword | user | magnitude |
+---------+------+-----------+
| mysql | 1 | 4 |
| mysql | 2 | 2 |
| mysql | 4 | 2 |
| mysql | 3 | 1 |
| mysql | 5 | 1 |
| query | 2 | 3 |
| query | 3 | 2 |
| query | 1 | 1 |
| table | 1 | 2 |
| table | 2 | 1 |
| table | 3 | 1 |
+---------+------+-----------+
Users ranked within keywords:
mysql> select * from keyword_user_ranked order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| mysql | 2 | 2 | 3 |
| mysql | 4 | 2 | 3 |
| mysql | 3 | 1 | 5 |
| mysql | 5 | 1 | 5 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| query | 1 | 1 | 3 |
| table | 1 | 2 | 1 |
| table | 3 | 1 | 3 |
| table | 2 | 1 | 3 |
+---------+------+-----------+------+
Only top 2 from each keyword:
mysql> select * from keyword_user_ranked where rank <= 2 order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| table | 1 | 2 | 1 |
+---------+------+-----------+------+
Note that when there are ties -- see users 2 and 4 for keyword "mysql" in the examples -- all parties in the tie get the "last" rank, i.e. if the 2nd and 3rd are tied, both are assigned rank 3.
Performance: adding an index to the keyword and user columns will help. I have a table being queried in a similar way with 4000 and 1300 distinct values for the two columns (in a 600000-row table). You can add the index like this:
alter table results add index keyword_user (keyword, user);
In my case, query time dropped from about 6 seconds to about 2 seconds.
You can use a pattern like this (from Within-group quotas (Top N per group)):
SELECT tmp.ID, tmp.entrydate
FROM (
SELECT
ID, entrydate,
IF( #prev <> ID, #rownum := 1, #rownum := #rownum+1 ) AS rank,
#prev := ID
FROM test t
JOIN (SELECT #rownum := NULL, #prev := 0) AS r
ORDER BY t.ID
) AS tmp
WHERE tmp.rank <= 2
ORDER BY ID, entrydate;
+------+------------+
| ID | entrydate |
+------+------------+
| 1 | 2007-05-01 |
| 1 | 2007-05-02 |
| 2 | 2007-06-03 |
| 2 | 2007-06-04 |
| 3 | 2007-07-01 |
| 3 | 2007-07-02 |
+------+------------+