Group by values ignoring commas - mysql

I have a table which contains rows that, mostly, have the name of a single country assigned to each row. Unfortunately, at some point, the “country” field had multiple values which were separated by a comma. Most of the rows now have a single value, but there are residual commas left in the some of the fields. For instance, some rows that pertain to Afghanistan have “Afghanistan” and some have “,Afghanistan”. My current SELECT query treats those values as two separate groups. I am not allowed to fiddle with the database to get rid of the commas.
What do I do to have my SELECT query to disregard the commas and group the countries values together. As an added complication, there are a few rows that have multiple country values, which, again, I can’t edit. Ideally I would like to exclude those entirely from the SELECT query (as well as rows that have a negative value in another field.
Example data of what my current query gives me:
,Afghanistan 66
,Albania 1
,Angola 25
,Bangladesh 2225
,Bolivia 824
,Bosnia 1
,Bosnia And Herzegovina 291
,Bosnia and Herzogovina 181
,France, Germany 1
Afghanistan32
Albania 3
Bangladesh 132
Bolivia 295
Bosnia and Herzegovina 79
Botswana 2
Here is my query:
/* Group by country and count instances selecting the resources has a positive number in the ref ID */
SELECT field3 "Country", COUNT(field3) FROM `resource` WHERE ref > 0 GROUP BY field3;

SELECT REPLACE(field3, ",", "") AS Country, COUNT(field3)
FROM `resource`
WHERE ref > 0
GROUP BY Country;

Related

MySQL Query to replace string with value

I have requirement like as below.
Need a MYSQL query to replace value with maching the below condition.
i have a table containg the Product ID
Product_ID
1
2
3
4
5
15
25
I want to replace the 5 with value of 1.111. My requiremnet is this that it should only replace the 5 value not the 15 value.
example 5 should be 1.111 but it sould not replace the 15 value.
You can use IF() or CASE to select a different value when the value meets a condition.
SELECT IF(product_id = '5', '1.111', product_id)
FROM yourTable
or
SELECT CASE product_id
WHEN '5' THEN '1.111'
ELSE product_id
END
FROM yourTable
CASE generalizes more easily to other values that you want to replace, since you can have multiple WHEN clauses.

Selecting a value based on how another field was generated

I'm selecting some data;
select c.*,
coalesce(s.column1, ...),
coalesce(s.column2, ...),
FROM
(SELECT ...)
Basically, if s.column1 or s.column2 is null then I am putting in some logic to take the average of that column and use it instead.
I want to have another field so I can know weather or not that value was computing using the average or not - perhaps a boolean? Lets say the average for column1 was 120, the table would look like;
column1 column2 avg
54 10 0
200 40 0
120 180 1
499 160 0
This allows me to see that the third row was generated using the avg of all rows as it was initially null.
How could the logic for the avg column work?
Your question seems fairly moot to me because:
The AVG function ignores NULL values by default, so the average using the overall average for NULL slots is the same as leaving out those slots entirely, and
If you just want to mark the rows which had a NULL value, you can use a CASE expression
So, to get what you want, just use this:
SELECT
column1,
column2,
CASE WHEN column1 IS NULL THEN 1 ELSE 0 END AS avg
FROM yourTable;
And know that SELECT AVG(column1) FROM yourTable would return the same value whether NULL rows were omitted, or the overall average were used.

Finding count of unique value before a character

I have a some entries in database table rows as follows.
101 - 1
101 - 2
101 - 3
102 - 1
102 - 2
102 - 3
103
I need to get the result of SELECT Query for count as '3' since there are 101 and 102 are the only number before the -.
So is there any way to find the unique value in db table columns before a character?
EDIT : I have entries even without the - .
In case your entries have always the format you have provided us, you just have to find the position of the '-' character, split the values, get the first n characters and count the distinct values
This works for SQL Server, otherwise informs us about what DBMS you are using or replace the functions with the ones of your DBMS on your own
SELECT COUNT(DISTINCT SUBSTRING(val,0,CHARINDEX('-', val))) from YourTable
create table T1
(
id int primary key identity,
col1 varchar(20)
)
insert into T1 values('101 - 1'),('101 - 2'),('101 - 3'),('102 - 1'),('102 - 2'),('102 - 3')
select SUBSTRING(col1,0,CHARINDEX(' ',col1)) as 'Value',count(*) as 'Count' from T1 group by SUBSTRING(col1,0,CHARINDEX(' ',col1))

MYSQL grouping by field that is not null

I have a table where a field is populated if the record is a duplicate. The code is already running, and properly checks for duplicates and is working.
The table looks like this:
id | dupe_ids | id_subscription
1 NULL 5343
2 3, 4 5343
3 2, 4 5343
4 2, 3 5343
5 NULL 5343
6 7 5343
7 6 5343
The query should return a count for the number of entries, but needs to group the duplicated ids. I need the query to group the records that have entries into one count, but somehow based on the duplicates. In the example above the count for subscription 5343, the count would be 4. Record 2 would count as one with 3 and 4 being skipped or grouped, and record 6 would count as one, with record 7 being grouped or skipped.
The query now looks like this:
SELECT app.id_subscription, app.id_site, app.id_customer, COUNT(*) AS app_count, site.url
FROM web_manager.app, web_manager.site
WHERE app.id_customer = :wm_id
AND (app.received_at BETWEEN :sdate AND :edate)
AND app.id_site = site.id
AND app.dupe_ids IS NULL
GROUP BY app.id_subscription
ORDER BY app_count DESC
If the values in dupe_ids is a list of numeric id values, and the list is always "in order" with the lowest value being the first in the list, as a dirty solution...
The query in my original answer (below) modified to replace the constant 0 with an expression like this: LEAST(a.id,SUBSTRING_INDEX(a.dupe_ids,',',1)+0).
That expression is saying: take the first value from the dupe_ids list, evaluate it in a numeric context, compare the numeric value to the id value from the row, and return the lower of the two.
SELECT COUNT(DISTINCT IF(a.dupe_ids IS NULL,a.id,LEAST(a.id,SUBSTRING_INDEX(a.dupe_ids,',',1)+0))) AS my_funky_cnt
, a.id_subscription
FROM web_manager.app a
JOIN web_manager.site s
ON s.id = a.id_site
WHERE ...
GROUP BY a.id_subscription
ORDER BY my_funky_cnt DESC
Again, removing the GROUP BY and the aggregate, to see what is actually being returned by the expression...
SELECT a.id
, a.dupe_ids
, a.id_subscription
, IF(a.dupe_ids IS NULL,a.id,LEAST(a.id,SUBSTRING_INDEX(a.dupe_ids,',',1)+0)) AS expr
FROM web_manager.app a
JOIN web_manager.site s
ON s.id = a.id_site
WHERE ...
ORDER BY a.id_subscription, a.dupe_ids IS NULL, a.id
we'd expect that to return:
id | dupe_ids | id_subscription | expr
2 3, 4 5343 2 -- id=2 is less than fv=3
3 2, 4 5343 2 -- fv=2 is less than id=3
4 2, 3 5343 2 -- fv=2 is less than id=4
6 7 5343 6 -- id=6 is less than fv=7
7 6 5343 6 -- fv=6 is less than id=7
1 NULL 5343 1
5 NULL 5343 5
So a GROUP BY id_subscription and COUNT(DISTINCT expr) would return a count of 4.
(this not tested)
This approach depends on dupes_id having the lowest id value listed first (first value in the list), evaluating that first value in a numeric context, and comparing that to the id value from the row.
If dupe_ids is an empty string, or starts with a comma, or the first non-blank characters can't be interpreted as a numeric value, then expr is going to return a 0.
EDIT
The original answer (below) was based on collapsing all of the rows with non-NULL values for a given id_subscription... returning a count of 3. The question has been updated, adding more example rows with non-NULL values which should not be collapsed together. Desired return for "count" is now 4. The query in the original answer would return a count of 3.
Getting a count of rows with a NULL value of dupe_ids is straightforward.
The sticky wicket is the bizarre contents of the dupe_ids column, the comma separated list of id values...
id dupe_ids
---- --------
2 '3,4'
3 '2,4'
4 '2,3'
6 '7'
7 '6'
This would be easier if we weren't dealing with a "comma separated list" of values. If we instead had foreign key references to the rows, in a separate table. Or, if we had some criteria other than the dupe_ids columns to identify rows that are "duplicates".
But, this wasn't the question asked. The question didn't ask if it would be better to avoid storing a comma separated list; whether there was a better approach.
The question leaves us dealing with a comma separated list. (It serves as an example of why we strongly recommend avoiding comma separated lists in the first place).
If we had an expression that has the values in dupe_ids along with the id value, together, so that we had identical values on the rows...
id dupe_ids expr
---- -------- ------
2 '3,4' '2,3,4'
3 '2,4' '2,3,4'
4 '2,3' '2,3,4'
6 '7' '6,7'
7 '6' '6,7'
Then we could use a COUNT(DISTINCT expr) to get us the return we're after. The ugly part is getting that value of expr. It would be easy to prepend or append id onto dupe_ids, but the resulting string values wouldn't be identical. The lists would be in a different order.
There's no simple builtin in function in MySQL to return the values shown for expr based on the contents of id and dupe_ids.
ORIGINAL ANSWER
The approach I would take is to use an expression, and count distinct values of that.
If dupe_ids is null, the return a unique value. If id is unique in the table, I would just use the value of that column. If dupe_ids is not null, then substitute a constant that is not a valid id value. Assuming id values are positive integers, I would use 0 or a negative value.
As an example:
SELECT COUNT(DISTINCT IF(a.dupe_ids IS NULL,a.id,0)) AS my_funky_cnt
, a.id_subscription
FROM web_manager.app a
JOIN web_manager.site s
ON s.id = a.id_site
WHERE ...
GROUP BY a.id_subscription
ORDER BY my_funky_cnt DESC
I'd verify the expression is "working" by first doing a query without the GROUP BY and aggregate...
SELECT a.id
, a.dupe_ids
, a.id_subscription
, IF(a.dupe_ids IS NULL,a.id,0) AS derived_col
FROM web_manager.app a
JOIN web_manager.site s
ON s.id = a.id_site
WHERE ...
ORDER BY a.id_subscription, a.dupe_ids IS NULL, a.id
We'd expect that to return:
id | dupe_ids | id_subscription | derived_col
1 NULL 5343 1
2 3, 4 5343 0
3 2, 4 5343 0
4 2, 3 5343 0
5 NULL 5343 5
So all of the rows with non-null dupe_ids have the same value, and the rows with NULL dupe_ids have a unique value.
And a COUNT(DISTINCT of that expression will return 3.

Ignore columns in MySQL query result with null values

I have a MySql table as,
Name Month Salary
=======================================
A Salary_Month_Sept 15000
A Salary_Month_Oct 0
B Salary_Month_Sept 12000
B Salary_Month_Oct 0
C Salary_Month_Sept 13000
C Salary_Month_Oct 0
and I am querying that table as
select Name,
max(IF(Month = 'Salary_Month_Sept', Salary, 0)) AS 'Salary_Month_Sept',
max(IF(Month = 'Salary_Month_Oct', Salary, 0)) AS 'Salary_Month_Oct'
from myTable
Which returns the query result as
Name Salary_Month_Sept Salary_Month_Oct
=============================================
A 15000 0
B 12000 0
C 17000 0
How can i ignore the column containing only zero or null values from the above query result.
Don't use *. Name columns you want to have. The query is not a crystal ball. It doesn't know in front if there will be data for the column. To do something like that you need 2 queries, assuming the salaries are only positive:
Select sum(salary_sept), sum(salary_oct), ... for the condition you need.
Create second select only for columns returning sum bigger than zero.
The SQL has no time machine, sorry. You have to do your work yourself.