Count flags for a variable (big) number of colums - mysql

I have a table which looks like this: http://i.stack.imgur.com/EyKt3.png
And I want a result like this:
Conditon COL
ted1 4
ted2 1
ted3 2
I.e., the count of the number of '1' only in this case.
I want to know the total no. of 1's only (check the table), neglecting the 0's. It's like if the condition is true (1) then count +1.
Also consider: what if there are many columns? I want to avoid typing expressions for every single one, like in this case ted1 to ted80.

Using proc means is the most efficient method:
proc means data=have noprint;
var ted:; *captures anything that starts with Ted;
output out=want sum =;
run;
proc print data=want;
run;

Try this
select
sum(case when ted1=1 then 1 else 0 end) as ted1,
sum(case when ted2=1 then 1 else 0 end) as ted2,
sum(case when ted3=1 then 1 else 0 end) as ted3
from table

In PostgreSQL (tested with version 9.4) you could unpivot with a VALUES expression in a LATERAL subquery. You'll need dynamic SQL.
This works for any table with any number of columns matching any pattern as long as selected columns are all numeric or all boolean. Only the value 1 (true) is counted.
Create this function once:
CREATE OR REPLACE FUNCTION f_tagcount(_tbl regclass, col_pattern text)
RETURNS TABLE (tag text, tag_ct bigint)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT
'SELECT l.tag, count(l.val::int = 1 OR NULL)
FROM ' || _tbl || ', LATERAL (VALUES '
|| string_agg( format('(%1$L, %1$I)', attname), ', ')
|| ') l(tag, val)
GROUP BY 1
ORDER BY 1'
FROM pg_catalog.pg_attribute
WHERE attrelid = _tbl
AND attname LIKE col_pattern
AND attnum > 0
AND NOT attisdropped
);
END
$func$;
Call:
SELECT * FROM f_tagcount('tbl', 'ted%');
Result:
tag | tag_ct
-----+-------
ted1 | 4
ted2 | 1
ted3 | 2
The 1st argument is a valid table name, possibly schema-qualified. Defense against SQL-injection is built into the data type regclass.
The 2nd argument is a LIKE pattern for the column names. Hence the wildcard %.
db<>fiddle here
Old sqlfiddle
Related:
Select columns with particular column names in PostgreSQL
SELECT DISTINCT on multiple columns

Related

match a number using regex for comma separated number

I have a string that contains number with separated by comma like below.
15,22,20,26,33,445,40,44,22,225,115,2
I want to know if a number say 15 is in that string or not.The problem is that 15 and 115 both are a match.Same for other number say 2, for this case 20 , 25, and 225 are match.For both cases only it should return if there is 15 or 2 in the string.I tried using like keyword but it's not working. It also return the rows with 115 or 20, 225, 222 whille matching 15 and 2 respectively. Can anyone suggest a regex pattern?
Update
I have a query like below where I was using like keyword, but I was getting wrong result for above reason.
SELECT DISTINCT A.id,A.title,A.title_hi,A.cId,B.id as cid1,A.report_type ,A.icon_img_url, A.created_at , A.news_date
FROM tfs_report_news A, tfs_commodity_master B
WHERE (',' + RTRIM(A.cId) + ',') LIKE ('%,' + B.id + ',%')
AND A.ccId = B.ccId AND A.`report_type`= "M"
AND A.isDeleted=0 AND A.isActive=1 AND B.isDeleted=0
AND B.status=1
AND A.news_date= (SELECT MAX(T.news_date)
FROM tfs_report_news T WHERE (',' + RTRIM(T.cId) + ',')
LIKE ('%,' + B.id + ',%'))
ORDER BY created_at desc, id desc limit 100;
Here tfs_report_news has the string 15,22,20,26,33,445,40,44,22,225,115,2 as column name cId and individual cId like 15 or 2 is id of tfs_commodity_master
In MySQL, what you asked for is the purpose of string function find_in_set():
Returns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings. A string list is a string composed of substrings separated by , characters [...] Returns 0 if str is not in strlist or if strlist is the empty string. Returns NULL if either argument is NULL.
So to check if a value is present in the list, you can just do:
find_in_set('15', '15,22,20,26,33,445,40,44,22,225,115,2') > 0
Side note: here is a recommended reading.
Use FIND_IN_SET:
SELECT
CASE WHEN FIND_IN_SET('15', csv) > 0 THEN 'yes' ELSE 'no' END AS result
FROM yourTable;
Another option would be to use LIKE:
SELECT
CASE WHEN CONCAT(',', csv, ',') LIKE '%,15,%' THEN 'yes' ELSE 'no' END AS result
FROM yourTable;
Finally, you could also use REGEXP here:
SELECT
CASE WHEN csv REGEXP '[[:<:]]15[[:>:]]' THEN 'yes' ELSE 'no' END AS result
FROM yourTable;

Teradata Masking - Retain all chararcters at position 1,4,8,12,16 .... in a string and mask remaining characters with 'X'

I have a requirement where I need to mask all but characters in position 1,4,8,12,16.. for a variable length string with 'X'
For example:
Input string - 'John Doe'
Output String - 'JXXn xxE'
SPACE between the two strings must be retained.
Kindly help or reach out for more details if required.
I think maybe an external function would be best here, but if that's too much to bite off, you can get crafty with strtok_split_to_table, xml_agg and regexp_replace to rip the string apart, replace out characters using your criteria, and stitch it back together:
WITH cte AS (SELECT REGEXP_REPLACE('this is a test of this functionality', '(.)', '\1,') AS fullname FROM Sys_Calendar.calendar WHERE calendar_date = CURRENT_DATE)
SELECT
REGEXP_REPLACE(REGEXP_REPLACE((XMLAGG(tokenout ORDER BY tokennum) (VARCHAR(200))), '(.) (.)', '\1\2') , '(.) (.)', '\1\2')
FROM
(
SELECT
tokennum,
outkey,
CASE WHEN tokennum = 1 OR tokennum mod 4 = 0 OR token = ' ' THEN token ELSE 'X' END AS tokenout
FROM TABLE (strtok_split_to_table(cte.fullname, cte.fullname, ',')
RETURNS (outkey VARCHAR(200), tokennum integer, token VARCHAR(200) CHARACTER SET UNICODE)) AS d
) stringshred
GROUP BY outkey
This won't be fast on a large data set, but it might suffice depending on how much data you have to process.
Breaking this down:
WITH cte AS (SELECT REGEXP_REPLACE('this is a test of this functionality', '(.)', '\1,') AS fullname FROM Sys_Calendar.calendar WHERE calendar_date = CURRENT_DATE)
This CTE is just adding a comma between every character of our incoming string using that regexp_replace function. Your name will come out like J,o,h,n, ,D,o,e. You can ignore the sys_calendar part, I just put that in so it would spit out exactly 1 record for testing.
SELECT
tokennum,
outkey,
CASE WHEN tokennum = 1 OR tokennum mod 4 = 0 OR token = ' ' THEN token ELSE 'X' END AS tokenout
FROM TABLE (strtok_split_to_table(cte.fullname, cte.fullname, ',')
RETURNS (outkey VARCHAR(200), tokennum integer, token VARCHAR(200) CHARACTER SET UNICODE)) AS d
This subquery is the important bit. Here we create a record for every character in your incoming name. strtok_split_to_table is doing the work here splitting that incoming name by comma (which we added in the CTE)
The Case statement just runs your criteria swapping out 'X' in the correct positions (record 1, or a multiple of 4, and not a space).
SELECT
REGEXP_REPLACE(REGEXP_REPLACE((XMLAGG(tokenout ORDER BY tokennum) (VARCHAR(200))), '(.) (.)', '\1\2') , '(.) (.)', '\1\2')
Finally we use XMLAGG to combine the many records back into one string in a single record. Because XMLAGG adds a space in between each character we have to hit it a couple of times with regexp_replace to flip those spaces back to nothing.
So... it's ugly, but it does the job.
The code above spits out:
tXXs XX X XeXX oX XhXX fXXXtXXXaXXXy
I couldn't think of a solution, but then #JNevill inspired me with his idea to add a comma to each character :-)
SELECT
RegExp_Replace(
RegExp_Replace(
RegExp_Replace(inputString, '(.)(.)?(.)?(.)?', '(\1(\2[\3(\4', 2)
,'(\([^ ])', 'X')
,'(\(|\[)')
,'this is a test of this functionality' AS inputString
tXXs XX X XeXX oX XhXX fXXXtXXXaXXXy
The 1st RegExp_Replace starts at the 2nd character (keep the 1st character as-is) and processes groups of (up to) 4 characters adding either a ( (characters #1,#2,#4, to be replaced by X unless it's a space) or [ (character #3, no replacement), which results in :
t(h(i[s( (i(s[ (a( (t[e(s(t( [o(f( (t[h(i(s( [f(u(n(c[t(i(o(n[a(l(i(t[y(
Of course this assumes that both characters don't exists in your input data, otherwise you have to choose different ones.
The 2nd RegExp_Replace replaces the ( and the following character with X unless it's a space, which results in:
tXX[s( XX[ X( X[eXX( [oX( X[hXX( [fXXX[tXXX[aXXX[y(
Now there are some (& [ left which are removed by the 3rd RegExp_Replace.
As I still consider me as a beginner in Regular Expressions, there will be better solutions :-)
Edit:
In older Teradata versions not all parameters were optional, then you might have to add values for those:
RegExp_Replace(
RegExp_Replace(
RegExp_Replace(inputString, '(.)(.)?(.)?(.)?', '(\1(\2[\3(\4', 2, 0 'c')
,'(\([^ ])', 'X', 1, 0 'c')
,'(\(|\[)', '', 1, 0 'c')

Can Sybase CASE expressions have a default column name for their result?

I have a sybase query that is structured like this:
SELECT
case
when isnull(a,'') <> '' then a
else convert(varchar(20), b)
end
FROM table_name
WHERE b=123
It used to return the results of the 'case' in a column named 'converted'. It now returns the results of the 'case' in a column with an empty string name ''.
How could this be? Could there be some database configuration that defaults the results of a 'case' with no name?
(I've fixed the broken query by adding " as computed" after 'end' but now I'd like to know how it used to return as 'computed' before I added the fix?)
Is this what you want?
SELECT (case when isnull(a, '') <> '' then a
else convert(varchar(20), b)
end) as converted
-------------^
FROM table_name
WHERE b = 123;
By the way, you could write the select more succinctly as:
SELECT coalesce(nullif(a, ''), b) as converted

Middle Fucntion/Query for odd and even string

I need a single query which gives two middle characters of even string and one middle character of odd string.
Currently i am using this code but it is giving error.
SELECT S_name, MID(S_name, LENGTH(S_name)/2,1) WHERE (LENGTH(S_name) %2) = 1 OR/AND SELECT S_Name, MID(S_name,LENGTH(S_name)/2,2) WHERE (LENGTH(S_name)%2)=0 FROM Student;
I have also tried this code but it is returning empty view/table.
SELECT S_name FROM Student WHERE ((LENGTH(S_name) %2) = 1 AND SUBSTRING(S_name, LENGTH(S_name)/2+1, 1)) OR ((LENGTH(S_name) %2) = 0 AND SUBSTRING(S_name, LENGTH(S_name)/2-1, 2))
Please Just take a look at it and point out my mistake.
You want to use case in the select clause:
select s_name,
(case when length(s_name) % 2 = 0 then substring(s_name, length(s_name)/2, 2)
else substring(s_name, 1 + length(s_name) / 2, 1)
end)
from student;

Detect if value is number in MySQL

Is there a way to detect if a value is a number in a MySQL query? Such as
SELECT *
FROM myTable
WHERE isANumber(col1) = true
You can use Regular Expression too... it would be like:
SELECT * FROM myTable WHERE col1 REGEXP '^[0-9]+$';
Reference:
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
This should work in most cases.
SELECT * FROM myTable WHERE concat('',col1 * 1) = col1
It doesn't work for non-standard numbers like
1e4
1.2e5
123. (trailing decimal)
If your data is 'test', 'test0', 'test1111', '111test', '111'
To select all records where the data is a simple int:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+$';
Result: '111'
(In regex, ^ means begin, and $ means end)
To select all records where an integer or decimal number exists:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+\\.?[0-9]*$'; - for 123.12
Result: '111' (same as last example)
Finally, to select all records where number exists, use this:
SELECT *
FROM myTable
WHERE col1 REGEXP '[0-9]+';
Result: 'test0' and 'test1111' and '111test' and '111'
SELECT * FROM myTable
WHERE col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$'
Will also match signed decimals (like -1.2, +0.2, 6., 2e9, 1.2e-10).
Test:
drop table if exists myTable;
create table myTable (col1 varchar(50));
insert into myTable (col1)
values ('00.00'),('+1'),('.123'),('-.23e4'),('12.e-5'),('3.5e+6'),('a'),('e6'),('+e0');
select
col1,
col1 + 0 as casted,
col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$' as isNumeric
from myTable;
Result:
col1 | casted | isNumeric
-------|---------|----------
00.00 | 0 | 1
+1 | 1 | 1
.123 | 0.123 | 1
-.23e4 | -2300 | 1
12.e-5 | 0.00012 | 1
3.5e+6 | 3500000 | 1
a | 0 | 0
e6 | 0 | 0
+e0 | 0 | 0
Demo
Returns numeric rows
I found the solution with following query and works for me:
SELECT * FROM myTable WHERE col1 > 0;
This query return rows having only greater than zero number column that col1
Returns non numeric rows
if you want to check column not numeric try this one with the trick (!col1 > 0):
SELECT * FROM myTable WHERE !col1 > 0;
This answer is similar to Dmitry, but it will allow for decimals as well as positive and negative numbers.
select * from table where col1 REGEXP '^[[:digit:]]+$'
use a UDF (user defined function).
CREATE FUNCTION isnumber(inputValue VARCHAR(50))
RETURNS INT
BEGIN
IF (inputValue REGEXP ('^[0-9]+$'))
THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END;
Then when you query
select isnumber('383XXXX')
--returns 0
select isnumber('38333434')
--returns 1
select isnumber(mycol) mycol1, col2, colx from tablex;
-- will return 1s and 0s for column mycol1
--you can enhance the function to take decimals, scientific notation , etc...
The advantage of using a UDF is that you can use it on the left or right side of your "where clause" comparison. this greatly simplifies your SQL before being sent to the database:
SELECT * from tablex where isnumber(columnX) = isnumber('UnkownUserInput');
hope this helps.
Another alternative that seems faster than REGEXP on my computer is
SELECT * FROM myTable WHERE col1*0 != col1;
This will select all rows where col1 starts with a numeric value.
Still missing this simple version:
SELECT * FROM myTable WHERE `col1` + 0 = `col1`
(addition should be faster as multiplication)
Or slowest version for further playing:
SELECT *,
CASE WHEN `col1` + 0 = `col1` THEN 1 ELSE 0 END AS `IS_NUMERIC`
FROM `myTable`
HAVING `IS_NUMERIC` = 1
You can use regular expression for the mor detail https://dev.mysql.com/doc/refman/8.0/en/regexp.html
I used this ^([,|.]?[0-9])+$. This is allows handle to the decimal and float number
SELECT
*
FROM
mytable
WHERE
myTextField REGEXP "^([,|.]?[0-9])+$"
I recommend: if your search is simple , you can use `
column*1 = column
` operator interesting :) is work and faster than on fields varchar/char
SELECT * FROM myTable WHERE column*1 = column;
ABC*1 => 0 (NOT EQU **ABC**)
AB15*A => 15 (NOT EQU **AB15**)
15AB => 15 (NOT EQU **15AB**)
15 => 15 (EQUALS TRUE **15**)
SELECT * FROM myTable WHERE sign (col1)!=0
ofcourse sign(0) is zero, but then you could restrict you query to...
SELECT * FROM myTable WHERE sign (col1)!=0 or col1=0
UPDATE: This is not 100% reliable, because "1abc" would return sign of
1, but "ab1c" would return zero... so this could only work for text that does not begins with numbers.
you can do using CAST
SELECT * from tbl where col1 = concat(cast(col1 as decimal), "")
I have found that this works quite well
if(col1/col1= 1,'number',col1) AS myInfo
Try Dividing /1
select if(value/1>0 or value=0,'its a number', 'its not a number') from table