I have a bad-projected database which have a ID sets in text columns like "1,2,5,10". I need to get an intersection of two columns which are set by same way.
I don't like to do it using PHP or another scripting language, I also not like MySQL custom functions.
Is there any way to get an intersection of two sets given by comma-delimeter strings?
Actually I don't need to have full intersection, I just need to know is there same numbers in two sets. If yes, I need to have "1", if no same number, I need to have "0".
Thank you.
You may be able to use REGEXP to do this with a bit of clever replacing.
Think this should do it (disclaimer: haven't tested it extensively):
SELECT col1,
col2,
CONCAT('(', REPLACE(col2, ',', '(\\,|$)|'), '(\\,|$))') AS regex,
col1 REGEXP CONCAT('(', REPLACE(col2, ',', '(\\,|$)|'), '(\\,|$))') AS intersect
FROM tbl
See http://sqlfiddle.com/#!2/7b86f/3
To explain: This converts col2 into a regular expression for matching against col1. The (\,|$) bit matches against either a comma or the end of the string. Hope this helps...
The code from Steve does not work in all cases.
For e.x it does not work when a number can be found in another number. INSERT INTO tbl (col1, col2)
VALUES ('60,61,64,68,73', '14,16,17,18,1');
With a little tweak it can work:
SELECT col1,
col2,
CONCAT('((\\,)', REPLACE(col2,',', '(\\,)|(\\,)'), '(\\,))') AS regex,
CONCAT(',',col1,',') REGEXP CONCAT('((\\,)', REPLACE(col2,',', '(\\,)|(\\,)'), '(\\,))') AS intersect
FROM tbl
Related
I want to count how many columns in a row are not NULL.
The table is quite big (more than 100 columns), therefore I would like to not do it manually or using php (since I dont use php) using this approach Counting how many MySQL fields in a row are filled (or empty).
Is there a simple query I can use in a select like SELECT COUNT(NOT ISNULL(*)) FROM big_table;
Thanks in advance...
Agree with comments above:
There is something wrong in the data since there is a need for such analysis.
You can't completely make it automatic.
But I have a recipe for you for simplifying the process. There are only 2 steps needed to achieve your aim.
Step 0. In the step1 you'll need to get the name of your table schema. Normally, the devs know in what schema does the table reside, but still... Here is how you can find it
select *
from information_schema.tables
where table_name = 'test_table';
Step 1. First of all you need to get the list of columns. Getting just the list of cols won't help you out at all, but this list is all we need to be able to create SELECT statement, right? So, let's make database to prepare select statement for us
select concat('select (length(concat(',
group_concat(concat('ifnull(', column_name, ', ''###'')') separator ','),
')) - length(replace(concat(',
group_concat(concat('ifnull(', column_name, ', ''###'')') separator ','),
'), ''###'', ''''))) / length(''###'')
from test_table')
from information_schema.columns
where table_schema = 'test'
and table_name = 'test_table'
order by table_name,ordinal_position;
Step 3. Execute statement you've got on step 2.
select (length(concat(.. list of cols ..)) -
length(replace(concat(.. list of cols .. ), '###', ''))) / length('###')
from test_table
The select looks tricky but it's simple: first replace all nulls with some symbols that you're sure you'll never get in those columns. I usually do that replacing nulls with "###". that what all that "ifnull"s are here for.
Next, count symbols with "length". In my case it was 14
After that, replace all "###" with blanks and count length again. It's 11 now. For that I was using "length(replace" functions together
Last, just divide (14 - 11) by a length of a replacement string ("###" - 3). You'll get 1. This is exactly amount of nulls in my test string.
Here's a test case you can play with
Do not hesitate to ask if needed
I am a beginner so please help me.
There are 2 things you need to combine in this case.
Because you didn't provide enough information in your question we have to guess what you mean by name. I'm going to assume that you have a single name column, but that would be unusual.
With strings, to match a character column that is not an exact match, you need to use LIKE which allows for wildcards.
You also need to negate the match, or in other words show things that are NOT (something).
First to match names that START with 'A'.
SELECT * FROM table_name WHERE name LIKE 'A%';
This should get you all the PEOPLE who have names that "Start with A".
Some databases are case sensitive. I'm not going to deal with that issue. If you were using MySQL that is not an issue. Case sensitivity is not universal. In some RDBMS like Oracle you have to take some steps to deal with mixed case in a column.
Now to deal with what you actually want, which is NOT (starting with A).
SELECT * FROM table_name WHERE name NOT LIKE 'A%';
your question should have more detail however you can use the substr function
SELECT name FROM yourtable
WHERE SUBSTR(name,1,1) <> 'A'
complete list of mysql string functions here
mysql docs
NOT REGXP operator
MySQL NOT REGXP is used to perform a pattern match of a string expression expr against a pattern pat. The pattern can be an extended regular expression.
Syntax:
expr NOT REGEXP pat
Query:
SELECT * FROM emp_table WHERE emp_name NOT REGEXP '^[a]';
or
SELECT * FROM emp_table WHERE emp_name NOT REGEXP '^a';
I want to to create a regex to find all columns that only have a single character ([A-Z]) as name, like N or M but not NM.
I've tried:
SELECT * FROM 'table' WHERE Name REGEXP '^[A-Z]'
But it's not displaying the expected result.
Try ^[A-Z]$.
You then state that this character is first and also last character of the value.
The regex functions in Oracle work only on one column. So, to search for just one character in a column, you would do the following:
select * from yourTable where REGEXP_LIKE (col1, '^[A-z]$');
Now, to search all the char/varchar columns on your table, you'll need to chain the regex expressions together, like so:
select * from yourTable where REGEXP_LIKE (col1, '^[A-z]$') or REGEXP_LIKE (col3, '^[A-z]$');
SQL solution:
where name in ('N','M')
I am looking to compare the results of 2 cells in the same row. the way the data is structured is essentially this:
Col_A: table,row,cell
Col_B: row
What I want to do is compare when Col_A 'row' is the same as Col_B 'row'
SELECT COUNT(*) FROM MyTable WHERE Col_A CONTAINS Col_B;
sample data:
Col_A: a=befd-47a8021a6522,b=7750195008,c=prof
Col_B: b=7750195008
Col_A: a=bokl-e5ac10085202,b=4478542348,c=pedf
Col_B: b=7750195008
I am looking to return the number of times the comparison between Col_A 'b' and Col_B 'b' is true.
This does what I was looking for:
SELECT COUNT(*) FROM MyTable WHERE Col_A LIKE CONCAT('%',Col_B,'%');
I see You answered Your own question.
SELECT COUNT(*) FROM MyTable WHERE Col_A LIKE CONCAT('%',Col_B,'%');
is good from performance perspective. While normalization is very good idea, it would not improve speed much in this particular case. We must simply scan all strings from table. Question is, if the query is always correct. It accepts for example
Col_A: a=befd-47a8021a6522,ab=7750195008,c=prof
Col_B: b=7750195008
or
Col_A: a=befd-47a8021a6522,b=775019500877777777,c=prof
Col_B: b=7750195008
this may be a problem depending on the data format. Solution is quite simple
SELECT COUNT(*) FROM MyTable WHERE CONCAT(',',Col_A,',') LIKE CONCAT('%,',Col_B,',%');
But this is not the end. String in LIKE is interpreted and if You can have things like % in You data You have a problem. This should work on mysql:
SELECT COUNT(*) FROM MyTable WHERE LOCATE(CONCAT(',',Col_B,','), CONCAT(',',Col_A,','))>0;
SELECT * FROM MyTable WHERE Col_A = Col_B (AND Col_A = 'cell')
^^ Maybe you are looking for this statement. The part in brackets is optional.
If this is not the solution, please supply us with further information.
The easiest way would be to use the IN operator.
SELECT COUNT(*) FROM MyTable WHERE Col_A IN (Col_B);
More info on the IN operator: http://www.w3schools.com/sql/sql_in.asp
There's also the SUBSTRING() or MID() (depending on what you're using) function if you know that the substring will be in the same position everytime.
MID()/SUBSTRING() function: http://www.w3schools.com/sql/sql_func_mid.asp
You can use SUBSTRING_INDEX to extract a delimited field from a column.
SELECT COUNT(*)
FROM MyTable
WHERE Col_B = SUBSTRING_INDEX(SUBSTRING_INDEX(Col_A, ',', 2), ',', -1)
You need to call it twice to get a single field. The inner call gets the first two fields, the outer call gets the last field of that.
Note that this will be very slow if the table is large, because it's not possible to index substrings in MySQL. It would be much better if you normalized your schema so each field is in a separate column.
If column Col_a has data with format table,row,cell then search expression will be next:
SELECT COUNT(*) FROM MyTable AS MT
WHERE SUBSTRING(Col_A,
INSTR(Col_A, ',b=') + 3,
INSTR(Col_A, ',c=') - INSTR(Col_A, ',b=') + 3) = Col_B
I am trying to search multiple columns in my Db using a regex. It works but using many and/or statments. I was wondering if it was possible to use something like this;
SELECT * FROM table REGEXP 'regex' IN (col1, col2, col3,.....)
This doesn't work, it was a guess at the syntax because I can't find anything similar by searching online. Is this a stupid idea or am I missing something very simple?
If you want to regexp search a value in multiple columns then you can do:
SELECT * FROM table where CONCAT(col1, col2, col3) REGEXP 'search-pattern';
The syntax for MySQL REGEX comparison is
expr REGEXP pattern_string
You cannot use it with IN. You would have to do:
SELECT * FROM `table` WHERE
col1 REGEXP 'regex'
OR col2 REGEXP 'regex'
OR col3 REGEXP 'regex'
You could also use RLIKE -- they are synonyms.