How to use MySQL regexp with 'and' 'or' in one statement - mysql

I am using MySQL 5.x, and am trying to come out with a SQL statement to select rows base on the following datasets
ID | Type | Name
1 | Silver | Customer A
2 | Golden | Customer B
3 | Silver, Golden | Customer C
4 | Bronze, Silver | Customer D
I need to use regexp (Legacy system reasons) in the SQL statement, where I need to only select ID=1 and ID=4, which means I need "Silver", "Silver with Bronze" customer type, but not "Silver + Golden"
I am not very familiar with regular expressions, been trying with SQL like below:
SELECT DISTINCT `customer_type` FROM `customers` WHERE
`customer_type` regexp
"(Silver.*)(^[Golden].*)"
Where I need to have the regular expressions in one place like above, but not like below:
SELECT DISTINCT `customer_type` FROM `customers` WHERE
`customer_type` regexp
"(Silver.*)"
AND NOT
customer_type` regexp
"(Golden.*)"
Although LIKE will work, but I can't use it for special reasons.
SELECT DISTINCT `customer_type` FROM `customers` WHERE
`customer_type` LIKE "%Silver%"
AND NOT
customer_type` LIKE "%Golden%"
I couldn't get the first SQL statement to work, and not sure even if that is possible.

Just try these one:
SELECT DISTINCT `id`, `customer_type`
FROM `customers`
WHERE `customer_type` regexp "^.*Silver$"
This matches "anything + Silver" or just Silver.

Related

How to use LIKE operator in MYSQL?

Write an SQL query to report the patient_id, patient_name all conditions of patients who have Type I Diabetes. Type I Diabetes always starts with DIAB1 prefix.
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| patient_id | int |
| patient_name | varchar |
| conditions | varchar |
+--------------+---------+
This table contains information of the patients in the hospital.
patient_id is the primary key for this table. conditions contains 0 or more code separated by spaces.
So this was my solution:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '%DIAB1%' ;
It worked correctly for all these conditions
patient_id
patient_name
conditions
1
Daniel
YFEV COUGH
2
Alice
3
Bob
DIAB100 MYOP
4
George
ACNE DIAB100
except for this condition
patient_id
patient_name
conditions
1
Daniel
SADIAB100
And in the solution it was shown that there is a space after 1st % which would give you the correct answer:
correct query:
SELECT *
FROM Patients
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%' ;
So, can someone please explain why this query works for that particular condition (SADIAB100) and not the 1st query
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%'
The problem this is trying to address is when a condition contains the keyword (DIAB1) - while you only want to match on the beginning of the keyword.
The naive approach fails, because it matches on "SADIAB100":
WHERE conditions LIKE '%DIAB1%'
So the workaround is to search for the keyword:
either at the beginning of the whole string (ie at the beginning of the first condition) ; that's what LIKE 'DIAB1%' does
or after another condition, in which case it is preceded by a space, so ' DIAB1%'
Hence:
WHERE conditions LIKE 'DIAB1%' OR conditions LIKE '% DIAB1%'
A slightly neater expression is:
WHERE CONCAT(' ', conditions) LIKE '% DIAB1%'
Bottom line: if you are using a relational database, you should not be storing multiple values in a single row.
Instead of a CSV-like format, you should have a separate table to store the conditions, with each value on a separate row, allowing you to leverage the powerful set-based features that your product offers.
creating a regex pattern to find the keyword 'DIAB1'
filtering on cases where it matches the above
with main as (
select
*, REGEXP_LIKE(conditions,'DIAB1') as is_relevant_patient
from <table_name>
)
select * from main where is_relevant_patient
fiddle test

Count number of rows use mysql like condition

I have a table that contain list of book. I want to count the number of row with "mysql php" in it.
If I use regular like (use the code below), it only count the row with 'mysql php'.
SELECT 'mysql php' as searched_word,
count(case when book_list like '%mysql php%' then 1 else 0 end) AS number_of_occurrences
FROM book_list
What I want is, it also count the row with "php mysql" value, but I don't have idea how to do it.
This is the table:
+------------------------+
| book_list |
+------------------------+
| mysql php for dummies |
+------------------------+
| learn php mysql |
+------------------------+
| mysql php for students |
+------------------------+
| mysql database |
+------------------------+
my expected result:
+---------------+-----------------------+
| searched_word | number_of_occurrences |
+---------------+-----------------------+
| mysql php | 3 |
+---------------+-----------------------+
Make the condition an or, and use sum() (not count()):
select
'mysql php' as searched_word,
sum(book_list like '%mysql php%' or book_list like '%php mysql%') AS number_of_occurrences
from book_list
Note the way MySQL allows briefer code to count conditions, because 'true' is '1' and 'false' is '0'.
If you wanted to count rows that had both terms somewhere, eg "learn php and mysql", use this:
sum(book_list like '%php%' and book_list like '%mysql%')
WITH
-- parse searching criteria to separate tokens (delimiter - space), remove duplicates
cte1 AS (
SELECT DISTINCT token
FROM JSON_TABLE( CONCAT('["', REPLACE(#criteria, ' ', '","'), '"]'),
"$[*]" COLUMNS( token VARCHAR(254) PATH "$" )
) AS jsontable
),
-- select books which' title contains ALL tokens as complete words
-- words delimiter - space
-- commas, dots and another punctuation is NOT removed
cte2 AS (
SELECT book_list.book_list
FROM book_list
JOIN cte1 ON LOCATE(cte1.token, REPLACE(book_list.book_list, ' ', CHAR(0)))
GROUP BY book_list.book_list
HAVING COUNT(*) = ( SELECT COUNT(*)
FROM cte1 )
)
-- count the amount of matched titles
SELECT #criteria searched_word, COUNT(*) number_of_occurrences
FROM cte2;
fiddle
PS. Needed MySQL 8.0.4 or newer.

Using REGEXP vs IN on a subquery mysql

I want to use the data from table 'similar' to find results from table 'releases'
Table 'Similar' has this structure
artist similar_artist
Moodymann Theo Parrish
Moodymann Jeff Mills
Moodymann Marcellus Pittman
Moodymann Rick Wilhite
My query so far is
SELECT * FROM releases
WHERE
releases.all_artists REGEXP 'Moodymann'
OR releases.label_no_country='KDJ'
OR releases.all_artists IN (SELECT similar_artist
FROM similar
WHERE artist='Moodymann')
ORDER BY date DESC
the column 'all_artists' has records like this:
Moodymann | Theo Parrish | Rick Wade
Jeff Mills | Moodymann | Rick Wilhite
So the end query that I want will essentially be this
SELECT * FROM releases
WHERE
releases.all_artists REGEXP 'Moodymann'
OR releases.label_no_country='KDJ'
OR releases.all_artists IN ('Theo Parrish','Jeff Mills','Marcellus Pittman','Rick Wilhite')
To make matches I think I need to use REGEXP instead of IN - REGEXP returns the 'Subquery returns more than 1 row'. How can use the data returned from the subquery?
Also the query is taking a long time to run (up to 20 seconds) - is there anyway to speed this up as this is not usable in my web app.
Thanks!
The only way I would know of how to use REGEXP with a subquery, would be to use that subquery to produce a REGEXP string.
SELECT * FROM releases
WHERE
releases.all_artists REGEXP 'Moodymann'
OR releases.label_no_country='KDJ'
OR releases.all_artists REGEXP (
SELECT GROUP_CONCAT(similar_artist SEPARATOR '|')
FROM similar
WHERE artist='Moodymann'
GROUP BY similar_artist)
ORDER BY date DESC
The above isn't tested, is just a theory to what I might try. It's not going to be very optimal however.
update
Have since tested this and found that GROUP BY similar_artist should be GROUP BY artist
SELECT * FROM releases
WHERE
releases.all_artists REGEXP 'Moodymann'
OR releases.label_no_country='KDJ'
OR releases.all_artists REGEXP (
SELECT GROUP_CONCAT(similar_artist SEPARATOR '|')
FROM similar
WHERE artist='Moodymann'
GROUP BY artist)
ORDER BY date DESC
However, as mentioned by Pheonix you would be better off refactoring your structure to have a releases_artist table. You could then do all this work via JOINs which would be much, much faster.
Try this SQL
SELECT *
FROM releases
WHERE releases.all_artists LIKE '%Moodymann%'
OR releases.label_no_country='KDJ'
ORDER BY date DESC
SQL Fiddle
MySQL 5.5.30 Schema Setup:
CREATE TABLE Table1
(`artist` varchar(9), `similar_artist` varchar(17))
;
INSERT INTO Table1
(`artist`, `similar_artist`)
VALUES
('Moodymann', 'Theo Parrish'),
('Moodymann', 'Jeff Mills'),
('Moodymann', 'Marcellus Pittman'),
('Moodymann', 'Rick Wilhite')
;
create table allt(allf varchar(50));
insert into allt values('Moodymann | Theo Parrish | Rick Wade'),
('Jeff Mills | Moodymann | Rick Wilhite'),
('Jeff Mills | asdasdadasd | Rick Wilhite');
Query 1:
SELECT *
FROM allt
WHERE allt.allf LIKE '%Moodymann%'
Results:
| ALLF |
-----------------------------------------
| Moodymann | Theo Parrish | Rick Wade |
| Jeff Mills | Moodymann | Rick Wilhite |
You can do a join on a comma separated list (won't be fast, but might be quicker than using LIKE with a leading wild card), and you can replace your existing delimiter with a comma to allow this. Also you can use a load of UNIONs to get your list of artists to behave like a table to do a join on.
Further you can use union instead of your other WHERE clauses which might well help with allowing the use of indexes (MySQL will only use one index per table in a query, hence using OR to query on a different column forces it to not use an index for one of the columns it is checking).
As such you can do something like the following:-
SELECT releases.*
FROM releases
INNER JOIN (SELECT 'Theo Parrish' AS anArtist UNION SELECT 'Jeff Mills' UNION SELECT 'Marcellus Pittman' UNION SELECT 'Rick Wilhite') Sub1
ON FIND_IN_SET(Sub1.anArtist, REPLACE(releases.all_artists, " | ", ",")) > 0
UNION
SELECT releases.*
FROM releases
WHERE releases.label_no_country='KDJ'
However if changing the database design to split the pipe separated list of artists onto a different table is even a slight option then do that instead. It will be far quicker and will cope with far greater numbers of artists.

Select row which contains exact number in column with set of numbers separated by comma

Maybe answer is very easy, but I can't find the right MySQL query which do what I want.
I have table user :
| id_user | name | action_type |
+---------------------------------+
| 1 | joshua | 1,13,12,40 |
| 2 | joshua | 2,8 |
And I want to select only rows which have exact number in action_type column.
action_type is stored in MySQL as TEXT.
I've tried this:
SELECT * FROM user WHERE action_type LIKE '%2%'
But it selected rows with 12 which is not what I want :(
Maybe it's possible with IN operator, but I couldn't find a right way to use this.
You are looking for FIND_IN_SET
SELECT *
FROM user
WHERE FIND_IN_SET( '2', action_type )
SQL Fiddle DEMO
UPDATE
Just to mentioned it, this is also possible
SELECT *
FROM user
WHERE FIND_IN_SET( 2, action_type )
MySQL will do an automatic conversion to char
Include the delimiter in your LIKE clause:
SELECT *
FROM user
WHERE action_type LIKE '2,%'
OR action_type LIKE '%,2,%'
OR action_type LIKE '%,2'
Note that I had to use two additional LIKE clauses to cover the cases where the item is at the beginning or end of the string.
Try
SELECT * FROM user
WHERE CONCAT( ',', action_type, ',' ) LIKE '%,2,%';
correct syntax from Sir Rufo

How can I use the LIKE operator on a list of strings to compare?

I have a query I need to run on almost 2000 strings where it would be very helpful to be able to do a list like you can with the "IN" operator but using the LIKE comparison operation.
For example I want to check to see if pet_name is like any of these (but not exact): barfy, max, whiskers, champ, big-D, Big D, Sally
Using like it wouldn't be case sensitive and it can also have an underscore instead of a dash. Or a space. It will be a huge pain in the ass to write a large series of OR operators. I am running this on MySQL 5.1.
In my particular case I am looking for file names where the differences are usually a dash or an underscore where the opposite would be.
For this task I would suggest making use of RegExp capabilities in MySQL like this:
select * from EMP where name RLIKE 'jo|ith|der';
This is case insensitive match and will save from multiple like / OR conditions.
You could do something like this -
SELECT FIND_IN_SET(
'bigD',
REPLACE(REPLACE('barfy,max,whiskers,champ,big-D,Big D,Sally', '-', ''), ' ', '')
) has_petname;
+-------------+
| has_petname |
+-------------+
| 5 |
+-------------+
It will give a non-zero value (>0) if there is a pet_name we are looking for.
But I'd suggest you to create a table petnames and use SOUNDS LIKE function to compare names, in this case 'bigD' will be equal to 'big-D', e.g.:
SELECT 'bigD' SOUNDS LIKE 'big-D';
+---------------------------+
| 'bigD'SOUNDS LIKE 'big-D' |
+---------------------------+
| 1 |
+---------------------------+
Example:
CREATE TABLE petnames(name VARCHAR(40));
INSERT INTO petnames VALUES
('barfy'),('max'),('whiskers'),('champ'),('big-D'),('Big D'),('Sally');
SELECT name FROM petnames WHERE 'bigD' SOUNDS LIKE name;
+-------+
| name |
+-------+
| big-D |
| Big D |
+-------+
As first step put all static values in any temporary table, this would be lookup dictionary.
SELECT * FROM Table t
WHERE EXISTS (
SELECT *
FROM LookupTable l
WHERE t.PetName LIKE '%' + l.Value + '%'
)
Configure the column containing those 2000 values for full-text searching. Then you can use MySQL's full-text search feature. Refer to their docs
You could use REGEXP instead. It worked like a charm for me
pet_name regexp 'barfy|max|whiskers|champ|you name it'