I have a user table containing a column(say interests) with comma separated interest ids as a value.
e.g.
user interests
A 12,13,15
B 10,11,12,15
C 9,13
D 10,12
Now, I have a string with comma separated values as "13,15".
I want to fetch the users who has the interest 13,15 from above table means it should return the user A, B & C as user A contains both interest(13,15), user B matched the interest with 15 & user matched the interest with 13.
what will be the SQL as I have a lots of users in my table.
It can be done with regexp as #1000111 said, but with more complicated regexp. Look at this, for example:
(^|,)(13|15)(,|$)
This will not match 13 from 135, or 1 from 13 and so on. For example, for number 13 this will match next strings:
1,13,2
13,1,2
1,13
13,2
13
But will not match these
1,135,2
131,2
1,113
And this is query:
SET #search = '13,15';
SELECT *
FROM test
WHERE interests REGEXP CONCAT('(^|,)(', REPLACE(#search, ',', '|'), ')(,|$)')
If you want to get the result based on loose matching then you can follow this query:
Loose matching means interests like 135,151 would also appear while searching for '13,15'.
SET #inputInterest := "13,15";
SELECT
*
FROM userinterests
WHERE interests REGEXP REPLACE(#inputInterest,',','|');
For the given data you will get an output like below:
| ID | user | interests |
|----|------|-------------|
| 1 | A | 12,13,15 |
| 2 | B | 10,11,12,15 |
| 3 | C | 9,13 |
SQL FIDDLE DEMO
EDIT:
If you want result based on having at least one of the interests exactly then you can use regex as #Andrew mentioned in this answer:
Here's I've modified my query based on his insight:
SET #inputInterest := "13,15";
SELECT
*
FROM userinterests
WHERE interests REGEXP CONCAT('(^|,)(', REPLACE(#inputInterest, ',', '|'), ')(,|$)')
SEE DEMO OF IT
Note:
You need to replace the #inputInterest variable by your input string.
Suggestion:
Is storing a delimited list in a database column really that bad?
Related
I have a MySQL database with a varchar column (although the column type can be changed if needed).
The column stores some ids separated with underscores like so:
Row 1: 1
Row 2: 1_2_3
Row 3: 10_2
Row 4: 4_5_1
Is there anyway in this structure to query that column for 1 and return all rows with 1 (but not Row 3 which contains 1 but the ID is 10).
To get the current results I am attempting to search the column LIKE %1%.
Or do I need to change the structure to achieve the result I want?
Maybe you can try:
select *
from t
where c like '1\_%'
or c like '%\_1'
or c like '%\_1\_%'
or c = '1'
You need to escape the underscore as \_, since SQL defines it as a wildcard and will match any character.
If we had a comma separator, then we could use MySQL FIND_IN_SET function.
We can use MySQL REPLACE function to change the underscores to commas,
e.g.
SELECT t.*
FROM t
WHERE FIND_IN_SET('1',REPLACE( t.id ,'_',','))
Reference:
https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_find-in-set
https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_replace
NOTE:
Storing underscore separated lists is an antipattern. See Chapter 2 of Bill Karwin's book "SQL Antipatterns: Avoiding the Pitfalls of Database Programming"
https://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557
With the operator like:
select * from tablename
where concat('_', id, '_') like '%#_1#_%' escape '#'
See the demo.
Results:
| id |
| ----- |
| 1 |
| 1_2_3 |
| 4_5_1 |
I have started learning MySQL and I'm stuck on a case.
I have the following table:
id | value
1 | abc
1 | def
2 |
2 |
3 | pqr
3 |
4 |
4 | xyz
5 |
Please note the empty values beside numeric int denotes empty strings.
Problem statement: I need to get those ids which if formed into a group would only have empty strings attached to it. Example here would be ids: 2, 5
Explanation: id = 2 appears twice both with empty strings and hence included. id = 5 appears once and have one empty string, thus included. id = 3 is not included since one of its instance has got non-empty value i.e. "pqr"
I am stuck with the query:
SELECT * FROM t1 GROUP BY id;
But this gives a wrong result.
Could you please help me out? What should be the query to get ids = 2, 5. I am sorry for the table formatting.
SELECT DISTINCT t1.id
FROM t1
LEFT JOIN t1 t1d ON t1d.id = t1.id AND t1d.value <> ''
WHERE t1d.id IS NULL
without GROUP BY and HAVING = one happy database!
You can achieve the expected outcome with conditional counting compared to counting of all rows within a group:
select id from t1
group by id
having count(*)=count(if(`value`='',1,null))
count(*) returns the number of records with the corresponding id. count(if(value='',1,null)) return the number of such records, where the value field is an empty string.
Using below query you will get your desired output.
select id
from test_empty
group by id
having TRIM(TRAILING ',' FROM group_concat(value))=""
By group_concat(value) output will concatenated value by comma for all the ids.
By using TRIM(TRAILING ',' FROM group_concat(value)) trailing comma can be removed
By using having we can place condition on group by that only id with all blank value will be retrieved.
An empty string will always be "less than" any non-empty string, so this should do the trick:
select id from t1
group by id
having max(value) = ''
I have a column called "Permissions" in my table. The permissions are strings which can be:
"r","w","x","rw","wx","rwx","xwr"
etc. Please note the order of characters in the string is not fixed. I want to GROUP_CONCAT() on the "Permissions" column of my table. However this causes very large strings.
Example: "r","wr","wx" group concatenated is "r,wr,wx" but should be "r,w,x" or "rwx". Using distinct() clause doesn't seem to help much. I am thinking that if I could check if a permission value is a substring of the other column then I should not concatenate it, but I don't seem to find a way to accomplish that.
Any column based approach using solely string functions would also be appreicated.
EDIT:
Here is some sample data:
+---------+
| perm |
+---------+
| r,x,x,r |
| x |
| w,rw |
| rw |
| rw |
| x |
| w |
| x,x,r |
| r,x |
+---------+
The concatenated result should be:
+---------+
| perm |
+---------+
| r,w,x |
+---------+
I don't have control over the source of data and would like not to create new tables ( because of restricted privileges and memory constraints). I am looking for a post-processing step that converts each column value to the desired format.
A good idea would be to first normalize your data.
You could, for example try this way (I assume your source table is named Files):
Create simple table called PermissionCodes with only column named Code (type of string).
Put r, w, and x as values into PermissionCodes (three rows total).
In a subquery join Files to PermissionCodes on a condition that Code exists as a substring in Permissions.
Perform your GROUP_CONCAT aggregation on the result of the subquery.
If it is a case here, that for the same logical entires in Files there exists multiple permission sets that overlaps (i.e. for some file there is a row with rw and another row with w) then you would limit your subquery to distinct combinations of Files' keys and Code.
Here's a fiddle to demonstrate the idea:
http://sqlfiddle.com/#!9/6685d6/4
You can try something like:
SELECT user_id, GROUP_CONCAT(DISTINCT perm)
FROM Permissions AS p
INNER JOIN (SELECT 'r' AS perm UNION ALL
SELECT 'w' UNION ALL
SELECT 'x') AS x
ON p.permission LIKE CONCAT('%', x.perm, '%')
GROUP BY user_id
You can include any additional permission code in the UNION ALL of the derived table used to JOIN with Permissions table.
Demo here
I have a table containing following values :
id | value |
-----------------------
1 | 1,2,5,8,12,20 |
2 | 11,25,26,28 |
-----------------------
now I want to search some comma separated IDs e.g. '1,3,6,7,11' from above value column e.g.
SELECT id FROM tbl_name
WHERE value REGEXP '*some reg exp goes here containing 1,3,6,7,11*'
LIMIT 1,0;
SELECT id FROM tbl_name
WHERE value REGEXP '*some reg exp goes here containing 3,6,27,15*'
LIMIT 1,0;
above 1st query should return 1 while the 2nd should return NULL
I am new with regular expressions can anyone help. Thanks
REGEXP '(^|,)(1|3|6|7|11)(,|$)'
Will match all values containing one number of the sequence 1,3,6,7,11.
You should not use one column to save several values. Normalize data!
Edited answer
I am storing in a column a list of states that are separated by commas:
Like this: 1,2,3,4,5,6,7,8.. and so on, just their IDs.
Then I am doing a query to get all the rows that have the state with the ID 8, and it works when the list of states has few items.
This is the query and the table:
mysql> select id_partner, name, states from partner where 8 IN (states);
+------------+----------------+--------------------+
| id_partner | name | states |
+------------+----------------+--------------------+
| 1 | Inmo Inmo | 8,9,10 |
| 2 | Foto Piso | 8,9,10,11,12,13,14 |
| 4 | PARTNER 001-A | 8 |
| 6 | EnAlquiler | 8 |
| 7 | Habitaclia.com | 8,43,50 |
+------------+----------------+--------------------+
5 rows in set (0.00 sec)
If the column states contains 10 IDs separated by comma it will work, but if it has 50 or more it will not work anymore. In the above result it will not show the row that has many states, including the one with ID 8.
Any idea? I am using this approach to not having to create another table to save the relation between the partner and the states, or should I do that better?
That's not how the IN clause works--you need to use the FIND_IN_SET function to search comma separated values in a single column:
SELECT *
FROM partner
WHERE FIND_IN_SET(8, states) > 0
Realistically, you should not be storing comma delimited data. It's known as denormalized data, and can be difficult to get information on specific values within the commas.
Your string is a single value, not multiple values. It happens to contain commas, but it's still a single string.
The value of a string in a numeric context is taken from the leading number digits. In other words, the numeric value of '123,456,789' is 123. MySQL ignores all characters past the first non-digit. (This is the behavior in MySQL, other databases can behave differently).
The IN( ) predicate allows you to compare to multiple values, but you have to make them separate SQL expressions. For instance in the following two queries, only the first one returns anything:
SELECT * FROM partner WHERE 5 IN ( 1, 2, 3, 4, 5 );
SELECT * FROM partner WHERE 5 IN ( '1,2,3,4,5' );
This shows one of the many reasons why you shouldn't use a string with comma-separated elements and expect it to behave like it's really a collection of separate values.
The better design in this case is to store the states for a given partner in another table, where each association between a partner and a state is on a separate row.
create table partner_states (
id_partner int not null,
id_state int not null,
primary key (id_partner, id_state)
);
Then populate it:
insert into partner_states (id_partner, id_state)
values (1, 8), (1, 9), (1, 10);
Then you can query for partners that match a certain state easily:
select id_partner, name, states
from partner p join partner_states s on p.id_partner = s.id_partner
where s.id_state = 8
Querying for partners that don't match a state you can do with an outer join:
select id_partner, name, states
from partner p left outer join partner_states s on p.id_partner = s.id_partner
and s.id_state = 8
where s.id_state is null