I need to find a record who dont have a specific value in CSV column. below is the table structure
CREATE TABLE `employee` (
`id` int NOT NULL AUTO_INCREMENT,
`first_name` varchar(100) NOT NULL,
`last_name` varchar(100) NOT NULL,
`keywords` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Sample record1: 100, Sam, Thompson, "50,51,52,53"
Sample record2: 100, Wan, Thompson, "50,52,53"
Sample record3: 100, Kan, Thompson, "53,52,50"
50 = sports
51 = cricket
52 = soccer
53 = baseball
i need to find the employees name who has the tags of "sports,soccer,baseball" excluding cricket
so the result should return only 2nd and 3rd record in this example as they dont have 51(cricket) but all other 3 though in diff pattern.
My query is below, but i couldnt get it worked any more.
SELECT t.first_name,FROM `User` `t` WHERE (keywords like '50,52,53') LIMIT 10
is there anything like unlike option? i am confused how to get this worked.
You could use FIND_IN_SET:
SELECT t.first_name
FROM `User` `t`
WHERE FIND_IN_SET('50', `keywords`) > 0
AND FIND_IN_SET('52', `keywords`) > 0
AND FIND_IN_SET('53', `keywords`) > 0
AND FIND_IN_SET('51', `keywords`) = 0;
Keep in mind it could be slow. The correct way is to normalize your table structure.
FIND_IN_SET will do the job for you but it does not use indexes. This is not a bug it's a feature.
SUBSTRING_INDEX can use an index and return the data as you wish. You don't have an index on it at the moment, But the catch here is that TEXT fields cannot be fully indexed and what you have is a TEXT field.
Normalize!
This is what you really should be doing. It's not a good idea to store comma separated values in a database. You really should be having a keywords table and since the keywords will be short, you can have a char or varchar narrow column which can be fully indexed.
Related
Problem description
I have a table, say trans_flow:
CREATE TABLE trans_flow (
id BIGINT(20) AUTO_INCREMENT PRIMARY KEY,
card_no VARCHAR(50) DEFAULT NULL,
money INT(20) DEFAULT NULL
)
New data is inserted into this table constantly.
Now, I want to fetch only the rows that have not been fetched in the last query. For example, at 5:00, id ranges from 1 to 100, and I read the rows 80 - 100 and do some processing. Then, at 5:01, the id comes to 150, and I want to get exactly the rows 101 - 150. Otherwise, the processing program will read in old and already processed data. Note that such queries are committed continuously. From a certain perspective, I want to implement "streaming process" on MySQL.
A tentative idea
I have a simple but maybe ugly solution. I create an auxiliary table query_cursor which stores the beginning and end ids of one query:
CREATE TABLE query_cursor (
task_id VARCHAR(20) PRIMARY KEY COMMENT 'Specify which task is reading this table',
first_row_id BIGINT(20) DEFAULT NULL,
last_row_id BIGINT(20) DEFAULT NULL
)
During each query, I first update the query range stored in this table by:
UPDATE query_cursor
SET first_row_id = (SELECT last_row_id + 1 FROM query_cursor WHERE task_id = 'xxx'),
last_row_id = (SELECT MAX(id) FROM trans_flow)
WHERE task_id = 'xxx'
And then, doing query on table trans_flow using stored cursors:
SELECT * FROM trans_flow
WHERE id BETWEEN (SELECT first_row_id FROM query_cursor WHERE task_id = 'xxx')
AND (SELECT last_row_id FROM query_cursor WHERE task_id = 'xxx')
Question for help
Is there a simpler and more elegant implementation that can achieve the same effect (the best if no need to use an auxiliary table)? The version of MySQL is 5.7.
I am trying to search comma separated values from database table column contains comma separated string.
MY DB
id interest status
------------------------
1 1,2,3 1
2 4 1
My search combination contains 1,2, 3,2, 3, 1,4 etc. Any combination will occure.
I want to show all the id that contains any digit from comma separated search combination.
For example, search for 1,4 should return
id
--
1
2
For example, search for 3,2 should return
id
--
1
I have tried using IN and FIND_IN_SET but none of them achieved my result. Is there any other option.
SELECT * FROM `tbl_test` WHERE interest IN (3)
The above code return empty set.
Like Jens has pointed out in the comments, it is highly recommended to normalize your schema.
If you wish to continue with string and comma separated values, you should then be looking at complex regex matching (which I leave it to you to explore).
However, one more alternative is to convert your column interest as JSON datatype. MYSQL 5.7 and above supports these datatypes.
CREATE TABLE IF NOT EXISTS `tbl` (
`id` int(6) unsigned NOT NULL,
`interest` JSON DEFAULT NULL,
`status` int(1) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `tbl` (`id`, `interest`,`status`) VALUES
(1, '[1,2,3,4]',1),
(2, '[1,2]',1),
(3, '[3]',1);
And then query it as follows :
select id from tbl where JSON_CONTAINS( interest ,'[1,2]')
select id from tbl where JSON_CONTAINS( interest ,'[3,4]');
...
You can see it action in this sql fiddle.
I have a table that contains a bunch of numbers seperated by a comma.
I would like to retrieve rows from table where an exact number not a partial number is within the string.
EXAMPLE:
CREATE TABLE IF NOT EXISTS `teams` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`uids` text NOT NULL,
`islive` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
INSERT INTO `teams` (`id`, `name`, `uids`, `islive`) VALUES
(1, 'Test Team', '1,2,8', 1),
(3, 'Test Team 2', '14,18,19', 1),
(4, 'Another Team', '1,8,20,23', 1);
I would like to search where 1 is within the string.
At present if I use Contains or LIKE it brings back all rows with 1, but 18, 19 etc is not 1 but does have 1 within it.
I have setup a sqlfiddle here
Do I need to do a regex?
You only need 1 condition:
select *
from teams
where concat(',', uids, ',') like '%,1,%'
I would search for all four possible locations of the ID you are searching for:
As the only element of the list.
As the first element of the list.
As the last element of the list.
As an inner element of the list.
The query would look like:
select *
from teams
where uids = '1' -- only
or uids like '1,%' -- first
or uids like '%,1' -- last
or uids like '%,1,%' -- inner
You could probably catch them all with a OR
SELECT ...
WHERE uids LIKE '1,%'
OR uids LIKE '%,1'
OR uids LIKE '%, 1'
OR uids LIKE '%,1,%'
OR uids = '1'
You didn't specify which version of SQL Server you're using, but if you're using 2016+ you have access to the STRING_SPLIT function which you can use in this case. Here is an example:
CREATE TABLE #T
(
id int,
string varchar(20)
)
INSERT INTO #T
SELECT 1, '1,2,8' UNION
SELECT 2, '14,18,19' UNION
SELECT 3, '1,8,20,23'
SELECT * FROM #T
CROSS APPLY string_split(string, ',')
WHERE value = 1
You SQL Fiddle is using MySQL and your syntax is consistent with MySQL. There is a built-in function to use:
select t.*
from teams t
where find_in_set(1, uids) > 0;
Having said that, FIX YOUR DATA MODEL SO YOU ARE NOT STORING LISTS IN A SINGLE COLUMN. Sorry that came out so loudly, it is just an important principle of database design.
You should have a table called teamUsers with one row per team and per user on that team. There are numerous reasons why your method of storing the data is bad:
Numbers should be stored as numbers, not strings.
Columns should contain a single value.
Foreign key relationships should be properly declared.
SQL (in general) has lousy string handling functions.
The resulting queries cannot be optimized.
Simple things like listing the uids in order or removing duplicate are unnecessarily hard.
If I have a table like this:
CREATE TABLE `Suppression` (
`SuppressionId` int(11) NOT NULL AUTO_INCREMENT,
`Address` varchar(255) DEFAULT NULL,
`BooleanOne` bit(1) NOT NULL DEFAULT '0',
`BooleanTwo` bit(1) NOT NULL DEFAULT '0',
`BooleanThree` bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`SuppressionId`),
)
Is there a set-based way in which I can select all records which have exactly one of the three bit fields = 1 without writing out the field names?
For example given:
1 10 Pretend Street 1 1 1
2 11 Pretend Street 0 0 0
3 12 Pretend Street 1 1 0
4 13 Pretend Street 0 1 0
5 14 Pretend Street 1 0 1
6 14 Pretend Street 1 0 0
I want to return records 4 and 6.
You could "add them up":
where cast(booleanone as unsigned) + cast(booleantwo as unsigned) + cast(booleanthree as unsigned) = 1
Or, use tuples:
where ( (booleanone, booleantwo, booleanthree) ) in ( (0b1, 0b0, 0b0), (0b0, 0b1, 0b0), (0b0, 0b0, 0b1) )
I'm not sure what you mean by "set-based".
If your number of booleans can vary over time and you don't want to update your code, I suggest you make them lines and not columns.
For example:
CREATE TABLE `Suppression` (
`SuppressionId` int(11) NOT NULL AUTO_INCREMENT,
`Address` varchar(255) DEFAULT NULL,
`BooleanId` int(11) NOT NULL,
`BooleanValue` bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`SuppressionId`,`BooleanId`),
)
So with 1 query and a 'group by' you can check all values of your booleans, however numerous they are. Of course, this makes your tables bigger.
EDIT: Just came out with another idea: why don't you have a checksum column added, whose value would be the sum of all your bits? So you would update it at every write into your table, and just check this one in your select
If you
must use this denormalized way of representing these flags, and you
must be able to add new flag columns to your table in production, and you
cannot rewrite your queries by hand when you add columns,
then you must figure out how to write a program to write your queries.
You can use this query to retrieve a result set of boolean-valued columns, then you can use that result set in a program to write a query involving all those columns.
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = 'Suppression'
AND COLUMN_NAME LIKE 'Boolean%'
AND DATA_TYPE = 'bit'
AND NUMERIC_PRECISION=1
The approach you have proposed here will work exponentially more poorly as you add columns, unfortunately. Any time a software engineer says "exponential" it's time to run away screaming. Seriously.
A much more scalable approach is to build a one-to-many relationship between your Suppression rows and your flags. Add this table.
CREATE TABLE SuppressionFlags (
SuppressionId int(11) NOT NULL,
FlagName varchar(31) NOT NULL,
Value bit(1) NOT NULL DEFAULT '0',
PRIMARY KEY (SuppressionID, FlagName)
)
Then, when you want to insert a row with some flag variables, do this sequence of queries.
INSERT INTO Suppression (Address) VALUES ('some address');
SET #SuppressionId := LAST_INSERT_ID();
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanOne', 1);
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanTwo', 0);
INSERT INTO SuppressionFlags (SuppressionId, FlagName, Value)
VALUES (#SuppressionId, 'BooleanThree', 0);
This gives you one Suppression row with three flags set in the SuppressionFlags table. Note the use of #SuppressionId to set the Id values in the second table.
Then to find all rows with just one flag set, do this.
SELECT Suppression.SuppressionId, Suppression.Address
FROM Suppression
JOIN SuppressionFlags ON Suppression.SuppressionId = SuppressionFlags.SuppressionId
GROUP BY Suppression.SuppressionId, Suppression.Address
HAVING SUM(SuppressionFlags.Value) = 1
It gets a little trickier if you want more elaborate combinations. For example, if you want all rows with BooleanOne and either BooleanTwo or BooleanThree set, you need to do something like this.
SELECT S.SuppressionId, S.Address
FROM Suppression S
JOIN SuppressionFlags A ON S.SuppressionId=A.SuppressionId AND A.FlagName='BooleanOne'
JOIN SuppressionFlags B ON S.SuppressionId=B.SuppressionId AND B.FlagName='BooleanTwo'
JOIN SuppressionFlags C ON S.SuppressionId=C.SuppressionId AND C.FlagName='BooleanThree'
WHERE A.Value = 1 AND (B.Value = 1 OR C.Value = 1)
This common database pattern is called the attribute / value pattern. Because SQL doesn't easily let you use variables for column names (it doesn't really have reflection) this kind of way of naming your attributes is your best path to extensibility.
It's a little more SQL. But you can add as many new flags as you need, in production, without rewriting queries or getting a combinatorial explosion of flag-matching. And SQL is built to handle this kind of query.
I've got this table
CREATE TABLE `subevents` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(150) DEFAULT NULL,
`content` text,
`class` tinyint(4) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`)
) ENGINE=MyISAM
Each row can have a different value in the 'class' field.
I'd like to select any number of rows, ordered randomly, as long as the sum of the values in the 'class' field is equal to 100.
How could I accomplish it directly in the MySQL query without doing it later in PHP?
Thanks everybody!
By "ordered randomly" I assume you mean that the order of the rows doesn't matter but no row can be used more than once. So you are looking for a combination of rows in which the sum of class equals 100. Use the brute force method. Randomly generate possible solutions until you find one that works.
delimiter //
CREATE PROCEDURE subsetsum(total)
BEGIN
DECLARE sum INTEGER;
REPEAT
CREATE OR REPLACE VIEW `solution`
AS SELECT * FROM `subevents`
WHERE 0.5 <= RAND();
SELECT SUM(`class`) INTO sum FROM `solution`;
UNTIL sum = total END REPEAT;
END
//
delimiter ;
CALL subsetsum(100); /* For example */
SELECT * FROM `solution`;
I have tested this with tables having a TINYINT column of random values and it is actually reasonably fast. The only problem is that there is no guarantee that subsetsum() will ever return.
I don't think this is possible with only SQL...the only thing which comes to my mind is to redo a the sql query as long the sum isn't 100
But I have no clue how to select a random number of rows at once.