This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 2 years ago.
genreTable:
id
genre
1
Pop
2
Rock
3
Electro
songTable:
id
name
genre
1
Song1
1
2
Song2
1,2
3
Song3
2,3
Problem: Lets say I want to build query like:
SELECT * FROM songTable WHERE genre = '1'
It'll only return Song1
But how Do I make sure it also returns Song1, Song2
Any other suggestions regarding re-structuring the table is also accepted...
You should fix your data model! There are many reasons why your data model is broken:
Columns should only contain one value.
Numbers should be stored as numbers, not strings.
Foreign key relationships should be properly declared.
SQL has pretty bad string processing capabilities.
Sometimes, you are stuck with other people's really, really, really bad design decisions. In that case, you can use find_in_set():
select s.*
from songTable s
where find_in_set('1', genre) > 0
Related
This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 4 years ago.
What I have with me?
A string with comma separated :
$str_states = "1,2";
I have a table with the following:
id event_name states
1 ABC 1,4,5
2 PQR 1,2,3
3 XYZ 3,4,5
What I want:
id event_name states
1 ABC 1
2 PQR 1,2
What I tried with a query:
SELECT id,event_name FROM events_table WHERE
FIND_IN_SET('1,2', states);
SELECT id,event_name FROM events_table WHERE
states IN ('51,58');
You can't really use FIND_IN_SET the way you want here. FIND_IN_SET searches for a single value against a CSV string. So, you would have to use OR here:
SELECT id, event_name
FROM events_table
WHERE FIND_IN_SET('1', inLocationId) > 0 OR FIND_IN_SET('2', inLocationId) > 0;
If your starting point for the input is 1,2, then you will have to tease apart the individual values in your app layer.
By the way, storing CSV unnormalized data in your SQL tables is bad practice. You would be better off to fix your design than to use my answer.
As a bonus, here is a more terse way to write your query using REGEXP:
SELECT id, event_name
FROM events_table
WHERE inLocationid REGEXP '[[:<:]](1|2)[[:>:]]';
But again, please fix your data model.
I have two tables, one user table and an items table. In the user table, there is the field "items". The "items" table only consists of a unique id and an item_name.
Now each user can have multiple items. I wanted to avoid creating a third table that would connect the items with the user but rather have a field in the user_table that stores the item ids connected to the user in a "csv" field.
So any given user would have a field "items" that could have a value like "32,3,98,56".
It maybe is worth mentioning that the maximum number of items per user is rather limited (<5).
The question: Is this approach generally a bad idea compared to having a third table that contains user->item pairs?
Wouldn't a third table create quite an overhead when you want to find all items of a user (I would have to iterate through all elements returned by MySQL individually).
You don't want to store the value in the comma separated form.
Consider the case when you decide to join this column with some other table.
Consider you have,
x items
1 1, 2, 3
1 1, 4
2 1
and you want to find distinct values for each x i.e.:
x items
1 1, 2, 3, 4
2 1
or may be want to check if it has 3 in it
or may be want to convert them into separate rows:
x items
1 1
1 2
1 3
1 1
1 4
2 1
It will be a HUGE PAIN.
Use atleast normalization 1st principle - have separate row for each value.
Now, say originally you had this as you table:
x item
1 1
1 2
1 3
1 1
1 4
2 1
You can easily convert it into csv values:
select x, group_concat(item order by item) items
from t
group by x
If you want to search if x = 1 has item 3. Easy.
select * from t where x = 1 and item = 3
which in earlier case would use horrible find_in_set:
select * from t where x = 1 and find_in_set(3, items);
If you think you can use like with CSV values to search, then first like %x% can't use indexes. Second, it will produce wrong results.
Say you want check if item ab is present and you do %ab% it will return rows with abc abcd abcde .... .
If you have many users and items, then I'd suggest create separate table users with an PK userid, another items with PK itemid and lastly a mapping table user_item having userid, itemid columns.
If you know you'll just need to store and retrieve these values and not do any operation on it such as join, search, distinct, conversion to separate rows etc. etc. - may be just may be, you can (I still wouldn't).
Storing complex data directly in a relational database is a nonstandard use of a relational database. Normally they are designed for normalized data.
There are extensions which vary according to the brand of software which may help. Or you can normalize your CSV file into properly designed table(s). It depends on lots of things. Talk to your enterprise data architect in this case.
Whether it's a bad idea depends on your business needs. I can't assess your business needs from way out here on the internet. Talk to your product manager in this case.
This question already has answers here:
MySQL query finding values in a comma separated string
(11 answers)
Closed 7 years ago.
I have records in user table in following manner
id name address keywords
1 thompsan paris 10,20,30
2 samson paris 10,20,30
3 Nilawa paris 10,20,30
4 Nalama paris 100,30,50
5 Nalama paris 100,300,20
I need to get the users who have the keywords of 10 or 20. I have written this query:
SELECT * from User where keywords REGEXP '[[:<:]]10|20[[:>:]]'
It does not give me the expected output. It should filter for id 10 or 20 and give me the output of record 1,2,3,5. record 4 is not matching here.
Why is it not working? Is there a better way to do this?
Try this,
SELECT *
FROM user
WHERE FIND_IN_SET('10', keywords) > 0 OR
FIND_IN_SET('20', keywords) > 0
FIND_IN_SET is a builtin function with MySQL
Redesign your database so that it's actually in 1NF and you won't have to deal with these headaches not to mention the horrible performance and bugs that it's bound to bring you down the line.
Since I know that you won't do that though, there's no need to use REGEXP at all, assuming that your string in keywords is actually consistent (and if it's not then you're screwed anyway). Just use LIKE:
SELECT -- We'll list out the columns, since we NEVER use SELECT *
id,
name,
address,
keywords
FROM
User
WHERE
',' + keywords + ',' LIKE '%,10,%' OR
',' + keywords + ',' LIKE '%,20,%'
Let's say we have a table called Workorders and another table called Parts. I would like to have a column in Workorders called parts_required. This column would contain a single item that tells me what parts were required for that workorder. Ideally, this would contain the quantities as well, but a second column could contain the quantity information if needed.
Workorders looks like
WorkorderID date parts_required
1 2/24 ?
2 2/25 ?
3 3/16 ?
4 4/20 ?
5 5/13 ?
6 5/14 ?
7 7/8 ?
Parts looks like
PartID name cost
1 engine 100
2 belt 5
3 big bolt 1
4 little bolt 0.5
5 quart oil 8
6 Band-aid 0.1
Idea 1: create a string like '1-1:2-3:4-5:5-4'. My application would parse this string and show that I need --> 1 engine, 3 belts, 5 little bolts, and 4 quarts of oil.
Pros - simple enough to create and understand.
Cons - will make deep introspection into our data much more difficult. (costs over time, etc)
Idea 2: use a binary number. For example, to reference the above list (engine, belt, little bolts, oil) using an 8-bit integer would be 54, because 54 in binary representation is 110110.
Pros - datatype is optimal concerning size. Also, I am guessing there are tricky math tricks I could use in my queries to search for parts used (don't know what those are, correct me if I'm in the clouds here).
Cons - I do not know how to handle quantity using this method. Also, Even with a 64-bit BIGINT still only gives me 64 parts that can be in my table. I expect many hundreds.
Any ideas? I am using MySQL. I may be able to use PostgreSQL, and I understand that they have more flexible datatypes like JSON and arrays, but I am not familiar with how querying those would perform. Also it would be much easier to stay with MySQL
Why not create a Relationship table?
You can create a table named Workorders_Parts with the following content:
|workorderId, partId|
So when you want to get all parts from a specific workorder you just type:
select p.name
from parts p inner join workorders_parts wp on wp.partId = p.partId
where wp.workorderId = x;
what the query says is:
Give me the name of parts that belongs to workorderId=x and are listed in table workorders_parts
Remembering that INNER JOIN means "INTERSECTION" in other words: data i'm looking for should exist (generally the id) in both tables
IT will give you all part names that are used to build workorder x.
Lets say we have workorderId = 1 with partID = 1,2,3, it will be represented in our relationship table as:
workorderId | partId
1 | 1
1 | 2
1 | 3
This question already has answers here:
MySQL: Select Random Entry, but Weight Towards Certain Entries
(11 answers)
How to select one row randomly taking into account a weight?
(7 answers)
Closed 8 years ago.
I'm working on a query where users enter in points to a contest. They can enter in as many points as they have. I need to choose a winner at random but people with more points entered should technically have a better chance at getting picked.
I currently pull the query based using rand and a sum of total points per user.
The Table data looks like this:
fname lname user_id points
John baker 1 300
Robert backster 2 40
jason Doe 3 900
If I were to run the query multiple times, John would have a better chance then Robert but Jason would have a better chance then both John and Robert.
select * from table order by rand()
Doing it all within a query would be tough, I personally would rely a little bit on your backend. But if you have to do it all within MySQL, I would look for a weighted random selection.
From this SO post, I found this blog post that discusses it.
Basically, you are going to use some combination of the RAND() function and logarithms to achieve your result. From what I can tell, you may need to know the totals/averages to get your multiplier but that should be doable via a query as well.