database schema one column entry references many rows from another table - mysql

Let's say we have a table called Workorders and another table called Parts. I would like to have a column in Workorders called parts_required. This column would contain a single item that tells me what parts were required for that workorder. Ideally, this would contain the quantities as well, but a second column could contain the quantity information if needed.
Workorders looks like
WorkorderID date parts_required
1 2/24 ?
2 2/25 ?
3 3/16 ?
4 4/20 ?
5 5/13 ?
6 5/14 ?
7 7/8 ?
Parts looks like
PartID name cost
1 engine 100
2 belt 5
3 big bolt 1
4 little bolt 0.5
5 quart oil 8
6 Band-aid 0.1
Idea 1: create a string like '1-1:2-3:4-5:5-4'. My application would parse this string and show that I need --> 1 engine, 3 belts, 5 little bolts, and 4 quarts of oil.
Pros - simple enough to create and understand.
Cons - will make deep introspection into our data much more difficult. (costs over time, etc)
Idea 2: use a binary number. For example, to reference the above list (engine, belt, little bolts, oil) using an 8-bit integer would be 54, because 54 in binary representation is 110110.
Pros - datatype is optimal concerning size. Also, I am guessing there are tricky math tricks I could use in my queries to search for parts used (don't know what those are, correct me if I'm in the clouds here).
Cons - I do not know how to handle quantity using this method. Also, Even with a 64-bit BIGINT still only gives me 64 parts that can be in my table. I expect many hundreds.
Any ideas? I am using MySQL. I may be able to use PostgreSQL, and I understand that they have more flexible datatypes like JSON and arrays, but I am not familiar with how querying those would perform. Also it would be much easier to stay with MySQL

Why not create a Relationship table?
You can create a table named Workorders_Parts with the following content:
|workorderId, partId|
So when you want to get all parts from a specific workorder you just type:
select p.name
from parts p inner join workorders_parts wp on wp.partId = p.partId
where wp.workorderId = x;
what the query says is:
Give me the name of parts that belongs to workorderId=x and are listed in table workorders_parts
Remembering that INNER JOIN means "INTERSECTION" in other words: data i'm looking for should exist (generally the id) in both tables
IT will give you all part names that are used to build workorder x.
Lets say we have workorderId = 1 with partID = 1,2,3, it will be represented in our relationship table as:
workorderId | partId
1 | 1
1 | 2
1 | 3

Related

Storing csv in MySQL field – bad idea?

I have two tables, one user table and an items table. In the user table, there is the field "items". The "items" table only consists of a unique id and an item_name.
Now each user can have multiple items. I wanted to avoid creating a third table that would connect the items with the user but rather have a field in the user_table that stores the item ids connected to the user in a "csv" field.
So any given user would have a field "items" that could have a value like "32,3,98,56".
It maybe is worth mentioning that the maximum number of items per user is rather limited (<5).
The question: Is this approach generally a bad idea compared to having a third table that contains user->item pairs?
Wouldn't a third table create quite an overhead when you want to find all items of a user (I would have to iterate through all elements returned by MySQL individually).
You don't want to store the value in the comma separated form.
Consider the case when you decide to join this column with some other table.
Consider you have,
x items
1 1, 2, 3
1 1, 4
2 1
and you want to find distinct values for each x i.e.:
x items
1 1, 2, 3, 4
2 1
or may be want to check if it has 3 in it
or may be want to convert them into separate rows:
x items
1 1
1 2
1 3
1 1
1 4
2 1
It will be a HUGE PAIN.
Use atleast normalization 1st principle - have separate row for each value.
Now, say originally you had this as you table:
x item
1 1
1 2
1 3
1 1
1 4
2 1
You can easily convert it into csv values:
select x, group_concat(item order by item) items
from t
group by x
If you want to search if x = 1 has item 3. Easy.
select * from t where x = 1 and item = 3
which in earlier case would use horrible find_in_set:
select * from t where x = 1 and find_in_set(3, items);
If you think you can use like with CSV values to search, then first like %x% can't use indexes. Second, it will produce wrong results.
Say you want check if item ab is present and you do %ab% it will return rows with abc abcd abcde .... .
If you have many users and items, then I'd suggest create separate table users with an PK userid, another items with PK itemid and lastly a mapping table user_item having userid, itemid columns.
If you know you'll just need to store and retrieve these values and not do any operation on it such as join, search, distinct, conversion to separate rows etc. etc. - may be just may be, you can (I still wouldn't).
Storing complex data directly in a relational database is a nonstandard use of a relational database. Normally they are designed for normalized data.
There are extensions which vary according to the brand of software which may help. Or you can normalize your CSV file into properly designed table(s). It depends on lots of things. Talk to your enterprise data architect in this case.
Whether it's a bad idea depends on your business needs. I can't assess your business needs from way out here on the internet. Talk to your product manager in this case.

How to request lists that contain certain items in MySQL

In the application I am developing, the user has to set parameters to define the end product he will get.
My tables look like this :
Categories
-------------
Id Name
1 Material
2 Color
3 Shape
Parameters
-------------
Id CategoryId Name
1 1 Wood
2 1 Plastic
3 1 Metal
4 2 Red
5 2 Green
6 2 Blue
7 3 Round
8 3 Square
9 3 Triangle
Combinations
-------------
Id
1
2
...
ParametersCombinations
----------------------
CombinationId ParameterId
1 1
1 4
1 7
2 1
2 5
2 7
Now only some combinations of parameters are available to the user. In my example, he could get a red round wooden thingy or a green round wooden thingy but not a blue one because I can't produce it.
Let's say the user selected wood and round parameters. How do I make a request to know that there's only red and green available so I can disable the blue option for him ?
Or is there some better way to model my database ?
Let us assume you provide the selected parameters id in the following format
// I call this a **parameterList** for convenience sake.
(1,7) // this is parameter id 1 and id 7.
I am also assuming you are using some scripting language to help you with your app. Like ruby or php.
I am also assuming you want to avoid putting as much logic into your stored procedure or MySQL queries as much as possible.
Another assumption is that you are using one of the Rapid Application MVC Frameworks like Rails, Symfony or CakePHP.
Your logic would be:
Find all the combinations that contain ALL the parameters in your parameterList and put these found combinations in a list called relevantCombinations
Find all the parameters_combinations that contain at least 1 of the combinations in the list relevantCombinations. Retrieve only the unique parameter values.
First two steps can be solved using simple Model::find methods and a forloop in the frameworks I described above.
If you are not using frameworks, it is also cool to use the scripting language raw.
If you require them in MySQL queries, here are some possible queries. Be aware that these are not necessary the best queries.
First one is
SELECT * FROM (
SELECT `PossibleList`.`CombinationId`, COUNT(`PossibleList`.`CombinationId`) as number
FROM (
SELECT `CombinationId` FROM `ParametersCombinations`
WHERE `ParameterId` IN (1, 7)
) `PossibleList` GROUP BY `PossibleList`.`CombinationId`
) `PossibleGroupedList` WHERE `number` = 2;
-- note that the (1, 7) and the number 2 needs to be supplied by your app.
-- 2 refers to the number of parameters supplied.
-- In this case you supplied 1 and 7 therefore 2.
To confirm, look at http://sqlfiddle.com/#!2/16831/3.
Note how I purposely have a Combination 3 which only has the Parameter 1 but not 7. Therefore the query did not give you back 3, but only 1 and 2. Feel free to tweak the asterisk * in the first line.
Second one is
SELECT DISTINCT(`ParameterID`)
FROM `ParametersCombinations`
WHERE `CombinationId` IN (1, 2);
-- note that (1, 2) is the result we expect from the first step.
-- the one we call relevantCombinations
To confirm, look at http://sqlfiddle.com/#!2/16831/5
I do not recommend being a masochist and attempt to get your answer in a single query.
I also do NOT recommend using the MySQL queries I have supplied. It is less masochistic. But sufficiently masochistic for me NOT to recommend this way.
Since you did not indicate any tag other than mysql, I suspect that you are stronger with mysql. Hence my answer contains mysql.
My strongest suggestion would be my first. Make full use of established frameworks and put your logic in the business logic layer. Not in the data layer. Even if you don't use frameworks and just use raw php and ruby, that is still a better place for you to place your logic in than MySQL.
I saw that T gave an answer in a single MySQL query but I can tell you that (s)he considers only 1 parameter.
See this part:
WHERE ParameterId = 7 -- 7 is the selected parameter
You can adapt his/her answer with some trickery using a forloop and appending OR clauses.
Again, I do NOT recommend that in the big picture of building an app.
I have also tested his/her answer with http://sqlfiddle.com/#!2/2eda4/2. There may be 1 or 2 small bugs.
In summary, my recommendations in descending order of strength:
Use a framework like Rails or CakePHP and the pseudocode step 1 and 2 and as many find as you need. (STRONGEST)
Use raw scripting language and the pseudocode step 1 and 2 and as many simple queries as you need.
Use the raw MySQL queries I created. (LEAST STRONG)
P.S. I left out the part in my queries as to how to get the name of the Parameters. But given that you can get the ParameterIDs from my answer, I think that is trivial. I have also left out how you may need to remove the already selected parameters (1, 7). Again, that should be trivial to you.
Try the following
SELECT p.*, pc.CombinationId
FROM Parameters p
-- get the parameter combinations for all the parameters
JOIN ParametersCombinations pc
ON pc.ParameterId = p.Id
-- filter the parameter combinations to only combinations that include the selected parameter
JOIN (
SELECT CombinationId
FROM ParametersCombinations
WHERE ParameterId = 7 -- 7 is the selected parameter
) f ON f.CombinationId = pc.CombinationId
Or removing the already selected parameters
SELECT p.*, pc.CombinationId
FROM Parameters p
JOIN ParametersCombinations pc
ON pc.ParameterId = p.Id
JOIN (
SELECT CombinationId
FROM ParametersCombinations
WHERE ParameterId IN (7, 1)
) f ON f.CombinationId = pc.CombinationId
WHERE ParameterId NOT IN (7, 1)

mysql optimize data content: multi column or simple column hash data

I actually have a table with 30 columns. In one day this table can get around 3000 new records!
The columns datas look like :
IMG Name Phone etc..
http://www.site.com/images/image.jpg John Smith 123456789 etc..
http://www.site.com/images/image.jpg Smith John 987654321 etc..
I'm looking a way to optimize the size of the table but also the response time of the sql queries. I was thinking of doing something like :
Column1
http://www.site.com/images/image.jpg|John Smith|123456789|etc..
And then via php i would store each value into an array..
Would it be faster ?
Edit
So to take an example of the structure, let's say i have two tables :
package
package_content
Here is the structure of the table package :
id | user_id | package_name | date
Here is the structure of the table package_content :
id | package_id | content_name | content_description | content_price | content_color | etc.. > 30columns
The thing is for each package i can get up to 16rows of content. For example :
id | user_id | package_name | date
260 11 Package 260 2013-7-30 10:05:00
id | package_id | content_name | content_description | content_price | content_color | etc.. > 30columns
1 260 Content 1 Content 1 desc 58 white etc..
2 260 Content 2 Content 2 desc 75 black etc..
3 260 Content 3 Content 3 desc 32 blue etc..
etc...
Then with php i make like that
select * from package
while not EOF {
show package name, date etc..
select * from package_content where package_content.package_id = package.id and package.id = package_id
while not EOF{
show package_content name, desc, price, color etc...
}
}
Would it be faster? Definitely not. If you needed to search by Name or Phone or etc... you'd have to pull those values out of Column1 every time. You'd never be able to optimize those queries, ever.
If you want to make the table smaller it's best to look at splitting some columns off into another table. If you'd like to pursue that option, post the entire structure. But note that the number of columns doesn't affect speed that much. I mean it can, but it's way down on the list of things that will slow you down.
Finally, 3,000 rows per day is about 1 million rows per year. If the database is tolerably well designed, MySQL can handle this easily.
Addendum: partial table structures plus sample query and pseudocode added to question.
The pseudocode shows the package table being queried all at once, then matching package_content rows being queried one at a time. This is a very slow way to go about things; better to use a JOIN:
SELECT
package.id,
user_id,
package_name,
date,
package_content.*
FROM package
INNER JOIN package_content on package.id = package_content.id
WHERE whatever
ORDER BY whatever
That will speed things up right away.
If you're displaying on a web page, be sure to limit results with a WHERE clause - nobody will want to see 1,000 or 3,000 or 1,000,000 packages on a single web page :)
Finally, as I mentioned before, the number of columns isn't a huge worry for query optimization, but...
Having a really wide result row means more data has to go across the wire from MySQL to PHP, and
It isn't likely you'll be able to display 30+ columns of information on a web page without it looking terrible, especially if you're reading lots of rows.
With that in mind, you'll be better of picking specific package_content columns in your query instead of picking them all with a SELECT *.
Don't combine any columns, this is no use and might even be slower in the end.
You should use indexes on a column where you query at. I do have a website with about 30 columns where atm are around 600.000 results. If you use EXPLAIN before a query, you should see if it uses any indexes. If you got a JOIN with 2 values and a WHERE at the same table. You should make a combined index with the 3 columns, in order from JOIN -> WHERE. If you join on the same table, you should see this as a seperate index.
For example:
SELECT p.name, p.id, c.name, c2.name
FROM product p
JOIN category c ON p.cat_id=c.id
JOIN category c2 ON c.parent_id=c2.id AND name='Niels'
WHERE p.filterX='blaat'
You should have an combined index at category
parent_id,name
AND
id (probably the AI)
A index on product
cat_id
filterX
With this easy solution you can optimize queries from NOT DOABLE to 0.10 seconds, or even faster.
If you use MySQL 5.6 you should step over to INNODB because MySQL is better with optimizing JOINS and sub queries. Also MySQL will try to run them into MEMORY which will make it a lot faster aswel. Please keep in mind that backupping INNODB tables might need some extra attention.
You might also think about making MEMORY tables for super fast querieing (you do still need indexes).
You can also optimize by making integers size 4 (4 bytes, not 11 characters). And not always using VARCHAR 255.

Storing IDs as comma separated values

How can I select from a database all of the rows with an ID stored in a varchar comma separated. for example, I have a table with this:
, 7, 9, 11
How can I SELECT the rows with those IDs?
Normalize your database. You should be using a lookup table most likely.
You have 2 options:
Use a function to split the string into a temp table and then join the table your selecting from to that temp table.
Use dynamic SQL to query the table where id in (#variable) --- bad choice if you choose this way.
select * from table_name where id in (7, 9, 11)
If you do typically have that comma at the start, you will need to remove it first.
Use match(column) against('7,9,11')
this willl show all varchar column of your id's where 7,9,11 is there.
But you have to be shure that ur column have fulltext index.
Just yesterday I was fixing a bug in an old application here and saw where they handled it like this:
AND (T.ServiceIDs = '#SegmentID#' OR T.ServiceIDs LIKE '#SegmentID#,%'
OR T.ServiceIDs LIKE '%,#SegmentID#,%' OR T.ServiceIDs LIKE '%,#SegmentID#')
I am assuming you are saying something like the value of ServiceIDs from the database might contain 7,9,11 and that the variable SegmentID is one or more values. It was inside a CFIF statement checking to see that SegmentID in fact had a value(which was always the case due to prior logic that would default it.
I personally though would do as others have suggested and I'd create what I always refer to as a bridging table that allows you to have 0 to many PKs from one table related to the PK of another.
I had to tackle this problem years ago where I could not change the table structure and I created a custom table type and a set of functions so I could treat the values via SQL as if they were coming from a table. That custom table type solution though was specific to Oracle and I'd not know how to do that in MySQL without some research on my part.
There is a reason querying lists is so difficult: databases are not designed to work with delimited lists. They are optimized to work best with rows (or sets) of data. Creating the proper table structure will result in much better query performance and simpler sql queries. (So while it is technically possible, you should seriously consider normalizing your database as Todd and others suggested.)
Many-to-many relationships are best represented by three (3) tables. Say you are selling "widgets" in a variety of "sizes". Create two tables representing the main entities:
Widget (unique widgets)
WidgetID | WidgetTitle
1 | Widget 1
2 | Widget 2
....
Size (unique sizes)
SizeID | SizeTitle
7 | X-Small
8 | Small
9 | Medium
10 | Large
11 | X-Large
Then create a junction table, to store the relationships between those two entities, ie Which widgets are available in which sizes
WidgetSize (available sizes for each widget)
WidgetID | SizeID
1 | 7 <== Widget 1 "X-Small"
1 | 8 <== Widget 1 + "Small"
2 | 7 <== Widget 2 + "X-Small"
2 | 9 ....
2 | 10
2 | 11
....
With that structure, you can easily return all widgets having any (or all) of a list of sizes. Not tested, but something similar to the sql below should work.
Find widgets available in any of the sizes: <cfset listOfSizes = "7,9,11">
SELECT w.WidgetID, w.WidgetTitle
FROM Widget w
WHERE EXISTS
( SELECT 1
FROM WidgetSize ws
WHERE ws.WidgetID = w.WidgetID
AND ws.SizeID IN (
<cfqueryparam value="#listOfSizeIds#"
cfsqltype="cf_sql_integer" list="true" >
)
)
Find widgets available in all three sizes: <cfset listOfSizes = "7,9,11">
SELECT w.WidgetID, w.WidgetTitle, COUNT(*) AS MatchCount
FROM Widget w INNER JOIN WidgetSize ws ON ws.WidgetID = w.WidgetID
WHERE ws.SizeID IN (
<cfqueryparam value="#listOfSizeIds#"
cfsqltype="cf_sql_integer" list="true" >
)
GROUP BY w.WidgetID, w.WidgetTitle
HAVING MatchCount = 3

MySQL, how to repeat same line x times

I have a query that outputs address order data:
SELECT ordernumber
, article_description
, article_size_description
, concat(NumberPerBox,' pieces') as contents
, NumberOrdered
FROM customerorder
WHERE customerorder.id = 1;
I would like the above line to be outputted NumberOrders (e.g. 50,000) divided by NumberPerBox e.g. 2,000 = 25 times.
Is there a SQL query that can do this, I'm not against using temporary tables to join against if that's what it takes.
I checked out the previous questions, however the nearest one:
is to be posible in mysql repeat the same result
Only gave answers that give a fixed number of rows, and I need it to be dynamic depending on the value of (NumberOrdered div NumberPerBox).
The result I want is:
Boxnr Ordernr as_description contents NumberOrdered
------+--------------+----------------+-----------+---------------
1 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
2 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
....
25 | CORDO1245 | Carrying bags | 2,000 pcs | 50,000
First, let me say that I am more familiar with SQL Server so my answer has a bit of a bias.
Second, I did not test my code sample and it should probably be used as a reference point to start from.
It would appear to me that this situation is a prime candidate for a numbers table. Simply put, it is a table (usually called "Numbers") that is nothing more than a single PK column of integers from 1 to n. Once you've used a Numbers table and aware of how it's used, you'll start finding many uses for it - such as querying for time intervals, string splitting, etc.
That said, here is my untested response to your question:
SELECT
IV.number as Boxnr
,ordernumber
,article_description
,article_size_description
,concat(NumberPerBox,' pieces') as contents
,NumberOrdered
FROM
customerorder
INNER JOIN (
SELECT
Numbers.number
,customerorder.ordernumber
,customerorder.NumberPerBox
FROM
Numbers
INNER JOIN customerorder
ON Numbers.number BETWEEN 1 AND customerorder.NumberOrdered / customerorder.NumberPerBox
WHERE
customerorder.id = 1
) AS IV
ON customerorder.ordernumber = IV.ordernumber
As I said, most of my experience is in SQL Server. I reference http://www.sqlservercentral.com/articles/Advanced+Querying/2547/ (registration required). However, there appears to be quite a few resources available when I search for "SQL numbers table".