I have the following table structure:
Product (id, name, ...)
+-----+------------+
| id | name |
+-----+------------+
| 1 | Product #1 |
| 2 | Product #2 |
| 3 | Product #3 |
| 4 | Product #4 |
+-----+------------+
Attribute (id, title, ...)
+-----+------------+
| id | title |
+-----+------------+
| 1 | shape |
| 2 | colour |
| 3 | height |
| 4 | weight |
+-----+------------+
Option (id, title ... )
+-----+------------+
| id | title |
+-----+------------+
| 1 | round |
| 2 | square |
| 3 | oval |
| 4 | red |
| 5 | blue |
| 6 | green |
| 7 | tall |
| 8 | short |
| 9 | heavy |
| 10 | light |
+-----+------------+
and a fourth one (ProductAttribute - id, product_id, attribute_id, option_id), hoping to get "all the red round products which are also tall and heavy":
+-----+------------+--------------------+
| id | product | attribute | option |
+-----+------------+--------------------+
| 1 | Product #1 | shape | round |
| 2 | Product #2 | shape | oval |
| 3 | Product #3 | shape | round |
| 4 | Product #4 | shape | square |
| 5 | Product #1 | color | green |
| 6 | Product #2 | color | red |
| 7 | Product #3 | height | tall |
| 8 | Product #4 | height | short |
| 9 | Product #2 | weight | heavy |
| 10 | Product #1 | weight | light |
+-----+------------+--------------------+
I'm by far not an sql master and maybe my idea can't work.
Edit:
Q1. The question is how do I achieve that? Getting all the red, tall, heavy products for instance.
The following queries don't achieve my purpose:
1:
SELECT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
AND (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
2:
SELECT DISTINCT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
OR (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
Note: I'm purposely putting 2 variable in my query, as the real one has many more.
Key/value tables are a nuisance. So avoid them, if you can. You'd have these tables then:
table shapes
+--------+
| shape |
+--------+
| round |
| oval |
| round |
| square |
+--------+
table colors
+--------+
| color |
+--------+
| green |
| red |
+--------+
table heights
+--------+
| height |
+--------+
| tall |
| short |
+--------+
table weights
+--------+
| weight |
+--------+
| heavy |
| light |
+--------+
table products
+-------------+--------------+--------+--------+--------+--------+
| product_no | product name | shape | color | height | weight |
+-------------+--------------+--------+--------+--------+--------+
| 14214 | Product #1 | round | red | tall | heavy |
| 22312 | Product #2 | oval | | short | heavy |
| 35757 | Product #3 | square | green | tall | heavy |
| 42468 | Product #4 | | red | short | light |
+-------------+--------------+--------+--------+--------+--------+
The query
select *
from products
where shape = 'round'
and color = 'red'
and height = 'tall'
and weight = 'heavy';
You can do the same with IDs by the way. So all lookup tables would get an ID (round = 1, oval = 2, ... green = 1, red = 2, ...) and the product table would no longer contain the words, but the IDs. The query would then be:
select *
from products
where shape_id = (select id from shapes where shape = 'round')
and color_id = (select id from colors where color = 'red')
and height_id = (select id from heights where height = 'tall')
and weight_id = (select id from weights where weight = 'heavy';
So you want make select based on option from ProductAttribute table
Better way to store data in table is to use unique/id/primary key value for fourth columns and then you can so that
SELECT * FROM ProductAttribute as attr
INNER JOIN Product as product ON product.id=attr.product_id
INNER JOIN Attribute as attr2 ON attr2.id=attr.attribute_id
WHERE attr.option=“round” OR attr.option=“red”
I hope this help you!
Fot the key/value approach I'd use composite keys to improve consistency:
attribute (attribute_no, title), PK = attribute_no
+--------------+------------+
| attribute_no | title |
+--------------+------------+
| 1 | shape |
| 2 | colour |
| ... | ... |
+--------------+------------+
attribute_option (attribute_no, option_no, value), PK = attribute_no, option_no
+--------------+-----------+------------+
| attribute_no | option_no | value |
+--------------+-----------+------------+
| 1 | 1 | round |
| 1 | 2 | square |
| 2 | 1 | green |
| 2 | 2 | red |
| ... | ... | ... |
+--------------+-----------+------------+
product (product_no, product_name, ...), PK = product_no
+------------+--------------+
| product_no | product_name |
+------------+--------------+
| 7352871 | Product #1 |
| 8956443 | Product #2 |
| ... | ... |
+------------+--------------+
product_attributes (product_no, attribute_no, option_no), PK = product_no, attribute_no
+------------+--------------+-----------+
| product_no | attribute_no | option_no |
+------------+--------------+-----------+
| 7352871 | 1 | 1 |
| 7352871 | 2 | 1 |
| 8956443 | 1 | 2 |
| 8956443 | 2 | 1 |
+------------+--------------+-----------+
(And you'd want an index on attribute_no + option_no for this table.)
The product_attributes primary key guarantees that each product only gets one value per attribute. Well, this is good for height, weight, etc. If you want to have multiple colors etc. for a product however, you need such product_attributes table including the option_no in the primary key. You may end up with separate tables for unique attributes and multiple attributes. Maybe later you even want to introduce product groups with optional and obligatory attributes (a freezer has an energy class, a t-shirt doesn't). So this whole concept may grow, but above tables should give you an idea how to best approach this.
A query for all the red round products which are also tall and heavy:
select *
from product
where product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'colour'
and ao.value = 'red'
)
)
and product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'shape'
and ao.value = 'round'
)
)
and product_no in (...)
and product_no in (...);
Or shorter with aggregation:
select *
from product
where product_no in
(
select pa.product_no
from product_attributes pa
join attribute a on a.attribute_no = pa.attribute_no
join attribute_option ao on a.attribute_no = pa.attribute_no
and a.option_no = pa.option_no
group by pa.product_no
having sum(a.title = 'colour' and ao.value = 'red') > 0
and sum(a.title = 'shape' and ao.value = 'round') > 0
and sum(a.title = 'height' and ao.value = 'tall') > 0
and sum(a.title = 'weight' and ao.value = 'heavy') > 0
)
After searching the web for "mysql key value table" (thank you #Thorsten Kettner for the keywords, as I lack the terminology), I've end up with something like:
SELECT Product.id FROM Product
INNER JOIN ProductAttributes PA_1 ON
Product.id = PA_1.product_id
INNER JOIN ProductAttributes PA_2 ON
Product.id = PA_2.product_id
WHERE
(PA_1.attribute_id = 1 and PA_1.option_id = 1)
AND
(PA_2.attribute_id = 3 and PA_2.option_id = 4);
Basically whenever a new attribute is used in the query, a different INNER JOIN condition is needed.
Which in terms of "performance" a rather noticeable hit will happen.
According to this and this a key/value table should not be used for filtering, but at this point I have no choice, so it will be up to the caching server to save the day.
I've based my answer of this (no need for GROUP BY in my case as I don't use aggregate functions) Filtering and Grouping data from table with key/value pairs
This a hard question to know how to ask properly, but here goes...
This is the basic format of my table (the actual table has many rows and multiple lang_ids):
----------------------------------
| id | lang_id | key | text |
----------------------------------
| 1 | 1 | k_foo | foo |
----------------------------------
| 2 | 1 | k_bar | bar |
----------------------------------
| 3 | 2 | k_bar | le bar |
----------------------------------
| 4 | 2 | k_foo | le foo |
----------------------------------
What I want to do is return the rows with WHERE lang_id = 2 but order them by results of WHERE lang_id = 1 like so:
----------------------------------
| id | lang_id | key | text |
----------------------------------
| 4 | 2 | k_foo | le foo |
----------------------------------
| 3 | 2 | k_bar | le bar |
----------------------------------
I am driving myself nuts trying to figure it out. I've searched for hours but keep getting results for ordering by multiple columns instead of multiple results of a single column.
I've tried joining it, unioning it, and subqueries but I either return hundreds of rows, or none!
SELECT
l2.id,
l2.lang_id,
l2.key,
l2.text
FROM
language l1
JOIN
language l2
ON l2.key = l1.key
WHERE
l1.lang_id = 1
AND
l2.lang_id = 2
ORDER BY
l1.id -- Replace with whatever column you actually want to order by.
I'm currently redesign a heavy loaded website, and I would appreciate any opinion about a specific database design issue.
The concept is to keep in the db a number of products (500K of them).
Every product can have a number of dynamic properties (around 1K), and every property a number of predefined but dynamic values (lets say 10 on average for every property, so around 10K)
At this point of time this is the simplified db structure:
Products (Products Table)
+--------+--------------+
| ProdID | Product Name |
+--------+--------------+
| 1 | T-Shirt XYZ |
+--------+--------------+
| 2 | Dress ABC |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 500000 | Something |
+--------+--------------+
Properties Definition (Props Table) (it holds the Property Types)
+--------+--------------+
| PropID | Property Name|
+--------+--------------+
| 1 | color |
+--------+--------------+
| 2 | size |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 100 | Some Prop |
+--------+--------------+
Properties Values Definition (Values Table)
+-----------+--------+-------+
| PropValID | PropID | Value |
+-----------+--------+-------+
| 1 | 1 | red |
+-----------+--------+-------+
| 2 | 1 | blue |
+-----------+--------+-------+
| 3 | 2 | m |
+-----------+--------+-------+
| 4 | 2 | xl |
+-----------+--------+-------+
| 5 | 2 | xxl |
+-----------+--------+-------+
| ... | ... | ... |
+-----------+--------+-------+
| 1000 | 100 | xyz |
+-----------+--------+-------+
This way we can add any number of properties and values in any product.
The table below holds this info.
Product Properties & Values (ProdPropVal Table)
+--------+--------+--------+-----------+
| InfoID | ProdID | PropID | PropValID |
+--------+--------+--------+-----------+
| 1 | 1 | 1 | 1 |
+--------+--------+--------+-----------+
| 2 | 1 | 2 | 3 |
+--------+--------+--------+-----------+
| 3 | 2 | 1 | 2 |
+--------+--------+--------+-----------+
| 4 | 2 | 2 | 5 |
+--------+--------+--------+-----------+
| ... | ... | ... | |
+--------+--------+--------+-----------+
In the example above we know that "T-Shirt XYZ" has blue color and its size is medium.
And now the tricky part...
if we want to find all products that have a common property values set (all products of blue color and medium size) which is the best approach?
My ideas:
Search one time the ProdPropVal Table for each PropValID and compare the results in code. This can be fine tuned by starting from the most rare PropValIDs and limiting ProdIDs using a WHERE ProdID IN (previous IDs) in the next queries.
Use an Inner Join in the ProdPropVal Table for each PropValID wanted. Something like: SELECT ProdID FROM ProdPropVal ppv1 INNER JOIN ProdPropVal ppv2 ON ppv1.ProdID = ppv2.ProdID INNER JOIN ProdPropVal ppv3 ON ppv1.ProdID = ppv3.ProdID INNER JOIN ProdPropVal ppv4 ON ppv1.ProdID = ppv4.ProdID WHERE ppv1.PropValID = 10 AND ppv2.PropValID = 20 AND ppv3.PropValID = 30 AND ppv4.PropValID = 150
These are my ideas so far. The fact that ProdPropVal tablet has some millions rows doesn't leave any room for error.
Any suggestion is most welcomed!
To find all products with blue colour and medium size I would do this:
SELECT ProdID
FROM ProdPropVal
WHERE (PropID = 1 AND PropValID = 2)
OR (PropID = 2 AND PropValID = 3)
GROUP BY ProdID
HAVING COUNT(*) = 2
Better still, if PropValID is unique in the Values table, then you would remove the PropID column from the ProdPropVal table, and simplify the query to this:
SELECT ProdID
FROM ProdPropVal
WHERE PropValID IN (2, 3)
GROUP BY ProdID
HAVING COUNT(*) = 2
I would like to create a view from a lookup table and actual data. I have two questions.
How would you accomplish this?
Should I try to do it this way?
Senerio
Table Name: steps
Table structure with values:
There are other columns hence the ( ... )
| id | Name | ... |
| 1 | Step One | ... |
| 2 | Step Two | ... |
Table Name: steps_completed
Table structure with values:
| user_id | steps_id |
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
Results Wanted
View Structure and Values wanted:
| user_id | step_one | step_two |
| 1 | 1 | 0 |
| 1 | 1 | 1 |
Thanks for your help.
Sounds like you need a cross tab query. The only way I've seen it done is via a stored procedure like this: Cross Tab Query
I'd like to find the car_id's of the cars that have 'FORD' AND 'SILVER' AND the user input value of '200' in the value column:
table_cars
+----+--------+----------+-----------+
| id | car_id | name | value |
+----+--------+----------+-----------+
| 1 | 1 | MAKE | FORD |
| 2 | 1 | CARLINE | FIESTA |
| 3 | 1 | COLOR | SILVER |
| 4 | 1 | TOPSPEED | 210KM/H |
| 5 | 2 | MAKE | FORD |
| 6 | 2 | CARLINE | FOCUS |
| 7 | 2 | COLOR | SILVER |
| 8 | 2 | TOPSPEED | 200KM/H |
| 9 | 3 | MAKE | HOLDEN |
| 10 | 3 | CARLINE | ASTRA |
| 11 | 3 | COLOR | WHITE |
| 12 | 3 | TOPSPEED | 212KM/H |
+----+--------+----------+-----------+
Which in this case should return only one car_id: car_id = 2.
What would be the way to go to create the SQL query for this?
What you have is a properties table. When you want to test multiple properties at once you need to join the table to itself:
SELECT c0.car_id
FROM table_cars AS c0
JOIN table_cars AS c1 ON c1.car_id=c0.car_id
JOIN table_cars AS c2 ON c2.car_id=c1.car_id
WHERE c0.name='MAKE' AND c0.value='FORD'
AND c1.name='COLOR' AND c1.value='SILVER'
AND c2.name='TOPSPEED' AND c2.value='200KM/H'
Having the surrogate id present in a properties table is questionable. It doesn't seem to be doing anything; each property isn't an entity of its own. Unless the id is required by some other element, I'd get rid of it and make car_id, name the primary key (a composite primary key).
I assume that every car needs to have variable parameters, otherwise you wouldn't have gone with a setup like this. It would be much easier if MAKE, CARLINE, COLOR, and TOPSPEED each had their own column.
Using the table you've provided, however, you need to use subqueries. http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
The query should look something like this (untested):
SELECT * FROM table_cars WHERE id IN (SELECT * FROM table_cars WHERE name="MAKE" AND value="FORD") AND id IN (SELECT * FROM table_cars WHERE name="COLOR" AND value="SILVER") AND id IN (SELECT * FROM table_cars WHERE name="TOPSPEED" AND value="200KM/H")