MySQL table relationship and how to query a Key/value table - mysql

I have the following table structure:
Product (id, name, ...)
+-----+------------+
| id | name |
+-----+------------+
| 1 | Product #1 |
| 2 | Product #2 |
| 3 | Product #3 |
| 4 | Product #4 |
+-----+------------+
Attribute (id, title, ...)
+-----+------------+
| id | title |
+-----+------------+
| 1 | shape |
| 2 | colour |
| 3 | height |
| 4 | weight |
+-----+------------+
Option (id, title ... )
+-----+------------+
| id | title |
+-----+------------+
| 1 | round |
| 2 | square |
| 3 | oval |
| 4 | red |
| 5 | blue |
| 6 | green |
| 7 | tall |
| 8 | short |
| 9 | heavy |
| 10 | light |
+-----+------------+
and a fourth one (ProductAttribute - id, product_id, attribute_id, option_id), hoping to get "all the red round products which are also tall and heavy":
+-----+------------+--------------------+
| id | product | attribute | option |
+-----+------------+--------------------+
| 1 | Product #1 | shape | round |
| 2 | Product #2 | shape | oval |
| 3 | Product #3 | shape | round |
| 4 | Product #4 | shape | square |
| 5 | Product #1 | color | green |
| 6 | Product #2 | color | red |
| 7 | Product #3 | height | tall |
| 8 | Product #4 | height | short |
| 9 | Product #2 | weight | heavy |
| 10 | Product #1 | weight | light |
+-----+------------+--------------------+
I'm by far not an sql master and maybe my idea can't work.
Edit:
Q1. The question is how do I achieve that? Getting all the red, tall, heavy products for instance.
The following queries don't achieve my purpose:
1:
SELECT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
AND (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
2:
SELECT DISTINCT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
OR (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
Note: I'm purposely putting 2 variable in my query, as the real one has many more.

Key/value tables are a nuisance. So avoid them, if you can. You'd have these tables then:
table shapes
+--------+
| shape |
+--------+
| round |
| oval |
| round |
| square |
+--------+
table colors
+--------+
| color |
+--------+
| green |
| red |
+--------+
table heights
+--------+
| height |
+--------+
| tall |
| short |
+--------+
table weights
+--------+
| weight |
+--------+
| heavy |
| light |
+--------+
table products
+-------------+--------------+--------+--------+--------+--------+
| product_no | product name | shape | color | height | weight |
+-------------+--------------+--------+--------+--------+--------+
| 14214 | Product #1 | round | red | tall | heavy |
| 22312 | Product #2 | oval | | short | heavy |
| 35757 | Product #3 | square | green | tall | heavy |
| 42468 | Product #4 | | red | short | light |
+-------------+--------------+--------+--------+--------+--------+
The query
select *
from products
where shape = 'round'
and color = 'red'
and height = 'tall'
and weight = 'heavy';
You can do the same with IDs by the way. So all lookup tables would get an ID (round = 1, oval = 2, ... green = 1, red = 2, ...) and the product table would no longer contain the words, but the IDs. The query would then be:
select *
from products
where shape_id = (select id from shapes where shape = 'round')
and color_id = (select id from colors where color = 'red')
and height_id = (select id from heights where height = 'tall')
and weight_id = (select id from weights where weight = 'heavy';

So you want make select based on option from ProductAttribute table
Better way to store data in table is to use unique/id/primary key value for fourth columns and then you can so that
SELECT * FROM ProductAttribute as attr
INNER JOIN Product as product ON product.id=attr.product_id
INNER JOIN Attribute as attr2 ON attr2.id=attr.attribute_id
WHERE attr.option=“round” OR attr.option=“red”
I hope this help you!

Fot the key/value approach I'd use composite keys to improve consistency:
attribute (attribute_no, title), PK = attribute_no
+--------------+------------+
| attribute_no | title |
+--------------+------------+
| 1 | shape |
| 2 | colour |
| ... | ... |
+--------------+------------+
attribute_option (attribute_no, option_no, value), PK = attribute_no, option_no
+--------------+-----------+------------+
| attribute_no | option_no | value |
+--------------+-----------+------------+
| 1 | 1 | round |
| 1 | 2 | square |
| 2 | 1 | green |
| 2 | 2 | red |
| ... | ... | ... |
+--------------+-----------+------------+
product (product_no, product_name, ...), PK = product_no
+------------+--------------+
| product_no | product_name |
+------------+--------------+
| 7352871 | Product #1 |
| 8956443 | Product #2 |
| ... | ... |
+------------+--------------+
product_attributes (product_no, attribute_no, option_no), PK = product_no, attribute_no
+------------+--------------+-----------+
| product_no | attribute_no | option_no |
+------------+--------------+-----------+
| 7352871 | 1 | 1 |
| 7352871 | 2 | 1 |
| 8956443 | 1 | 2 |
| 8956443 | 2 | 1 |
+------------+--------------+-----------+
(And you'd want an index on attribute_no + option_no for this table.)
The product_attributes primary key guarantees that each product only gets one value per attribute. Well, this is good for height, weight, etc. If you want to have multiple colors etc. for a product however, you need such product_attributes table including the option_no in the primary key. You may end up with separate tables for unique attributes and multiple attributes. Maybe later you even want to introduce product groups with optional and obligatory attributes (a freezer has an energy class, a t-shirt doesn't). So this whole concept may grow, but above tables should give you an idea how to best approach this.
A query for all the red round products which are also tall and heavy:
select *
from product
where product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'colour'
and ao.value = 'red'
)
)
and product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'shape'
and ao.value = 'round'
)
)
and product_no in (...)
and product_no in (...);
Or shorter with aggregation:
select *
from product
where product_no in
(
select pa.product_no
from product_attributes pa
join attribute a on a.attribute_no = pa.attribute_no
join attribute_option ao on a.attribute_no = pa.attribute_no
and a.option_no = pa.option_no
group by pa.product_no
having sum(a.title = 'colour' and ao.value = 'red') > 0
and sum(a.title = 'shape' and ao.value = 'round') > 0
and sum(a.title = 'height' and ao.value = 'tall') > 0
and sum(a.title = 'weight' and ao.value = 'heavy') > 0
)

After searching the web for "mysql key value table" (thank you #Thorsten Kettner for the keywords, as I lack the terminology), I've end up with something like:
SELECT Product.id FROM Product
INNER JOIN ProductAttributes PA_1 ON
Product.id = PA_1.product_id
INNER JOIN ProductAttributes PA_2 ON
Product.id = PA_2.product_id
WHERE
(PA_1.attribute_id = 1 and PA_1.option_id = 1)
AND
(PA_2.attribute_id = 3 and PA_2.option_id = 4);
Basically whenever a new attribute is used in the query, a different INNER JOIN condition is needed.
Which in terms of "performance" a rather noticeable hit will happen.
According to this and this a key/value table should not be used for filtering, but at this point I have no choice, so it will be up to the caching server to save the day.
I've based my answer of this (no need for GROUP BY in my case as I don't use aggregate functions) Filtering and Grouping data from table with key/value pairs

Related

SQL - Get rows with one result and selected option - Relation Table

I have a table that has the relations of products and colors. Each product has one or multiple colors. Is it possible to do a query that returns only the products that have one color only and the color wanted ?
Value from the api : color_slug = white ;
Sample table :
color_table
+----------+------------+
| color_id | color_slug |
+----------+------------+
| 1 | white |
| 2 | blue |
| 3 | black |
| 4 | green |
| 5 | red |
| 6 | yellow |
+----------+------------+
product_table
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 1 | shoes |
| 2 | shorts |
| 3 | t-shirt |
| 4 | jacket |
| 5 | watch |
| 6 | glasses |
+------------+--------------+
pc_relation
+----+------------+----------+
| id | product_id | color_id |
+----+------------+----------+
| 1 | 1 | 5 |
| 2 | 1 | 1 |
| 3 | 2 | 1 |
| 4 | 2 | 4 |
| 5 | 2 | 3 |
| 6 | 3 | 2 |
| 7 | 4 | 1 |
| 8 | 5 | 5 |
| 9 | 5 | 6 |
| 10 | 6 | 1 |
+----+------------+----------+
Select unique color values (if i put WHERE color_id = 1 the product colors are not longer of one color only) :
SELECT product_id
FROM pc_relation
// WHERE color_id = 1
GROUP BY product_id
HAVING MIN(color_id) = MAX(color_id)
pc_relation.id = 6,7,10
SELECT *
FROM color_table
INNER JOIN pc_relation ON pc_relation.color_id = color_table.color_id
INNER JOIN product_table ON pc_relation.product_id = product_table.product_id
WHERE colors.color_slug = 'white'
Values wanted (color_slug = white):
pc_relation.id = 7,10
product_table.product_name = jacket, glasses
*all the combinations are unique and indexed. For example I cannot have one product with the same color
twice.
You are on the right track with your first query. Move the color comparison to the HAVING clause:
SELECT product_id
FROM pc_relation
GROUP BY product_id
HAVING MIN(color_id) = MAX(color_id) AND
MIN(color_id) = 1;
You can also phrase this using NOT EXISTS:
select r.*
from pc_relation r
where r.color_id = 1 and
not exists (select 1
from pc_relation r2
where r2.product_id = r.product_id and r2.color_id <> r.color_id
);
However, the GROUP BY method is more general.

MySQL function: rank table by most similar attributes

I have a table of products ids and keywords that looks like the following:
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| product_id | int(10) unsigned | YES | MUL | NULL | |
| keyword | varchar(255) | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
This table simply stores product ids, and keywords associated with those products. So for example, it might contain:
+----+------------+---------+
| id | product_id | name |
+----+------------+---------+
| 1 | 1 | soft |
| 2 | 1 | red |
| 3 | 1 | leather |
| 4 | 2 | cloth |
| 5 | 2 | red |
| 6 | 2 | new |
| 7 | 3 | soft |
| 8 | 3 | red |
| 9 | 4 | blue |
+----+------------+---------+
In other words:
product 1 is soft, red, and leather.
product 2 is cloth, red and new.
Product 3 is red and soft,
product 4 is blue.
I need some way to take in a product ID, and get back a sorted list of product ids ranked by the number of common keywords
So for example, if I pass in product_id 1, I'd expect to get back:
+----+-------+------------+
| product_id | matches |
+------------+------------+
| 3 | 2 | (product 3 has two common keywords with product 1)
| 2 | 1 | (product 2 has one common keyword with product 1)
| 4 | 0 | (product 4 has no common keywords with product 1)
+------------+------------+
One option uses a self right outer join with conditional aggregation to count the number of matched names between, e.g. product ID 1, and all other product IDs:
SELECT t2.product_id,
SUM(CASE WHEN t1.name IS NOT NULL THEN 1 ELSE 0 END) AS matches
FROM yourTable t1
RIGHT JOIN yourTable t2
ON t1.name = t2.name AND
t1.product_id = 1
WHERE t2.product_id <> 1
GROUP BY t2.product_id
ORDER BY t2.product_id
Follow the link below for a running demo:
SQLFiddle
You need to use an outer join against the keywords for productid 1:
select y.productid, count(y2.keyword)
from yourtable y
left join (
select keyword from yourtable y2 where y2.productid = 1
) y2 on y.keyword = y2.keyword
where y.productid <> 1
group by y.productid
order by 2 desc
SQL Fiddle Demo
Results:
| productid | count(y2.keyword) |
|-----------|-------------------|
| 3 | 2 |
| 2 | 1 |
| 4 | 0 |

MYSQL pivot table create

I've tried various tutorials and examples how to make a dynamic pivot table, and I cant make it work. Every time I am getting some sql error.
Can someone help me how to make dynamic pivot table. I am new, and I guess I will make a filter search by attributes easy when I make pivot table first. Here are my tables:
Table 1 name: table_items
------------------------------
| item_id | title | ... |
------------------------------
| 1 | item 1 | ... |
------------------------------
| 2 | item 2 | ... |
------------------------------
Table 2 name: table_item_options
-----------------------------
| option_id | option_name |
-----------------------------
| 1 | Color |
-----------------------------
| 2 | Size |
-----------------------------
Table 3 name: table_attributes
--------------------------------------------------
| attribute_id | option_id | attribute_name |
--------------------------------------------------
| 1 | 1 | Blue |
--------------------------------------------------
| 2 | 1 | Red |
--------------------------------------------------
| 3 | 2 | XL |
--------------------------------------------------
| 4 | 1 | Green |
--------------------------------------------------
| 5 | 2 | L |
--------------------------------------------------
Table 4 name: table_item_attributes
------------------------------------------------------
| assigned_id | item_id | option_id | attribute_id |
------------------------------------------------------
| 1 | 1 | 1 | 1 |
------------------------------------------------------
| 1 | 1 | 1 | 2 |
------------------------------------------------------
Any help is much appreciated
What I want is to make a product filter for items by its attributes. As I understood it is best possible to make a pivot table, and to avoid duplicate results bu joining tables.
All first columns are primary keys, and with autoincrement.
EDIT:
At this point, using inner join , I create one big table, and in Select clause I put " WHERE attribute_id = '2' ". (Which means Select all Red) , but like this I can use only one filter.
So I have a problem that I cant use more than one filter from attribute_id column, and I want to filter by more attributes (other color, other size, city, etc..)
So how can I do this using pivot table ? My intention is to dinamicly create columns which would be option.name from (table_item_options) and to populate it with attribute_id or attribute_name, so I can use more filters
for example:
-----------------------------------------------------
| item_id | ... | color | size | City | etc.. |
------------------------------------------------------
| 1 | ... | 1(or Red) | L | A | ... |
------------------------------------------------------
| 1 | ... | 2(or Blue)| XL | B | ... |
------------------------------------------------------
In table like this, If I Select * .. where color = red , I will be able to filter this table by another column, eg: Where city = a
I hope it is more clear now.

Filtering data in left join

Here is some data:
record
-------------------------------------------------
| id | name |
-------------------------------------------------
| 1 | Round Cookie |
| 2 | Square Cookie |
| 3 | Oval Cookie |
| 4 | Hexagon Cookie |
-------------------------------------------------
record_field_data
----------------------------------------------
| id | record_id | data | type_id |
----------------------------------------------
| 1 | 1 | White | 1 |
| 2 | 1 | Round | 2 |
| 3 | 2 | Green | 1 |
| 4 | 2 | Square | 2 |
| 5 | 3 | Blue | 1 |
| 6 | 3 | Oval | 2 |
| 7 | 4 | Hexagon | 2 |
----------------------------------------------
record_type_field
-------------------------------------------------
| id | data_type |
-------------------------------------------------
| 1 | Color |
| 2 | Shape |
-------------------------------------------------
I am trying to get a list of all records left joined with the record_field_data of type "Color". This needs to be a left join because there may not be record_field_data of a given type, and I still want the record if the case.
This is the query I have come up with but it is returning a left join with ALL record_field_data and not just the specific ones I want.
SELECT record.id AS id, recordfield.data, recordtype.field_name
FROM record
LEFT JOIN record_field_data AS recordfield ON (record.id = recordfield.record_id)
LEFT JOIN record_type_field AS recordtype ON (recordfield.type_id = recordtype.id AND recordtype.data_type = 'Color');
I could do this with a subquery in the JOIN but I can't use a subquery. I have to translate this to HQL and subqueries are not supported in HQL for joins.
The result I am looking for is records ordered by the record_field_data where record_type_field.data_type is 'Color'. Note that "Hexagon cookie" doesn't have a color defined, I don't know if it should be at the top or bottom at this point. Either way will work.
-------------------------------------------------
| id | name |
-------------------------------------------------
| 3 | Oval Cookie |
| 2 | Square Cookie |
| 1 | Round Cookie |
| 4 | Hexagon Cookie |
-------------------------------------------------
SELECT r.id, r.name
FROM record r
JOIN record_type_field rf
ON rf.data_type = 'Color'
LEFT JOIN
record_type_data rd
ON rd.record_id = r.id
AND rd.type_id = rf.id
ORDER BY
rd.data
Have you tried using the SQL 'IN' clause.
select record.id as id, recordfield.data, recordtype.field_name
from record
left join record_field_data recordfield ON record.id = recordfield.record_id
WHERE id IN (SELECT id FROM record_type_field where data_type='Color');
The IN clause allows you to specify a list condition. So the subquery gets all of the ids where the type is "Color", and you are then doing the join and selecting all records from that join that have an id in the list of ids corresponding to the type "Color".
You have to change the second join to a INNER JOIN.
If you use two LEFT JOINs you are selecting all the records from record AND from record_field_data.

MySQL Database - Performance Design

I'm currently redesign a heavy loaded website, and I would appreciate any opinion about a specific database design issue.
The concept is to keep in the db a number of products (500K of them).
Every product can have a number of dynamic properties (around 1K), and every property a number of predefined but dynamic values (lets say 10 on average for every property, so around 10K)
At this point of time this is the simplified db structure:
Products (Products Table)
+--------+--------------+
| ProdID | Product Name |
+--------+--------------+
| 1 | T-Shirt XYZ |
+--------+--------------+
| 2 | Dress ABC |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 500000 | Something |
+--------+--------------+
Properties Definition (Props Table) (it holds the Property Types)
+--------+--------------+
| PropID | Property Name|
+--------+--------------+
| 1 | color |
+--------+--------------+
| 2 | size |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 100 | Some Prop |
+--------+--------------+
Properties Values Definition (Values Table)
+-----------+--------+-------+
| PropValID | PropID | Value |
+-----------+--------+-------+
| 1 | 1 | red |
+-----------+--------+-------+
| 2 | 1 | blue |
+-----------+--------+-------+
| 3 | 2 | m |
+-----------+--------+-------+
| 4 | 2 | xl |
+-----------+--------+-------+
| 5 | 2 | xxl |
+-----------+--------+-------+
| ... | ... | ... |
+-----------+--------+-------+
| 1000 | 100 | xyz |
+-----------+--------+-------+
This way we can add any number of properties and values in any product.
The table below holds this info.
Product Properties & Values (ProdPropVal Table)
+--------+--------+--------+-----------+
| InfoID | ProdID | PropID | PropValID |
+--------+--------+--------+-----------+
| 1 | 1 | 1 | 1 |
+--------+--------+--------+-----------+
| 2 | 1 | 2 | 3 |
+--------+--------+--------+-----------+
| 3 | 2 | 1 | 2 |
+--------+--------+--------+-----------+
| 4 | 2 | 2 | 5 |
+--------+--------+--------+-----------+
| ... | ... | ... | |
+--------+--------+--------+-----------+
In the example above we know that "T-Shirt XYZ" has blue color and its size is medium.
And now the tricky part...
if we want to find all products that have a common property values set (all products of blue color and medium size) which is the best approach?
My ideas:
Search one time the ProdPropVal Table for each PropValID and compare the results in code. This can be fine tuned by starting from the most rare PropValIDs and limiting ProdIDs using a WHERE ProdID IN (previous IDs) in the next queries.
Use an Inner Join in the ProdPropVal Table for each PropValID wanted. Something like: SELECT ProdID FROM ProdPropVal ppv1 INNER JOIN ProdPropVal ppv2 ON ppv1.ProdID = ppv2.ProdID INNER JOIN ProdPropVal ppv3 ON ppv1.ProdID = ppv3.ProdID INNER JOIN ProdPropVal ppv4 ON ppv1.ProdID = ppv4.ProdID WHERE ppv1.PropValID = 10 AND ppv2.PropValID = 20 AND ppv3.PropValID = 30 AND ppv4.PropValID = 150
These are my ideas so far. The fact that ProdPropVal tablet has some millions rows doesn't leave any room for error.
Any suggestion is most welcomed!
To find all products with blue colour and medium size I would do this:
SELECT ProdID
FROM ProdPropVal
WHERE (PropID = 1 AND PropValID = 2)
OR (PropID = 2 AND PropValID = 3)
GROUP BY ProdID
HAVING COUNT(*) = 2
Better still, if PropValID is unique in the Values table, then you would remove the PropID column from the ProdPropVal table, and simplify the query to this:
SELECT ProdID
FROM ProdPropVal
WHERE PropValID IN (2, 3)
GROUP BY ProdID
HAVING COUNT(*) = 2