Convert columns into rows - mysql

I have a table:
+--------------+-------+--------+----------+
| attribute_id | color | brand | category |
+--------------+-------+--------+----------+
| 1 | red | honda | cars |
| 2 | blue | bmw | cars |
| 3 | pink | skonda | vans |
+--------------+-------+--------+----------+
I would like to convert it to the following:
+--------------+---------+
| attribute_id | keyword |
+--------------+---------+
| 1 | red |
| 2 | blue |
| 3 | pink |
| 1 | honda |
| 2 | bmw |
| 3 | skonda |
| 1 | cars |
| 2 | cars |
| 3 | vans |
+--------------+---------+
The only way I can think of is to use UNIONs like this:
SELECT attribute_id, color from attributes
UNION ALL
SELECT attribute_id, brand from attributes
UNION ALL
SELECT attribute_id, category from attributes
The above way is a bit cumbersome, especially since my real use case will need to join multiple tables for each select.
Is there a simpler or less copy/paste way to write this?

A more efficient query (at least for large tables) is:
SELECT attribute_id,
(case when n = 1 then color
when n = 2 then brand
when n = 3 then category
end) as keyword
from attributes a cross join
(select 1 as n union all select 2 union all select 3) n;
The reason this is better than the union all query is performance. The union all will scan the original table three times. This will scan the original table once (and then loop through n). For a large table this could be a significant difference in performance.

Related

MySQL table relationship and how to query a Key/value table

I have the following table structure:
Product (id, name, ...)
+-----+------------+
| id | name |
+-----+------------+
| 1 | Product #1 |
| 2 | Product #2 |
| 3 | Product #3 |
| 4 | Product #4 |
+-----+------------+
Attribute (id, title, ...)
+-----+------------+
| id | title |
+-----+------------+
| 1 | shape |
| 2 | colour |
| 3 | height |
| 4 | weight |
+-----+------------+
Option (id, title ... )
+-----+------------+
| id | title |
+-----+------------+
| 1 | round |
| 2 | square |
| 3 | oval |
| 4 | red |
| 5 | blue |
| 6 | green |
| 7 | tall |
| 8 | short |
| 9 | heavy |
| 10 | light |
+-----+------------+
and a fourth one (ProductAttribute - id, product_id, attribute_id, option_id), hoping to get "all the red round products which are also tall and heavy":
+-----+------------+--------------------+
| id | product | attribute | option |
+-----+------------+--------------------+
| 1 | Product #1 | shape | round |
| 2 | Product #2 | shape | oval |
| 3 | Product #3 | shape | round |
| 4 | Product #4 | shape | square |
| 5 | Product #1 | color | green |
| 6 | Product #2 | color | red |
| 7 | Product #3 | height | tall |
| 8 | Product #4 | height | short |
| 9 | Product #2 | weight | heavy |
| 10 | Product #1 | weight | light |
+-----+------------+--------------------+
I'm by far not an sql master and maybe my idea can't work.
Edit:
Q1. The question is how do I achieve that? Getting all the red, tall, heavy products for instance.
The following queries don't achieve my purpose:
1:
SELECT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
AND (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
2:
SELECT DISTINCT ProductAttributes.product_id, ProductAttributes.id FROM ProductAttributes
WHERE (ProductAttributes.attribute_id = 1 AND ProductAttributes.option_id = 1)
OR (ProductAttributes.attribute_id = 3 AND ProductAttributes.option_id = 4);
Note: I'm purposely putting 2 variable in my query, as the real one has many more.
Key/value tables are a nuisance. So avoid them, if you can. You'd have these tables then:
table shapes
+--------+
| shape |
+--------+
| round |
| oval |
| round |
| square |
+--------+
table colors
+--------+
| color |
+--------+
| green |
| red |
+--------+
table heights
+--------+
| height |
+--------+
| tall |
| short |
+--------+
table weights
+--------+
| weight |
+--------+
| heavy |
| light |
+--------+
table products
+-------------+--------------+--------+--------+--------+--------+
| product_no | product name | shape | color | height | weight |
+-------------+--------------+--------+--------+--------+--------+
| 14214 | Product #1 | round | red | tall | heavy |
| 22312 | Product #2 | oval | | short | heavy |
| 35757 | Product #3 | square | green | tall | heavy |
| 42468 | Product #4 | | red | short | light |
+-------------+--------------+--------+--------+--------+--------+
The query
select *
from products
where shape = 'round'
and color = 'red'
and height = 'tall'
and weight = 'heavy';
You can do the same with IDs by the way. So all lookup tables would get an ID (round = 1, oval = 2, ... green = 1, red = 2, ...) and the product table would no longer contain the words, but the IDs. The query would then be:
select *
from products
where shape_id = (select id from shapes where shape = 'round')
and color_id = (select id from colors where color = 'red')
and height_id = (select id from heights where height = 'tall')
and weight_id = (select id from weights where weight = 'heavy';
So you want make select based on option from ProductAttribute table
Better way to store data in table is to use unique/id/primary key value for fourth columns and then you can so that
SELECT * FROM ProductAttribute as attr
INNER JOIN Product as product ON product.id=attr.product_id
INNER JOIN Attribute as attr2 ON attr2.id=attr.attribute_id
WHERE attr.option=“round” OR attr.option=“red”
I hope this help you!
Fot the key/value approach I'd use composite keys to improve consistency:
attribute (attribute_no, title), PK = attribute_no
+--------------+------------+
| attribute_no | title |
+--------------+------------+
| 1 | shape |
| 2 | colour |
| ... | ... |
+--------------+------------+
attribute_option (attribute_no, option_no, value), PK = attribute_no, option_no
+--------------+-----------+------------+
| attribute_no | option_no | value |
+--------------+-----------+------------+
| 1 | 1 | round |
| 1 | 2 | square |
| 2 | 1 | green |
| 2 | 2 | red |
| ... | ... | ... |
+--------------+-----------+------------+
product (product_no, product_name, ...), PK = product_no
+------------+--------------+
| product_no | product_name |
+------------+--------------+
| 7352871 | Product #1 |
| 8956443 | Product #2 |
| ... | ... |
+------------+--------------+
product_attributes (product_no, attribute_no, option_no), PK = product_no, attribute_no
+------------+--------------+-----------+
| product_no | attribute_no | option_no |
+------------+--------------+-----------+
| 7352871 | 1 | 1 |
| 7352871 | 2 | 1 |
| 8956443 | 1 | 2 |
| 8956443 | 2 | 1 |
+------------+--------------+-----------+
(And you'd want an index on attribute_no + option_no for this table.)
The product_attributes primary key guarantees that each product only gets one value per attribute. Well, this is good for height, weight, etc. If you want to have multiple colors etc. for a product however, you need such product_attributes table including the option_no in the primary key. You may end up with separate tables for unique attributes and multiple attributes. Maybe later you even want to introduce product groups with optional and obligatory attributes (a freezer has an energy class, a t-shirt doesn't). So this whole concept may grow, but above tables should give you an idea how to best approach this.
A query for all the red round products which are also tall and heavy:
select *
from product
where product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'colour'
and ao.value = 'red'
)
)
and product_no in
(
select product_no
from product_attributes
where (attribute_no, option_no) =
(
select ao.attribute_no, ao.option_no
from attribute_option ao
join attribute a on a.attribute_no = ao.attribute_no
where a.title = 'shape'
and ao.value = 'round'
)
)
and product_no in (...)
and product_no in (...);
Or shorter with aggregation:
select *
from product
where product_no in
(
select pa.product_no
from product_attributes pa
join attribute a on a.attribute_no = pa.attribute_no
join attribute_option ao on a.attribute_no = pa.attribute_no
and a.option_no = pa.option_no
group by pa.product_no
having sum(a.title = 'colour' and ao.value = 'red') > 0
and sum(a.title = 'shape' and ao.value = 'round') > 0
and sum(a.title = 'height' and ao.value = 'tall') > 0
and sum(a.title = 'weight' and ao.value = 'heavy') > 0
)
After searching the web for "mysql key value table" (thank you #Thorsten Kettner for the keywords, as I lack the terminology), I've end up with something like:
SELECT Product.id FROM Product
INNER JOIN ProductAttributes PA_1 ON
Product.id = PA_1.product_id
INNER JOIN ProductAttributes PA_2 ON
Product.id = PA_2.product_id
WHERE
(PA_1.attribute_id = 1 and PA_1.option_id = 1)
AND
(PA_2.attribute_id = 3 and PA_2.option_id = 4);
Basically whenever a new attribute is used in the query, a different INNER JOIN condition is needed.
Which in terms of "performance" a rather noticeable hit will happen.
According to this and this a key/value table should not be used for filtering, but at this point I have no choice, so it will be up to the caching server to save the day.
I've based my answer of this (no need for GROUP BY in my case as I don't use aggregate functions) Filtering and Grouping data from table with key/value pairs

Simplifying MySQL query - 2 queries into 1

I have a table that looks like this:
+----+--------+-------+
| id | entity | word |
+----+--------+-------+
| 1 | 1 | red |
| 2 | 1 | green |
| 3 | 1 | blue |
| 4 | 2 | car |
| 5 | 2 | truck |
| 6 | 2 | train |
| 7 | 3 | water |
| 8 | 3 | milk |
| 9 | 3 | soda |
+----+--------+-------+
If I do a search for blue I would like to get red, green and blue as an answer. Right now I am using 2 queries. One to find the 'entity' number and one to find all the words with the same 'entity' number.
Try this. Join is much faster than subquery
select distinct t2.word from Table t1
INNER JOIN Table t2 on t2.entity=t1.entity
where t1.word="blue";
SELECT *
FROM TABLE_NAME
WHERE entity IN
(SELECT entity
FROM TABLE_NAME
WHERE word='blue');

mysql select query on two tables

I am trying to get a result from two tables without having a second query nested inside the first query loop.
I have a table products:
product_code_1234 | product_name | otherfields...
And a table categories, where a product can have multiple categories:
category_name_1 | product_code_1234
category_name_2 | product_code_1234
category_name_3 | product_code_1234
Is there a query to get the following result?
product_code_1234 | product_name | ... | category_name_1 | category_name_2 | category_name_3
select * from a,b
will give you all data from table a ,combined with all data from table b.
However if you don't want repetition of data and there isn't any connection between table a and b it can't be done without some union or similar
Assume you have these tables:
+----------------------------+
| PRODUCTS |
+------+-------------+-------+
| code | name | price |
+------+-------------+-------+
| 1 | Bike Helmet | 99.99 |
| 2 | Shirt | 19.99 |
+------+-------------+-------+
+-------------------+
| CATEGORIES |
+------+------------+
| code | category |
+------+------------+
| 1 | Sports |
| 1 | Cycling |
| 1 | Protection |
| 2 | Men |
| 2 | Clothing |
+------+------------+
Here is a query based on another SO answer, that would match your desired result, if my interpretation of it is correct:
SELECT p.code, p.name, p.prize,
(SELECT category FROM categories c WHERE c.code = p.code LIMIT 0, 1) as category1,
(SELECT category FROM categories c WHERE c.code = p.code LIMIT 1, 1) as category2,
(SELECT category FROM categories c WHERE c.code = p.code LIMIT 2, 1) as category3,
FROM products p
This is the result:
+------+-------------+-------+-----------+-----------+------------+
| code | name | price | category1 | category2 | category3 |
+------+-------------+-------+-----------+-----------+------------+
| 1 | Bike Helmet | 99.99 | sports | cycling | protection |
| 2 | Shirt | 19.99 | men | clothing | NULL |
+------+-------------+-------+-----------+-----------+------------+
It is not possible, though, to have a dynamic number of categories in the result. In this case I limited the number to 3, like in your question. There might still be a solution with better performance. Also, this is obviously a nested query and therefore probably not suited for your needs. Still, it's the best I could come up with.
JOIN
There is also the SQL JOIN clause, which might be what you are looking for:
SELECT *
FROM products
NATURAL JOIN categories
You would end up with this result:
+------+-------------+-------+------------+
| code | name | price | category |
+------+-------------+-------+------------+
| 1 | Bike Helmet | 99.99 | sports |
| 1 | Bike Helmet | 99.99 | cycling |
| 1 | Bike Helmet | 99.99 | protection |
| 2 | Shirt | 19.99 | men |
| 2 | Shirt | 19.99 | clothing |
+------+-------------+-------+------------+
I guess that you will have to do two separate queries.
One to retrieve products, one to retrieve the product categories.
Then use any scripting language (like PHP) to achieve what you want with the results (display, export, whatever).

MySQL Database - Performance Design

I'm currently redesign a heavy loaded website, and I would appreciate any opinion about a specific database design issue.
The concept is to keep in the db a number of products (500K of them).
Every product can have a number of dynamic properties (around 1K), and every property a number of predefined but dynamic values (lets say 10 on average for every property, so around 10K)
At this point of time this is the simplified db structure:
Products (Products Table)
+--------+--------------+
| ProdID | Product Name |
+--------+--------------+
| 1 | T-Shirt XYZ |
+--------+--------------+
| 2 | Dress ABC |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 500000 | Something |
+--------+--------------+
Properties Definition (Props Table) (it holds the Property Types)
+--------+--------------+
| PropID | Property Name|
+--------+--------------+
| 1 | color |
+--------+--------------+
| 2 | size |
+--------+--------------+
| ... | ... |
+--------+--------------+
| 100 | Some Prop |
+--------+--------------+
Properties Values Definition (Values Table)
+-----------+--------+-------+
| PropValID | PropID | Value |
+-----------+--------+-------+
| 1 | 1 | red |
+-----------+--------+-------+
| 2 | 1 | blue |
+-----------+--------+-------+
| 3 | 2 | m |
+-----------+--------+-------+
| 4 | 2 | xl |
+-----------+--------+-------+
| 5 | 2 | xxl |
+-----------+--------+-------+
| ... | ... | ... |
+-----------+--------+-------+
| 1000 | 100 | xyz |
+-----------+--------+-------+
This way we can add any number of properties and values in any product.
The table below holds this info.
Product Properties & Values (ProdPropVal Table)
+--------+--------+--------+-----------+
| InfoID | ProdID | PropID | PropValID |
+--------+--------+--------+-----------+
| 1 | 1 | 1 | 1 |
+--------+--------+--------+-----------+
| 2 | 1 | 2 | 3 |
+--------+--------+--------+-----------+
| 3 | 2 | 1 | 2 |
+--------+--------+--------+-----------+
| 4 | 2 | 2 | 5 |
+--------+--------+--------+-----------+
| ... | ... | ... | |
+--------+--------+--------+-----------+
In the example above we know that "T-Shirt XYZ" has blue color and its size is medium.
And now the tricky part...
if we want to find all products that have a common property values set (all products of blue color and medium size) which is the best approach?
My ideas:
Search one time the ProdPropVal Table for each PropValID and compare the results in code. This can be fine tuned by starting from the most rare PropValIDs and limiting ProdIDs using a WHERE ProdID IN (previous IDs) in the next queries.
Use an Inner Join in the ProdPropVal Table for each PropValID wanted. Something like: SELECT ProdID FROM ProdPropVal ppv1 INNER JOIN ProdPropVal ppv2 ON ppv1.ProdID = ppv2.ProdID INNER JOIN ProdPropVal ppv3 ON ppv1.ProdID = ppv3.ProdID INNER JOIN ProdPropVal ppv4 ON ppv1.ProdID = ppv4.ProdID WHERE ppv1.PropValID = 10 AND ppv2.PropValID = 20 AND ppv3.PropValID = 30 AND ppv4.PropValID = 150
These are my ideas so far. The fact that ProdPropVal tablet has some millions rows doesn't leave any room for error.
Any suggestion is most welcomed!
To find all products with blue colour and medium size I would do this:
SELECT ProdID
FROM ProdPropVal
WHERE (PropID = 1 AND PropValID = 2)
OR (PropID = 2 AND PropValID = 3)
GROUP BY ProdID
HAVING COUNT(*) = 2
Better still, if PropValID is unique in the Values table, then you would remove the PropID column from the ProdPropVal table, and simplify the query to this:
SELECT ProdID
FROM ProdPropVal
WHERE PropValID IN (2, 3)
GROUP BY ProdID
HAVING COUNT(*) = 2

MySQL: Select multiple rows containing values from one column

I'd like to find the car_id's of the cars that have 'FORD' AND 'SILVER' AND the user input value of '200' in the value column:
table_cars
+----+--------+----------+-----------+
| id | car_id | name | value |
+----+--------+----------+-----------+
| 1 | 1 | MAKE | FORD |
| 2 | 1 | CARLINE | FIESTA |
| 3 | 1 | COLOR | SILVER |
| 4 | 1 | TOPSPEED | 210KM/H |
| 5 | 2 | MAKE | FORD |
| 6 | 2 | CARLINE | FOCUS |
| 7 | 2 | COLOR | SILVER |
| 8 | 2 | TOPSPEED | 200KM/H |
| 9 | 3 | MAKE | HOLDEN |
| 10 | 3 | CARLINE | ASTRA |
| 11 | 3 | COLOR | WHITE |
| 12 | 3 | TOPSPEED | 212KM/H |
+----+--------+----------+-----------+
Which in this case should return only one car_id: car_id = 2.
What would be the way to go to create the SQL query for this?
What you have is a properties table. When you want to test multiple properties at once you need to join the table to itself:
SELECT c0.car_id
FROM table_cars AS c0
JOIN table_cars AS c1 ON c1.car_id=c0.car_id
JOIN table_cars AS c2 ON c2.car_id=c1.car_id
WHERE c0.name='MAKE' AND c0.value='FORD'
AND c1.name='COLOR' AND c1.value='SILVER'
AND c2.name='TOPSPEED' AND c2.value='200KM/H'
Having the surrogate id present in a properties table is questionable. It doesn't seem to be doing anything; each property isn't an entity of its own. Unless the id is required by some other element, I'd get rid of it and make car_id, name the primary key (a composite primary key).
I assume that every car needs to have variable parameters, otherwise you wouldn't have gone with a setup like this. It would be much easier if MAKE, CARLINE, COLOR, and TOPSPEED each had their own column.
Using the table you've provided, however, you need to use subqueries. http://dev.mysql.com/doc/refman/5.0/en/subqueries.html
The query should look something like this (untested):
SELECT * FROM table_cars WHERE id IN (SELECT * FROM table_cars WHERE name="MAKE" AND value="FORD") AND id IN (SELECT * FROM table_cars WHERE name="COLOR" AND value="SILVER") AND id IN (SELECT * FROM table_cars WHERE name="TOPSPEED" AND value="200KM/H")