MySQL function: rank table by most similar attributes - mysql

I have a table of products ids and keywords that looks like the following:
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| product_id | int(10) unsigned | YES | MUL | NULL | |
| keyword | varchar(255) | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
This table simply stores product ids, and keywords associated with those products. So for example, it might contain:
+----+------------+---------+
| id | product_id | name |
+----+------------+---------+
| 1 | 1 | soft |
| 2 | 1 | red |
| 3 | 1 | leather |
| 4 | 2 | cloth |
| 5 | 2 | red |
| 6 | 2 | new |
| 7 | 3 | soft |
| 8 | 3 | red |
| 9 | 4 | blue |
+----+------------+---------+
In other words:
product 1 is soft, red, and leather.
product 2 is cloth, red and new.
Product 3 is red and soft,
product 4 is blue.
I need some way to take in a product ID, and get back a sorted list of product ids ranked by the number of common keywords
So for example, if I pass in product_id 1, I'd expect to get back:
+----+-------+------------+
| product_id | matches |
+------------+------------+
| 3 | 2 | (product 3 has two common keywords with product 1)
| 2 | 1 | (product 2 has one common keyword with product 1)
| 4 | 0 | (product 4 has no common keywords with product 1)
+------------+------------+

One option uses a self right outer join with conditional aggregation to count the number of matched names between, e.g. product ID 1, and all other product IDs:
SELECT t2.product_id,
SUM(CASE WHEN t1.name IS NOT NULL THEN 1 ELSE 0 END) AS matches
FROM yourTable t1
RIGHT JOIN yourTable t2
ON t1.name = t2.name AND
t1.product_id = 1
WHERE t2.product_id <> 1
GROUP BY t2.product_id
ORDER BY t2.product_id
Follow the link below for a running demo:
SQLFiddle

You need to use an outer join against the keywords for productid 1:
select y.productid, count(y2.keyword)
from yourtable y
left join (
select keyword from yourtable y2 where y2.productid = 1
) y2 on y.keyword = y2.keyword
where y.productid <> 1
group by y.productid
order by 2 desc
SQL Fiddle Demo
Results:
| productid | count(y2.keyword) |
|-----------|-------------------|
| 3 | 2 |
| 2 | 1 |
| 4 | 0 |

Related

Joining multiple rows with same ID in one

I am having trouble with an SQL query. I have two tables.
My first table:
+------------+-------------+---------------+
| id_mission | Some column | Other column |
+------------+-------------+---------------+
| 1 | ... | ... |
| 2 | ... | ... |
+------------+-------------+---------------+
My second table:
+------------+-------------+---------+
| id_mission | id_category | points |
+------------+-------------+---------+
| 1 | 1 | 3 |
| 1 | 2 | 4 |
| 1 | 3 | 4 |
| 1 | 4 | 8 |
| 2 | 1 | -4 |
| 2 | 2 | 3 |
| 2 | 3 | 1 |
| 2 | 4 | -7 |
+------------+-------------+---------+
And I would like to have this kind of result with my SELECT request
+------------+-------------+--------------+---------------+----------------+
| id_mission | Some column | Other column | id_category 1 | id_category X |
+------------+-------------+--------------+---------------+----------------+
| 1 | ... | ... | ... | ... |
| 2 | ... | ... | ... | ... |
+------------+-------------+--------------+---------------+----------------+
I have tried this with the first two column but it doesn't work, I also tried GROUP_CONCAT, it works but it's not the result I want.
SELECT m.id_mission ,mc.id_category 1,mc1.id_category 2
from mission m
left join mission_category mc on m.id_mission = mc.id_mission
left join mission_category mc1 on m.id_mission = mc1.id_mission
Can someone help me?
You can use conditional aggregation. Assuming that you want to pivot the points value per category:
select
t1.*,
max(case when t2.id_category = 1 then points end) category_1,
max(case when t2.id_category = 2 then points end) category_2,
max(case when t2.id_category = 3 then points end) category_3
from t1
inner join t2 on t2.id_mission = t1.id_mission
group by t1.id_mission
This assumes that id_mission is the primary key of t1 (else, you need to enumerate the columns you want in both the select and group by clauses).

Merging two vertical tables onto one horizontal table

Table Definitions
Table 1 (horizontal) This is a table of users
| id | name | phone |
---------------------
| 1 | Bob | 800 |
| 2 | Phil | 800 |
Table 2 (Vertical Table) This is a table of teams
| id | name |
------------------
| 1 | Donkey |
| 2 | Cat |
Table 3 (Vertical Table) This table is connecting the first two
| id | user_id | team_id |
--------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
My Goal
I would like to be able to query the data in such a way that i get the following back:
| id | name | phone | Donkey | Cat |
-------------------------------------
| 1 | Bob | 800 | 1 | 1 |
| 2 | Phil | 800 | 1 | Null |
This table would have my horizontal table data, then a combination of the other two vertical tables to create the appended columns. Where table 2 ends up being the column name headings. And the row valus are pulled from table three as a boolean.
You're chasing a pivot table:
select u.*,
sum(case when t1.name = 'Donkey' then 1 else 0 end) Donkey,
sum(case when t1.name = 'Cat' then 1 else 0 end) Cat
from users u
inner join user_team ut1
on u.id = ut1.user_id
inner join teams t1
on ut1.team_id = t1.id
group by name
demo: http://sqlfiddle.com/#!9/5fd33/7

SQL sorting with multiple criteria

I have two tables.
Tab1:
+------------+
| id | title |
+------------+
| 1 | B |
| 2 | C |
| 3 | A |
| 4 | A |
| 5 | A |
| 6 | A |
| ... |
+------------+
Tab2:
+-------------------------------------------+
| id | item_id | item_key | item_value |
+-------------------------------------------+
| 1 | 1 | value | $4 |
| 2 | 1 | url | http://h.com/ |
| 3 | 2 | value | $5 |
| 4 | 3 | url | http://i.com/ |
| 5 | 3 | value | $1 |
| 6 | 3 | url | http://y.com/ |
| 7 | 4 | value | $2 |
| 8 | 4 | url | http://z.com/ |
| 9 | 5 | value | $1 |
| 10 | 5 | url | http://123.com/ |
| ... |
+-------------------------------------------+
item_id is a foreign key from tab1.
How do I make it so I get a table of ids from Tab1 in order according to criteria from both tables. The criteria are the following:
Order ASC by title. If title is the same,
Order DESC by value. If both title and value is the same,
Prioritize items who's 'url' key contains '123.com'.
The resulting table with the ordered results would be:
+------------+
| id | title |
+------------+
| 4 | A |
| 5 | A |
| 3 | A |
| 6 | A |
| 1 | B |
| 2 | C |
| ... |
+------------+
The results should include items that don't have the one, both, or none of the fields from Tab2 set.
As far as I understand, a simple join will do it. You'll have to join Tab2 twice, since you want to order by values from both rows.
SELECT Tab1.id, Tab1.title
FROM Tab1
JOIN Tab2 t2_val ON t2_val.item_id = Tab1.id AND t2_val.item_key='value'
JOIN Tab2 t2_url ON t2_url.item_id = Tab1.id AND t2_url.item_key='url'
ORDER BY title,
t2_val.item_value DESC,
t2_url.item_value LIKE '%123.com%' DESC
An SQLfiddle to test with.
A little complicated, because when you do the join you will get multiple rows. Here is an approach that aggregates tab2 before doing the join:
select t1.*
from Tab1 t1 left outer join
(select id,
max(case when item_key = 'value' then item_value end) as value,
max(case when item_key = 'url' then item_value end) as url
from Tab2 t2
group by id
) t2
on t1.id = t2.id
order by t1.title, t2.value desc,
(t2.url like '%123.com%') desc;

MySQL: Using NULL as wildcard in JOIN statement

I have a table containing perfectly defined items and a second table with potentially vague/greedy orders as NULL would require all available values for this parameter.
items
+-----------------------+
| item_id | color | size |
|---------+-------+------|
| 1 | blue | 8 |
| 2 | red | 6 |
| 3 | green | 7 |
| 4 | black | 6 |
+------------------------+
orders
+-------------------------+
| order_id | color | size |
|----------+-------+------|
| 1 | red | 6 |
| 2 | green | 8 |
| 3 | NULL | 6 |
| 4 | blue | NULL |
| 5 | NULL | NULL |
+-------------------------+
Is there an efficient way to generate a complete list of items needed to fill all orders?
+--------------------+
| order_id | item_id |
|----------+---------|
| 1 | 2 |
| 3 | 2 |
| 3 | 4 |
| 4 | 1 |
| 5 | 1 |
| 5 | 2 |
| 5 | 3 |
| 5 | 4 |
+--------------------+
It seems to me like an INNER JOIN should be able to do this, but something like this obviously doesn't consider the possibility of NULL values as greedy wildcards in the orders table:
SELECT order_id, item_id
FROM orders
INNER JOIN items ON orders.color = items.color AND orders.size = items.size
Any ideas?
Try the following:
SELECT order_id, item_id
FROM orders
INNER JOIN items ON (orders.color IS NULL OR orders.color = items.color)
AND (orders.size IS NULL OR orders.size = items.size)
Let me know if that helps, or if I misunderstood the question.
If you rewrite the conditions for the JOIN you can get the desired result:
SELECT order_id, item_id
FROM orders
JOIN items
ON ((orders.color = items.color OR orders.color IS NULL)
AND (orders.size = items.size OR orders.size IS NULL))
However, the orders table should probably look more like the result of this query than the current orders table.
You could use the IFNULL function for this:
SELECT order_id, item_id
FROM orders
JOIN items ON IFNULL(orders.color, items.color) = items.color
AND IFNULL(orders.size, items.size) = items.size
If the value in orders is null, then it'll use the value from items (and thus match).

Querying across 6 tables, is there a better way of doing this?

What I did was, I wanted each user to have their own "unique" numbering system. Instead of auto incrementing the item number by 1, I did it so that Bob's first item would start at #1 and Alice's number would also start at #1. The same goes for rooms and categories. I achieved this by creating "mapping" tables for items, rooms and categories.
The query below works, but I know it can definitely be refactored. I have primary keys in each table (on the "ids").
SELECT unique_item_id as item_id, item_name, category_name, item_value, room_name
FROM
users_items, users_map_item, users_room, users_map_room, users_category, users_map_category
WHERE
users_items.id = users_map_item.map_item_id AND
item_location = users_map_room.unique_room_id AND
users_map_room.map_room_id = users_room.room_id AND
users_map_room.map_user_id = 1 AND
item_category = users_map_category.unique_category_id AND
users_map_category.map_category_id = users_category.category_id AND
users_category.user_id = users_map_category.map_user_id AND
users_map_category.map_user_id = 1
ORDER BY item_name
users_items
| id | item_name | item_location |item_category |
--------------------------------------------------------
| 1 | item_a | 1 | 1 |
| 2 | item_b | 2 | 1 |
| 3 | item_c | 1 | 1 |
users_map_item
| map_item_id | map_user_id | unique_item_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
users_rooms
| id | room_name |
----------------------
| 1 | basement |
| 2 | kitchen |
| 3 | attic |
users_map_room
| map_room_id | map_user_id | unique_room_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
users_category
| id | room_name |
----------------------
| 1 | antiques |
| 2 | appliance |
| 3 | sporting goods |
users_map_category
| map_room_id | map_user_id | unique_category_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
Rewriting your query with explicit JOIN conditions makes it more readable (while doing the same).
SELECT mi.unique_item_id AS item_id
, i.item_name
, c.category_name
, i.item_value
, r.room_name
FROM users_map_item mi
JOIN users_items i ON i.id = mi.map_item_id
JOIN users_map_room mr ON mr.unique_room_id = i.item_location
JOIN users_room r ON r.room_id = mr.map_room_id
JOIN users_map_category mc ON mc.unique_category_id = i.item_category
JOIN users_category c ON (c.user_id, c.category_id)
= (mc.map_user_id, mc.map_category_id)
WHERE mr.map_user_id = 1
AND mc.map_user_id = 1
ORDER BY i.item_name
The result is unchanged. Query plan should be the same. I see no way to improve the query further.
You should use LEFT [OUTER] JOIN instead of [INNER] JOIN if you want to keep rows in the result where no matching rows are found in the right hand table. You may want to move the additional WHERE clauses to the JOIN condition in this case, as it changes the outcome.