Find items with maximum matching attributes - mysql

Here is my table structure - table name "propAssign"
(indexed) (composite index for attributeName and attributeValue)
productId attributeName attributeValue
1 Height 3
1 Weight 1
1 Class X1
1 Category C1
2 Height 2
2 Weight 2
2 Class X2
2 Category C1
3 Height 3
3 Weight 1
3 Class X1
3 Category C1
4 Height 4
4 Weight 5
4 Class X2
4 Category C3
What I want to do is, get list of productId, sorted by maximum matching attributes-value pair. In real table, I am using numeric ID of attribute name and value, I've used text here for easy representation.
So if I want to find matching products of productId=1, I want it to look for product which has maximum match (like Height=3, Weight=1, Class=X1 and Category=C1). There may not be any with 100% match (all 4 match) but if there are, they should come first, next comes productId which has any 3 attributes matching, then any 2, etc.
I could add more indexes if required, better if I don't have to since there are millions rows. It's MariaDB v10 to be exact.
Desired result - If I try to find matching product for productId=1, it should return following, in same order.
productId
-----------
3
2
Reason - 3 has all attributes matching with 1, 2 has some matches and 4 has no match.

You can use conditional aggregation to retrieve the productId's with the highest number of matches first.
select productId,
count(case when attributeName = 'Height' and attributeValue='3' then 1 end)
+ count(case when attributeName = 'Weight' and attributeValue='1' then 1 end)
+ count(case when attributeName = 'Category' and attributeValue='C1' then 1 end) as rank
from mytable
group by productId
order by rank desc
The query above returns all rows even with 0 matches. If you only want to return rows with 1 or more matches, then use the query below, which should be able to take advantage of your composite index:
select productId, count(*) as rank
from mytable
where (attributeName = 'Height' and attributeValue = '3')
or (attributeName = 'Weight' and attributeValue = '1')
or (attributeName = 'Category' and attributeValue = 'C1')
group by productId
order by rank desc

Related

Postgres - How to search and aggregate from a JSON column

I have an asset_quantities table as below
id | asset_type | quantity | site_id | asset_ids_json
1 'Container' 3 1 [{"id":1,"make":"am1","model":"amo1"},{"id":2,"make":"am1","model":"amo2"},{"id":3,"make":"am3","model":"amo3"}]
2 'Cage' 3 1 [{"id":4,"make":"bm1","model":"bmo1"},{"id":5,"make":"bm2","model":"bmo2"},{"id":6,"make":"bm2","model":"cmo3"}]
3 'Crate' 3 1 [{"id":7,"make":"cm1","model":"cmo1"},{"id":8,"make":"cm1","model":"cmo1"},{"id":9,"make":"cm1","model":"cmo2"}]
I want to write a SQL query in Postgres that will give me the quantity count of each asset type for a given make or model.
E.g. If I wanted to fetch the quantity for each asset type where make='am1',
site_id | Container_qty | Cage_qty | Crate_qty
1 2 0 0
E.g. If I wanted to fetch the quantity for each asset type where make='cm1', the result set would look like
site_id | Container_qty | Cage_qty | Crate_qty
1 0 0 3
I have written the query below to pivot the values from the 'asset_type' rows into columns but can't figure out how to filter and aggregate the counts based on the attributes inside the field 'asset_ids_json'. It is safe to assume that the length of the json array inside asset_ids_json will always be the same as the value in the 'quantity' column.
select
aq.site_id,
sum(case when aq.asset_type = 'Container' then aq.quantity end) container_qty,
sum(case when aq.asset_type = 'Cage' then aq.quantity end) cage_qty ,
sum(case when aq.asset_type = 'Crate' then aq.quantity end) crate_qty,
from asset_quantities aq
group by aq.site_id;
The crux of my question is how can I filter & aggregate results based on the attributes inside the json column 'asset_ids_json'. I'm using Postgres 9.4.
step-by-step demo:db<>fiddle
SELECT
site_id,
SUM(case when asset_type = 'Container' then quantity end) container_qty,
SUM(case when asset_type = 'Cage' then quantity end) cage_qty ,
SUM(case when asset_type = 'Crate' then quantity end) crate_qty
FROM (
SELECT DISTINCT ON (id)
site_id,
asset_type,
quantity
FROM asset_quantities aq,
json_array_elements(asset_ids_json)
WHERE value ->> 'make' = 'cm1'
) s
GROUP BY site_id
To get a WHERE clause over the content of a JSON array you have to expand the array. json_array_elements() creates one row for each element. With that it is possible to ask for a certain value.
Because of this expansion the current rows are multiplied (three times here because there are three elements in the array). Because you are only interested in the original site_id, asset_type and quantity data which were simply copied into the new records, you can eliminate them with a DISTINCT. DISTINCT ON checks for distinct values of each id. So if two JSON array would contain the same key/value both will be saved.

How to make COUNT not count(fieldname) NULL values?

I'm setting up requests for e-commerce backend.
I have 3 tables: product (with id as index), category (with id as index) and product_category, which binds first 2 tables, since one product can be in several categories and there can be several products in one category.
The request is to get a list of all categories containing names of categories and number of products in this category including zero values (when category contains no products). Last 2 columns of the results shown.
Unfortunatelly, COUNT(fieldname) gives me 1 instead of 0.
Here's my SQL request:
SELECT product_category.id_category AS pr_cat_cat_id,
id_product AS pr_cat_pr_id,
product.name AS productname,
categories.id,
categories.name,
COUNT ('pr_cat_cat_id') AS quantity
FROM product
LEFT JOIN product_category ON product_category.id_product = product.id
RIGHT JOIN (SELECT * FROM category) AS categories
ON categories.id = product_category.id_category
GROUP BY name
ORDER BY id ASC
and get this result:
pr_cat_cat_id pr_cat_pr_id productname id name quantity
1 1 Product "Name1" 1 Category 1 2
2 3 Product "Name 3" 2 Category 2 2
NULL NULL NULL 3 Category 3 1
NULL NULL NULL 4 Category 4 1
NULL NULL NULL 5 Category 5 1
NULL NULL NULL 6 Category 6 1
I do expect quantity to be zero on categories without products.
You are counting a constant string value, which is never NULL. Use quotes correctly. You don't need them here:
COUNT(product_category.id_category) AS quantity
You cannot use an alias for the COUNT(). You need to refer to the original column.
Note that your query is malformed. The only things in the select should be name and the aggregation functions.
Column aliases from the same select list can't be referenced. Instead use its original column name:
COUNT(product_category.id_category)
Note: Single quotes are for string literals, and those are never null.

SELECT on same table for Multiple key value pairs

I have a table like
id keyword_id value category_id asset_id
1 2 abc.jpg 4424 479
2 3 Jpeg 4424 479
3 4 400*600 4424 479
4 2 def.jpg 4424 603
5 3 Jpeg 4424 603
6 4 500*700 4424 603
I want to fetch values depending on multiple pairs like (keyword id = 3 and value like '%Jpeg%') And (keyword id = 2 and value like '%abc%').
This should return only one value with asset_id 479 because it meets both the criteria.
I am running a query like
SELECT DISTINCT asset_id FROM asset_keyword_table where category_id = 4424
AND (( keyword_id = 2 AND value LIKE '%abc%') AND ( keyword_id = 3 AND
value LIKE '%Jpeg%'));
But EXPLAIN this query returns Impossible WHERE clause.
What is the way to get this working.
This query is generated by BE code so blocks likes this can be many -
( keyword_id = 2 AND value LIKE '%abc%')
depending on user input. And the blocks separated by AND or OR is also determined by User. Using aliases is not possible because there is no limit on the number of blocks.
Can anyone help?
You need to filter the total number of rows that match with your condition.
SELECT Asset_ID
FROM asset_keyword_table
WHERE category_id = 4424
AND
(( keyword_id = 2 AND value LIKE '%abc%')
OR (keyword_id = 3 AND value LIKE '%Jpeg%'))
GROUP BY Asset_ID
HAVING COUNT(*) = 2 -- number of rows that matched the condition
Here's a Demo.
Nested queries should give you the desired result:
SELECT DISTINCT asset_id FROM asset_keyword_table WHERE
( category_id = 4424 AND keyword_id = 2 AND value LIKE '%abc%' )
AND asset_id IN
( SELECT DISTINCT asset_id FROM asset_keyword_table WHERE
category_id = 4424 AND keyword_id = 3 AND value LIKE '%Jpeg%' )
/* OR asset_id IN
( SELECT DISTINCT asset_id FROM asset_keyword_table WHERE
category_id = 4424 AND keyword_id = 4 AND value LIKE '%500%' ) */
SQL Fiddle

Conditional condition in ON clause

I am trying to apply a conditional condition inside ON clause of a LEFT JOIN. What I am trying to achieve is somewhat like this:
Pseudo Code
SELECT * FROM item AS i
LEFT JOIN sales AS s ON i.sku = s.item_no
AND (some condition)
AND (
IF (s.type = 0 AND s.code = 'me')
ELSEIF (s.type = 1 AND s.code = 'my-group')
ELSEIF (s.type = 2)
)
I want the query to return the row, if it matches any one of the conditions (Edit: and if it matches one, should omit the rest for the same item).
Sample Data
Sales
item_no | type | code | price
1 0 me 10
1 1 my-group 12
1 2 14
2 1 my-group 20
2 2 22
3 2 30
4 0 not-me 40
I want the query to return
item_no | type | code | price
1 0 me 10
2 1 my-group 20
3 2 30
Edit: The sales is table is used to apply special prices for individual users, user groups, and/or all users.
if type = 0, code contains username. (for a single user)
if type = 1, code contains user-group. (for users in a group)
if type = 2, code contains empty-string (for all users).
Use the following SQL (assumed, the the table sales has a unique id field as usual in yii):
SELECT * FROM item AS i
LEFT JOIN sales AS s ON i.sku = s.item_no
AND id = (
SELECT id FROM sales
WHERE item_no = i.sku
AND (type = 0 AND code = 'me' OR
type = 1 AND code = 'my-group' OR
type = 2)
ORDER BY type
LIMIT 1
)
Try following -
SELECT *,SUBSTRING_INDEX(GROUP_CONCAT(s.type ORDER BY s.type),','1) AS `type`, SUBSTRING_INDEX(GROUP_CONCAT(s.code ORDER BY s.type),','1) AS `code`,SUBSTRING_INDEX(GROUP_CONCAT(s.price ORDER BY s.type),','1) AS `price`
FROM item AS i
LEFT JOIN sales AS s
ON i.sku = s.item_no AND (SOME CONDITION)
GROUP BY i.sku

Return set(s) ID only if all items from that set meet a certain criteria

I have looked at similar questions, but I can't seem to wrap my head around how the answers work in order to apply them to my case.
I have sets of articles (set_table)
ID SET ID ART
1 1
1 4
2 1
2 4
3 2
1 3
Those articles have a table with their parent ID. (article_table)
ID ART ID PARENT
1 1
2 3
3 2
4 1
Then those parents have a condition they have to meet, but it could be multiple (parent_table):
PARENT ID GROUP ID
1 6
2 15
3 12
Meaning, I have to select all sets whose articles (all of them) are in GROUP 6, then the result should be ID SET: 2. Or I could need to select all sets whose articles (all of them) are in GROUPS 6 and 15, then the result should be ID SET: 1. Or I could need to select all sets whose articles (all of them) are in GROUPS 6, 12; then the result should be NULL.
I have tried:
SELECT parent_id
FROM parent_table
WHERE group_id IN (6,15)
GROUP BY parent_id
HAVING COUNT(DISTINCT group_id) = 2; -- Number of group ids
Which is cool, but I don't manage to filter the sets correctly, my attempts in selecting the set are not working.
The query below is not so painful once you start writing it. Just join together the three tables, and then use conditional aggregation to count the number of entries in an ID_SET which have the desired groups.
The following query finds the ID_SET values which have groups of either 6 or 12. Note that this will return an empty result set for the sample data you gave in your original question. The DISTINCT subquery is needed to remove duplicates group values which would otherwise throw off the conditional aggregation.
SELECT t.ID_SET,
SUM(CASE WHEN t.GROUP_ID IN (6, 12) THEN 1 ELSE 0 END) AS groupCount
FROM
(
SELECT DISTINCT s.ID_SET, p.GROUP_ID
FROM set_table s
INNER JOIN article_table a
ON s.ID_ART = a.ID_ART
INNER JOIN parent_table p
ON a.ID_PARENT = p.PARENT_ID
) t
GROUP BY t.ID_SET
HAVING groupCount = 2 -- change 2 to however many group values you want to match
Use the below query
select ID ART
from articles a
join Parent1 p1 on a.ID ART = p1.ID ART
join Parent2 p2 on p1.ID PARENT = p2.PARENT ID AND p2.GROUP ID in (6,12)