Faceted search using MySQL - mysql

I'm currently trying to build a MySQL (version 8) query to get a list of filters with an article count. I know I can use Elasticsearch to achieve the desired result, but the requirement is to use MySQL.
Data
DB Fiddle
Query
SELECT sf.name, sff.title, sff.key, COUNT(DISTINCT sfa.id) AS articles_count
FROM shop_filters AS sf
INNER JOIN shop_filter_facets AS sff ON sff.filter_id = sf.id
LEFT JOIN shop_facetables AS sfa ON (
sfa.facet_id = sff.id AND sfa.facetable_id IN (
SELECT sfa.facetable_id
FROM shop_filter_facets AS sff
INNER JOIN shop_facetables AS sfa ON sfa.facet_id = sff.id
INNER JOIN shop_filters AS sf ON sf.id = sff.filter_id
GROUP BY sfa.facetable_id
HAVING (
sf.name = 'filter_1' AND MAX(sff.key = 1252884110) = 1
OR MAX(sff.key = 1741157870) = 1
)
)
)
GROUP BY sf.name, sff.title, sff.key
Output
As you can see, the other filter_1 items have a count of 0. They should display a count higher than zero. What am I missing in the query above?
Expected output
An example of how the faceted search should behave:

You just needed to change an AND to an OR in the HAVING clause:
SELECT sf.name, sff.title, sff.key, COUNT(DISTINCT sfa.id) AS articles_count
FROM shop_filters AS sf
INNER JOIN shop_filter_facets AS sff ON sff.filter_id = sf.id
LEFT JOIN shop_facetables AS sfa ON (
sfa.facet_id = sff.id AND sfa.facetable_id IN (
SELECT sfa.facetable_id
FROM shop_filter_facets AS sff
INNER JOIN shop_facetables AS sfa ON sfa.facet_id = sff.id
INNER JOIN shop_filters AS sf ON sf.id = sff.filter_id
GROUP BY sfa.facetable_id
HAVING (
sf.name = 'filter_1' OR MAX(sff.key = 1252884110) = 1
OR MAX(sff.key = 1741157870) = 1
)
)
)
GROUP BY sf.name, sff.title, sff.key;
The Results:
| name | title | key | articles_count |
| -------- | --------- | ---------- | -------------- |
| filter_1 | Facet 1-A | 1741157870 | 5 |
| filter_1 | Facet 1-B | 9401707597 | 4 |
| filter_1 | Facet 1-C | 8395537669 | 27 |
| filter_1 | Facet 1-D | 1252884110 | 18 |
| filter_1 | Facet 1-E | 885500301 | 1 |
| filter_2 | Facet 2-A | 5454540233 | 4 |
| filter_2 | Facet 2-B | 2418516648 | 3 |
| filter_2 | Facet 2-C | 2808696733 | 4 |
| filter_2 | Facet 2-D | 8692535611 | 5 |
| filter_2 | Facet 2-E | 6389292333 | 0 |
| filter_2 | Facet 2-F | 5107586138 | 4 |
| filter_2 | Facet 2-G | 9464620325 | 3 |
| filter_2 | Facet 2-H | 1166556565 | 0 |
| filter_2 | Facet 2-I | 2739765054 | 0 |
| filter_3 | Facet 3-A | 1112385648 | 23 |
| filter_4 | Facet 4-A | 2883255908 | 2 |
| filter_4 | Facet 4-B | 1507996583 | 3 |
| filter_4 | Facet 4-C | 7632658109 | 3 |
| filter_4 | Facet 4-D | 2990697496 | 2 |
| filter_5 | Facet 5-A | 2051629771 | 16 |
| filter_5 | Facet 5-B | 6620949318 | 6 |
| filter_5 | Facet 5-C | 8962757449 | 2 |
| filter_5 | Facet 5-D | 2020077129 | 2 |
View on DB Fiddle

Related

How to group by columns in a different table

I am trying to write a query to return the sum of totalRxCount that is grouped by zipcode.
I have two tables named fact2 and demographic.
My problem is that in the demographic table there are duplicate rows which affects the sum of totalRxCount.
To avoid duplicates I am wanting to only return results where npiNum is distinct.
Right now I have this working but it is grouping by relId (the primary key).
I cannot figure out a way to group by zipcode since this column and totalRxCount are in separate tables.
When I try this I am getting wrong results since it is counting the duplicate rows.
Here is my query. I am wanting to modify this to return results grouped by zipcode instead of relId.
Any input will be greatly appreciated!
SELECT fact2.relID
, SUM(fact2.`totalRxCount`)
FROM fact2
LEFT
JOIN (
SELECT O1.relId, COUNT(DISTINCT O1.npiNum)
FROM demographic As O1
GROUP BY O1.relId
) AS d1
ON d1.`relId` = fact2.relID
LEFT
JOIN (
SELECT O2.relID, Sum(O2.totalRxCount)
FROM fact2 AS O2
GROUP BY O2.relID
) AS p1
ON p1.relID = d1.relId
WHERE (monthEndDate BETWEEN 201911 AND 202010) GROUP BY fact2.relID;
Results:
+-------+---------------------------+
| relID | SUM(fact2.totalRxCount) |
+-------+---------------------------+
| 2465 | 2 |
+-------+---------------------------+
What I've tried
SELECT zipcode, SUM(fact2.`totalRxCount`)
FROM fact2
INNER JOIN demographic ON demographic.relId=fact2.relID
LEFT JOIN (
SELECT O1.`relId`, COUNT(DISTINCT O1.`npiNum`)
FROM demographic As O1
GROUP BY O1.`relId`
) AS d1
ON d1.`relId` = fact2.`relID`
LEFT JOIN (
SELECT O2.`relID`, Sum(O2.`totalRxCount`)
FROM fact2 AS O2
GROUP BY O2.`relID`
) AS p1
ON p1.`relID` = d1.`relId`
WHERE (`monthEndDate` BETWEEN 201911 AND 202010) GROUP BY zipcode;
This is returning the sum multiplied by number of duplicate rows in demographic.
Results:
+---------+---------------------------+
| zipcode | SUM(fact2.`totalRxCount`) |
+---------+---------------------------+
| 66097 | 4 |
+---------+---------------------------+
^ This should be 2
demographic table:
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| relId | zipcode | providerId | writerType | firstName | middleName | lastName | title | specCode | specDesc | address | city | state | amaNoContact | pdrpInd | pdrpDate | deaNum | amaNum | amaCheckDig | npiNum | terrId | callStatusCode |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
fact2
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| relID | marketId | marketName | productID | productName | dataType | providerId | writerType | planId | pmtTypeInd | monthEndDate | newRxCount | refillRxCount | totalRxCount | newRxQuan | refillRxQuan | totalRxQuan | newRxCost | refillRxCost | totalRxCost |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| 2465 | 10871 | GALT PP MONTHLY | 1399451 | ZOLPIDEM TARTRATE | 15 | | A | 900145 | C | 202004 | 1 | 0 | 1 | 30 | 0 | 30 | 139 | 0 | 139 |
| 2465 | 10871 | GALT PP MONTHLY | 1399458 | ESZOPICLONE | 15 | | A | 900145 | C | 202006 | 1 | 0 | 1 | 30 | 0 | 30 | 350 | 0 | 350 |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+

Join 3 Tables in a MySql Query

i have this 3 tables in a MySql Database.
users
+----+------+--------+------+
| Id | Name | Status | Role |
+----+------+--------+------+
| 1 | A | Aktiv | Op |
| 2 | B | Aktiv | Op |
| 3 | C | Aktiv | Op |
| 4 | D | Aktiv | Op |
+----+------+--------+------+
cnt
+----+------+------------+------+
| Id | Name | Date | Type |
+----+------+------------+------+
| 1 | A | 2017-11-09 | Web |
| 2 | B | 2017-11-09 | Web |
| 3 | C | 2017-11-09 | Web |
| 4 | C | 2017-11-09 | Inb |
| 5 | A | 2017-11-09 | Web |
+----+------+------------+------+
Lead
+----+------+------------------+------------+
| Id | Name | Date | Type |
+----+------+------------------+------------+
| 1 | A | 2017-11-09 00:24 | Imported |
| 2 | B | 2017-11-09 09:32 | Activation |
| 3 | B | 2017-11-09 10:56 | Activation |
| 4 | D | 2017-11-09 12:21 | Activation |
| 5 | D | 2017-11-10 12:22 | Activation |
+----+------+------------------+------------+
I'm trying to join them in a main table but with no success, the query i'm using is:
SELECT IFNULL(u.Name,'Total') as "Name",
Sum(IF(c.Type = 'Web',1,0)) as "Cnt Web",
Sum(IF(l.Type = 'Activation',1,0)) as "Lead Web"
FROM users u
LEFT JOIN cnt c ON u.Name = c.Name and c.Date = '2017-11-09'
LEFT JOIN lead l ON u.Name = l.Name and l.Date>= '2017-11-09' AND l.Date< '2017-11-10'
WHERE u.Status = 'Aktiv' AND u.Role = 'Op'
GROUP BY u.Name WITH ROLLUP
The result i need is a table like this:
+----+------+--------+---------+
| Id | Name | Cnt Web| Lead Web|
+----+------+------------------+
| 1 | A | 2 | 0 |
| 2 | B | 1 | 2 |
| 3 | C | 1 | 0 |
| 4 | D | 0 | 1 |
+----+------+------------------+
When i try to join the first table with the second or the first with the third, i get the correct result, but i can't get the needed result when i join them all.
Any answer is the most welcomed. Thank you in advance.
Here's a solution using correlated sub-queries
SELECT u.Id,
u.Name,
(SELECT COUNT(Name) FROM cnt WHERE Name = u.name AND type = 'Web' AND Date = '2017-11-09') AS cnt_web,
(SELECT COUNT(Name) FROM lead WHERE Name = u.name AND type = 'Activation' AND Date>= '2017-11-09' AND Date< '2017-11-10') AS cnt_lead
FROM users u
WHERE u.Status = 'Aktiv' AND u.Role = 'Op'

Left join select using Propel ORM

I have 3 table
major table:
+----+------------+
| id | major |
+----+------------+
| 1 | Computer |
| 2 | Architect |
| 3 | Designer |
+----+------------+
classroom table:
+----+----------+-------+
| id | major_id | name |
+----+----------+-------+
| 1 | 1 | A |
| 2 | 1 | B |
| 3 | 1 | C |
| 4 | 2 | A |
| 5 | 2 | B |
| 6 | 3 | A |
+----+----------+-------+
and finally, student_classroom table
+----+------------+--------------+----------+
| id | student | classroom_id | status |
+----+------------+--------------+----------+
| 1 | John | 1 | Inactive |
| 2 | Defou | 2 | Active |
| 3 | John | 2 | Active |
| 4 | Alexa | 1 | Active |
| 5 | Nina | 1 | Active |
+----+------------+--------------+----------+
how can I use propel to build query below
select
a.id,
a.major,
b.number_of_student,
c.number_of_classroom
from major a
left join (
select
major.major_id,
count(student_classroom.id) as number_of_student
from major
left join classroom on classroom.major_id = major.id
left join student_classroom on student_classroom.classroom_id = classroom.id
where student_classroom.`status` = 'Active'
group by major_id
) b on b.major_id = a.major_id
left join (
select
major.major_id,
count(classroom.id) as number_of_classroom
from major
left join classroom on classroom.major_id = major.id
group by major_id
) c on c.major_id = a.major_id
Because I want the final result would be something like this, I spend hours trying to figure it out without success.
+----+------------+-------------------+---------------------+
| id | major | number_of_student | number_of_classroom |
+----+------------+-------------------+---------------------+
| 1 | Computer | 4 | 3 |
| 2 | Architect | 0 | 2 |
| 3 | Designer | 0 | 1 |
+----+------------+-------------------+---------------------+
Try this
select
m.id,
m.major,
count(distinct s.id) as number_of_student ,
count(distinct c.id) as number_of_classroom
from major m
left join classroom c on
(m.id = c.major_id)
left join student_classroom s
on (s.classroom_id = c.id and c.major_id = m.id and s.status = 'active')
group by m.id
order by m.id

how to subsort of LEFT JOIN in MySQL query?

How can I build a query to sort the results by vl_name (vldescription) alphabetically, and subsort table vllinks by vlk_addeddate from the latest with internal limit equal to 1?
SELECT
aa.vl_id, aa.vl_name, aa.vl_code, aa.vl_vcc, aa.vl_description,
bb.vlk_id, bb.vlk_vlid, bb.vlk_link, bb.vlk_platform, bb.vlk_location, bb.vlk_addeddate,
le.vop_vlid, le.vop_thumbnail,
cx.vcc_id, cx.vcc_type, cx.vcc_brand, cx.vcc_variant
FROM vldescription AS aa
LEFT JOIN vllinks AS bb ON aa.vl_id = bb.vlk_vlid
LEFT JOIN vlofferphotos AS le ON le.vop_vlid = aa.vl_id
LEFT JOIN vlcarcats AS cx ON cx.vcc_id = aa.vl_vcc
WHERE aa.vl_vcc = '$change_me_if_you_need'
GROUP BY vl_id
ORDER BY vl_name
Table vlcarcats (vcc_
vcc_id | vcc_type | vcc_brand | vcc_variant
1 | OpenPlace | SomeCorp1 | website
2 | ForPrive | SomeCorp2 | other way
Table vldescription
vl_id | vl_name | vl_code | vl_vcc | vl_description
1 | OpTECC | xDAOcm | 1023 | text, text,...
2 | NewCop | d9MMo2 | 42 | more text,...
Table vllinks (vlk_vlid == vl_id)
vlk_id | vlk_vlid | vlk_link | vlk_platform | vlk_location | vlk_addeddate
1 | 1 | http://... | 1 | USA | 2014-01-10
2 | 2 | http://... | 1 | UK | 2014-01-12
3 | 2 | ftp://... | 2 | UK | 2014-01-15
4 | 2 | ftp://... | 2 | India | 2014-01-19
5 | 1 | ftp://... | 2 | Austria | 2014-01-22
Table vlofferphotos (vop_vlid == vl_id)
vop_vlid | vop_thumbnail
1 | abcdefg.jpg
2 | hijklmn.jpg

SQL update after choosing a MAX from some AVERAGEs

I have 2 tables with same columns but different data. I need to compute the average of a column in one table ( with some filters ) and to choose the MAX of them. Then to put that value in the 2nd table.
I've built so far this query:
UPDATE st16
INNER JOIN st17 ON st17.parent = st16.uid
SET
st16.p1 = SELECT MAX(
(SELECT AVG(st17.p1) FROM st17 WHERE st17.parent = st16.uid AND st17.row = st16.row)),
st16.p2 = SELECT MAX(
(SELECT AVG(st17.p2) FROM st17 WHERE st17.parent = st16.uid AND st17.row = st16.row))
but I get this error: "#1111 - Invalid use of group function".
Any ideas? Thanks!
Sample data ( first is st17, and below is st16 ):
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
| uid | parent | fen | p1 | p2 | row |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
| ee95b564f2b3fa1573b451d8f4e00f5d | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D8D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/6D5D3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD | -10.481481481481481 | 10.481481481481481 | 1 |
| 691ed545dd5375cb3e75f0b8d032534b | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D6D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/5D3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D | -10.481481481481481 | 10.481481481481481 | 1 |
| b6e2a3f4ea51c8e6638a2cc657bf3511 | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D5D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D | -10.481481481481481 | 10.481481481481481 | 1 |
| 0dbe5038d01e457e4f65415ac081d0dd | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D3D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D | -10.481481481481481 | 10.481481481481481 | 1 |
| ca1e85058ed8294d60a9922d36f8c1fa | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D2D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/KSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D | -10.481481481481481 | 10.481481481481481 | 1 |
| e85179f395ba8e441ff7b1544e05404c | c75eb9315dee4e3b42fb52e8cd509910 | QS7DJS/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/TS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DAS | -9.703703703703704 | 9.703703703703704 | 1 |
| eb3c352febe8ff25f375032bbb6cc5d7 | c75eb9315dee4e3b42fb52e8cd509910 | QS7DTS/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJS | -9.703703703703704 | 9.703703703703704 | 1 |
| 69f06801edf9b3cf669df56dc9152271 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D8S/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS | -9.703703703703704 | 9.703703703703704 | 1 |
| 5f78082dd3aee8b51bf096286df5e4e7 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D5H/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS8S7S6S5S3S2SAH7H | -9.703703703703704 | 9.703703703703704 | 1 |
| 7ee50e8aa1afd3af703b3a5b3cdf3cf8 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D3H/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS8S7S6S5S3S2SAH7H5H | -9.703703703703704 | 9.703703703703704 | 1 |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+
| uid | parent | fen | p1 | p2 | row |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+
| bc5ef0d66b3bde08b0ba35a91412c058 | 9e123e356e468b847d4493cf55809fcd | QS7D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/KSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2D | 0 | 0 | 1 |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+
As Gordon Linoff mentioned, you can't pass a subquery to an aggregate function such as MAX(). Another problem that you will likely run into: you cannot select from the table you are updating in MySQL. So something like this
UPDATE st16
SET
st16.p1 = (SELECT AVG(st17.p1) FROM st17 JOIN st16 ON st17.parent = st16.uid WHERE st17.row = st16.row ORDER BY AVG(st17.p1) DESC LIMIT 1),
st16.p2 = (SELECT AVG(st17.p2) FROM st17 JOIN st16 ON st17.parent = st16.uid WHERE st17.row = st16.row ORDER BY AVG(st17.p2) DESC LIMIT 1);
will not work, unfortunately. You might just want to break this into multiple queries; that is, retrieve the maximum averages first in a SELECT, then ship those results in a second, separate UPDATE.