Joining a variable number of tables based on column value - mysql

I have a few Models in my code which is modeled in my MySQL database in this structure:
Properties
+----+------+---------+
| id | name | address |
|----+------+---------|
| 1 | p1 | 123 st |
| 2 | p2 | 123 st |
| 2 | p3 | 123 st |
+----+------+---------+
Tenants (belongs to property)
+----+-------------+-------+
| id | property_id | suite |
|----+-------------+-------|
| 1 | 1 | s1 |
| 2 | 1 | s2 |
| 3 | 2 | s3 |
+----+-------------+-------+
Costs (can belong to property or tenants)
+----+--------------+-----------+--------------+
| id | parent_model | parent_id | name |
|----+--------------+-----------+--------------+
| 1 | property | 1 | gardening |
| 2 | property | 2 | construction |
| 3 | tenant | 1 | renovation |
+----+--------------+-----------+--------------+
Files (can belong to any model)
+----+--------------+-----------+--------------+
| id | parent_model | parent_id | name |
|----+--------------+-----------+--------------+
| 1 | property | 1 | file1.jpg |
| 2 | tenant | 2 | file2.pdf |
| 3 | costs | 3 | file3.doc |
+----+--------------+-----------+--------------+
As you can see from the table structure all models can be linked back to a property record (either directly or via one or more intermediary tables).
In my code I want to write one query that will get the property.id of a file
After looking over this question: Joining different tables based on column value I realized finding a "link" from a file to a property can be done via a few joins.
The number of joins needed is different based on whatever the parent_model is. For file.id = 1 its a matter of joining in the properties table. For file.id = 3 we must join in costs, tenants, and properties
How should a query be written that can get a property.id for all of my files records?
Edit:
This would be a sample output:
+---------+-------------+
| file_id | property_id |
|---------+-------------|
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
+---------+-------------+
In this case all the files worked out to be associated with property_id 1 but this may not always be the case.

I don't think there's a shortcut. You need to traverse all the paths something along these lines
select -- property files
FileID, parent_id
from Files
where parent_model='property'
union
select -- property costs
FileID, c.parent_id
from
Files inner join costs C on Files.parent_id=c.id and c.parent_model='property'
where Files.parent_model='costs'
union
select -- tenant costs
FileID, t.parent_id
Files inner join costs C on Files.parent_id=c.id and c.parent_model='tenant'
inner join tenant t on t.id=c.parent_id
where Files.parent_model='costs'
.... etc.
i.e. just string together all the variations then UNION

You should be able to do this like so:
SELECT
P.id,
P.name,
P.address,
F.name,
...
FROM
Files F
LEFT OUTER JOIN Costs C ON
F.parent_model = 'Cost' AND C.id = F.parent_id
LEFT OUTER JOIN Tenants T ON
(F.parent_model = 'Tenant' AND T.id = F.parent_id) OR
(C.parent_model = 'Tenant' AND T.id = C.parent_id)
LEFT OUTER JOIN Properties P ON
(F.parent_model = 'Property' AND P.id = F.parent_id) OR
(C.parent_model = 'Property' AND P.id = C.parent_id) OR
(P.id = T.property_id)
Depending on your data, this might break down. I don't think that I like the table design in this case.

Related

joining multiple tables and returning a zero if not row in one

I'm trying to setup a new permissions database in MySQL and I'm breaking my brain over something that I'm sure is very simple. I'm certain something to this tune has been answered here before but after hours of searching I have found nothing that works.
I have 4 tables that are relevant
Permission (contains every possible permission)
|permission_name | description |
--------------------------------
|users.list | etc. etc. |
|users.update | etc. etc. |
|users.delete | etc. etc. |
User
| id | fname | group_id |
------------------------------
| 1 | John | 1 |
| 2 | Nancy | 1 |
| 3 | Paul | 2 |
Group
| group_id | group_name |
-------------------------
| 1 | Webmasters |
| 2 | Corporate |
| 3 | HR |
Group_permission (contains permissions relevant to each group)
| group_id | permission_name | permission_type (1=Y|0=not set|-1=N)
----------------------------------------------
| 1 | users.list | 1 |
| 1 | users.update | 1 |
| 2 | users.list | 1 |
OK so lots of relations going on, but I'm trying to get ALL the group permissions for a specific user EVEN if the group permission doesn't exist yet.
I imagined this being some sort of left join using a permission table as a base, but whenever I include the WHERE user.id = 2 it limits my result set down and won't include nulls on the right side.
SELECT a.permission_name, IFNULL(b.permission_type, 0)
FROM permission a
LEFT JOIN group_permission b on b.permission_name = a.permission_name
LEFT JOIN user c on c.group_id = b.group_id
WHERE c.id = 2
the result I want to see for Nancy is
|permission_name | permission_type |
------------------------------------
|users.list | 1 |
|users.update | 0 |
|users.delete | 0 |
I won't know what group the user is in on the PHP side, so I have to query by using the users ID only.
All I'm getting is
|permission_name | permission_type |
------------------------------------
|users.list | 1 |
Any help appreciated. TIA
It ended up being just a subquery that did the trick.
SELECT a.permission_name, IFNULL(b.permission_type, 0)
FROM permission a
NATURAL LEFT JOIN
(
SELECT a.group_id, a.permission_name, a.permission_type FROM group_permission a
NATURAL LEFT JOIN users b
WHERE b.id = 2
) as b
Not 100% sure that this will work, but try joining the group_permissions table to the permissions table.
SELECT a.permission_name, IFNULL(b.permission_type, 0)
FROM group_permission a
LEFT JOIN permission b on a.permission_name = b.permission_name
LEFT JOIN user c on c.group_id = b.group_id
WHERE c.id = 2

Getting no of products in all categories and parent categories contains a keyword

I am trying to fetch all the categories and their count (no of products in that category) of those products where keyword matches. The query I tried doesn't give me the correct result.
Also I want the parent categories till level 1 and their count as well.
e.g. I am trying with keyword watch, then category "watches" should be there with some count. Also the parent category "accessories" with the sum of its descendant categories count.
my table structures are:
tblProducts: There are 5 categories of a product, fldCategoryId1, fldCategoryId2, fldCategoryId3, fldCategoryId4 and fldCategoryId5. fldProductStatus should be 'A'
+-----------------------------+-------------------+
| Field | Type |
+-----------------------------+-------------------+
| fldUniqueId | bigint(20) |
| fldCategoryId1 | bigint(20) |
| fldCategoryId2 | bigint(20) |
| fldCategoryId3 | bigint(20) |
| fldCategoryId4 | bigint(20) |
| fldCategoryId5 | bigint(20) |
| fldProductStatus | enum('A','P','D') |
| fldForSearch | longtext |
+-----------------------------+-------------------+
tblCategory:
+------------------------------+-----------------------+
| Field | Type |
+------------------------------+-----------------------+
| fldCategoryId | bigint(20) |
| fldCategoryName | varchar(128) |
| fldCategoryParent | int(11) |
| fldCategoryLevel | enum('0','1','2','3') |
| fldCategoryActive | enum('Y','N') |
+------------------------------+-----------------------+
Search Query:
SELECT count( c.fldCategoryId ) AS cnt, c.fldCategoryLevel, c.fldCategoryParent, c.fldCategoryId, c.fldCategoryName, p.fldForSearch, c.fldCategoryParent
FROM tblCategory c, tblProducts p
WHERE (
c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5
)
AND p.fldProductStatus = 'A'
AND (
MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
)
GROUP BY c.fldCategoryId
Note: The table is in the InnoDB engine and have FULLTEXT search index on 'fldForSearch' column.
EDIT: sample data can be found in sqlfiddle
I'm not sure what you mean by:
Also I want the parent categories till level 1 and their count as well.
But the following query will show you a count for each category (including those with 0 found products), and a general rollup:
SELECT
c.fldCategoryId,
c.fldCategoryLevel,
c.fldCategoryName,
COUNT( * ) AS cnt
FROM tblCategory c
LEFT JOIN tblProducts p ON
(c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5)
AND p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
GROUP BY
c.fldCategoryId
c.fldCategoryLevel,
c.fldCategoryName
WITH ROLLUP;
Notes:
you cannot select p.fldForSearch if you expect a count of all the products in the category. fldForSearch is on a per product basis, it defeats the grouping purpose
I left joined with products so it returns the categories with 0 products matching your keywords. If you don't want this to happen just remove the LEFT keyword
I haven't checked the MATCH condition I assume it's correct.
Start by not splaying an array (fldCategoryId...) across columns. Instead, add a new table.
Once you have done that, the queries change, such as getting rid of OR clauses.
Hopefully, any further issues will fall into place.
Since your category tree has a fixed height (4 levels), you can create a transitive closure table on the fly with
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
The result will look like
| descendantId | ancestorId |
|--------------|------------|
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| ... | ... |
| 5 | 1 |
| 5 | 2 |
| 5 | 5 |
| ... | ... |
You can now use it in a subquery (derived table) to join it with products using descendantId and with categories using ancestorId. That means that a product from category X will be indirectly associated with all ancestors of X (as well as with X). For example: Category 5 is a child of 2 - and 2 is a child of 1. So all products from category 5 must be counted for categories 5, 2 and 1.
Final query:
SELECT c.*, coalesce(sub.cnt, 0) as cnt
FROM tblCategory c
LEFT JOIN (
SELECT tc.ancestorId, COUNT(DISTINCT p.fldUniqueId) AS cnt
FROM tblProducts p
JOIN (
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
) tc ON tc.descendantId IN (
p.fldCategoryId1,
p.fldCategoryId2,
p.fldCategoryId3,
p.fldCategoryId4,
p.fldCategoryId5
)
WHERE p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST ( '+(watches watch)' IN BOOLEAN MODE )
GROUP BY tc.ancestorId
) sub ON c.fldCategoryId = sub.ancestorId
Result for your sample data (without level, since it seems to be wrong anyway):
| fldCategoryId | fldCategoryName | fldCategoryParent | fldCategoryActive | cnt |
|---------------|-----------------|-------------------|-------------------|-----|
| 1 | Men | 0 | Y | 5 |
| 2 | Accessories | 1 | Y | 5 |
| 3 | Men Watch | 1 | Y | 3 |
| 5 | Watch | 2 | Y | 5 |
| 6 | Clock | 2 | Y | 3 |
| 7 | Wrist watch | 1 | Y | 2 |
| 8 | Watch | 2 | Y | 4 |
| 9 | watch2 | 3 | Y | 2 |
| 10 | fastrack | 8 | Y | 3 |
| 11 | swish | 8 | Y | 2 |
| 12 | digital | 5 | Y | 2 |
| 13 | analog | 5 | Y | 2 |
| 14 | dual | 5 | Y | 1 |
Demos:
sqlfiddle
rextester
Note that the outer (left joined) subquery is logically not necessary. But from my experience MySQL doesn't perform well without it.
There are still ways for performance optimisation. One is to store the transitive closure table in an indexed temporary table. You can also persist it in a regular table, if categories do rarely change. You can also manage it with triggers.

MySQL Performance - LEFT JOIN / HAVING vs Sub Query

Which of the following queries style is better for performance?
Basically, I'm returning many related records into one row with GROUP_CONCAT and I need to filter by another join on the GROUP_CONCAT value, and I will need to add many more either joins/group_concats/havings or sub queries in order to filter by more related values. I saw that, officially, LEFT JOIN was faster, but I wonder if the GROUP_CONCAT and HAVING through that off.
(This is a very simplified example, the actual data has many more attributes and it's reading from a Drupal MySQL architecture)
Thanks!
Main Records
+----+-----------------+----------------+-----------+-----------+
| id | other_record_id | value | type | attribute |
+----+-----------------+----------------+-----------+-----------+
| 1 | 0 | Red Building | building | |
| 2 | 1 | ACME Plumbing | attribute | company |
| 3 | 1 | east_side | attribute | location |
| 4 | 0 | Green Building | building | |
| 5 | 4 | AJAX Heating | attribute | company |
| 6 | 4 | west_side | attribute | location |
| 7 | 0 | Blue Building | building | |
| 8 | 7 | ZZZ Mattresses | attribute | company |
| 9 | 7 | south_side | attribute | location |
+----+-----------------+----------------+-----------+-----------+
location_transaltions
+-------------+------------+
| location_id | value |
+-------------+------------+
| 1 | east_side |
| 2 | west_side |
| 3 | south_side |
+-------------+------------+
locations
+----+--------------------+
| id | name |
+----+--------------------+
| 1 | Arts District |
| 2 | Warehouse District |
| 3 | Suburb |
+----+--------------------+
Query #1
SELECT
a.id,
GROUP_CONCAT(
IF(b.attribute = 'company', b.value, NULL)
) AS company_value,
GROUP_CONCAT(
IF(b.attribute = 'location', b.value, NULL)
) AS location_value,
GROUP_CONCAT(
IF(b.attribute = 'location', lt.location_id, NULL)
) AS location_id
FROM
records a
LEFT JOIN records b ON b.other_record_id = a.id AND b.type = 'attribute'
LEFT JOIN location_translations lt ON lt.value = b.value
WHERE a.type = 'building'
GROUP BY a.id
HAVING location_id = 2
Query #2
SELECT temp.* FROM (
SELECT
a.id,
GROUP_CONCAT(
IF(b.attribute = 'company', b.value, NULL)
) AS company_value,
GROUP_CONCAT(
IF(b.attribute = 'location', b.value, NULL)
) AS location_value
FROM
records a
LEFT JOIN records b ON b.other_record_id = a.id AND b.type = 'attribute'
WHERE a.type = 'building'
GROUP BY a.id
) as temp
LEFT JOIN location_translations lt ON lt.value = temp.location_value
WHERE location_id = 2
Using JOIN is preferable in most cases, because it helps optimizer to understand which indexes he can to use. In your case, query #1 looks good enough.
Of course, it works only if tables has indexes. Check table records has indexes on id, other_record_id, value and type columns, table location_translations on value

SQL Join vs Sub-query

I'm running MySQL 5.1.71. In my database there are three tables - load, brass and mfg with load being my "main" table. My goal is to query load and have mfg.name included in the results. I've tried various iterations of JOIN clauses vs sub-queries both with and without WHERE clauses. It seems this should be pretty trivial so I'm not sure how I can't arrive at the solution.
load
-------------------------
| id | desc | brass_id |
-------------------------
| 1 | One | 2 |
| 2 | Two | 1 |
-------------------------
brass
---------------
| id | mfg_id |
---------------
| 1 | 6 |
| 2 | 8 |
---------------
brass_mfg
------------------------
| id | name |
------------------------
| 6 | This Company |
| 8 | That Company |
------------------------
My desired results would be...
results
---------------------------
| load | mfg |
---------------------------
| One | That Company |
| Two | This Company |
---------------------------
A load ID will always have only a single brass ID
A brass ID will always have only a single mfg ID
EDIT
The previously provided sample data (above) has been updated. Also, below are the query I'm running and the results I'm getting. The company is wrong in each record that is returned. I've included in the query and the results the IDs across the tables. The company names that appear are not the names in for the IDs in the mfg table.
SELECT
load.id AS "load.id",
load.brass_id AS "load.brass_id",
brass.id AS "brass.id",
brass.mfg_id AS "brass.mfg_id",
brass_mfg.id AS "brass_mfg.id",
brass_mfg.name AS "brass_mfg.name"
FROM `load`
LEFT JOIN brass ON load.brass_id = brass.id
LEFT JOIN brass_mfg ON brass.id = brass_mfg.id
-----------------------------------------------------------------------------------------
| load.id | load.brass_id | brass.id | brass.mfg_id | brass_mfg.id | brass_mfg.name |
-----------------------------------------------------------------------------------------
| 1 | 2 | 2 | 6 | 2 | Wrong Company |
| 2 | 1 | 1 | 8 | 1 | Incorrect Company |
-----------------------------------------------------------------------------------------
Look at your tables and see what data relates to one another then build up joins table by table to get your desired output.
SELECT p.desc AS Product, m.name AS mfg
FROM product p
INNER JOIN lot l ON p.lot_id = l.id
INNER JOIN mfg m ON l.mfg_id = m.id
If this is single - single relationship, why having middle table?
In your case the best scenario is simple join.
SELECT pt.desc as Product, mfg.name as Mfs
FROM Product pt
Join Lot lt on lt.id = pt.lot_id
Join Mfg mf on mf.id = lt.mfg_id
You have an error in your join query.
Try this one:
Select
l.id AS "load.id",
l.brass_id AS "load.brass_id",
b.id AS "brass.id",
b.mfg_id AS "brass.mfg_id",
m.id AS "brass_mfg.id",
m.`name` AS "brass_mfg.name"
FROM `load` as l
LEFT JOIN brass as b ON l.brass_id = b.id
LEFT JOIN brass_mfg as m ON b.mfg_id = m.id
You need LEFT JOIN only

select rows where related record doesn't exist

I need to retrieve rows from a mysql database as follows: I have a contract table, a contract line item table, and another table called udac. I need all contracts which DO NOT have a line item record with criteria based on a relationship between contract line item and udac. If there is a better way to state this question, let me know.
Table Structures
----contract--------------------- ---contractlineitem-----------
| id | customer_id | entry_date | | id | contract_id | udac_id |
--------------------------------- ------------------------------
| 1 | 1234 | 2010-01-01 | | 1 | 1 | 5 |
| 2 | 2345 | 2016-01-31 | | 2 | 1 | 2 |
--------------------------------- | 3 | 1 | 1 |
| 4 | 2 | 4 |
| 5 | 2 | 2 |
------------------------------
---udac----------
| id | udaccode |
-----------------
| 1 | SWBL/R |
| 2 | SWBL |
| 3 | ABL/R |
| 4 | ABL |
| 5 | XRS/F |
-----------------
Given the above data, contract 2 would show up but contract 1 would not, because it has contractlineitems that point to udacs that end in /F or /R.
Here's what i have so far, but it's not correct.
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND NOT EXISTS (
SELECT cli.id
FROM contractlineitem cli_i
JOIN udac u
ON cli_i.udac_id = u.id
WHERE u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R'
AND cli_i.contract_id = cli.contract_id);
Tom's comment that your WHERE clause is wrong may be the problem you are chasing. Plus, using a correlated subquery may be problematic for performance if the optimizer can't figure out a better way to do it.
Here is the better way to do it using an OUTER JOIN:
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
LEFT OUTER JOIN udac u
ON ( u.id = cli.udac_id
AND ( u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R' ) )
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND u.id IS NULL
Try that out and see if it does what you want. The query essentially does what you stated: It tries to join to udac where the code ends in '/F' or '/R', but then it only accepts the ones where it can't find a match (u.id IS NULL).
If the same row is returned multiple times incorrectly, throw a distinct on the front.