MySQL - return reference via multiple table joins - mysql

TABLE 'styles'
id|style_code
1|6110
2|6120
3|6250
TABLE 'colour'
id|colour_code
1|1001
2|1012
3|1033
4|1050
TABLE 'styleColour'
id|style_id|colour_id|cancelled
1 | 1 | 1 |
2 | 1 | 2 | y
3 | 2 | 1 |
4 | 2 | 3 |
5 | 2 | 4 |
6 | 3 | 1 |
7 | 3 | 2 |
8 | 3 | 3 | y
9 | 3 | 4 | y
TABLE 'orders'
id|style_code|colour_code
1 | 6110 | 1001
2 | 6110 | 1012
3 | 6130 | 1001
4 | 6130 | 1033
5 | 6130 | 1050
6 | 6250 | 1033
7 | 6250 | 1050
Output wanted (based on 'order' table):
style_code|colour_code|cancelled
6110 | 1001 |
6110 | 1012 | y
6130 | 1001 |
6130 | 1033 |
6130 | 1050 |
6250 | 1033 | y
6250 | 1050 | y
What joins are needed to reference the 'cancelled' column to the appropriate style_code and colour_code combination on the 'order' table ouput?
Please bear in mind that although it may seem odd that the 'style_code' and 'colour_code' data as shown rather than being represented by style_id and colour_id, this is required for importing reasons.
Thanks and kind regards,
Derek.

Updated answer as per comments below -
SELECT orders.style_code,orders.colour_code,styleColour.cancelled
FROM orders
LEFT JOIN colours ON orders.colour_code=colours.colour_code
LEFT JOIN styles ON orders.style_code=styles.style_code
LEFT JOIN styleColour ON styleColour.style_id=styles.id
AND styleColour.colour_id=colours.id;
However if you make the JOINs on colours/styles being on the id column instead of the code column (I assume id is the primary key), it might be faster (as primary keys are indexed).
This is my attempt at that. I haven't tested it, but give it a try:
SELECT orders.style_code,orders.colour_code,styleColour.cancelled
FROM styleColour
RIGHT JOIN colours ON orders.colour_id=colours.id
RIGHT JOIN styles ON orders.style_id=styles.id
RIGHT JOIN orders ON orders.style_code=styles.style_code
AND orders.colour_code=colours.colour_code;
The reason it's all RIGHT JOINs is to do all the joining based on the rows in orders as opposed to styleColour.

Related

Merge references to duplicate rows in mysql

This feels very simple and complex at the same time, but I can't quite get my head around an appropriate way of going about this as mysql query.
I have a table of tags called categories that should only have unique titles for the field cat_title. However, I've noticed that there are multiple rows with the same cat_title field name.
I want to delete all but the first instance of any duplicates. Simple enough, yes. But another table, tagging has a field called tagging_cat_id that references the identifier field, cat_id in the categories table. Deleting duplicates will break these references and point to nothing.
So, the more complex aspect is finding any tagging_cat_id field that references a duplicate row that's about to be deleted and change it to reference the (soon to be unique, single) first row of this cat_title
I am a novice at mysql and this is a bit out of my depth. I was almost tempted to do this manually by hand in a gui. Is there a simple enough method of doing this as a query that I could run on occasion to perform the above? (until what's causing duplicates to be created is resolved). Distrib version is 5.7.21.
Sample Data
Categories
+--------+-----------+
| cat_id | cat_title |
+--------+-----------+
| 1 | green |
| 2 | red |
| 3 | blue |
| 4 | green |
| 5 | green |
| 6 | red |
| 7 | white |
+--------+-----------+
Tagging
+------------+-------------------+----------------+
| tagging_id | tagging_record_id | tagging_cat_id |
+------------+-------------------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 7 |
| 4 | 3 | 5 |
| 5 | 4 | 6 |
| 6 | 5 | 4 |
| 7 | 5 | 3 |
| 8 | 6 | 5 |
+------------+-------------------+----------------+
I want to convert the above to the following:
Categories
+--------+-----------+
| cat_id | cat_title |
+--------+-----------+
| 1 | green |
| 2 | red |
| 3 | blue |
| 7 | white |
+--------+-----------+
Tagging
+------------+-------------------+----------------+
| tagging_id | tagging_record_id | tagging_cat_id |
+------------+-------------------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 7 |
| 4 | 3 | 1 |
| 5 | 4 | 2 |
| 6 | 5 | 1 |
| 7 | 5 | 3 |
| 8 | 6 | 1 |
+------------+-------------------+----------------+
If your version of MySql is 8.0+ you can use this query:
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
to identify for each cat_id the minimum cat_id with the same cat_title so you can update the table:
WITH ids AS (
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
)
UPDATE tagging t
INNER JOIN ids i ON i.cat_id = t.tagging_cat_id
SET t.tagging_cat_id = i.min_id
Then you can delete the duplicates:
WITH ids AS (
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
)
DELETE c
FROM categories c INNER JOIN ids i
ON i.cat_id = c.cat_id AND i.min_id < c.cat_id
See the demo.
For previous versions of MySql that do not support window functions and CTEs:
UPDATE tagging t
INNER JOIN categories c ON c.cat_id = t.tagging_cat_id
INNER JOIN (
SELECT cat_title, MIN(cat_id) min_id
FROM categories
GROUP BY cat_title
) m ON m.cat_title = c.cat_title
SET t.tagging_cat_id = m.min_id
and:
DELETE c1
FROM categories c1 INNER JOIN categories c2
ON c2.cat_title = c1.cat_title
WHERE c1.cat_id > c2.cat_id
See the demo.
Results:
cat_id
cat_title
1
green
2
red
3
blue
7
white
and:
tagging_id
tagging_record_id
tagging_cat_id
1
1
1
2
1
2
3
2
7
4
3
1
5
4
2
6
5
1
7
5
3
8
6
1

Need help/explanation to JOINED query

I'm kinda lost on what kind of SQL query I should do to achieve what I want.
Let's say I have three tables :
select * FROM trip;
| trip_id | title | description
----------------------------------
| 1 | title1 | desc1 |
| 2 | title2 | desc2 |
| 3 | title3 | desc3 |
| 4 | title4 | desc4 |
| 5 | title5 | desc5 |
| 6 | title6 | desc6 |
select * FROM weekly_report;
| report_id | trip_id| incident_id
----------------------------------
| 1 | 1 | (null) |
| 2 | 1 | (null) |
| 3 | 1 | 1 |
| 4 | 2 | 2 |
| 5 | 3 | 3 |
| 6 | 3 | (null) |
select * FROM incident;
| incident_id | error_code |
----------------------------------
| 1 | 22223 |
| 2 | 25456 |
| 3 | 25456 |
So for a little operationnal knowledge :
The trip table contains 1 record PER trip done by the customer.
The weekly_report contains A report per Week of the trip. (1 trip of 2 weeks will have 2 records, 1 trip or 5 weeks will have 5.. ).
The incident table contains 1 record per incident. (If an incident happened during a week : we create a record in the incident table, else we do nothing)
I'd like to find in a single query (or if it has to be, with subqueries) the number of trips where during at least a week there has been an incident declared for the error_code "25456".
Expected result from the sample data : 2 ( because for trip 2 and three there exist an incident with the error code 25456 ).
I can explain more if needed, is there anybody out there willing to help me ?
Thanks,
You need to take count of distinct trips for related incidents
select count(distinct w.trip_id)
from weekly_report w
inner join incident i
on w.incident_id = i.incident_id
where i.error_code = 25456;
Try this:
SELECT w.trip_id
FROM incident i
INNER JOIN weekly_report w ON i.incident_id=w.incident_id
WHERE error_code='25456'
and if you want the count,then
SELECT COUNT(w.trip_id)
FROM incident i
INNER JOIN weekly_report w ON i.incident_id=w.incident_id
WHERE error_code='25456'

JOIN and SUM values from multiple tables

I have 3 tables like the tables below
tbl_GasExpense
GID | Gas_Expense | Date_Occured
-----------------------------------
1 | 400 | 11/30/2014
2 | 500 | 11/30/2014
3 | 300 | 11/30/2014
tbl_Food Expense
FID | Food_Expense | Date_Occured
-----------------------------------
1 | 450 | 11/30/2014
2 | 250 | 11/30/2014
3 | 390 | 11/30/2014
tbl_Drink Expense
DID | Drink_Expense | Date_Occured
-----------------------------------
1 | 150 | 11/30/2014
2 | 250 | 11/30/2014
3 | 360 | 11/30/2014
and with those tables above, I want an output like this.
ID | Gas_Sum | Food_Sum | Drink_Sum | Date_Occured
-----------------------------------------------------------
1 | 1200 | 1090 | 760 | 11/30/2014
The values of the three tables from which are dated 11/30/2014 are summed in table four.
Using the IDs from the first three tables as foreign keys in the table 4 to establish a relation. Gas_Sum is a mask for GID, Food_Sum for FID, Drink_Sum for DID.
Thanks guys, but I already have my answer now after several trial and errors
.. it is something like this. but this is on my own code
SELECT o.eh_ID, SUM(o.others_amt) as 'OTHERS SUM'
FROM tbl_Others o
INNER JOIN tbl_ExpenseHead hd ON hd.eh_ID = o.eh_ID
GROUP BY o.eh_ID

Work around the 61 table JOIN limit in MySQL by nesting subqueries within each other

I figured out that you can get around the 61 table join limit in MySQL by using subqueries.
https://stackoverflow.com/a/20134402/2843690
I'm trying to figure out how to easily use this in a program I'm working on to get a detailed product list from Magento (but I think the answer to this question could apply to a lot of situations where eav is involved). The tables that need to be joined look something like this:
catalog_product_entity
+-----------+----------------+
| entity_id | entity_type_id |
+-----------+----------------+
| 1 | 4 |
| 2 | 4 |
| 3 | 4 |
| 4 | 4 |
| 5 | 4 |
| 6 | 4 |
| 7 | 4 |
| 8 | 4 |
| 9 | 4 |
+-----------+----------------+
catalog_product_entity_int
+----------+----------------+--------------+-----------+-------+
| value_id | entity_type_id | attribute_id | entity_id | value |
+----------+----------------+--------------+-----------+-------+
| 1 | 4 | 2 | 1 | 245 |
| 2 | 4 | 3 | 1 | 250 |
| 3 | 4 | 4 | 1 | 254 |
| 4 | 4 | 2 | 2 | 245 |
| 5 | 4 | 3 | 2 | 249 |
| 6 | 4 | 4 | 2 | 253 |
| 7 | 4 | 2 | 3 | 247 |
| 8 | 4 | 3 | 3 | 250 |
| 9 | 4 | 4 | 3 | 254 |
+----------+----------------+--------------+-----------+-------+
eav_attribute
+--------------+----------------+----------------+--------------+
| attribute_id | entity_type_id | attribute_code | backend_type |
+--------------+----------------+----------------+--------------+
| 1 | 4 | name | varchar |
| 2 | 4 | brand | int |
| 3 | 4 | color | int |
| 4 | 4 | size | int |
| 5 | 4 | price | decimal |
| 6 | 4 | cost | decimal |
| 7 | 4 | created_at | datetime |
| 8 | 3 | name | varchar |
| 9 | 3 | description | text |
+--------------+----------------+----------------+--------------+
eav_attribute_option
+-----------+--------------+
| option_id | attribute_id |
+-----------+--------------+
| 245 | 2 |
| 246 | 2 |
| 247 | 2 |
| 248 | 3 |
| 249 | 3 |
| 250 | 3 |
| 251 | 4 |
| 252 | 4 |
| 253 | 4 |
| 254 | 4 |
+-----------+--------------+
eav_attribute_option_value
+----------+-----------+-------------------+
| value_id | option_id | value |
+----------+-----------+-------------------+
| 15 | 245 | Fruit of the Loom |
| 16 | 246 | Hanes |
| 17 | 247 | Jockey |
| 18 | 248 | White |
| 19 | 249 | Black |
| 20 | 250 | Gray |
| 21 | 251 | Small |
| 22 | 252 | Medium |
| 23 | 253 | Large |
| 24 | 254 | Extra Large |
+----------+-----------+-------------------+
The program that I'm writing generated sql queries that looked something like this:
SELECT cpe.entity_id
, brand_int.value as brand_int, brand.value as brand
, color_int.value as color_int, color.value as color
, size_int.value as size_int, size.value as size
FROM catalog_product_entity as cpe
LEFT JOIN catalog_product_entity_int as brand_int
ON (cpe.entity_id = brand_int.entity_id
AND brand_int.attribute_id = 2)
LEFT JOIN eav_attribute_option as brand_option
ON (brand_option.attribute_id = 2
AND brand_int.value = brand_option.option_id)
LEFT JOIN eav_attribute_option_value as brand
ON (brand_option.option_id = brand.option_id)
LEFT JOIN catalog_product_entity_int as color_int
ON (cpe.entity_id = color_int.entity_id
AND color_int.attribute_id = 3)
LEFT JOIN eav_attribute_option as color_option
ON (color_option.attribute_id = 3
AND color_int.value = color_option.option_id)
LEFT JOIN eav_attribute_option_value as color
ON (color_option.option_id = color.option_id)
LEFT JOIN catalog_product_entity_int as size_int
ON (cpe.entity_id = size_int.entity_id
AND size_int.attribute_id = 4)
LEFT JOIN eav_attribute_option as size_option
ON (size_option.attribute_id = 4
AND size_int.value = size_option.option_id)
LEFT JOIN eav_attribute_option_value as size
ON (size_option.option_id = size.option_id)
;
It was relatively easy to write the code to generate the query, and the query was fairly easy to understand; however, it's pretty easy to hit the 61 table join limit, which I did with my real-life data. I believe the math says 21 integer-type attributes would go over the limit, and that is before I even start adding varchar, text, and decimal attributes.
So the solution I came up with was to use subqueries to overcome the 61 table limit.
One way to do it is to group the joins in subqueries of 61 joins. And then all of the groups would be joined. I think I can figure out what the sql queries should look like, but it seems difficult to write the code to generate the queries. There is a further (albeit theoretical) problem in that one could again violate the 61 table limit if there were enough attributes. In other words, if I have 62 groups of 61 tables, there will be a MySQL error. Obviously, one could get around this by then grouping the groups of groups into 61. But that just makes the code even more difficult to write and understand.
I think the solution I want is to nest subqueries within subqueries such that each subquery is using a single join of 2 tables (or one table and one subquery). Intuitively, it seems like the code would be easier to write for this kind of query. Unfortunately, thinking about what these queries should look like is making my brain hurt. That's why I need help.
What would such a MySQL query look like?
You're right that joining too many attributes through an EAV design is likely to exceed the limit of joins. Even before that, there's probably a practical limit of joins because the cost of so many joins gets higher and higher geometrically. How bad this is depends on your server's capacity, but it's likely to be quite a bit lower than 61.
So querying an EAV data model to produce a result as if it were stored in a conventional relational model (one column per attribute) is problematic.
Solution: don't do it with a join per attribute, which means you can't expect to produce the result in a conventional row-per-entity format purely with SQL.
I'm not intimately familiar with the Magento schema, but I can infer from your query that something like this might work:
SELECT cpe.entity_id
, o.value AS option
, v.value AS option_value
FROM catalog_product_entity AS cpe
INNER JOIN catalog_product_entity_int AS i
ON cpe.entity_id = i.entity_id AND i.attribute_id IN (2,3,4)
INNER JOIN eav_attribute_option AS o
ON i.value = o.option_id AND i.attribute_id = o.attribute_id
INNER JOIN eav_attribute_option_value AS v
ON v.option_id = o.option_id;
The IN(2,3,4,...) predicate is where you specify multiple attributes. There's no need to add more joins to get more attributes. They're simply returned as rows rather than columns.
This means you have to write application code to fetch all the rows of this result set and map them into fields of a single object.
From comments by #Axel, is sounds like Magento provides helper functions to do this consuming of a result set and mapping it into an object.

Select Distinct Set Common to Subset From Join Table

Given a join table for m-2-m relationship between booth and user
+-----------+------------------+
| booth_id | user_id |
+-----------+------------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 5 |
| 1 | 9 |
| 2 | 1 |
| 2 | 2 |
| 2 | 5 |
| 2 | 10 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 3 | 6 |
| 3 | 11 |
+-----------+------------------+
How can I get a distinct set of booth records that are common between a subset of user ids? For example, if I am given user_id values of 1,2,3, I expect the result set to include only booth with id 3 since it is the only common booth in the join table above between all user_id's provided.
I'm hoping I'm missing a keyword in MySQL to accompish this. The furthest I've come so far is using ... user_id = all (1,2,3) but this is always returning an empty result set (I believe I understand why it is though).
The SQL query for this will be:
select booth_id from table1 where [user_id]
in (1,2,3) group by booth_id having count(booth_id) =
(select count(distinct([user_id])) from table1 where [user_id] in (1,2,3))
If this could help you creating the MySQL query.