complex sql query issue - mysql

I have a little SQL but I can't find the way to get back text just numbers. - revised!
SELECT if( `linktype` = "group",
(SELECT contactgroups.grname
FROM contactgroups, groupmembers
WHERE contactgroups.id = groupmembers.id ???
AND contactgroups.id = groupmembers.link_id),
(SELECT contactmain.contact_sur
FROM contactmain, groupmembers
WHERE contactmain.id = groupmembers.id ???
AND contactmain.id = groupmembers.link_id) ) AS adat
FROM groupmembers;
As now I have improved a bit gives back some info but ??? (thanks to minitech) indicate my problem. I can't see how could I fix... Any advice welcomed! Thansk
Contactmain (id, contact_sur, email2)
data:
1 | Peter | email#email.com
2 | Andrew| email2#email.com
Contactgroups (id, grname)
data:
1 | All
2 | Trustee
3 | Comitee
Groupmembers (id, group_id, linktype, link_id)
data:
1 | 1 | contact | 1
2 | 1 | contact | 2
3 | 2 | contact | 1
4 | 3 | group | 2
And I would like to list out who is in the 'Comitee' the result should be Andrew and Trustee if I am right:)

It does look a bit redundant on the join since you are implying both the ID and Link_ID columns are the same value. Since BOTH select values are derived from a qualification to the group members table, I have restructured the query to use THAT as the primary table and do a LEFT JOIN to each of the other tables, anticipating from your query that the link should be found from ONE or the OTHER tables. So, with each respective LEFT JOIN, you will go through the GroupMembers table only ONCE. Now, your IF(). Since the group members is the basis, and we have BOTH tables available and linked, we just grab the column from one table vs the other respectively. I've included the "linktype" too just for reference purposes. By using the STRAIGHT_JOIN will help the engine from trying to change the interpretation of how to join the tables.
SELECT STRAIGHT_JOIN
gm.linktype,
if( gm.linktype = "group", cg.grname, cm.contact_sur ) ADat
from
groupmembers gm
left join contactgroups cg
ON gm.link_id = cg.id
left join contactmain cm
ON gm.link_id = cm.id

If contactgroups.id must equal groupmembers.id but must also equal 2, that's redundant and also probably where your problem is. It works fine as you've written it: http://ideone.com/7EGLZ so without knowing what it's actually supposed to do I can't help more.
EDIT: I'm unfamiliar with the comma-separated FROM, but it gives the same result since you don't select anything from the other table so it doesn't really matter.

Related

SQL LEFT JOIN "JOIN expression is not supported"

This question relates to one I raised here SQL help in finding missing operations on the Sparx website. They suggested I try StackOverflow so here I am. I had tried to do a lot of research on the problem before posting on the Sparx website. I have slightly tweaked it to make it more of a general SQL issue than fixed to the tool that I am using.
I am not an SQL guru so please be kind to me!
The situation
I have 2 tables that has the following elements
Table t_object
--------------
Object_ID AutoNumber
Object_Type Text
Name Text
ParentID Number
Table t_operation
-----------------
OperationID AutoNumber
Object_ID Number
Name Text
The table t_object contains lots of different types of items. The Object_ID is the unique key. The items that I am interested are where Object_Type = 'class' OR 'activity' OR 'activityparameter'. The Name is the name of the item. The ParentID is only applicable to 'activityparameter' items and is the Object_ID of the 'activity' that the 'activityparameter' belongs to.
The table t_operation contains all operations belonging to a class. The OperationID is the unique key. The Object_ID is how this operation is linked to its class in t_object.
The problem
In our system all classes have an equally named activity and all operations belonging to a class have an equally named activityparameter belonging to the activity.
I am trying to find erroneous entries where operation MyOp in class MyClass does not have an equally named activityparameter MyOp in activity MyClass.
Using the test data below
t_object
+-----------+-------------------+-------+----------+
| Object_ID | Object_Type | Name | ParentID |
+-----------+-------------------+-------+----------+
| 1 | Class | c1 | 0 |
| 2 | Class | c2 | 0 |
| 3 | Activity | c1 | 0 |
| 4 | Activity | c2 | 0 |
| 5 | ActivityParameter | MyOp1 | 3 |
| 6 | ActivityParameter | MyOp2 | 3 |
| 7 | ActivityParameter | MyOp3 | 3 |
| 8 | ActivityParameter | MyOp1 | 4 |
| 9 | ActivityParameter | MyOp2 | 4 |
+-----------+-------------------+-------+----------+
t_operation
+-------------+-----------+-------+
| OperationID | Object_ID | Name |
+-------------+-----------+-------+
| 1 | 1 | MyOp1 |
| 2 | 1 | MyOp2 |
| 3 | 2 | MyOp1 |
| 4 | 2 | MyOp2 |
| 5 | 2 | MyOp3 |
| 6 | 2 | MyOp4 |
+-------------+-----------+-------+
The above tables represent the following
Operation c1::MyOp1 (from class c1)
Operation c1::MyOp2 (from class c1)
Operation c2::MyOp1 (from class c2)
Operation c2::MyOp2 (from class c2)
Operation c2::MyOp3 (from class c2)
Operation c2::MyOp4 (from class c2)
Activity parameter c1::MyOp1 (from activity c1)
Activity parameter c1::MyOp2 (from activity c1)
Activity parameter c1::MyOp3 (from activity c1)
Activity parameter c2::MyOp1 (from activity c2)
Activity parameter c2::MyOp2 (from activity c2)
We can see the following errors
Operation c2::MyOp3 has no equivalent activity parameter
Operation c2::MyOp4 has no equivalent activity parameter
Activity parameter c1::MyOp3 has no equivalent operation
For the purpose of this question I am not interested in the final error. When I get the SQL query for "operation class::operation has no equivalent activity parameter" then I will have the logic to do the reverse.
I tried the SQL query (I am using Access and also MySQL). Note, text searches are case insensitive. The 2 IDs at the end of the SELECT should be equal if the activity parameter belongs to the mentioned activity. If they are different then the returned activity parameter belongs to a different activity. This acts as a quick cross-check.
SELECT o_class.name, o_operation.name, o_activity.name, o_actparam.name, o_activity.object_ID AS "activity ID", o_actparam.parentID AS "belongs to activity ID"
FROM
(((
t_object o_class
INNER JOIN t_object o_activity ON
( o_activity.name = o_class.name
AND
o_class.object_type = 'class'
AND
o_activity.object_type = 'activity'
)
)
INNER JOIN t_operation o_operation ON o_operation.object_id = o_class.object_id)
LEFT JOIN t_object o_actparam ON
( o_actparam.name = o_operation.name
AND
o_actparam.object_type = 'activityparameter'
AND
o_actparam.parentid = o_activity.object_id
)
)
WHERE
o_actparam.name is NULL
ORDER BY
o_class.name, o_operation.name, o_activity.name, o_actparam.name, o_activity.object_ID, o_actparam.parentID
The above aims to get a class, then an activity with the same name, then all operations belonging to the class, then for each operation try and find an activity parameter in this activity with the same name. Any that didn't match should return NULL (since it is a LEFT JOIN) and so the WHERE statement shows the operations that didn't have a related activityparameter, i.e. the errors.
The above does not work; I get a "JOIN expression not supported".
If I take out the "o_actparam.parentid = o_activity.object_id" then it returns no results at all. This is clearly wrong. I believe this is because the LEFT JOIN matches on the first expression, i.e. "o_actparam.name = o_operation.name", then applies any other expressions to that result. So it returns 10 rows (one NULL for c2::MyOp4) but the 2nd expression (o_actparam.object_type = 'activityparameter') then throws away the NULL (c2::MyOp4). Then all 9 results are thrown away by the WHERE clause.
If I change the LEFT JOIN to
LEFT JOIN t_object o_actparam ON
( o_actparam.name = o_operation.name
AND
( o_actparam.object_type = 'activityparameter'
OR
o_actparam.object_type is NULL
)
)
)
then I get the result
c2::MyOp4
It has failed to find c2::MyOp3. This is because operation c2::MyOp3 matches activityparmater c1::MyOp3 (same activity parameter name even though it belongs to the wrong class/activity c1). The LEFT JOIN comparison ignores the class/activity. Remember, if I put the 'o_actparam.parentid = o_activity.object_id' check then I get "JOIN expression not supported".
If I change the WHERE (and keep the above LEFT JOIN) to
WHERE
o_actparam.parentid <> o_activity.object_id
OR
o_actparam.parentid is NULL
then I get the result
c1::MyOp1
c1::MyOp2
c2::MyOp1
c2::MyOp2
c2::MyOp3
c2::MyOp4
It is now finding lots of wrong items since the same activityparameter name exists in different activities. The WHERE is too late to throw away items, however, putting the parentid in the LEFT JOIN gave me the "JOIN expression is not supported" error.
In my research I noticed that I could concatenate the expressions in an ON clause, e.g.
LEFT JOIN t_object o_actparam ON
o_actparam.name + o_actparam.object_type + o_actparam.parentid = o_operation.name + 'activityparameter' + o_activity.object_id
)
The idea here is that if the LEFT JOIN works on the first expression then filters on any subsequent ones then putting them all as one would do what I wanted. It also has the benefit of not requiring "OR xxx is NULL" all over the place. It also reinforces what I am looking for, the combination of name, object type and parent. Naturally the above didn't work (or I wouldn't be asking here); it still gave the same "JOIN expression not supported". Again, removing the "parentID" aspect gave wrong results.
I hope I have given a detailed situation, test cases, expected answers, my reasonings and my research to show this is not just a "I couldn't be bothered to work it out myself, please help". I have spent days googling and trying out SQL but, as I said, I am not an SQL expert.
Is someone able to help me here?
Thanks
Darren
I have found the answer so am posting it here in case someone finds it useful. Thanks Sandra for your help.
The answer is that my problem LEFT JOIN was trying to access entries in earlier tables but it couldn't see them because I referred to 2 different tables. I am not an SQL expert so I don't know the correct terminology or exactly why.
Anyway, the solution is to place all previous tables into a subquery so that it then gets a new single table name.
Note, since I am using Access, I have been told that the parentheses after the ON part of the LEFT JOIN are essential since I am joining on a constant value ('activityparameter').
The changes needed were to insert
(SELECT * FROM
before the first 2 tables. Note that this required me to break the single line with 3 parentheses into 2 lines as the SELECT needed to go after the first parenthesis.
After that add an extra
) AS q1
after the 2 tables. In my final answer below I've indented this whole section and put a blank line before and after for clarity.
Then in my LEFT JOIN I refer to the old tables by adding q1. at the start of each reference.
The following query works perfectly and gives the results that I expected.
SELECT o_class.name, o_operation.name, o_activity.name, o_actparam.name, o_activity.object_ID AS "activity ID", o_actparam.parentID AS "belongs to activity ID"
FROM
(
(SELECT * FROM
((
t_object o_class
INNER JOIN t_object o_activity ON
( o_activity.name = o_class.name
AND
o_class.object_type = 'class'
AND
o_activity.object_type = 'activity'
)
)
INNER JOIN t_operation o_operation ON o_operation.object_id = o_class.object_id)
) AS q1
LEFT JOIN t_object o_actparam ON
( o_actparam.name = q1.o_operation.name
AND
o_actparam.object_type = 'activityparameter'
AND
o_actparam.parentid = q1.o_activity.object_id
)
)
WHERE
o_actparam.name is NULL
ORDER BY
o_class.name, o_operation.name, o_activity.name, o_actparam.name, o_activity.object_ID, o_actparam.parentID

MySQL Query WHERE through multiple pivot tables

products
+----+--------+
| id | title |
+----+--------+
| 1 | Apple |
| 2 | Pear |
| 3 | Banana |
| 4 | Tomato |
+----+--------+
product_variants
+----+------------+------------+
| id | product_id | is_default |
+----+------------+------------+
| 1 | 1 | 0 |
| 2 | 1 | 1 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 4 | 1 |
+----+------------+------------+
properties
+----+-----------------+-----------+
| id | property_key_id | value |
+----+-----------------+-----------+
| 1 | 1 | Yellow |
| 2 | 1 | Green |
| 3 | 1 | Red |
| 4 | 2 | Fruit |
| 5 | 2 | Vegetable |
| 6 | 1 | Blue |
+----+-----------------+-----------+
property_keys
+----+-------+
| id | value |
+----+-------+
| 1 | Color |
| 2 | Type |
+----+-------+
product_has_properties
+----+------------+-------------+
| id | product_id | property_id |
+----+------------+-------------+
| 1 | 1 | 4 |
| 2 | 1 | 3 |
| 3 | 2 | 4 |
| 4 | 3 | 4 |
| 5 | 3 | 4 |
| 6 | 4 | 4 |
| 7 | 4 | 5 |
+----+------------+-------------+
product_variant_has_properties
+----+------------+-------------+
| id | variant_id | property_id |
+----+------------+-------------+
| 1 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 2 | 6 |
| 4 | 3 | 4 |
| 5 | 4 | 1 |
| 6 | 5 | 1 |
+----+------------+-------------+
I need to query my DB so it selects products which have certain properties attached to the product itself OR have those properties attached to one of its related product_variants. Also should properties with the same properties.property_key_id be grouped like this: (pkey1='red' OR pkey1='blue') AND (pkey2='fruit' OR pkey2='vegetable')
Example cases:
Select all products with (color='red' AND type='vegetable'). This should return only Tomato.
Select all products with ((color='red' OR color='yellow') AND type='fruit') should return Apple and Banana
Please note that in the example cases above I don't really need to query by properties.value, I can query by properties.id.
I played around a lot with MySQL query's but the biggest problem I'm struggling with is the properties being loaded through two pivot tables. Loading them is no problem but loading them and combining them with the correct WHERE, AND and OR statements is.
The following code should give you what you're looking for, however you should note that your table currently has a Tomato listed as yellow and a vegetable. Obviously you want the Tomato as red and a Tomato is actually a fruit not a vegetable:
Select distinct title
from products p
inner join
product_variants pv on pv.product_id = p.id
inner join
product_variant_has_properties pvp on pvp.variant_id = pv.id
inner join
product_has_properties php on php.product_id = p.id
inner join
properties ps1 on ps1.id = pvp.property_id --Color
inner join
properties ps2 on ps2.id = php.property_id --Type
inner join
property_keys pk on pk.id = ps1.property_key_id or pk.id = ps2.property_key_id
where ps1.value = 'Red' and ps2.value = 'Vegetable'
Here is the SQL Fiddle: http://www.sqlfiddle.com/#!9/309ad/3/0
This is a convoluted answer, and it may be possible to do it in a far simpler way. However given that you seem to want to be able to query by color = xx and type = xx, we clearly need to have columns with those names, which as you've intimated, means we need to pivot the data.
Furthermore, since we want to get all the combinations of colours and types for each product, we need to perform a sort of cross join, to combine them.
This leads us to the query - first we get all the types for a product and its variants, then we join that to all the colours for a product and its variant. We use union to combine the product and variant properties in order to keep them all in the same column, rather than having multiple columns to check.
Of course all products may not have this information specified, so we use left joins all the way through. If it is guaranteed that a product will always have at least one colour, and at least one type - they can all be changed to inner joins.
Also, in your example you say tomato should have a colour of red, yet in the sample data you provide i'm sure the tomato has a colour of yellow.
Anyway, here's the query:
select distinct title from
(select q1.title, q1.value as color, q2.value as type from
(
select products.id, products.title, properties.value, properties.property_key_id
from products
left join product_has_properties
on products.id = product_has_properties.product_id
left join properties
on properties.id = product_has_properties.property_id and properties.property_key_id = 1
union
select product_variants.product_id, products.title, properties.value, properties.property_key_id
from product_variants
inner join products
on product_variants.product_id = products.id
left join product_variant_has_properties
on product_variants.id = product_variant_has_properties.variant_id
left join properties
on properties.id = product_variant_has_properties.property_id and properties.property_key_id = 1
) q1
left join
(
select products.id, products.title, properties.value, properties.property_key_id
from products
left join product_has_properties
on products.id = product_has_properties.product_id
left join properties
on properties.id = product_has_properties.property_id and properties.property_key_id = 2
union
select product_variants.product_id, products.title, properties.value, properties.property_key_id
from product_variants
inner join products
on product_variants.product_id = products.id
left join product_variant_has_properties
on product_variants.id = product_variant_has_properties.variant_id
left join properties
on properties.id = product_variant_has_properties.property_id and properties.property_key_id = 2
) q2
on q1.id = q2.id
where q1.value is not null or q2.value is not null
) main
where ((color = 'red' or color = 'yellow') and type = 'fruit')
And here's a demo: http://sqlfiddle.com/#!9/d3ded/76
If you were to get more types of property, in addition to colour and type, the query would need to be modified - sorry but that's pretty much what you're stuck with, trying to pivot in mysql
I think that you make unnecessary complications for your data model, your code and your queries.
Those eventually will be a performance killer for your application.
Your best solution is to consider an easier approach.
Try to flatten your data structure so you will not have such dependencies.
I don't know what exactly product_variants mean so I can't tell exactly how to do the change.
But the main idea is to save the properties always for each variant.
When you have only 1 variant - define it as a variant too.
And I suggest you to make the properties table to reference the exact variant instead of having global numbering with referencing tables in the structure of:
+----+-----------------+-------------+-----------+
| id | property_key_id | variant_id| value |
+----+-----------------+-------------+-----------+
| 1 | 1 | 1 | Yellow |
| 2 | 1 | 1 | Green |
| 3 | 1 | 1 | Red |
| 4 | 2 | 1 | Fruit |
| 5 | 2 | 2 | Vegetable |
| 6 | 1 | 2 | Blue |
| 7 | 1 | 2 | Yellow |
+----+-----------------+-------------+-----------+
If this approach - you will have duplicate values, but all your queries will be simpler and you will have the freedom to save the values that you want for each specific product variant.
UPDATE
If you have no option to change the structure of the data, "LEFT OUTER JOIN" is your only hope.
Check the below query that selects the ones with color 'Yellow'
select p.* from products p
left outer join product_has_properties pp
on p.id=pp.product_id
left outer join product_variants v
on p.id=v.product_id
left outer join product_variant_has_properties vp
on v.id = vp.variant_id
where vp.property_id=1 or pp.property_id=1;
Considering products and not variants, you can simulate this (at least to some extent) with joins so that you
substitute each OR in your query with an equivalent condition in the WHERE clause. E.g. to have (color='red' OR color='yellow'),
SELECT product_id FROM product_has_properties
WHERE property_id IN (1, 3)
substitute each AND in your query with a self-join and a condition in the WHERE clause. This should yield rows that correspond to products that have the pair of properties in question. E.g.to have (color='red' AND type='vegetable'),
SELECT p1.product_id
FROM product_has_properties p1
INNER JOIN product_has_properties p2 ON (p1.product_id = p2.product_id)
WHERE p1.property_id = 3 AND p2.property_id = 5
Obviously this gets complicated as the number of conditions grows. To get ((color='red' OR color='yellow') AND type='fruit'), you would need to do
SELECT p1.product_id
FROM product_has_properties p1
INNER JOIN product_has_properties p2 ON (p1.product_id = p2.product_id)
WHERE (p1.property_id = 1 OR p1.property_id = 3) AND p2.property_id = 4
Assuming that some fruit could be both blue and red, to get pkey1='red' AND pkey1='blue' AND pkey2='fruit', you'd have to do
SELECT p1.product_id
FROM product_has_properties p1
INNER JOIN product_has_properties p2 ON (p1.product_id = p2.product_id)
INNER JOIN product_has_properties p3 ON (p1.product_id = p3.product_id)
WHERE p1.property_id = 3 AND p2.property_id = 6 AND p3.property_id = 4
There might be some case which isn't covered by this approach, though.
Short answer
I'm going to throw out a bit of a different answer to the ones you've been getting. While it is very possible to have a purely SQL answer to this, the question I would pose to you is: Why?
That answer will determine your next step.
If your answer is to try to learn the pure SQL way to do it, there are some great answers here which get you most if not all of the way there.
If your answer is to create scalable dynamic queries for an end application, then you may find your job eased by leaning on your programming language.
A little personal background
I had a requirement to pivot data with more tables. I was determined I'd try to do this the best possible way, and I spent a lot of time working out what was best for my application. Knowing full well this may not be the same experience you have, I will share my experience here in case it helps you.
I tried to create pure SQL solutions, which did work for specific use cases but required extensive tweaking for each additional use case. When I tried to scale the queries up I first attempted to create Stored Procedures. That was a nightmare and pretty early on in my development I realized it would be a headache to maintain.
I went on to use PHP and create my own query generating. While some of this code has morphed into something that is quite useful for me today, I learned that much of it was going to be challenging to maintain unless I created service libraries. At that point, I realized I was basically going to be creating an Object-relational Mapper (ORM). Unless my application was SO special and SO unique that no ORM on the market could come close to doing what I wanted, then I needed to take that opportunity to explore employing an ORM for my application. Despite my initial reservations which caused me to do everything BUT look at an ORM, I have started using one and it helped my development speed increase significantly.
Reaching your desired end result
Select all products with (color='red' AND type='vegetable'). This should return only Tomato.
Select all products with ((color='red' OR color='yellow') AND type='fruit') should return Apple and Banana
This is possible in an ORM. What you're describing is only loosely defined in your SQL but is in fact perfectly summarized in OOP. This is what it would look like in PHP, just as an example.
<?
Abtract class AbstractProductType {
public function __construct() {
}
}
class Color extends AbstractProductType {
}
class Yellow extends Color {
}
class Red extends Color {
}
class Type extends AbstractProductType {
}
class Vegetable extends Type {
}
class Fruit extends Type {
}
class Product {
public function setColor(Color $color) {
//
}
public function setType(Type $type) {
//
}
}
$product = new Product();
$product->setColor(new Red());
$product->setType(new Fruit());
$result = $product->find();
?>
The idea behind this is that you can make full use of SQL in object oriented programming.
A slightly lower-key version of this would be to create a class which generates SQL snippets. My personal experience was that that's a lot of work for a limited payback. If your project is going to remain relatively small, it may work out just fine. However, if you antiicpate that your project will grow, then an ORM may well be worth exploring.
Conclusion
Although I am not sure what language you will be utilizing to query and manipulate your data, there are great ORMs out there which should not be discounted. Despite their many cons (you can find a lot of debate about this all over the internet), I am a reluctant believer that, although certainly not ideal for all situations, they should be considered for some. If this is not one of those situations for you, be prepared to write lots of JOINs yourself. When referencing a table n times and requiring a reference back to the table, the only method I am aware of to add a reference is to create n JOINs.
I'll be very interested to see if there is a better way, of course!
Conditional Aggregation
You can use conditional aggregation in your having clause to see if a product has specific properties. For example, to query all products that have both the "type vegetable" and "color red" properties.
You have to group by both the product id and the product variant id in order to make sure that all the properties you're searching for exist on the same variant or the product itself.
select p.id, pv.id from products p
left join product_has_properties php on php.product_id = p.id
left join properties pr on pr.id = php.property_id
left join property_keys pk on pk.id = pr.property_key_id
left join product_variants pv on pv.product_id = p.id
left join product_variant_has_properties pvhp on pvhp.variant_id = pv.id
left join properties pr2 on pr2.id = pvhp.property_id
left join property_keys pk2 on pk2.id = pr2.property_id
group by p.id, pv.id
having (
count(case when pk.value = 'Color' and pr.value = 'Red' then 1 end) > 0
and count(case when pk.value = 'Type' and pr.value = 'Vegetable' then 1 end) > 0
) or (
count(case when pk2.value = 'Color' and pr2.value = 'Red' then 1 end) > 0
and count(case when pk2.value = 'Type' and pr2.value = 'Vegetable' then 1 end) > 0
)
What was the question? (I read through the post several times, and I'm still failing to see any actual question that is being asked.) A lot of the answers here seem to be answering the question "What SQL statement would return a result from these tables?" My answer doesn't provide an example or a "how to" guide to writing SQL. My answer addresses a fundamentally different question.
The difficulty that OP is experiencing writing SQL against the tables shown in the "question" is due to (what I refer to as) the "impedance mismatch" between the "Relational" model and the "Entity-Attribute-Value" (EAV) model.
SQL is designed to work with the "Relational" model. Each instance of an entity is represented as a tuple, stored a row in table. The attributes of an entity are stored as values in columns of the entity row.
The EAV model differs significantly from the Relational model. It moves attribute values off of the entity row, and moves them into multiple, separate rows in other tables. And that makes writing queries more complicated, if the queries are attempting to emulate queries against a "Relational" model by transforming the data from the "EAV" representation back into a "Relational" representation.
There's a couple of approaches to writing SQL queries against the EAV model that emulate the results returned from a Relational model (as demonstrated by the example SQL provided in other answers to this "question".
One approach is to use subqueries in the SELECT list to return values of attributes as columns in the entity row.
Another approach is to perform joins between the row in the entity table to the rows in the attribute table(s), and use a GROUP BY to collapse the rows, and in the SELECT list, use conditional expressions "pick out" the value to be returned for a column.
There's lots of examples of both of those approaches. And neither is really better than the other, the suitability of each approach really depends on the particular use case.
While it is possible to write SQL queries against the EAV-style tables shown, those queries are an order of magnitude more complicated than equivalent queries against data stored in a "relational" model.
A result returned by a trivial query in the relational model, e.g.
SELECT p.id
FROM product p
WHERE p.color = 'red'
To return that same set from data in the EAV model requires a much more complex SQL query, involving joins of several tables and/or subqueries.
And once we move beyond the trivial query, to a query where we want to return attributes from multiple related entities... as a simple example, return information about orders in the past 30 days for products that were 'red'
SELECT c.customer_name
, c.address
, o.order_date
, p.product_name
, l.qty
FROM customer c
JOIN order o ON ...
JOIN line_item l ON ...
JOIN product p ON ...
WHERE p.color = 'red'
AND o.order_date >= DATE(NOW()) + INTERVAL 30
getting that same result, using SQL, from the EAV model is way more convoluted and confusing, and can be an excruciating exercise in frustration.
Certainly, it's possible to write the SQL. And once we do manage to get SQL statements that work to return a "correct" resultset, when the number of rows in the tables scale up beyond the trivial demonstration, up to the kind of volumes we expect databases to handle... the performance of those queries is horrendous (as compared to queries returning the same results from a traditional Relational model).
(And we've not even touched on the additional complexity for just adding and updating the attributes of entities, enforcing referential integrity between entities, etc.)
But why would we want to do that? Why do we need (or want) to write SQL statements against the EAV model tables that emulate the results returned from queries against Relational model tables?.
Bottom line, if we are going to use an EAV model, we are much better off not attempting to use a single SQL statement to return results like we'd get back from a query of a "Relational" model.
The problem of retrieving information from the EAV model is much more suited to a programming language that is object-oriented, and provides a framework. Something that is entirely lacing in SQL.

Collating data from two tables

I'm using the following statement to try and collect and display data correctly. It is necessary to to a 'LEFT JOIN' with one table to collect more information, but I should say that it's not necessary to do this for the second case (but such is my work-around).
SELECT
COALESCE(building.campus_id, campus.campus_id) AS campus
member.*
FROM location
LEFT JOIN cu_member AS member ON
(member.member_id = location.member_id)
LEFT JOIN cu_building AS building ON
(location.params LIKE 'building_id=%' = building.id)
LEFT JOIN cu_campus AS campus ON
(location.params LIKE 'campus_id=%' = campus.id)
I'm my above query, I would want to use the wildcard value.
LEFT JOIN cu_building AS building ON
('39' = building.id)
Below is how my location table looks. I'm trying to use the data from the params column to get the resulting campus from another table (building). I only need to do this for fields containing the building_id tag, not for those with 'campus_id`, because that is already known.
-----------------------------
member_id | params
-----------------------------
1 | building_id=39
2 | building_id=24
3 | campus_id=6
4 | campus_id=3
5 | building_id=11
6 | campus_id=14
7 | building_id=15
This is how my building table looks. It lists which building is part of which campus.
--------------------------
building_id | campus_id
--------------------------
39 | 5
24 | 4
11 | 2
15 | 2
I have another table named `campus'. My problem is, this table only lists, but I was hoping to use this table in order to display the correct data in the final result.
--------------------------
campus_id | name
--------------------------
6 | ...
3 | ..
14 | .
The result I want to achieve with the MySQL query is this. Here, the collected results are shown in one table.
-----------------------------
member_id | campus
-----------------------------
1 | 5 (building_id=39)
2 | 4 (building_id=24)
3 | 6 (campus_id=6)
4 | 3 (campus_id=3)
5 | 2 (building_id=11)
6 | 14 (campus_id=14)
7 | 2 (building_id=15)
First things first. You really need to revise your database structure. That location table will give you an infinite number of problems, complicating each and every query you ever need to join members with buildings and campuses. As per the data shown, you really should have building_id and campus_id on the member table. Another, softer, solution would be to have a building_id and a campus_id column in the location table.
That said, you do not need neither regex, nor LIKEs to have this query work. Something like this should work adeguately:
SELECT
COALESCE(b.campus_id, c.id) AS campus
m.*
FROM cu_member m
JOIN location l ON l.member_id = m.member_id
LEFT JOIN cu_building b ON l.params = CONCAT('building_id=',b.id)
LEFT JOIN cu_campus c ON l.params = CONCAT('campus_id=',c.id)
I can see that the last join seems a bit redundant, since you really only need the campus id, and not the name or other info. That already resides in the location table, so it would seem unnecessary to JOIN the campus table. The problem is that it is embedded in the params column. Extracting it is a mess.
In the example you provide, it would be better to expand your params column into a building_id and a campus_id column.
Joining this then becomes very easy.
SELECT
COALESCE(building.campus_id, campus.campus_id) AS campus
member.*
FROM location
LEFT JOIN cu_member AS member ON (member.member_id = location.member_id)
LEFT JOIN cu_building AS building ON (location.building_id = building.building_id)
LEFT JOIN cu_campus AS campus ON (location.campus_id = campus.campus_id)
If there are other reasons here that make creating separate columns difficult, then use the code provided by #Frazz

How to SELECT row B only if row A doesn't exist on GROUP BY

I'm passing through the following situation and have not found a good solution to this problem. I am going through a optimization of a API so am looking for fastest possible solution.
The following description is not exactly what I am doing, but I think it represents the problem well.
Let's say I have a table of products:
+----+----------+
| id | name |
+----+----------+
| 1 | product1 |
| 2 | product2 |
+----+----------+
And I have a table of attachments to each product, separate by language:
+----+----------+------------+-----------------------+
| id | language | product_id | attachment_url |
+----+----------+------------+-----------------------+
| 1 | bb | 1 | image1_bb.jpg |
| 1 | en | 1 | image1_en.jpg |
| 1 | pt | 1 | image1_pt.jpg |
| 2 | bb | 1 | image2_bb.jpg |
| 2 | pt | 1 | image2_pt.jpg |
+----+----------+------------+-----------------------+
What I intend to do is to get the correct attachment according to the language selected on the request. As you can see above, I can have several attachments to each product. We use Babel (bb) as a generic language, so every time I don't have a attachment to the right language, I should get the babel version. Is also important to consider that the Primary Key of the attachments table is a composite of id + language.
So, supposing I try to get all the data in pt, my first option to create a SQL query was:
SELECT p.id, p.name,
GROUP_CONCAT( '{',a.id,',',a.attachment_url, '}' ) as attachments_list
FROM products p
LEFT JOIN attachments a
ON (a.product_id=p.id AND (a.language='pt' OR a.language='bb'))
The problem is that, with this query I always get the bb data and I only want to get it when there is no attachment on the right language.
I already tried to do a subquery changing attachments for:
(SELECT * FROM attachments GROUP BY id ORDER BY id ASC, language DESC)
but it doubles the time of the request.
I also tried using DISTINCT inside the GROUP_CONCAT, but it only works if the whole result of each row is equal, so it does not work for me.
Does anyone knows any other solution that I can apply directly into the query?
EDIT:
Combining the answers of #Vulcronos and #Barmar made the final solution at least 2x faster than the one I first suggested.
Just to add some context, for anybody else who is looking for it. I am using Phalcon. Because of it, I had a lot of trouble putting the pieces together, as Phalcon PHQL does not support subqueries, nor a lot of the other stuff I had to use.
For my scenario, where I had to deliver approximatelly 1.2MB of JSON content, with more than 2100 objects, using custom queries made the total request time up to 3x faster than Phalcon native relations management methods (hasMany(), hasManyToMany(), etc.) and 10x faster than my original solution (which used a lot the find() method).
Try doing two joins instead of one:
SELECT p.id, p.name,
GROUP_CONCAT( '{',COALESCE(a.id, b.id),',',COALESCE(a.attachment_url, b.attachment_url), '}' ) as attachments_list
FROM products p
LEFT JOIN attachments a
ON (a.product_id=p.id AND a.language='pt')
LEFT JOIN attachments b
ON (a.product_id=p.id AND a.language='bb')
and then using COALESCE to return b instead of a if a doesn't exist. You can also do it with a subselect if the above doesn't work.
OR conditions tend to make queries slow, because it's hard to optimize them with indexes. Try joining separately using the two different languages.
SELECT p.id, p.name,
IFNULL(apt.attachment_url, abb.attachment_url) AS attachment_url
FROM products AS p
JOIN attachments AS abb ON abb.product_id = p.id
LEFT JOIN attachments AS apt ON alang.product_id = p.id AND apt.language = 'pt'
WHERE abb.language = 'bb'
This assumes that all products have a bb attachment, while pt is optional.
I left out the join of Product, because it's not relevant for this problem. It's only needed to include the product name in the resultset.
SELECT a.product_id, a.id, a.attachment_url FROM attachments a
WHERE a.language = ?
OR (a.language = 'bb'
AND NOT EXISTS
(SELECT * FROM attachments
WHERE language = ?
AND id = a.id
AND product_id = a.product_id));
Notes: problems like this usually have many possible solutions. This is not necessarily the most efficient one.

What type of Join to use?

I've got a core table and and 3 tables that extend the 'core' table in different ways.
I'm working with MLS data and I have a 'common' table that contains information common to all mls listings and then a table that has specifically "residential" information, one for "commercial",etc... I have been using mls number to join a single table when I know a listing when the property type is known, but for searching I want to join all of them and have the special fields available for search criteria (not simply searching the common table).
What type of join will give me a dataset that will contain all listings (including the extended fields in the idx tables) ?
For each Common table record there is a single corresponding record in ONLY ONE of the idx tables.
___________
| |
| COMMON |
| |
|___________|
_|_
|
___________________|_____________________
_|_ _|_ _|_
_____|_____ _____|______ ____|______
| | | | | |
| IDX1 | | IDX2 | | IDX3 |
| | | | | |
|___________| |____________| |___________|
If you want everything in one row, you can use something like this format. Basically it gives you all the "Common" fields, then the other fields if there is a match otherwise NULL:
SELECT Common.*,
Idx1.*,
Idx2.*,
Idx3.*
FROM Common
LEFT JOIN Idx1
ON Idx1.MLSKey = Common.MLSKey
LEFT JOIN Idx2
ON Idx2.MLSKey = Common.MLSKey
LEFT JOIN Idx3
ON Idx3.MLSKey = Common.MLSKey
Bear in mind it's better to list out fields than to use the SELECT * whenever possible...
Also I'm assuming MySQL syntax is the same as SQL Server, which is what I use.
I have a similar set up of tables where the table 'jobs' is the core table.
I have this query that selects certain elements from each of the other 2 tables:
SELECT jobs.frequency, twitterdetails.accountname, feeds.feed
FROM jobs
JOIN twitterdetails ON twitterdetails.ID = jobs.accountID
JOIN feeds ON jobs.FeedID = feeds.FeedID
WHERE jobs.username ='".$currentuser."';");
So, as you can see, no specific JOIN, but the linking fields defined. You'd probably just need an extra JOIN line for your set up.
Ugly solution / poor attempt / may have misunderstood question:
SELECT common.*,IDX1.field,NULL,NULL FROM COMMON
LEFT JOIN IDX1 ON COMMON.ID = IDX1.ID
WHERE TYPE="RESIDENTIAL"
UNION ALL
SELECT common.*,NULL,IDX2.field,NULL FROM COMMON
LEFT JOIN IDX2 ON COMMON.ID = IDX2.ID
WHERE TYPE="RESIDENTIAL"
UNION ALL
SELECT common.*,NULL,NULL,IDX3.field FROM COMMON
LEFT JOIN IDX3 ON COMMON.ID = IDX3.ID
WHERE TYPE="INDUSTRIAL"
Orbit is close. Use inner join, not left join. You don't want common to show up in the join if it does not have a row in idx.
You MUST union 3 queries to get the proper results assuming each record in common can only have 1 idx table. Plug in "NULL" to fill in the columns that each idx table is missing so they can be unioned.
BTW your table design is good.