MySQL, Best way to select entries with many one-to-many relationships? - mysql

Lets say I got a table Products and I got 3-4 other tables(Comments, Pictures, Orders and so on) connected to Products with one-to-many relationship.
How should I select one product and its connected entries from the other tables(Comments, Pictures, Orders) in a easy manner?
I tried using left joins connecting the other tables but i get duplicated entries. This will also get worse if i wanted to select many products instead of one.
I was also thinking of one query for each related table but isn't this too slow?

If I understand you correctly, you have e.g.
1 Product
4 comments for said product
10 pictures and
5 orders
and if you do a left join you get 4*10*5=200 results with lots of duplicate comments, pictures and orders. But you only want one row per comment, picture and order.
You will need a separate query for each related table. If there was a way around that, it would be more complex and slower than the separate solution.

You are not getting duplicate rows, you are getting the correct result of performing a one to many join.
Read up on:
http://en.wikipedia.org/wiki/Relational_algebra#Joins_and_join-like_operators
http://dev.mysql.com/doc/refman/5.0/en/join.html
If you want one row per product entry consider using an aggregate function
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
select product.id, product.name, group_concat(comment.text), avg(rating.value)
from product
left join comment on comment.product_id=product.id
left join rating on rating.product_id=product.id
group by product.id, product.name

Related

SQL Return All Rows Where 2 Values Are Repeated

I am sure this question has already been answered, but I can't find it or the answer was too complicated. I am new to SQL and am not sure how to word this generically.
I have a mySQL database of software installed on devices. My query to pull all the data has more fields and more joins, but for brevity I just included a few. I need to add another dimension to create a report that lists every case where a device has more than one installation of software from the same product family.
sample
Right now I have code kind of like this and it is not doing what I need. I have seen some info on exists but the examples didn't account for multiple joins so the syntax escapes me. Help?
select
devices.name,
sw_inventory.product,
products.family_name,
sw_inventory.ignore_usage,
from sw_inventory
inner join products
on sw_inventory.product=products.product_name
inner join devices
on sw_inventory.device_name=devices.name
where sw_inventory.ignore=0
group by devices.name, products.family_name
There are plenty of answers out there on this topic but I definitely understand not always knowing terminology. you are looking for how to find duplicates values.
Basically this is a two step process. 1 find the duplicates 2 relate that back to the original records if you want those. Note the second part is optional.
So to literally find all of the duplicates of the query you provided
ADD HAVING COUNT(*) > 1 after group by statements. If you want to know how many duplicates add a calculated column to count them.
select
devices.name,
sw_inventory.product,
products.family_name,
sw_inventory.ignore_usage,
NumberOfDuplicates = COUNT(*)
from sw_inventory
inner join products
on sw_inventory.product=products.product_name
inner join devices
on sw_inventory.device_name=devices.name
where sw_inventory.ignore=0
group by devices.name, products.family_name
HAVING COUNT(*) > 1

Slow Mysql Query with 3 left join

We have a e-store and in this e-store there is many complicated links between categories and products.
I'm using Taxonomy table in order to store relations between Products-Categories and Products-Products as sub product.
Products may be member of more than one category.
Products may be a sub product a sub product of an other product. (May be more than one)
Products may be a module of an other product (May be more than one)
aliases of query :
pr-Product
ct-Category
sp-Sub Product
md-Module
Select pr.*,ifnull(sp.destination_id,0) as `top_id`,
ifnull(ct.destination_id,0) as `category_id`
from Products as pr
Left join Taxonomy as ct
on (ct.source_id=pr.id and ct.source='Products' and ct.destination='Categories')
Left join Taxonomy as sp
on (sp.source_id=pr.id and sp.source='Products' and sp.destination='Products' and sp.type='TOPID')
Left join Modules as md
on(pr.id = md.product_id)
where pr.deleted=false
and ct.destination_id='47'
and sp.destination_id is null
and md.product_id is null
order by pr.order,pr.sub_order
With this query; I'm trying to get all products under Category_id=47 and not module of any product and not sub product of any product.
This query takes 23 seconds.
There is 7.820 Records in Products, 3.200 Records in Modules and 19.000 records in Taxonomy
I was going to say that MySQL can only use one index per query but it looks like that is no longer the case. I also came across this in another answer:
http://dev.mysql.com/doc/mysql/en/index-merge-optimization.html
However that may not help you.
In the past, when I've come across queries MySQL couldn't optimised I've settled for precomputing answers in another table using a background job.
What you're trying to do looks like a good fit for a graph database like neo4j.
MySQL's optimizer is known to be bad in changing Outer to Inner joins automatically, it does the outer join first and then starts to filter data.
In your case the join between Products and Taxonomy can be rewritten as an Inner Join (there's a WHERE-condition on ct.destination_id='47').
Try if this changes the execution plan and improves performance.

Joining 5 tables - 1 master plus 4 with multiple rows to the master but master data is duplicated

I am working in mysql with queries, but I am new to this. I am joining 5 tables where each table has an identifier and one table is the master. Each related table may have more than one associated record to the master table. I am attempting to join these tables but I can't seem to get rid of the duplicated data.
I want all of the related records to be displayed, but I don't want the data in the master table to display for all results in the related tables. I have tried so many different methods but nothing has worked. Currently I have 4 queries that work for the separate tables, but I have not successfully joined them to have the results display the multiple records in the related table but just one record from the master table.
Here are my individual queries that work:
SELECT
GovernmaxAdditionsExtract.AdditionDescr,
GovernmaxAdditionsExtract.BaseArea,
GovernmaxAdditionsExtract.Value
FROM
GovernmaxExtract
INNER JOIN GovernmaxAdditionsExtract
ON GovernmaxExtract.mpropertyNumber = GovernmaxAdditionsExtract.PropertyNumber
WHERE (((GovernmaxExtract.mpropertyNumber)="xxx-xxx-xx-xxx"));
SELECT
GovernmaxExtract.mpropertyNumber,
GovernmaxDwellingExtract.CardNumber,
GovernmaxDwellingExtract.MainBuildingType,
GovernmaxDwellingExtract.BaseArea
FROM
GovernmaxExtract INNER JOIN
GovernmaxDwellingExtract ON GovernmaxExtract.mpropertyNumber = GovernmaxDwellingExtract.PropertyNumber
WHERE (((GovernmaxExtract.mpropertyNumber)="xxx-xxx-xx-xxx"));
Using these sub queries, I tried to put together 2 of the tables, but now I am getting all records back and it is not reading my input parameter:
SELECT GE.mpropertynumber
FROM
GovernmaxExtract AS GE,
(SELECT
GovernmaxAdditionsExtract.AdditionDescr,
GovernmaxAdditionsExtract.BaseArea,
GovernmaxAdditionsExtract.Value
FROM GovernmaxExtract INNER JOIN
GovernmaxAdditionsExtract ON
governmaxextract.mpropertyNumber = GovernmaxAdditionsExtract.PropertyNumber) AS AE
WHERE GE.mpropertynumber = 'xxx-xxx-xx-xxx'
I tried nested queries, lots of different joins, and I am just not able to wrap my head around this. I am pretty sure I want to do a nested query since I want the main data from the Governmax table to display once with the main data and all records with all info for the associated tables. Maybe I am going about it all wrong.
Our original code was:
SELECT
ge.*,
gde.*,
gfe.*,
gae.*,
goie.*
FROM governmaxextract AS ge
LEFT JOIN governmaxdwellingextract AS gde
ON ge.mpropertyNumber = gde.PropertyNumber
LEFT JOIN governmaxfeaturesextract AS gfe
ON gde.PropertyNumber = gfe.PropertyNumber
LEFT JOIN governmaxadditionsextract AS gae
ON gde.PropertyNumber = gae.PropertyNumber
RIGHT JOIN governmaxotherimprovementsextract AS goie
ON gde.PropertyNumber = goie.PropertyNumber
WHERE ge.mpropertyNumber = '$codeword'
ORDER BY goie.CardNumber
But this gives multiple rows from the master table for each record in the associated tables. I thought about concatenate, but I need the data from the associated tables to be displayed individually. Not sure what to try next. Any help is much appreciated.
Sorry, and there is no way to do that like you want. JOIN's can't do that.
I suggest to keep solution with separate queries.
Btw - You could play with UNION operator,
http://en.wikipedia.org/wiki/Union_(SQL)#UNION_operator
P.s.
You could extract main data separately, then extract data from related tables at once using UNION. With UNIOM it will give one result row per each row in related table.
In order to join an two of the Detail tables together without generating duplicate rows, you will have to perform the following operation on each one:
Group on the foreign key to the Master table, and aggregate all other columns being projected onto the join.
Numeric columns are commonly aggregated with SUM(), COUNT(), MAX(), and MIN(). MAX() and MIN() are also applicable to character data. A PIVOT operation is also sometimes useful as an aggregation operator for this type of circumstance.
Once you have two of the Detail tables grouped and aggregated in this way, they will join without duplicates. Additional Detail tables can be added to the join by first grouping and aggregating them also, in the same fashion.

How do I get MYSQL to join a whole table?

I have a SELECT query that returns the response based on an unique ID, so I always get just one row.
I thought that I could save my machine an extra SELECT query if I simply added the prices table to the result, and read them to memory later on.
Would that be a good approach or am I missing something ?
(I tried it out and seems to get the job done)
SELECT *
FROM subscriptions
LEFT JOIN prices ON 1=1
WHERE subscriptions.ID = 100
edit: The prices table has no ID. I just need to get the complete table, I used to have a different SELECT just for that
This looks like a terrible idea... you should join the subscriptions table to the prices table using the foreign key that you (supposedly/should) have.
Assuming your prices table has a subscription ID column then your query should look something like this:
SELECT *
FROM subscriptions LEFT JOIN prices ON subscriptions.ID=prices.ID
WHERE subscriptions.ID=100
What this will do is produce a cartesian join - not too bad since you're limiting the 'subscriptions' side of things to a single record, but will still produce as many rows as there's records in the price side. Where this gets bad is when you've got multiple rows on both sides. Then you get n x m results - think of how big the result set would be if you had 50,000 subscriptions joined against 1000 prices: 50,000 x 1,000 = 50 million result rows.
First off, this approach is going to be much less clear what you're doing than two SELECT statements unless there is an actual relation between the tables. Second, it's probably going to be slower, because you're transferring much more data (each row of prices additionally gets all the fields from subscriptions copied).
If subscriptions and prices are related, you want to change that ON condition to use the relation, so you're only pulling the data you need.
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (s.subscription_id = p.subscription_id)
WHERE s.subscription_id = 100
One thing you definitely don't want to do is this:
SELECT *
FROM subscriptions s LEFT JOIN prices p ON (1=1)
as that'd pull the full Cartesian product. Once your tables get sufficiently large, that will run you out of temporary table space.
why your condition have 1=1 ?
I thing that is's must something like this:
SELECT s.*,p.*
FROM subscriptions as s
LEFT JOIN prices as p ON p.product_id=s.product_id
WHERE s.ID = 100
show me your full fields of tables subscriptions and prices to help for you
This?
SELECT *
FROM subscriptions, prices
WHERE subscriptions.ID = 100
You'll get horrible results like this, but it seems this is what you wanted.
The table with less rows will have its rows repeating. Again, this is not a good practice.
Use two SELECTs.
This is a cross join http://en.wikipedia.org/wiki/Join_(SQL)#Cross_join
which means your resultset will contain as many rows as you have in the prices table.
So I guess it is not a good idea

mysql table problem?

i have these two tables tables for a chatting app
users{user_id,username,pictures}
chat_data(con_id, chat_text}
i used this sql query
SELECT c.chat_text, u.username
FROM chat_data c, users u
WHERE c.con_id =1
but its giving me duplicate results, when i know thiers only row with the con_id =1, what is the problem with the query!! :))
You need to "join" the tables to avoid duplicates. For example
SELECT c.chat_text, u.username
FROM chat_data c, users u
WHERE c.con_id =1
and u.id = c.user_id
You can read a bit about relational algebra which is the theory behind relational databases.
The users and chat_data tables should be JOINED in order to get a unique tuple as result.
Since users and chat_data cannot be joined, you simply get Cartesian product of the two tables.
Cartesian Products
If two tables in a join query have no
join condition, then Oracle Database
returns their Cartesian product.
Oracle combines each row of one table
with each row of the other. A
Cartesian product always generates
many rows and is rarely useful. For
example, the Cartesian product of two
tables, each with 100 rows, has 10,000
rows. Always include a join condition
unless you specifically need a
Cartesian product. If a query joins
three or more tables and you do not
specify a join condition for a
specific pair, then the optimizer may
choose a join order that avoids
producing an intermediate Cartesian
product.
Refer: http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10759/queries006.htm
With a query like that, you will get as many rows in the results as there are rows in the table users.
It's because of the type of join the SQL is doing. Is it returning a row for each user you have? This is what I expect it is doing, i.e. if you have 2 users John and Jack then are you getting a row for both of these users being returned?
Are you just trying to get the data related to the users involved in a conversation? If so you need some link between the 2 tables, like foreign key references from the chat_data table referencing users.
As previously stated, you're missing a link between the two tables. If you are trying to retrieve the user associated to a particular chat you will need to add a foreign key reference in chat_data that references user.user_id. But if you are trying to get multiple users associated to a chat you would need to add a new table. Your new tables would look something like this:
users{user_id,username,pictures}
chat_data(con_id, chat_text}
user_chat(user_id, con_id) //By adding this new table you can have multiple users per chat
The the query would look something like
SELECT u.username, c.chat_text
FROM users u, chat_data c, user_chat uc
WHERE u.id = uc.id
AND c.con_id = uc.con_id