sql inner and left join + performance - mysql

I have 3 tables:
room
room location
room_storys
If anyone creates a room, everytime comes a row automaticlly to room location. room_storys table can be empty.
Now I want to inner join the tables.
If I make this I get no results:
SELECT
r.name,
r.date,
rl.city,
rl.street,
rl.number,
rl.name,
rs.source,
rs.date
FROM room r
INNER JOIN room_location rl
ON rl.room_id = 67
INNER JOIN room_storys rs
ON rs.room_id = 67
LIMIT 1;
If I make this:
INNER JOIN room_storys rs
ON rs.room_id = 67
to this:
LEFT JOIN room_storys rs
ON rs.room_id = 67
```
then it works. But I heard that left join has no good performance, how you would perform this query above? Or is that okey?

Every JOIN type has its pros and cons but, for a query like yours, the β€œcost” is negligible. One recommendation would be to have the tables reference each other rather than a specific id as part of the ON clause. Specificity can be crucial with OUTER JOINs in some situations. Your query can be written like this:
SELECT r.`name`,
r.`date`,
rl.`city`,
rl.`street`,
rl.`number`,
rl.`name`,
rs.`source`,
rs.`date`
FROM `room` r INNER JOIN `room_location` rl ON r.`id` = rl.`room_id`
LEFT OUTER JOIN `room_storys` rs ON rl.`room_id` = rs.`room_id`
WHERE r.`id` = 67;
This change solves the problem of returning everything in room, which was mitigated by the LIMIT 1. It also ensures a record is returned even if there is nothing in room_stories for the room_id.
Hope this gives you something to think about with future queries πŸ‘πŸ»

Think of it this way:
ON says how the tables are related, such as
ON room.id = room_story.room_id
WHERE is used for filtering, such as
WHERE room.id = 67
Also, JOIN (or INNER JOIN requires the matching rows in each of the 2 tables to both exist. LEFT JOIN says that the matching row in the right table is optionally missing.
Putting those together, I think this is what you needed:
FROM room r
INNER JOIN room_location rl ON rl.room_id = r.id
INNER JOIN room_storys rs ON rs.room_id = r.id
WHERE r.id = 67
This becomes irrelevent (I think): LIMIT 1

Related

SQL query optimization for speed

So I was working on the problem of optimizing the following query I have already optimized this to the fullest from my side can this be further optimized?
select distinct name ad_type
from dim_ad_type x where exists ( select 1
from sum_adserver_dimensions sum
left join dim_ad_tag_map on dim_ad_tag_map.id=sum.ad_tag_map_id and dim_ad_tag_map.client_id=sum.client_id
left join dim_site on dim_site.id = dim_ad_tag_map.site_id
left join dim_geo on dim_geo.id = sum.geo_id
left join dim_region on dim_region.id=dim_geo.region_id
left join dim_device_category on dim_device_category.id=sum.device_category_id
left join dim_ad_unit on dim_ad_unit.id=dim_ad_tag_map.ad_unit_id
left join dim_monetization_channel on dim_monetization_channel.id=dim_ad_tag_map.monetization_channel_id
left join dim_os on dim_os.id = sum.os_id
left join dim_ad_type on dim_ad_type.id = dim_ad_tag_map.ad_type_id
left join dim_integration_type on dim_integration_type.id = dim_ad_tag_map.integration_type_id
where sum.client_id = 50
and dim_ad_type.id=x.id
)
order by 1
Your query although joined ok, is an overall bloat. You are using the dim_ad_type table on the outside, just to make sure it exists on the inside as well. You have all those left-joins that have NO bearing on the final outcome, why are they even there. I would simplify by reversing the logic. By tracing your INNER query for the same dim_ad_type table, I find the following is the direct line. sum -> dim_ad_tag_map -> dim_ad_type. Just run that.
select distinct
dat.name Ad_Type
from
sum_adserver_dimensions sum
join dim_ad_tag_map tm
on sum.ad_tag_map_id = tm.id
and sum.client_id = tm.client_id
join dim_ad_type dat
on tm.ad_type_id = dat.id
where
sum.client_id = 50
order by
1
Your query was running ALL dim_ad_types, then finding all the sums just to find those that matched. Run it direct starting with the one client, then direct with JOINs.

How to Optimize my JOIN's to speed up my Query?

I have 6 queries like the following query listed below..
each are taking 6 seconds to run
for a total of 36 seconds for page to load
Is there a way to optimize these kinds of queries?
SELECT
tickets.ticketID,
tickets.ticket,
tickets.name1,
tickets.address1,
tickets.city,
tickets.cstate,
tickets.zip,
tickets.caller_type,
tickets.phone,
tickets.caller,
tickets.caller_phone,
tickets.contact,
tickets.contact_phone,
tickets.call_back,
tickets.location,
tickets.printable_text,
tblnotes.ntDate,
tblnotes.ntText,
tblstatus.stDesc,
tblUsers.username
FROM tblusers
RIGHT OUTER JOIN tickets ON tblusers.ID = tickets.ownerID
LEFT OUTER JOIN tblstatus ON tblstatus.stID = tickets.statusID
LEFT OUTER JOIN tblnotes ON tblnotes.ntID = tickets.noteID
WHERE tblstatus.stDesc <> "Closed"
EDIT: try this
SELECT
tickets.ticketID,
tickets.ticket,
tickets.name1,
tickets.address1,
tickets.city,
tickets.cstate,
tickets.zip,
tickets.caller_type,
tickets.phone,
tickets.caller,
tickets.caller_phone,
tickets.contact,
tickets.contact_phone,
tickets.call_back,
tickets.location,
tickets.printable_text,
tblnotes.ntDate,
tblnotes.ntText,
tblstatus.stDesc,
tblUsers.username
FROM tickets
INNER JOIN tblusers ON tblusers.ID = tickets.ownerID
INNER JOIN tblstatus ON tblstatus.stID = tickets.statusID
LEFT OUTER JOIN tblnotes ON tblnotes.ntID = tickets.noteID
WHERE tickets.statusID <> 3
posting as an answer, as I am unable to comment
You have a condition where tblstatus.stDesc <> "Closed"
assuming you have an index here on stID
change that to where tblstatus.stID <> put the id value
also change your left outer joins to inner joins, as any ways you have a where condition, you can keep the left join on tblnotes as I am not sure if it may have a row corresponding to tbltickets
i will also move the tickets table to from and then do an inner join with tblusers
use left outer join only when the join table may not have data, but you still want to show data from your main table

Sql query taking long time with inner join

I am supposed to write a query which requires joining 3 tables.
The query designed by me works fine, but it takes a lot of time to execute.
SELECT v.LinkID, r.SourcePort, r.DestPort, r.NoOfBytes, r.StartTime , r.EndTime, r.Direction, r.nFlows
FROM LINK_TBL v
INNER JOIN NODEIF_TBL n
INNER JOIN RAW_TBL r ON
r.RouterIP=n.ifipaddress
and n.NodeNumber=v.orinodenumber
and v.oriIfIndex=r.OriIfIndex;
Is there any issue w.r.t performance in this query ?
Try this one put the on conditions in the joins
SELECT v.LinkID, r.SourcePort, r.DestPort, r.NoOfBytes, r.StartTime , r.EndTime, r.Direction, r.nFlows
FROM LINK_TBL v
INNER JOIN NODEIF_TBL n ON (n.NodeNumber=v.orinodenumber )
INNER JOIN RAW_TBL r ON (r.RouterIP=n.ifipaddress and v.oriIfIndex=r.OriIfIndex)
Try this:
SELECT v.LinkID, r.SourcePort, r.DestPort, r.NoOfBytes, r.StartTime , r.EndTime, r.Direction, r.nFlows
FROM LINK_TBL v
INNER JOIN NODEIF_TBL n ON
n.NodeNumber=v.orinodenumber
INNER JOIN RAW_TBL r ON
r.RouterIP=n.ifipaddress
and v.oriIfIndex=r.OriIfIndex;
The join order is somewhat weird. I don't work with mysql so maybe it is just some unique way to join, but usually you join like:
FROM
a
INNER JOIN b ON a.id1 = b.id2
INNER JOIN c ON b.id3 = c.id4
Since you are using INNER JOIN this way you first filter out a with b joins and only then use the remaining join to filter out thus saving a lot of comparison actions. Imagine each table has 1 thousand rows. When you add c this becomes 1 million comparisons. Meanwhile with my example it would only be 1000 + 1000 comparisons instead of 1000 * 1000.

Getting Repeated values in SQL

I was desperately trying harder and harder to get this thing done but didn`t yet succeed. I am getting repeated values when i run this query.
select
tbl_ShipmentStatus.ShipmentID
,Tbl_Contract.ContractID,
Tbl_Contract.KeyWinCountNumber,
Tbl_Item.ItemName,
Tbl_CountryFrom.CountryFromName,
Tbl_CountryTo.CountryToName,
Tbl_Brand.BrandName,
Tbl_Count.CountName,
Tbl_Seller.SellerName,
Tbl_Buyer.BuyerName,
Tbl_Contract.ContractNumber,
Tbl_Contract.ContractDate,
tbl_CountDetail.TotalQty,
tbl_CostUnit.CostUnitName,
tbl_Comission.Payment,
tbl_Port.PortName,
Tbl_Contract.Vans,
tbl_Comission.ComissionPay,
tbl_Comission.ComissionRcv,
tbl_CountDetail.UnitPrice,
tbl_Comission.ComissionRemarks,
tbl_CountDetail.Amount,
tbl_LCStatus.LCNumber,
tbl_ShipmentStatus.InvoiceNumber,
tbl_ShipmentStatus.InvoiceDate,
tbl_ShipmentStatus.BLNumber,
tbl_ShipmentStatus.BLDate,
tbl_ShipmentStatus.VesselName,
tbl_ShipmentStatus.DueDate
from tbl_ShipmentStatus
inner join tbl_LCStatus
on
tbl_LCStatus.LCID = tbl_ShipmentStatus.LCStatusID
inner join Tbl_Contract
on
tbl_LCStatus.ContractID = Tbl_Contract.ContractID
inner join Tbl_CountDetail
on Tbl_Contract.ContractID = Tbl_CountDetail.ContractId
inner join tbl_Comission
on
tbl_Comission.ContractID = Tbl_Contract.ContractID
inner join Tbl_Item
on
Tbl_Item.ItemID = Tbl_Contract.ItemID
inner join Tbl_Brand
on Tbl_Brand.BrandID = Tbl_Contract.BrandID
inner join Tbl_Buyer
on Tbl_Buyer.BuyerID = Tbl_Contract.BuyerID
inner join Tbl_Seller
on Tbl_Seller.SellerID = Tbl_Contract.SellerID
inner join Tbl_CountryFrom
on Tbl_CountryFrom.CountryFromID = Tbl_Contract.CountryFromID
inner join Tbl_CountryTo
on
Tbl_CountryTo.CountryToID = Tbl_Contract.CountryToID
inner join Tbl_Count
on
Tbl_Count.CountID = Tbl_CountDetail.CountId
inner join tbl_CostUnit
on tbl_Comission.CostUnitID = tbl_CostUnit.CostUnitID
inner join tbl_Port
on tbl_Port.PortID = tbl_Comission.PortID
where tbl_LCStatus.isDeleted = 0
and tbl_ShipmentStatus.isDeleted =0
and tbl_LCStatus.isDeleted = 0
and Tbl_CountDetail.isDeleted = 0
and Tbl_Contract.isDeleted = 0
and tbl_ShipmentStatus.LCStatusID = 5
I have also attached a picture of my result set of rows.
Any suggestions why this is happening would really be appreciable.
Result Set
Typically this happens when you have an implicit partial cross join (Cartesian product) between two of your tables. That's what it looks like to me here.
This happens most often when you have a many-to-many relationship. For example, if a single Album allows both multiple Artists and multiple Songs and the only relationship between Artists and Songs is Album, then there's essentially a many-to-many relationship between Artists and Songs. If you select from all three tables at once you're going to implicitly cross join Artists and Songs, and this may not be what you want.
Looking at your query, I see many-to-many between Tbl_CountDetail and tbl_Comission through Tbl_Contract. Try eliminating one of those joins to test to see if the behavior disappears.
Try using the DISTINCT keyword. It should solve your issue
Select DISTINCT ....
Wait as far as I can see your records are not duplicates.
HOWEVER
Notice the CountName column and Shipment ID column
The combination is unique for every row. Hence the values are unique as far as I can see. Try not selecting CountName.
Well if you have distinct rows its not a duplication problem. The issue is during the join a combination is occurring you don't want it to duplicating the results.
Either don't select CountName or you have a mistake in your data.
Only one of those rows should be true either 6 with Count2 or 6 with Count1. Likewise for 7. The fact that your getting both when your not supposed to indicates a logic mistake

Optimizing a where statement MYSQL

Im writing this complex query to return a large dataset, which is about 100,000 records. The query runs fine until i add in this OR statement to the WHERE clause:
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS
Null)
Now i understand that by putting the or statement in there it adds a lot of overhead.
Without that statement and just:
AND responses.StrategyFk = strategies.Id
The query runs within 15 seconds, but doesn't return any records that didn't have a fk linking a strategie.
Although i would like these records as well. Is there an easier way to find both records with a simple where statement? I can't just add another AND statement for null records because that will break the previous statement. Kind of unsure of where to go from here.
Heres the lower half of my query.
FROM
responses, subtestinstances, students, schools, items,
strategies, subtests
WHERE
subtestinstances.Id = responses.SubtestInstanceFk
AND subtestinstances.StudentFk = students.Id
AND students.SchoolFk = schools.Id
AND responses.ItemFk = items.Id
AND (responses.StrategyFk = strategies.Id Or responses.StrategyFk IS Null)
AND subtests.Id = subtestinstances.SubtestFk
try:
SELECT ... FROM
responses
JOIN subtestinstances ON subtestinstances.Id = responses.SubtestInstanceFk
JOIN students ON subtestinstances.StudentFk = students.Id
JOIN schools ON students.SchoolFk = schools.Id
JOIN items ON responses.ItemFk = items.Id
JOIN subtests ON subtests.Id = subtestinstances.SubtestFk
LEFT JOIN strategies ON responses.StrategyFk = strategies.Id
That's it. No OR condition is really needed, because that's what a LEFT JOIN does in this case. Anywhere responses.StrategyFk IS NULL will result in no match to the strategies table, and it wil return a row for that.
See this link for a simple explanation of joins: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
After that, if you're still having performance issues then you can start looking at the EXPLAIN SELECT ... ; output and looking for indexes that may need to be added. Optimizing Queries With Explain -- MySQL Manual
Try using explicit JOINs:
...
FROM responses a
INNER JOIN subtestinstances b
ON b.id = a.subtestinstancefk
INNER JOIN students c
ON c.id = b.studentfk
INNER JOIN schools d
ON d.id = c.schoolfk
INNER JOIN items e
ON e.id = a.itemfk
INNER JOIN subtests f
ON f.id = b.subtestfk
LEFT JOIN strategies g
ON g.id = a.strategyfk