I've a MySQL database and a MS Access front end. MySQL database tables are linked via ODBC connection to MS Access.
ANY query with multiple joined tables will run extremely slow in case of having anything in "WHERE" (or "HAVING") clause.
For example:
SELECT tblGuests.GuestName, Sum(tblPayments.Payment) AS SumOfPayment, tblRooms.RoomName
FROM (tblGuests LEFT JOIN tblPayments ON tblGuests.GuestID = tblPayments.GuestNo) LEFT JOIN tblRooms ON tblGuests.RoomNo = tblRooms.RoomID
WHERE tblGuests.NoShow=False
GROUP BY tblGuests.GuestName, tblRooms.RoomName;
will take for ages (approx. 3 minutes for 20K records.) Exactly the same script takes for 1-1.5 seconds in case of Pass Through Query, so the problem shouldn't be related to indexes or settings on server side. (By the way, indexes are set up on the necessary columns and relations are set up, too.)
The problem happens ONLY if there are more than 2 tables involved in the query AND there is something in the "WHERE" clause or in "HAVING".
For example if you modify the code above like
SELECT tblGuests.GuestName, Sum(tblPayments.Payment) AS SumOfPayment
FROM tblGuests LEFT JOIN tblPayments ON tblGuests.GuestID = tblPayments.GuestNo
WHERE tblGuests.NoShow=False
GROUP BY tblGuests.GuestName;
then it will be very quick again. (Only 2 tables are involved to the query.) Also
SELECT tblGuests.GuestName, Sum(tblPayments.HUFpayment) AS SumOfPayment, tblGuests.NoShow, tblRooms.RoomName
FROM (tblGuests LEFT JOIN tblPayments ON tblGuests.GuestID = tblPayments.GuestNo) LEFT JOIN tblRooms ON tblGuests.RoomNo = tblRooms.RoomID
GROUP BY tblGuests.GuestName, tblGuests.NoShow, tblRooms.RoomName;
will have no problem at all because there is no "WHERE" clause. However the very similar code I mentioned in the beginning of the post will be very slow, unless I run it directly on the server (or via Pass Through Query).
Do you have any idea what can cause this problem and how to avoid it (except to run Pass Through Queries all the time)?
Related
I am currently looking to do something which I assume to be super simple but am currently missing for some reason or another. I have 2 tables one is a list of all company drivers and another is a list of the most recent incidents for the drivers. I am looking to make a query to take a list of drivers and show if they have an incident or not. The query I have now will only show the drivers that have an incident and not the ones that do not. I am assuming it has something to do with my joins.
SELECT DriverProfile.DriverID, DriverProfile.FirstName, DriverProfile.LastName, dbo_manpowerprofile.mpp_senioritydate AS [Seniority Date], Last(Incidents.[Event Date]) AS [Incident Date]
FROM (DriverProfile LEFT JOIN dbo_manpowerprofile ON DriverProfile.DriverID = dbo_manpowerprofile.mpp_id) LEFT JOIN Incidents ON DriverProfile.DriverID = Incidents.Driver
WHERE (((Incidents.Type)<>"OBS") AND ((Incidents.Preventability)<>"TNP" And (Incidents.Preventability)<>"NTNP") AND ((DriverProfile.ActiveYN)="Y"))
GROUP BY DriverProfile.DriverID, DriverProfile.FirstName, DriverProfile.LastName, dbo_manpowerprofile.mpp_senioritydate;
The problem is actually in your WHERE clause. Applying criteria to the left joined tables (dbo_manpowerprofile and Incidents), effectively makes those joins, inner joins. If you are working in Access, you need to create separate queries for those tables and apply the criteria there. Then left join those queries to DriverProfile. If you were working in SQL Server, you could include the criteria in the join itself.
Lets say I have the following query:
SELECT occurs.*, events.*
FROM occurs
INNER JOIN events ON (events.event_id = occurs.event_id)
WHERE event.event_state = 'visible'
Another way to do the same query and get the same results would be:
SELECT occurs.*, events.*
FROM occurs
INNER JOIN events ON (events.event_id = occurs.event_id
AND event.event_state = 'visible')
My question. Is there a real difference? Is one way faster than the other? Why would I choose one way over the other?
For an INNER JOIN, there's no conceptual difference between putting a condition in ON and in WHERE. It's a common practice to use ON for conditions that connect a key in one table to a foreign key in another table, such as your event_id, so that other people maintaining your code can see how the tables relate.
If you suspect that your database engine is mis-optimizing a query plan, you can try it both ways. Make sure to time the query several times to isolate the effect of caching, and make sure to run ANALYZE TABLE occurs and ANALYZE TABLE events to provide more info to the optimizer about the distribution of keys. If you do find a difference, have the database engine EXPLAIN the query plans it generates. If there's a gross mis-optimization, you can create an Oracle account and file a feature request against MySQL to optimize a particular query better.
But for a LEFT JOIN, there's a big difference. A LEFT JOIN is often used to add details from a separate table if the details exist or return the rows without details if they do not. This query will return result rows with NULL values for b.* if no row of b matches both conditions:
SELECT a.*, b.*
FROM a
LEFT JOIN b
ON (condition_one
AND condition_two)
WHERE condition_three
Whereas this one will completely omit results that do not match condition_two:
SELECT a.*, b.*
FROM a
LEFT JOIN b ON some_condition
WHERE condition_two
AND condition_three
Code in this answer is dual licensed: CC BY-SA 3.0 or the MIT License as published by OSI.
Currently, I have some common tables created and waiting for their respective SELECT queries. (Including their INNER JOIN and LEFT JOIN instances)
I've run into occasions when a Table rarely needs to exist for some databases, but for others it will.
SELECT c.fname, c.lname, c2.colname
FROM `tbl_cus` c
LEFT JOIN `tbl_cus_optional_data` c2 ON c.cusid = c2.cusid
WHERE 1
tbl_cus_optional_data will exist on some databases, while others will not.
When a database table does not exist MySQL will throw an error, and the query stops, as expected.
Is there way to indicate inside the query if one of the LEFT JOIN tables does not exist, ignore it and any columns in the SELECT portion.
I'm probably searching for a unicorn here, but I get surprised by database capabilities often enough.
SELECT
a.cdrID as cdrID,
a.userName as userName,
a.callingStationID as callingStationID,
a.orgClientAccountID as orgClientAccountID,
a.terClientAccountID as terClientAccountID,
a.calledStationID as calledStationID,
a.setupTime as setupTime,
a.connectTime as connectTime,
a.disconnectTime as disconnectTime,
a.orgDestCode as orgDestCode,
a.orgBilledDuration as orgBilledDuration,
a.orgBilledAmount as orgBilledAmount,
a.terDestCode as terDestCode,
a.terBilledDuration as terBilledDuration,
a.terBilledAmount as terBilledAmount,
a.orgRateID as orgRateID,
a.terRateID as terRateID,
b.dtDestName as orgDestName,
c.dtDestName as terDestName,
d.clCustomerID as terClientName,
1 as cdrwsid,
cast((e.crFlatRate*a.orgBilledAmount)as decimal(10,4)) as cdrsale,
cast((f.crFlatRate*a.terBilledAmount)as decimal(10,4)) as cdrpurchase,
cast(((e.crFlatRate*a.orgBilledAmount)-(f.crFlatRate*a.terBilledAmount))as decimal(10,4)) as cdrprofit
FROM Successful.vbSuccessfulCDR_508 a
inner join iTelBilling.vbDestination b on a.orgDestCode=b.dtDestCode
inner join iTelBilling.vbDestination c on a.terDestCode=c.dtDestCode
inner join iTelBilling.vbClient d on a.terClientAccountID=d.clAccountID
inner join iTelBilling.vbCallRate e on a.orgRateID=e.crCallRateID
inner join iTelBilling.vbCallRate f on a.terRateID=f.crCallRateID
where setupTime between '1317761709564' and '1317804909564' and a.terBilledDuration!=0
I have problem with this query some time this query runs fine some time it got hanged on server and some time it through error to-many connections. Can any one tell me what to do.
The problem sounds like this query is running very long; This can be due to the fact, that you need to have a look at the indexes that the query uses. To get an overview (and perhaps optimize your indexes and pks) use the command:
> EXPLAIN SELECT
a.cdrID as cdrID,
a.userName as userName,
...
Another reason can be, that there are deadlock-situations or situations, where the query is running very long since a table is locked. If this happens, other users that execute that query (I assume you are using it in an webserver-context) are building up a "waiting row". Each user that executes this query (which is waiting) needs a connection of its own. If this happens, your server is running out of concurrent connections in a short time.
This can be solved in two ways:
1) Make sure your query has more performance (check the pks and indexes)
2) Increase your concurrent connections settings in your SQL-server:
This can be done by setting the following value to 200 connections (for example) in your my.cnf
max_connections = 200
3) Optimize your mySQL. Make sure your querycache, key-buffer, ... are set to a fitting value. Further informations on mySQL-Performance tuning you will find here.
It might be that the engine is trying to use your SMALLER lookup tables for performing the join instead of your PRIMARY table of CRD (Call Data Records), like phone system billing. You are trying to get proper origination / destination billing codes and rates. Sometimes MySQL will try to think for you by using the smaller tables first.
Ensure you have an index on your Successful table on the "setupTime". In addition, add "STRAIGHT_JOIN" clause to the top
SELECT STRAIGHT_JOIN ... rest of query.
This tells MySQL to process based on the tables you have ordered in that order. It appears the joins to your destination, client and call rate tables WOULD have the corresponding index on their join keys respectively... if not create them.
I have a query in MS-Access like this:
select DISTINCTROW companies.* from companies, contacts, companies left join contacts on contacts.com_uid = companies.com_uid (This is the ms-access form of a standard "left-join")
[Companies] and [contacts] are linked views on a sql-server 2008, ODBC driver is "SQL server native client 10.0". Both views looks like "select * from [companies] where deleted = 0" and "select * from [contacts] where delete = 0"
The result is wrong since companies are show as many contacts there are.
If the Views are stored on a SQL2000 and linked with the ODBC-driver "SQL Server" everything is fine: All the companies are shown exactly once.
Are there any solutions to get the result with DISTINCTROW again?
I'm surprised it executes that query at all. You're specifying the table "contacts" twice.
Your LEFT JOIN should return every row from "companies". Since you're not retrieving any columns from contacts, I'm pretty sure your query is equivalent to
SELECT *
FROM companies
as long as "companies" means what it does in ordinary language.
If that turns out not to be the case, you can hand the burden off to SQL Server either by creating a view in SQL Server, or by creating a passthrough query in Access. A passthrough query will have to be written in your server's dialect of SQL (SQL Server 2008 dialect of SQL).
Your revision, reproduced below, does nothing to change my earlier comments.
select DISTINCTROW companies.*
from companies, contacts, companies
left join contacts on contacts.com_uid = companies.com_uid
(This is the ms-access form of a standard "left-join")
That's not Access's form of a left join. Access won't allow this:
from companies, contacts, companies
left join contacts
because you're now specifying both tables twice.
Based on your edit, I'd say the query you're trying to write is still equivalent to
SELECT *
FROM companies
What do you get if you run that?
Let's stop talking about the syntax of a left-join in ms-access. Fact is that if the linked tables are views on sql-server 2000:
create view [companies] as
select * from [TabCompanies] where deleted = 0
and
create view [contacts] as
select * from [TabContcts] where deleted = 0
These views are ODBC-linked-tables in a ms-access 2003/2007 mdb.
The questions shows up in ms-access on a query like
select distinctrow [companies].* from [companies] left join [contacts] on [companies].com_uid = contacts.com_uid] where [contacts].[function] like 'C*'
(lets forget that alternative syntax and look on the result assuming that the left join works without an error or syntaxerror)
This DISTINCTROW is a ms-access feature and not know in sql-server and for my point of view the result is the same like DISTINCT but works also even if there are columns with datatype of images par example.
All together we expect by now the same like Catcall in his answer said "select * from companies" BUT IT IS NOT, why?
This is only an excerpt of the whole query and may be makes no sense for production but it shows the changed behaviour wehn sql2008 is connected.
The purpose of DISTINCTROW is to make editable the two sides of an N:1 join. With a Cartesian product (from companies, contacts, companies), the result cannot be editable, so DISTINCTROW has no advantage over DISTINCT.
Secondly, no matter what you say, it is not possible to have the same table twice in a FROM clause without an alias. The SQL you've posted could not have worked in any version of Access.
The only way I can possible imagine there's any sense in what you've posted is if you've omitted a WHERE clause.
EDIT BASED ON COMMENTS:
This should work:
SELECT DISTINCT companies.*
FROM companies INNER JOIN contacts ON companies.com_uid = contacts.com_uid
WHERE contacts.function LIKE "C*"
First off, I'd assume a normal N:1 relationship between contacdts and companies (i.e., many contact records are linked to any single company record), so with both tables in the FROM clause, you do need a DISTINCT to return a single row for each company.
Secondly, if you place criteria on the table on the many side of the JOIN, there's no reason to attempt to use a LEFT JOIN, as it won't change the records returned (use a LEFT JOIN when you want to return records regardless of whether or not there are records in the table on the many side of the JOIN). So, an INNER JOIN is going to do the job for you, and be more efficient (outer JOINs are just slower, even with criteria).