I am currently looking to do something which I assume to be super simple but am currently missing for some reason or another. I have 2 tables one is a list of all company drivers and another is a list of the most recent incidents for the drivers. I am looking to make a query to take a list of drivers and show if they have an incident or not. The query I have now will only show the drivers that have an incident and not the ones that do not. I am assuming it has something to do with my joins.
SELECT DriverProfile.DriverID, DriverProfile.FirstName, DriverProfile.LastName, dbo_manpowerprofile.mpp_senioritydate AS [Seniority Date], Last(Incidents.[Event Date]) AS [Incident Date]
FROM (DriverProfile LEFT JOIN dbo_manpowerprofile ON DriverProfile.DriverID = dbo_manpowerprofile.mpp_id) LEFT JOIN Incidents ON DriverProfile.DriverID = Incidents.Driver
WHERE (((Incidents.Type)<>"OBS") AND ((Incidents.Preventability)<>"TNP" And (Incidents.Preventability)<>"NTNP") AND ((DriverProfile.ActiveYN)="Y"))
GROUP BY DriverProfile.DriverID, DriverProfile.FirstName, DriverProfile.LastName, dbo_manpowerprofile.mpp_senioritydate;
The problem is actually in your WHERE clause. Applying criteria to the left joined tables (dbo_manpowerprofile and Incidents), effectively makes those joins, inner joins. If you are working in Access, you need to create separate queries for those tables and apply the criteria there. Then left join those queries to DriverProfile. If you were working in SQL Server, you could include the criteria in the join itself.
SELECT dbo.postst.cost as Cost2014,
case when (dbo.InvNum.OrderDate) between '2014/01/01' and '2014/12/31' then dbo.Postst.Cost end as Cost2014
from dbo.postst INNER JOIN dbo.invnum on dbo.invnum.autoindex = dbo.postst.accountLink
Here is the technique I use for tracking down an issue like this:
SELECT dbo.postst.cost as Cost2014,
dbo.InvNum.OrderDate,
case when (dbo.InvNum.OrderDate) between '2014/01/01' and '2014/12/31' then dbo.Postst.Cost end as Cost2014
from dbo.postst
left JOIN dbo.invnum on dbo.invnum.autoindex = dbo.postst.accountLink
So I add the column I am doing some kind of processing on to see the values I am returning. Then I change the inner join to a left join (temporarily). Then when I run the select, I can usually see why they may not be meeting my expectations and why my query is not retuning the correct results.
In this case, you may not have any data in the right range or the join might be incorrect and thus no records are picked up at all.
What you have is a relatively simple query, If it more complex, I often use select * instead of the sepcific columns just to see if there is something in the other columns that is affecting the results. This is often the case when you have a one-many relationship and want to get only one record but are getting duplicates in the fields you selected.
In one of my tables, some customers have multiple lines - this could be due to re-visits from technicians etc. What I want to do is for each customer ID, analyse whether a re-vist has taken place and place a marker against their name.
I have tried to combine an if/in statement that analyses the max/min visit dates for each customert ID. So if the max>min its classed as a "re-visit", however, i keep getting a syntax error.
Can someone help?
This is a job for two SQL queries:
1st query:
SELECT customerID, count(customerID) as visitCount
FROM tableOfInterest
GROUP BY customerID
2nd query uses first query:
UPDATE customerManifest INNER JOIN queryAbove ON queryAbove.customerID = customerManifest.customerID
SET customerManifest.multipleVisitIndicatorField to queryAbove.visitCount
I'm having an issue getting this SQL query to work properly.
I have the following query
SELECT apps.*,
SUM(IF(adtracking.appId = apps.id AND adtracking.id = transactions.adTrackingId, transactions.payoutAmount, 0)) AS 'revenue',
SUM(IF(adtracking.appId = apps.id AND adtracking.type = 'impression', 1, 0)) AS 'impressions'
FROM apps, adtracking, transactions
WHERE apps.userId = '$userId'
GROUP BY apps.id
Everything is working, HOWEVER for the 'impressions' column I am generating in the query, I am getting a WAY larger number than there should be. For example, one matching app for this query should only have 72 for 'Impressions' yet it is coming up with a value of over 3,000 when there aren't even that many rows in the adtracking table. Why is this? What is wrong here?
Your problem is you have no join conditions, so you are getting every row of every table being joined in your query result - called a cartesian product.
To fix, change your FROM clause to this:
FROM apps a
LEFT JOIN adtracking ad ON ad.appId = a.id
LEFT JOIN transactions t ON t.adTrackingId = ad.id
You haven't provided the schema for your tables, so I guessed the names of the relevant columns - you may have to adjust them. Also, your transaction table may join to adtracking - it's impossible to know from your question, so agin you have have to alter things slightly. Hopefully you get the idea.
Edit:
Note: your group-by clause is incorrect. You either need to list every column of apps (not recommended), or change your select to only select the id column from apps (recommended). Change your select to this:
SELECT apps.id,
-- rest of query the same
Otherwise you'll get weird, incorrect, results.
I have a reasonably complex MySQL query being run on another developer's database. I am trying to copy over his data to our new database structure, so I'm running this query to get a load of the data over to copy. The main table has around 45,000 rows.
As you can see from the query below, there's a lot of fields from several different tables. My problem is that the field Ref.refno (as ref_id) is being pulled through, in some cases, two or three times. This is because in the table LandlordOnlineRef (LLRef) there are sometimes multiple rows with this same reference number - in this case, because the row should have been edited, but instead was duplicated...
Here's what I've tried doing: -
SELECT DISTINCT(Ref.refno) [...] - this makes no difference to the output at all, although I would've assumed it would stop selecting duplicate refno IDs
Is this a MySQL bug, or me? - I also tried adding GROUP BY ref_id to the end of my query. The query normally takes a few milliseconds to run, but when I add GROUP BY to the end, it seems to run infinitely - I waited several minutes but nothing was happening. I thought it might be struggling because I'm using LIMIT 1000, so I also tried LIMIT 10 but still get the same effect.
Here's the problem query - thanks!
SELECT
-- progress
Ref.refno AS ref_id,
Ref.tenantid AS tenant_id,
Ref.productid AS product_id,
Ref.guarantorid AS guarantor_id,
Ref.agentid AS agent_id,
Ref.companyid AS company_id,
Ref.status AS status,
Ref.startdate AS ref_start_date,
Ref.enddate AS ref_end_date,
-- ReferenceDetails
RefDetails.creditscore AS credit_score,
-- LandlordOnlineRef
LLRef.propaddress AS prev_ll_address,
LLRef.rent AS prev_ll_rent,
LLRef.startdate AS prev_ll_start_date,
LLRef.enddate AS prev_ll_end_date,
LLRef.arrears AS prev_ll_arrears,
LLRef.arrearsreason AS prev_ll_arrears_reason,
LLRef.propertycondition AS prev_ll_property_condition,
LLRef.conditionreason AS prev_ll_condition_reason,
LLRef.consideragain AS prev_ll_consider_again,
LLRef.completedby AS prev_ll_completed_by,
LLRef.contactno AS prev_ll_contact_no,
LLRef.landlordagent AS prev_ll_or_agent,
-- EmpDetails
EmpRef.cempname AS emp_name,
EmpRef.cempadd1 AS emp_address_1,
EmpRef.cempadd2 AS emp_address_2,
EmpRef.cemptown AS emp_address_town,
EmpRef.cempcounty AS emp_address_county,
EmpRef.cemppostcode AS emp_address_postcode,
EmpRef.ctelephone AS emp_telephone,
EmpRef.cemail AS emp_email,
EmpRef.ccontact AS emp_contact,
EmpRef.cgross AS emp_income,
EmpRef.cyears AS emp_years,
EmpRef.cmonths AS emp_months,
EmpRef.cposition AS emp_position,
-- EmpLlodReference
ELRef.lod_ref_status AS prev_ll_status,
ELRef.lod_ref_email AS prev_ll_email,
ELRef.lod_ref_tele AS prev_ll_telephone,
ELRef.emp_ref_status AS emp_status,
ELRef.emp_ref_tele AS emp_telephone,
ELRef.emp_ref_email AS emp_email
FROM ReferenceDetails AS RefDetails
LEFT JOIN progress AS Ref ON Ref.refno
LEFT JOIN LandlordOnlineRef AS LLRef ON LLRef.refno = Ref.refno
LEFT JOIN EmpLlodReference AS ELRef ON ELRef.refno = Ref.refno
LEFT JOIN EmpDetails AS EmpRef ON EmpRef.tenantid = Ref.tenantid
-- For testing purposes to speed things up, limit it to 1000 rows
LIMIT 1000
LEFT JOIN progress AS Ref ON Ref.refno
is going to basically turn that into a cartesian join. You're not doing an explicit comparison, you're saying "join all records where there's a non-null value".
Shouldn't it be
LEFT JOIN progress AS Ref ON Ref.refno = RefDetails.something
?
Put all of the selected columns into DISTINCT, separated by ,. If you want to keep the renaming, wrap another SELECT DISTINCT(*) FROM (YOUR_SELECT) around.
Are there indexes on the columns in the GROUP BY clause? LIMIT is applied after GROUP BY. So limiting does not affect the query runtime.
General rule is to never group by more columns then you have to. Use a subquery with a group by on the table thats returning duplicate rows to get rid of them.
Change:
LEFT JOIN LandlordOnlineRef AS LLRef ON LLRef.refno = Ref.refno
to:
Left Join (select refno
, othercolumns you need
from LandlordOnlineRef
group by refno,othercolumn) as LLRef
Not sure on which columns you'll want to include here, but at any table level, you can change that table to a subquery to eliminate duplicate rows before the join. As MarkBannister says, you'll need some logic to identify a unique refno within LLRef . You can also use a date column for 'most recent' or any other logic you can think of to get back a unique LLRef and the info related to that record.
ugh, auto spell check is changing refno to refine. ha