MySQL Inner join naming error? - mysql

http://sqlfiddle.com/#!9/e6effb/1
I'm trying to get a top 10 by revenue per brand for France on december.
There are 2 tables (first table has date, second table has brand and I'm trying to join them)
I get this error "FUNCTION db_9_d870e5.SUM does not exist. Check the 'Function Name Parsing and Resolution' section in the Reference Manual"
Is my use of Inner join there correct?

It's because you had an extra space after SUM. Please change it from
SUM (o1.total_net_revenue)to SUM(o1.total_net_revenue).
See more about it here.
Also after correcting it, your query still had more error as you were not selecting order_id on your intermediate table i2 so edited here as :
SELECT o1.order_id, o1.country, i2.brand,
SUM(o1.total_net_revenue)
FROM orders o1
INNER JOIN (
SELECT i1.brand, SUM(i1.net_revenue) AS total_net_revenue,order_id
FROM ordered_items i1
WHERE i1.country = 'France'
GROUP BY i1.brand
) i2
ON o1.order_id = i2.order_id AND o1.total_net_revenue = i2.total_net_revenue
AND o1.total_net_revenue = i2.total_net_revenue
WHERE o1.country = 'France' AND o1.created_at BETWEEN '2016-12-01' AND '2016-12-31'
GROUP BY 1,2,3
ORDER BY 4
LIMIT 10`

--EDIT stack Fan is correct that the o2.total_net_revenue exists. My confusion was because the data structure duplicated three columns between the tables, including one that was being looked for.
There were a couple errors with your SQL statement:
1. You were referencing an invalid column in your outer-select-SUM function. I believe you're actually after i2.total_net_revenue.
The table structure is terrible, the "important" columns (country, revenue, order_id) are duplicated between the two tables. I would also expect the revenue columns to share the same name, if they always have the same values in them. In the example, there's no difference between i1.net_revenue and o1.total_net_revenue.
In your inner join, you didn't reference i1.order_id, which meant that your "on" clause couldn't execute correctly.
PROTIP:
When you run into an issue like this, take all the complicated bits out of your query and get the base query working correctly first. THEN add your functions.
PROTIP:
In your GROUP BY clause, reference the actual columns, NOT the column numbers. It makes your query more robust.
This is the query I ended up with:
SELECT o1.order_id, o1.country, i2.brand,
SUM(i2.total_net_revenue) AS total_rev
FROM orders o1
INNER JOIN (
SELECT i1.order_id, i1.brand, SUM(i1.net_revenue) AS total_net_revenue
FROM ordered_items i1
WHERE i1.country = 'France'
GROUP BY i1.brand
) i2
ON o1.order_id = i2.order_id AND o1.total_net_revenue = i2.total_net_revenue
AND o1.total_net_revenue = i2.total_net_revenue
WHERE o1.country = 'France' AND o1.created_at BETWEEN '2016-12-01' AND '2016-12-31'
GROUP BY o1.order_id, o1.country, i2.brand
ORDER BY total_rev
LIMIT 10

Related

SQL Aggregate with join giving incorrect results

In a bid to learn SQL i've added some dummy data into a few tables that i generated in Excel. I've got a table for customer, order headers and order lines.
Im trying to check that the customers balance, order header total and line totals all match.
But when I run this query I get the incorrect output for the orderheader, i believe it to be becuase its doing the SUM for the amount of times the orderlines table is referenced.
Can anyone tell me the correct way i should be doing it?
SELECT
cus.cus_id,
cus.cus_name,
cus.cus_balance,
SUM(orderheader.orderheader_currentsell) AS orderHeader_total,
SUM(orderlines.orderlines_currentsell) AS orderLines_total
FROM
cus
JOIN
orderheader ON orderheader.orderHeader_customer = cus.cus_id
JOIN
orderlines ON orderlines.orderlines_orderid = orderheader.orderHeader_id
GROUP BY cus.cus_name
output ( the highlighted column should be the same as the other values.)
You have multiple rows for the header. To solve this, aggregate before doing the join. In your case, just aggregating the order lines should be sufficient:
SELECT c.cus_id, c.cus_name, c.cus_balance,
SUM(oh.orderheader_currentsell) AS orderHeader_total,
SUM(ol.orderLines_total) AS orderLines_total
FROM cus c JOIN
orderheader oh
ON oh.orderHeader_customer = c.cus_id JOIN
(SELECT ol.orderlines_orderid, SUM((ol.orderlines_currentsell) as orderLines_total
FROM orderlines ol
GROUP BY ol.orderlines_orderid
) ol
ON ol.orderlines_orderid = oh.orderHeader_id
GROUP BY cus.cus_name;
Because you have different levels of grouping, it's not that trivial, and you need subselects.
You can calculate the total per customer as a subselect in the field list. In the code below I've done that just for the orders, but you could do the same for the order lines which are still solved by the grouping.
SELECT
cus.cus_id,
cus.cus_name,
cus.cus_balance,
( SELECT
SUM(orderheader_currentsell)
FROM
orderheader
WHERE
orderheader.orderHeader_customer = cus.cus_id) AS orderHeader_total,
SUM(orderlines.orderlines_currentsell) AS orderLines_total
FROM
cus
JOIN
orderlines ON orderlines.orderlines_orderid = orderheader.orderHeader_id
GROUP BY cus.cus_name
This is at first glance, but I am noticing you have:
cus.cus_id,
cus.cus_name,
cus.cus_balance,
as the non-aggregate columns. But in your Group-By you only have:
GROUP BY cus.cus_name
Group By should include all of the non-aggregate columns. This may be why you're not getting the expected results. That would be changed to:
GROUP BY cus.cus_id,
cus.cus_name,
cus.cus_balance

Grouping method

I am working on a query with the following format:
I require all the columns from the Database 'A', while I only require the summed amount (sum(amount)) from the Database 'B'.
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ON A.CUSTNMBR = B.CUSTNMBR
GROUP BY A
ORDER BY A.CUSTNMBR
My question is regarding the grouping statement, database A has about 12 columns and to group by each individually is tedious, is there a cleaner way to do this such as:
GROUP BY A
I am not sure if a simpler way exists as I am new to SQL, I have previously investigated GROUPING_ID statements but thats about it.
Any help on lumped methods of grouping would be helpful
Since the docnumber is the primary key - just use the following SQL:
SELECT A.*, sum(B.CURTRXAM) as 'Current Transaction Amt'
FROM A
LEFT JOIN C
ON A.Schedule_Number = C.Schedule_Number
LEFT JOIN B
ON A.DOCNUMBR = B.DOCNUMBR
ORDER BY RM20401.CUSTNMBR
GROUP BY A.DOCNUMBR

SubQuery Join Failed

I am trying to find out the missing record in the target. I need the employee whose record are missing.
Suppose I have input source as
1,Jack,type1,add1,reg3,..,..,..,
2,Jack,type2,add1,reg3,..,,.,..,
3,Jack,type3,add2,reg4,..,.,..,.,
4,Rock,,,,,,,,
and I have output as
1,Jack,type1,add1,reg3,..,..,..,
4,Rock,,,,,,,,
I have 1000 numbers of rows for other employees and in target i don't have any duplicate records.
I need the employee who are present in source and target having different occurance
means for e.g in above sample data I have 3 entries of jack and 1 entry of Rock in source
and in target I have only on entry of Jack and one for Rock
I am running below query and required output is Jack,3
How can I get it. I am getting error in below query
select A.EMP_NUMBER,A.CNT1
from
(select EMP_NUMBER,count(EMP_NUMBER) as CNT1
from EMPLOYEE_SOURCE
group by EMP_NUMBER ) as A
INNER JOIN
(select B.EMP_NUMBER,B.CNT2
from (select EMP_NUMBER,count(EMP_NUMBER) as CNT2
from EMPLOYEE_TARGET
group by EMP_NUMBER )as B )
ON (A.EMP_NUMBER = B.EMP_NUMBER)
where A.CNT1 != B.CNT2
Please help.
Why don't get the employee that have different number of rows in the two table when grouped by their name (I suppose Emp_Number is the field that contain the name if that what the query in the question return)
SELECT s.Emp_Number, Count(s.Emp_Number)
FROM EMPLOYEE_SOURCE s
LEFT JOIN EMPLOYEE_TARGET t ON s.Emp_Number = t.Emp_Number
GROUP BY s.Emp_Number
HAVING Count(s.Emp_Number) != Count(t.Emp_Number)
It would be really helpful if you specified the exact error you get.
If this is you actual query there are two things: There's no alias name for the 2nd Derived Table (btw, you don't need it at all) and at least in Teradata !=is not valid, this is SQL and not C.
select A.EMP_NUMBER,A.CNT1
from
(
select EMP_NUMBER,count(EMP_NUMBER) as CNT1
from EMPLOYEE_SOURCE
group by EMP_NUMBER
) as A
INNER JOIN
(
select EMP_NUMBER,count(EMP_NUMBER) as CNT2
from EMPLOYEE_TARGET
group by EMP_NUMBER
) as B
ON (A.EMP_NUMBER = B.EMP_NUMBER)
where A.CNT1 <> B.CNT2
If an employee is missing in the 2nd table you might have to use an Outer Join as Serpiton suggested and add an additional WHERE-condition:
where A.CNT1 <> B.CNT2
or b.CNT2 IS NULL

MySql query runs very slow(actually never gives output) without where clause

I have a mysql query and it works fine when i use where clause, but when i donot use
where clause it gone and never gives the output and finally timeout.
Actually i have used Explain command to check the performance of the query and in both cases the Explain gives the same number of rows used in joining.
I have attached the image of output got with Explain command.
Below is the query.
I couldn't figure whats the problem here.
Any help is highly appreciated.
Thanks.
SELECT
MCI.CLIENT_ID AS CLIENT_ID, MCI.NAME AS CLIENT_NAME, MCI.PRIMARY_CONTACT AS CLIENT_PRIMARY_CONTACT,
MCI.ADDED_BY AS SP_ID, CONCAT(MUD_SP.FIRST_NAME, ' ', MUD_SP.LAST_NAME) AS SP_NAME,
MCI.FK_PROSPECT_ID AS PROSPECT_ID, MCI.DATE_ADDED AS ADDED_ON,
(SELECT GROUP_CONCAT(LT.TAG_TEXT SEPARATOR ', ')
FROM LK_TAG LT
INNER JOIN M_OBJECT_TAG_MAPPING MOTM
ON LT.PK_ID = MOTM.FK_TAG_ID
WHERE MOTM.FK_OBJECT_ID = MCI.FK_PROSPECT_ID
AND MOTM.OBJECT_TYPE = 1
AND MOTM.IS_ACTIVE = 1
) AS TAGS,
IFNULL(SUM(GET_DIGITS(MMR.RCP_AMOUNT)), 0) AS REVENUE_SO_FAR,
IFNULL(SUM(GET_DIGITS(MMR.RCP_RUPEES)), 0) AS REVENUE_INR,
COUNT(DISTINCT PMI_MONTHLY.PROJECT_ID) AS MONTHLY,
COUNT(DISTINCT PMI_FIXED.PROJECT_ID) AS FIXED,
COUNT(DISTINCT PMI_HOURLY.PROJECT_ID) AS HOURLY,
COUNT(DISTINCT PMI_ANNUAL.PROJECT_ID) AS ANNUAL,
COUNT(DISTINCT PMI_CURRENTLY_RUNNING.PROJECT_ID) AS CURRENTLY_RUNNING_PROJECTS,
COUNT(DISTINCT PMI_YET_TO_START.PROJECT_ID) AS YET_TO_START_PROJECTS,
COUNT(DISTINCT PMI_TECH_SALES_CLOSED.PROJECT_ID) AS TECH_SALES_CLOSED_PROJECTS
FROM
M_CLIENT_INFO MCI
INNER JOIN M_USER_DETAILS MUD_SP
ON MCI.ADDED_BY = MUD_SP.PK_ID
LEFT OUTER JOIN M_MONTH_RECEIPT MMR
ON MMR.CLIENT_ID = MCI.CLIENT_ID
LEFT OUTER JOIN M_PROJECT_INFO PMI_FIXED
ON PMI_FIXED.CLIENT_ID = MCI.CLIENT_ID AND PMI_FIXED.PROJECT_TYPE = 1
LEFT OUTER JOIN M_PROJECT_INFO PMI_MONTHLY
ON PMI_MONTHLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_MONTHLY.PROJECT_TYPE = 2
LEFT OUTER JOIN M_PROJECT_INFO PMI_HOURLY
ON PMI_HOURLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_HOURLY.PROJECT_TYPE = 3
LEFT OUTER JOIN M_PROJECT_INFO PMI_ANNUAL
ON PMI_ANNUAL.CLIENT_ID = MCI.CLIENT_ID AND PMI_ANNUAL.PROJECT_TYPE = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_CURRENTLY_RUNNING
ON PMI_CURRENTLY_RUNNING.CLIENT_ID = MCI.CLIENT_ID AND PMI_CURRENTLY_RUNNING.STATUS = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_YET_TO_START
ON PMI_YET_TO_START.CLIENT_ID = MCI.CLIENT_ID AND PMI_YET_TO_START.STATUS < 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_TECH_SALES_CLOSED
ON PMI_TECH_SALES_CLOSED.CLIENT_ID = MCI.CLIENT_ID AND PMI_TECH_SALES_CLOSED.STATUS > 4
WHERE YEAR(MCI.DATE_ADDED) = '2012'
GROUP BY MCI.CLIENT_ID ORDER BY CLIENT_NAME ASC
Yes, as many people have said, the key is that when you have the where clause, mysql engine filters the table M_CLIENT_INFO --probably drammatically--.
A similar result as removing the where clause is to to add this where clause:
where 1 = 1
You will see that the performance is degraded also because mysql will try to get all the data.
Remove the where clause and all columns from select and add a count to see how many records you get. If it is reasonable, say up to 10k, then do the following,
put back the select columns related to M_CLIENT_INFO
do not include the nested one "TAGS"
remove all your joins
run your query without where clause and gradually include the joins
this way you'll find out when the timeout is caused.
I would try the following. First, MySQL has a keyword "STRAIGHT_JOIN" which tells the optimizer to do the query in the table order you've specified. Since all you left-joins are child-related (like a lookup table), you don't want MySQL to try and interpret one of those as a primary basis of the query.
SELECT STRAIGHT_JOIN ... rest of query.
Next, your M_PROJECT_INFO table, I dont know how many columns of data are out there, but you appear to be concentrating on just a few columns on your DISTINCT aggregates. I would make sure you have a covering index on these elements to help the query via an index on
( Client_ID, Project_Type, Status, Project_ID )
This way the engine can apply the criteria and get the distinct all out of the index instead of having to go back to the raw data pages for the query.
Third, your M_CLIENT_INFO table. Ensure that has an index on both your criteria, group by AND your Order By, and change your order by from the aliased "CLIENT_NAME" to the actual column of the SQL table so it matches the index
( Date_Added, Client_ID, Name )
I have "name" in ticks as it is also a reserved word and helps clarify the column, not the keyword.
Next, the WHERE clause. Whenever you apply a function to an indexed column name, it doesn't work the greatest, especially on date/time fields... You might want to change your where clause to
WHERE MCI.Date_Added between '2012-01-01' and '2012-12-31 23:59:59'
so the BETWEEN range is showing the entire year and the index can better be utilized.
Finally, if the above do not help, I would consider splitting your query some. The GROUP_CONCACT inline select for the TAGS might be a bit of a killer for you. You might want to have all the distinct elements first for the grouping per client, THEN get those details.... Something like
select
PQ.*,
group_concat(...) tags
from
( the entire primary part of the query ) as PQ
Left join yourGroupConcatTableBasis on key columns

COUNT evaluate to zero if no matching records

Take the following:
SELECT
Count(a.record_id) AS newrecruits
,a.studyrecord_id
FROM
visits AS a
INNER JOIN
(
SELECT
record_id
, MAX(modtime) AS latest
FROM
visits
GROUP BY
record_id
) AS b
ON (a.record_id = b.record_id) AND (a.modtime = b.latest)
WHERE (((a.visit_type_id)=1))
GROUP BY a.studyrecord_id;
I want to amend the COUNT part to display a zero if there are no records since I assume COUNT will evaluate to Null.
I have tried the following but still get no results:
IIF(ISNULL(COUNT(a.record_id)),0,COUNT(a.record_id)) AS newrecruits
Is this an issue because the join is on record_id? I tried changing the INNER to LEFT but also received no results.
Q
How do I get the above to evaluate to zero if there are no records matching the criteria?
Edit:
To give a little detail to the reasoning.
The studies table contains a field called 'original_recruits' based on activity before use of the database.
The visits tables tracks new_recruits (Count of records for each study).
I combine these in another query (original_recruits + new_recruits)- If there have been no new recruits I still need to display the original_recruits so if there are no records I need it to evalulate to zero instead of null so the final sum still works.
It seems like you want to count records by StudyRecords.
If you need a count of zero when you have no records, you need to join to a table named StudyRecords.
Did you have one? Else this is a nonsense to ask for rows when you don't have rows!
Let's suppose the StudyRecords exists, then the query should look like something like this :
SELECT
Count(a.record_id) AS newrecruits -- a.record_id will be null if there is zero count for a studyrecord, else will contain the id
sr.Id
FROM
visits AS a
INNER JOIN
(
SELECT
record_id
, MAX(modtime) AS latest
FROM
visits
GROUP BY
record_id
) AS b
ON (a.record_id = b.record_id) AND (a.modtime = b.latest)
LEFT OUTER JOIN studyrecord sr
ON sr.Id = a.studyrecord_id
WHERE a.visit_type_id = 1
GROUP BY sr.Id
I solved the problem by amending the final query where I display the result of combining the original and new recruits to include the IIF there.
SELECT
a.*
, IIF(IsNull([totalrecruits]),consents,totalrecruits)/a.target AS prog
, IIf(IsNull([totalrecruits]),consents,totalrecruits) AS trecruits
FROM
q_latest_studies AS a
LEFT JOIN q_totalrecruitment AS b
ON a.studyrecord_id=b.studyrecord_id
;