MYSQL Inner join for three to four tables - mysql

Please I need to figure out what I am doing wrong. I created this inner join code for mysql. it works but it gives me repeated values like repeating a particular row twice or categoryid twice. each of the tables(users,paymentnotification,monthlyreturns) has the categoryid used to check and display the username(users.pname) from the user table, then check and display those that have made payment from the monthly returns and payment table using the categoryid.
$r="SELECT monthlyreturns.categoryid, monthlyreturns.month, monthlyreturns.quarter, monthlyreturns.year,paymentnotification.amount, users.pname, monthlyreturns.ototal, paymentnotification.payee, status
FROM paymentnotification
INNER JOIN (monthlyreturns INNER JOIN users ON monthlyreturns.categoryid=users.categoryid)
ON monthlyreturns.categoryid=paymentnotification.categoryid
ORDER BY monthlyreturns.categoryid DESC";

I think the query you want is more like this:
SELECT b.categoryid, b.month, b.quarter, b.year, a.amount, c.pname, b.ototal, a.payee, status
FROM paymentnotification a
INNER JOIN monthlyreturns b
ON a.categoryid = b.categoryid
INNER JOIN users c
ON b.categoryid = c.categoryid
ORDER BY b.categoryid DESC
The way you are doing the correlations doesn't seem clear and may cause problems. Try this one out and see what happens. If its still doing duplicates, perhaps the nature of the data require further filtering.

Assuming I understand what you're trying to do, you are not joining your tables properly. Try joining one at a time
SELECT DISTINCT monthlyreturns.categoryid, monthlyreturns.month, monthlyreturns.quarter, monthlyreturns.year,paym entnotification.amount, users.pname, monthlyreturns.ototal, paymentnotification.payee, status
FROM paymentnotification
INNER JOIN monthlyreturns
ON paymentnotification.categoryid = monthlyreturns.categoryid
INNER JOIN users
ON monthlyreturns.categoryid = users.categoryid
ORDER BY monthlyreturns.categoryid DESC

I don't see any problem.. I get 4 result rows: check this fiddle http://sqlfiddle.com/#!2/165a22/5
this is the query I used:
SELECT m.categoryid, m.month, m.quarter, m.year,p.amount, u.pname, m.ototal, p.payee, m.status
FROM paymentnotification p JOIN monthlyreturns m ON p.categoryid = m.categoryid
JOIN users u ON u.categoryid = m.categoryid
ORDER BY m.categoryid DESC
there are no duplicated rows, just "unique" rows if you consider every column you choose.
Hope it helps

SELECT M.categoryid, M.month, M.quarter, M.year, M.ototal,
P.amount, P.payee, P.status,
U.pname
FROM paymentnotification AS P
INNER JOIN monthlyreturns AS M ON P.categoryid = M.categoryid
INNER JOIN users AS U ON M.categoryid = U.categoryid
ORDER BY M.categoryid DESC

Related

SQL - order by is breaking my query when there is no reviews

I have the rather lengthy SQL query that I have included below. As you can see it orders by AvgRating and NumReviews, both of which rely on data from the reviews table. Unfortunately I need to see the rows in my results even when there are no reviews, currently if there are no reviews to order by then that row just doesnt show up in the results. All help greatly appreciated.
SELECT travisor_tradesperson.name, travisor_tradesperson.id, travisor_catagory.catname,
travisor_company.cname, travisor_company.description, travisor_company.city, travisor_company.address, travisor_company.postcode, travisor_company.phone,
ROUND(AVG(travisor_review.rating)) as RoundAvgRating, AVG(travisor_review.rating) as AvgRating, COUNT(travisor_review.rating) as NumReviews
FROM `travisor_tradesperson`
INNER JOIN travisor_company
ON travisor_tradesperson.company = travisor_company.id
INNER JOIN travisor_catagory
ON travisor_tradesperson.catagory = travisor_catagory.id
INNER JOIN travisor_review
ON travisor_review.tradesperson = travisor_tradesperson.id
WHERE travisor_catagory.catname = '$catagory'
AND travisor_company.city = '$city'
GROUP BY travisor_tradesperson.name, travisor_catagory.catname, travisor_company.cname,
travisor_company.description
ORDER BY AvgRating DESC, NumReviews DESC
Left join travisor_review instead of Inner Join. Inner join will only find records that are present in both tables. If you have no reviews for that tradesperson record, it will drop from the results set.
Left join will return a NULL if it cannot match on the join predicate. In this case, the tradesperson will return but with a NULL. Convert the NULL to a 0 if needed and that should fix your AVG.

What Would be the Correct SELECT Statement for This?

SELECT *
FROM notifications
INNER JOIN COMMENT
ON COMMENT.id = notifications.source_id
WHERE idblog IN (SELECT blogs_id
FROM blogs
WHERE STATUS = "active")
INNER JOIN reportmsg
ON reportmsg.msgid = notifications.source_id
WHERE uid =: uid
ORDER BY notificationid DESC
LIMIT 20;
Here I am INNER JOINing notifications with comment and reportmsg; then filtering content with WHERE.
But my problem is that for the first INNER JOIN [i.e, with comment], before joining notifications with comment, I want to match notifications.idblog with blogs.blogs_id and SELECT only those rows where blogs.status = "active".
For better understanding of the code above:
Here, for INNER JOIN, with comment I want to SELECT only those rows in notifications whose idblog matches blogs.blogs_id and has status = "active".
The second INNER JOIN with reportmsg needs not to be altered. I.e, it only filters through uid.
As you can see from the image below, you can just need to merge other tables to notifications table using LEFT JOIN like that:
SELECT n.notificationid, n.uid, n.idblog, n.source_id,
b.blogs_id, b.status,
c.id,
r.msgid
-- ... and the other columns you want
FROM notifications n
LEFT JOIN blogs b ON b.blogs_id = n.idblog AND b.STATUS = "active" AND n.uid =: uid
LEFT JOIN comment c ON c.id = n.source_id
LEFT JOIN reportmsg r ON r.msgid = n.source_id
ORDER BY n.notificationid DESC
LIMIT 20;
There's no need/reason to filter before the second join because you only use inner joins and then the order of joins and WHERE-conditions don't matter:
SELECT n.*, c.*, r.*
FROM notifications AS n
JOIN COMMENT as c
ON n.source_id = c.id
LEFT JOIN blogs as b
ON n.idblogs = b.blogs_id
AND B.STATUS = 'active'
JOIN reportmsg AS R
ON n.source_id = r.msgid
WHERE uid =: uid
ORDER BY notificationid DESC
LIMIT 20
You can switch the order of joins, you can move B.STATUS = 'active' into the join-condition, but all queries will return the same result. (After the edit it's a LEFT JOIN, of course now the result differs)
And of course you shouldn't use *, better list only the columns you actually need.
if query optimizer does its work, it does not matter where you put filtering statement in INNER JOIN case but in the LEFT JOIN it has effects. Putting filtering statement in LEFT JOIN conditions cause table filtered at first and joined after while putting filtering statement in WHERE clause will filter results of join. Hence, if you want to use LEFT JOIN your query must look like:
SELECT nt.*
FROM notifications nt
LEFT JOIN Blogs bg on nt.blogs_id = bg.blogs_id and bg.STATUS = "active"
LEFT JOIN COMMENT cm ON cm.id = nt.source_id
LEFT JOIN reportmsg rm ON rm.msgid = nt.source_id
WHERE uid =: uid
ORDER BY nt.notificationid DESC
LIMIT 20;
It's very unclear what you are after here.. while your table diagram is useful, you should really supply some sample data and an expected result even if it is just a couple of dummy rows for each table.
Queries work row by row, both INNER JOINs are applied to the same notification row and non-matching rows are discarded.
Any filter applies to both JOIN and any returned rows must have a match in BOTH comment and reportmsg.
Perhaps you want two LEFT JOINs that can apply different filters and guessing from the table names perhaps it could look like this:
SELECT *
FROM notifications n
LEFT JOIN blogs b
ON n.blogId = b.blogs_id
LEFT JOIN comment c
ON c.id = n.source_id
AND b.status = "Active"
LEFT JOIN reportmsg rm
ON rm.msgid = n.source_id
WHERE n.uid =: uid
AND (c.id IS NOT NULL OR rm.msgid IS NOT NULL)
ORDER BY n.notificationid DESC
LIMIT 20
You also should work on your naming convention:
notifications, comment -> pick either plural or singular table names
notifications.notificationid, comment.id -> pick adding table name to id
notificationid, source_id -> pick underscore or no separation
idblog, notificationid -> pick prepending or appending id
Currently you pretty much have to look up every id field every time you want to use one.
You should change your query to this:
SELECT *
FROM notifications
INNER JOIN comment ON comment.id = notifications.source_id
INNER JOIN reportmsg ON reportmsg.msgid=notifications.source_id
LEFT JOIN blogs ON notifications.idblog = blogs.blogs_id
WHERE blogs.status = 'active'
ORDER BY notificationid DESC
LIMIT 20;

Summing multiple columns - unexpected results

Why
Does this give incorrect results?
SELECT
people.name,
SUM(allorders.TOTAL),
SUM(allorders.DISCOUNT),
SUM(allorders.SERVICECHARGE),
SUM(payments.AMOUNT)
FROM
people
INNER JOIN
allorders ON allorders.CUSTOMER = people.ID
INNER JOIN
payments ON payments.CUSTOMER = people.ID
WHERE
people.ID = 7 AND allorders.VOIDED = 0 AND payments.VOIDED = 0
Gives: (the name), 1644000, 1100000, 50000, 1485000
If I do it two tables at a time (INNER JOIN people ON allorders.CUSTOMER = people.ID) in separate queries, I get the correct results. I don't don't even know where the numbers I get come from. Like:
SELECT
people.name,
SUM(allorders.TOTAL),
SUM(allorders.DISCOUNT),
SUM(allorders.SERVICECHARGE)
FROM
people
INNER JOIN
allorders ON allorders.CUSTOMER = people.ID
WHERE people.ID = 7 AND allorders.VOIDED = 0
Gives: (the name), 822000, 550000, 25000
SELECT
people.name,
SUM(payments.AMOUNT)
FROM
people
INNER JOIN payments ON payments.CUSTOMER = people.ID
WHERE people.ID = 7 AND payments.VOIDED = 0
Gives: (the name), 297000
It looks like it doubles, but I don't know why.
The odd thing is I have a similar query that does this sum correctly. I'll post it, but it's a bit complex. Here goes:
SELECT
t1.IDENTIFIER,
ifnull(t1.NAME,""),
t1.PRICE,
t1.GUESTS,
t1.STATUS,
ifnull(t1.NOTE,""),
t1.LINK,
ifnull(t1.EDITOR,""),
concat(t2.FIRSTNAME,"",t2.LASTNAME),
t2.ID,
t3.ID,
ifnull(t1.EMAIL,""),
ifnull(t3.PHONE,""),
ifnull(SUM(p1.AMOUNT),0),
ifnull(SUM(o1.DISCOUNT),0),
ifnull(SUM(o1.TOTAL),0),
ifnull(SUM(o1.SERVICECHARGE),0)
FROM
tables t1
INNER JOIN
people t2 ON t1.SELLER = t2.ID
INNER JOIN
people t3 ON t1.CUSTOMER = t3.ID
INNER JOIN
orderpaymentinfo ON orderpaymentinfo.TABLEID = t1.IDENTIFIER
INNER JOIN
payments p1 ON orderpaymentinfo.PAYMENTID = p1.PAYMENTID
INNER JOIN
allorders o1 ON o1.ORDERID = orderpaymentinfo.ORDERID
WHERE
p1.VOIDED = 0 AND o1.VOIDED = 0 AND t1.DATE = "2014-12-20"
GROUP BY t1.IDENTIFIER
The latter statement does the same join, only it uses an additional helper-table. I'm sorry it's a bit poorly formatted (I'm not great with SO's formatter), but if someone can tell me the difference between the logic in these two statements and how one can be completely wrong while the other right, I'd be very happy.
In response to answer:
Result 1:
Name - 5
Result 2:
Name - 2
Result 3:
Name - 10
Result 4 is truncated in phpMyAdmin - where would I get this easily?
Table structure for the three tables looks like:
SHOW create on the way.
Okay, so I am pretty sure you've a join condition that's basically exploding your result set into something like a Cartesian product. Here's what I think you should try
First, run the following and share the output:
SELECT p.name,COUNT(*)
FROM people as p
INNER JOIN allorders AS a
ON a.CUSTOMER = p.ID
WHERE p.ID = 7 AND a.VOIDED = 0
GROUP BY p.name
Then run
SELECT p.name,COUNT(*)
FROM people AS p
INNER JOIN payments AS pay
ON pay.CUSTOMER = p.ID
WHERE p.ID = 7 AND pay.VOIDED = 0
GROUP BY p.name
Then run
SELECT
p.name,
COUNT(*)
FROM
people as p
INNER JOIN
allorders as a ON a.CUSTOMER = p.ID
INNER JOIN
payments as pay ON pay.CUSTOMER = p.ID
WHERE
p.ID = 7 AND a.VOIDED = 0 AND pay.VOIDED = 0
GROUP BY p.name
Last run the following
SHOW CREATE TABLE people;
SHOW CREATE TABLE payments;
SHOW CREATE TABLE allorders;
The problem is that you don't have the correct understanding of your data. You need to give us a bit more info about the data and the relationships, and the output I've described here should help. Mine is not an answer. But if you run these queries and paste the output of them, you should be able to get an answer, either from me or someone else.
Based on the discussion and edits above, please try:
SELECT
p.name,
SUM(o.TOTAL),
SUM(o.DISCOUNT),
SUM(o.SERVICECHARGE),
MAX(pay.amt)
FROM
people as p
INNER JOIN
allorders AS o ON o.CUSTOMER = p.ID
INNER JOIN (SELECT customer,
SUM(amount) as amt
FROM payments
WHERE voided = 0 AND customer = 7
GROUP BY customer) AS pay
ON pay.customer = p.id
WHERE
p.ID = 7 AND o.VOIDED = 0
GROUP BY p.name
You could also do a subquery in your SELECT statement, but it's pretty obnoxious imo. You could also do min(pay.amt) or avg or even just leave the aggregate out altogether. The above should work... even though there are cleaner ways. I'm providing this answer so you can reason about why you were getting the unexpected result... actually optimizing your query is a different question that you can dive into later, once you've had a chance to look over this

SQL: Querying several tables, only most recent result from one, and all results from another

I have a number of tables and I am trying to pull out data from many of them in one query. I've worked out how to successfully use LEFT JOIN to pull in related data from other tables, with a common ID.
With this query I'm trying to do get data for each contact in our ASSOCIATES_BACKGROUND_DATA table.
Firstly, I am trying to find their most recent meeting in the ASSOCIATES_1_TO_1S.DATE_OF_MEETING (this is be stored as a timestamp, so looking for the largest timestamp only) - I am not getting the correct result, am I using the MAX function correctly?
Secondly, I want to get all records in the ASSOCIATES_ACTION_STEPS for that ASSOCIATE_ID. The right join doesn't seem to be pulling this in either?
Can anyone help? Feel like I'm so close to getting the result for this, but this is really bugging me!
SELECT *,
ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
ASSOCIATES_BACKGROUND_DATA.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS,
ASSOCIATES_BACKGROUND_DATA.LAST_NAME as ABG_LAST_NAME,
USERS.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS,
USERS.LAST_NAME as USERS_LAST_NAME,
MAX(ASSOCIATES_1_TO_1S.DATE_OF_MEETING)
FROM ASSOCIATES_BACKGROUND_DATA
LEFT JOIN ASSOCIATES_1_TO_1S
ON ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID = ASSOCIATES_1_TO_1S.ASSOCIATE_ID
LEFT JOIN LIST_OF_UNIVERSITIES
ON ASSOCIATES_BACKGROUND_DATA.UNIVERSITY = LIST_OF_UNIVERSITIES.ID
LEFT JOIN USERS
ON ASSOCIATES_BACKGROUND_DATA.PROGRAMME_COORDINATOR = USERS.ID
RIGHT JOIN ASSOCIATES_ACTION_STEPS
ON ASSOCIATES_BACKGROUND_DATA.ASSOCIATE_ID = ASSOCIATES_ACTION_STEPS.ASSOCIATE_ID
GROUP BY ASSOCIATES_1_TO_1S.ASSOCIATE_ID
ORDER BY `ABG_NAME_KNOWN_AS` ASC
Many thanks in advance for any help or pointers that you can give...
I checked your query. The first step for me was to reformat it to make it more readable (very important if you spend time on a query, you don't need any kind of overhead).
After that, I reviewed the logic and came to 2 conclusions:
you don't need a right join
your grouping algorithm was not quite correct.
Please have a look to this query and tell me if there's something missing:
SELECT
abg.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
abg.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS,
abg.LAST_NAME as ABG_LAST_NAME,
u.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS,
u.LAST_NAME as USERS_LAST_NAME,
t.lastmeeting
FROM ASSOCIATES_BACKGROUND_DATA abg
LEFT JOIN (
SELECT
abg.ASSOCIATE_ID as id,
MAX(a1s.DATE_OF_MEETING) AS lastmeeting
FROM ASSOCIATES_BACKGROUND_DATA abg
LEFT JOIN ASSOCIATES_1_TO_1S a1s ON abg.ASSOCIATE_ID = a1s.ASSOCIATE_ID
GROUP BY abg.ASSOCIATE_ID
) t ON t.id = abg.ASSOCIATE_ID
LEFT JOIN ASSOCIATES_ACTION_STEPS aas ON abg.ASSOCIATE_ID = aas.ASSOCIATE_ID
LEFT JOIN USERS u ON abg.PROGRAMME_COORDINATOR = u.ID
ORDER BY abg.NAME_KNOWN_AS
I try to not use the permissiveness of mysql in regards to its group by function (mysql allows you to get away with a group by instruction not including all the selected fields), and stick to standards. But then it is necessary to make a subquery (see the t alias) precalculating the MAX(DATE) for each associate, to later join it back with the main recordset.
Cheers.
PS: you may add again the join with the universities, I skipped it to isolate the problem only.
PPS: to reflect your comment, I built this query:
SELECT
abg.ASSOCIATE_ID as ABG_ASSOCIATE_ID,
abg.NAME_KNOWN_AS as ABG_NAME_KNOWN_AS,
abg.LAST_NAME as ABG_LAST_NAME,
u.NAME_KNOWN_AS as USERS_NAME_KNOWN_AS,
u.LAST_NAME as USERS_LAST_NAME,
t.lastmeeting,
t2.av AS averagetimedue
FROM ASSOCIATES_BACKGROUND_DATA abg
LEFT JOIN (
SELECT
abg.ASSOCIATE_ID as id,
MAX(a1s.DATE_OF_MEETING) AS lastmeeting
FROM ASSOCIATES_BACKGROUND_DATA abg
LEFT JOIN ASSOCIATES_1_TO_1S a1s ON abg.ASSOCIATE_ID = a1s.ASSOCIATE_ID
GROUP BY abg.ASSOCIATE_ID
) t ON t.id = abg.ASSOCIATE_ID
LEFT JOIN (
SELECT
(AVG(DATE_DUE)+AVG(DATE_COMPLETED))/2 AS av,
ASSOCIATE_ID
FROM ASSOCIATES_ACTION_STEPS
GROUP BY ASSOCIATE_ID
) t2 ON abg.ASSOCIATE_ID = t2.ASSOCIATE_ID
LEFT JOIN ASSOCIATES_ACTION_STEPS aas ON abg.ASSOCIATE_ID = aas.ASSOCIATE_ID
LEFT JOIN USERS u ON abg.PROGRAMME_COORDINATOR = u.ID
ORDER BY abg.NAME_KNOWN_AS
I'm not sure of your definition of average time between DATE_DUE and DATE_COMPLETED though.

fetching records with long sql query with multple joins

I will try to explain things as much as I can.
I have following query to fetch records from different tables.
SELECT
p.p_name,
p.id,
cat.cat_name,
p.property_type,
p.p_type,
p.address,
c.client_name,
p.price,
GROUP_CONCAT(pr.price) AS c_price,
pd.land_area,
pd.land_area_rp,
p.tagline,
p.map_location,
r.id,
p.status,
co.country_name,
p.`show`,
u.name,
p.created_date,
p.updated_dt,
o.type_id,
p.furnished,
p.expiry_date
FROM
property p
LEFT OUTER JOIN region AS r
ON p.district_id = r.id
LEFT OUTER JOIN country AS co
ON p.country_id = co.country_id
LEFT OUTER JOIN property_category AS cat
ON p.cat_id = cat.id
LEFT OUTER JOIN property_area_details AS pd
ON p.id = pd.property_id
LEFT OUTER JOIN sc_clients AS c
ON p.client_id = c.client_id
LEFT OUTER JOIN admin AS u
ON p.adminid = u.id
LEFT OUTER JOIN sc_property_orientation_type AS o
ON p.orientation_type = o.type_id
LEFT OUTER JOIN property_amenities_details AS pad
ON p.id = pad.property_id
LEFT OUTER JOIN sc_commercial_property_price AS pr
ON p.id = pr.property_id
WHERE p.id > 0
AND (
p.created_date > DATE_SUB(NOW(), INTERVAL 1 YEAR)
OR p.updated_dt > DATE_SUB(NOW(), INTERVAL 1 YEAR)
)
AND p.p_type = 'sale'
everything works fine if I exclude GROUP_CONCAT(pr.price) AS c_price, from above query. But when I include this it just gives one result. My intention to use group concat above is to fetch comma separated price from table sc_commercial_property_price that matches the property id in this case p.id. If the records for property exist in sc_commercial_property_price then fetch them in comma separated form along with other records. If not it should return blank. What m I doing wrong here?
I will try to explain again if my problem is not clear. Thanks in advance
The GROUP_CONCAT is an aggregation function. When you include it, you are telling SQL that there is an aggregation. Without a GROUP BY, only one row is returns, as in:
select count(*)
from table
The query that you have is acceptable syntax in MySQL but not in any other database. The query does not automatically group by the columns with no functions. Instead, it returns an arbitrary value. You could imagine a function ANY, so you query is:
select any(p.p_name) as p_num, any(p.tagline) as tagline, . . .
To fix this, put all your current variables in a group by clause:
GROUP BY
p.p_name,
p.id,
cat.cat_name,
p.property_type,
p.p_type,
p.address,
c.client_name,
p.price,
pd.land_area,
pd.land_area_rp,
p.tagline,
p.map_location,
r.id,
p.status,
co.country_name,
p.`show`,
u.name,
p.created_date,
p.updated_dt,
o.type_id,
p.furnished,
p.expiry_date
Most people who write SQL think it is good form to include all the group by variables in the group by clause, even though MySQL does not necessarily require this.
Add GROUP BY clause enumerating whatever you intend to have separate rows for. What happens now is that it picks some value for each result column and group_concats every pr.price.