mysql inner join giving bad results (?) - mysql

The following sql call works fine, returns the correct total retail for customers:
SELECT customer.id,
customer.first_name,
customer.last_name,
SUM(sales_line_item_detail.retail) AS total_retail
FROM sales_line_item_detail
INNER JOIN sales_header
ON sales_header.id = sales_line_item_detail.sales_header_id
INNER JOIN customer
ON customer.id = sales_header.customer_id
GROUP BY sales_header.customer_Id
ORDER BY total_Retail DESC
LIMIT 10
However, i need it to return the customers telephone and email addresses as well.. please keep in mind that not all customers have an email address and telephone number. whenever i left join the email and numbers tables, it throws the total_retail amount off by thousands and I am not sure why.
The following query gives completely wrong results for the total_retail field:
SELECT customer.id,
customer.first_name,
customer.last_name,
IF(
ISNULL( gemstore.customer_phone_numbers.Number),
'No Number..',
gemstore.customer_phone_numbers.Number
) AS Number,
IF(
ISNULL(gemstore.customer_emails.Email),
'No Email...',
gemstore.customer_emails.Email
) AS Email,
SUM(sales_line_item_detail.retail) AS total_retail,
FROM sales_line_item_detail
INNER JOIN sales_header
ON sales_header.id = sales_line_item_detail.sales_header_id
INNER JOIN customer
ON customer.id = sales_header.customer_id
LEFT JOIN gemstore.customer_emails
ON gemstore.customer_emails.Customer_ID = gemstore.customer.ID
LEFT JOIN gemstore.customer_phone_numbers
ON gemstore.customer_phone_numbers.Customer_ID = gemstore.customer.ID
GROUP BY sales_header.customer_Id
ORDER BY total_Retail DESC
LIMIT 10
Any help figuring out why it is throwing off my results is greatly appreciated.
Thanks!

Is it possible that there are multiple records for a Customer_ID in either the customer_emails or customer_phone_numbers tables?

You'll be matching too many records. Try the query without the group by clause and you'll see which ones and how. Most likely the left join's will duplicate order rows on every customer email/phone match.

I am not totally sure, as i can't test this, but the following might be happening.
If there are more than one email or phone number per customer the final result might get multiplied, because of the new joins.
Imagine the query without the group_by and join to sales:
CustomerId Email phoneNumber
1 test#gmx.com 0122233
1 mail#yahoo.com 0122233
The user in this example has 2 mailadresses.
If you would now add the join to sales and the group by, you would have doubled total_retail.
If this should be the case, replacing the LEFT JOIN with an LEFT OUTER JOIN should do the trick. In that case you will however only see the first email/phonenumer of the customer.

Related

Use SELECT through three table

I tried to write a query, but unfortunately I didn't succeed.
I want to know how many packages delivered over a given period by a person.
So I want to know how many packages were delivered by John (user_id = 1) between 01-02-18 and 28-02-18. John drives another car (another plate_id) every day.
(orders_drivers.user_id, plates.plate_name, orders.delivery_date, orders.package_amount)
I have 3 table:
orders with plate_id delivery_date package_amount
plates with plate_id plate_name
orders_drivers with plate_id plate_date user_id
I tried some solutions but didn't get the expected result. Thanks!
Try using JOINS as shown below:
SELECT SUM(o.package_amount)
FROM orders o INNER JOIN orders_drivers od
ON o.plate_id=od.plate_id
WHERE od.user_id=<the_user_id>;
See MySQL Join Made Easy for insight.
You can also use a subquery:
SELECT SUM(o.package_amount)
FROM orders o
WHERE EXISTS (SELECT 1
FROM orders_drivers od
WHERE user_id=<user_id> AND o.plate_id=od.plate_id);
SELECT sum(orders.package_amount) AS amount
FROM orders
LEFT JOIN plates ON orders.plate_id = orders_drivers.plate_id
LEFT JOIN orders_driver ON orders.plate_id = orders_drivers.plate_id
WHERE orders.delivery_date > date1 AND orders.delivery_date < date2 AND orders_driver.user_id = userid
GROUP BY orders_drivers.user_id
But seriously, you need to ask questions that makes more sense.
sum is a function to add all values that has been grouped by GROUP BY.
LEFT JOIN connects all tables by id = id. Any other join can do this in this case, as all ids are unique (at least I hope).
WHERE, where you give the dates and user.
And GROUP BY userid, so if there are more records of the same id, they are returned as one (and summed by their pack amount.)
With the AS, your result is returned under the name 'amount',
If you want the total of packageamount by user in a period, you can use this query:
UPDATE: add a where clause on user_id, to retrieve John related data
SELECT od.user_id
, p.plate_name
, SUM(o.package_amount) AS TotalPackageAmount
FROM orders_drivers od
JOIN plates p
ON o.plate_id = od.plate_id
JOIN orders o
ON o.plate_id = od.plate_id
WHERE o.delivery_date BETWEEN convert(datetime,01/02/2018,103) AND convert(datetime,28/02/2018,103)
AND od.user_id = 1
GROUP BY od.user_id
, p.plate_name
It groups rows on user_id and plate_name, filter a period of delivery_date(s) and then calculate the sum of packageamount for the group

Can i use the row result of a query to run a sub query and get the data returned?

to be clear I want to avoid for loop in my node.js program
my current approach is a group_concat() query [which is working correctly]
SELECT DISTINCT(c.main), GROUP_CONCAT(c.cId) AS cId_List FROM customers c LEFT JOIN boxes b ON b.boxId = c.boxId WHERE c.opId = ? GROUP BY c.conNo ORDER BY c.conNo ASC;
//response to json
{
"main": 2,
"cId_List": "512,513"
},{
"main": 3,
"cId_List": "514,515,516,517"
},....
The next query i need to run is for every "cId_List"
for(every cId_List){
qry = "SELECT SUM(amount) FROM payments p WHERE p.cId IN (cId_List);"
}
how can I avoid it?
Reasons to avoid it is because there is no limit to no.queries. It Can be 10000+ at a single request.
Added Info
What is happening?
There is are two tables namely customers, payments
There can be multiple rows in customer table with same "connection number [main]"
by doing group concat I am getting the ids of those rows into cId_List
now for every cId_List I want to run the SUM() Query in payments Table
so my result shall be
{
"main": 2,
"cId_List": "512,513", //multiple rows of customers table
"amount_sum": 500 //data from payments table using above cId_List
},{
"main": 3,
"cId_List": "514,515,516,517",
"amount_sum": -200
},....
sqlFiddle
as asked: sqlfiddle explanation
customers.conNo is a unifying column for multiple customers (basically of a family, they are billed together)
customers.cId is the primary key and the separator factor (when we need to bill per person basis)
payments.cId is foreign key of customers.cId and payments are entered as per cId
report needs to be generated according to conNo
so to get all the payments of a conNo I need to send all the appropriate cId to payments table.
I hope this will clear the doubts.
EDIT:
I am checking this query which may be the answer, I would like to know if this query format is good performance wise?
SELECT GROUP_CONCAT(DISTINCT(customers.cId)) AS cId_List, customers.*, payments.cId, SUM(amount) AS amt FROM `payments` left join customers on customers.cId = payments.cId GROUP BY `customers`.`conNo` ORDER BY `customers`.`conNo` ASC
So it seems that you can simply replace all of your code with the following:
SELECT c.conno
, SUM(p.amount) total
FROM customers c
LEFT
JOIN payments p
ON p.cid = c.cid
GROUP
BY c.conno
http://sqlfiddle.com/#!9/a65cf6/11
SELECT SUM(p.amount)
FROM customers AS c
LEFT JOIN payments AS p ON p.cid = c.cid
GROUP BY c.cid
This query seems to work. Can any one tell me if it is appropriate performance wise. Also would like your suggestions if any Thanks to #Strawberry and #Luca Giardina
SELECT
GROUP_CONCAT(DISTINCT(customers.cId)) AS cId_List,
customers.*, payments.cId,
SUM(amount) AS amt
FROM `payments` LEFT JOIN customers ON customers.cId = payments.cId
GROUP BY `customers`.`conNo`
ORDER BY `customers`.`conNo` ASC

Returning distinct records based on left join

I'm having some trouble formulating a complex SQL query. I'm getting the result I'm looking for and the performance is fine but whenever I try to grab distinct rows for my LEFT JOIN of product_groups, I'm either hitting some performance issues or getting incorrect results.
Here's my query:
SELECT
pl.name, pl.description,
p.rows, p.columns,
pr.sku,
m.filename, m.ext, m.type,
ptg.product_group_id AS group,
FROM
product_region AS pr
INNER JOIN
products AS p ON (p.product_id = pr.product_id)
INNER JOIN
media AS m ON (p.media = m.media_id)
INNER JOIN
product_language AS pl ON (p.product_id = pl.product_id)
LEFT JOIN
products_groups AS ptg ON (ptg.product_id = pr.product_id)
WHERE
(pl.lang = :lang) AND
(pr.region = :region) AND
(pt.product_id = p.product_id)
GROUP BY
p.product_id
LIMIT
:offset, :limit
The result I'm being given is correct however I want only distinct rows returned for "group". For example I'm getting many results back that have the same "group" value but I only want the first row to show and the following records that have the same group value to be left out.
I tried GROUP BY group and DISTINCT but it gives me incorrect results. Also note that group can come back as NULL and I don't want those rows to be effected.
Thanks in advance.
I worked out a solution an hour after posting this. My goal was to group by product_group_id first and then the individual product_id. The requirement was that I would eliminate product duplicates and have ONE product represent the group set.
I ended up using COALESCE(ptg.product_group_id, p.product_id). This accounts for the fact that most of my group IDs were null except for a few dispersed products. In using COALESCE I'm first grouping by the group ID, if that value is null it ignores the group and collects by product_id.

Wrong use of inner join function / group function?

I have the following problem with my query:
I have two tables:
Customer
Subscriber
linked together by customer.id=subscriber.customer_id
in the subscriber table, I have records with id_customer=0 (these are email records, that do not have a full customer account)
Now i want to show how many customers I have per day, and how many subscribers with id_customer, and how many subscribers WITH id_customer=0 (emailonlies i call them)
Somehow, i cannot manage to get those emailonlies.
Perhaps it has something to do with not using the right join type.
When i use left join, i get the right amount of customers, but not the right amount of emailonlies. When I use inner join i get the wrong amount of customers. Am i using the group function correctly? i think it has something to do with that.
THIS IS MY QUERY:
` SELECT DATE(c.date_register),
COUNT(DISTINCT c.id) AS newcustomers,
COUNT(DISTINCT s.customer_id) AS newsubscribedcustomers,
COUNT(DISTINCT s.subscriber_id AND s.customer_id=0) AS emailonlies
FROM customer c
LEFT JOIN subscriber s ON s.customer_id=c.id
GROUP BY DATE(c.date_register)
ORDER BY DATE(c.date_register) DESC
LIMIT 10
;`
I'm not entirely sure, but I think in DISTINCT s.subscriber_id AND s.customer_id=0, it runs the AND before the DISTINCT, so the DISTINCT only ever sees true and false.
Why don't you just take
COUNT(DISTINCT s.subscriber_id) - (COUNT(DISTINCT s.customer_id) - 1)?
(The -1 is there because DISTINCT s.customer_id will count 0.)
Got it, only risk is that i get no email onlies if there are no customers on this day, becuase of the left join. But this one works:
SELECT customers.regdatum,customers.customersqty,subscribers.emailonlies
FROM (
(SELECT DATE(c.date_register) AS regdatum,COUNT(DISTINCT c.id) AS customersqty
FROM customer c
GROUP BY DATE(c.date_register)
) AS customers
LEFT JOIN
(SELECT DATE(s.added) AS voegdatum,COUNT(DISTINCT s.subscriber_id) AS emailonlies
FROM subscriber s
WHERE s.customer_id=0
GROUP BY DATE(s.added)
) AS subscribers
ON customers.regdatum=subscribers.voegdatum
)
ORDER BY customers.regdatum DESC
;

Table join issue with MySQL

I have a table for referred users (contains an email address and date columns) and a table for users.
I run to get the top referers:
SELECT count(r.Email) as count, r.Email
FROM refs r
WHERE r.referredOn > '2011-12-13'
GROUP BY email
ORDER BY count DESC
But I want to join this with the users table so it displays with other data in the user table, I thought a join would work. Left join becuase emails may be entered incorrectly, some people put first name etc under refs.Email
SELECT count(r.Email) as count, r.Email, u.*
FROM refs r LEFT JOIN users u ON u.email_primary = r.Email
WHERE r.referredOn > '2011-12-13'
GROUP BY email
ORDER BY count DESC
With the above query the count is incorrect, but I don't know why.
Try this one:
SELECT count(r.Email) as count, r.Email
FROM refs r
INNER JOIN users u ON u.email_primary = r.Email
WHERE r.referredOn > '2011-12-13'
GROUP BY email
ORDER BY count DESC
if your adding new column from users u you also need to add it on your group by clause.
Regards
Unfortunately, a LEFT JOIN wont help you here; what this type of join says is give me all the rows in users that match my email, as well as all the rows that have no match on email. If the email doesn't match, then they wont come back as you want.
So you can't use a the left join condition here the way you want.
If you enforced the fact that they had to enter an email everytime, and it was a valid email etc, then you could use an INNER JOIN.
JOINs are usually used to follow referential integrity. So, for example, I have a user with an id in one table, and another table with the column userid - there is a strong relationship between the two tables I can join on.
Jeft Atwood has a good explantion of how joins work.
SEE if this will help you:
SELECT e.count, e.email, u.col1, u.col2 -- etc
FROM (
SELECT count(r.Email) as count, r.Email
FROM refs r
WHERE r.referredOn > '2011-12-13'
GROUP BY email
) e
INNER JOIN
users u ON u.email_primary = e.Email
Instead of a direct join, you could TRY to use your counting query as a subquery-table type..
I wrote this query
SELECT *, count(r.Email) as count FROM refs r
LEFT OUTER JOIN users u ON r.email = u.email_primary
WHERE u.uid IS NOT NULL
GROUP BY u.uid
ORDER BY count DESC
Which showed me that the reason the count was wrong was because some of the email addresses are used twice in the users table (users sharing 'family' email address), this doubled my count, the above query shows each separate user account.