Is this join possible in SQL? - mysql

I am not sure how else to ask the question, so I will give an example, to see if this is possible with SQL.
Let's say a customer visits a store, and a System creates a VisitID. Then he places an Order in a table of Orders, linked to the VisitID. Then he fills in a Shipping Form, in a table of ShippingForms, linked to the Visit ID. Then the system generates a Receipt, in a table of ReceiptForms, linked to the VisitID. Then he fills in a Return Form, in a table of ReturnForms, linked to the VisitID.
So, is there a way to query the system to show for a VisitID there is/isn't an Orders record, Shipping Form record, Receipt record, Return record? This would be handy in a DBGrid to show all the activities of the customer on that VisitID. Each table (Orders, ShippingForm, ReceiptForm, ReturnForm, etc.) is of a different structure and different fields, but linked by the Visit ID, and may be present or not present, or may have several Orders during that VisitID.
So -- Select Orders, ShippingForm, ReceiptForm, ReturnForm where VisitID=x.
so I could present the information in a grid such as:
{
VisitID 2315
OrderID 1256
OrderID 1257
OrderID 1258
ReceiptID 5124
ReceiptID 5125
ReceiptID 5126
ShippingID 99023
ReturnID 582812
}

Visits that do not have records in all tables:
Select *
from Visits v -- visit always exists
left join Orders o on o.visitId = v.visitId
left join ShippingForm s on s.visitId = v.visitId
left join ReceiptForm r on r.visitId = v.visitId
left join ReturnForm rf on fr.visitId - v.visitId
where v.VisitID=x
and (o.visitId is null or s.visitId is null or r.visitId is null or fr.visitId is null ) -- it is null when record does not exist

Maybe you are looking for UNION ALL:
select visitid, type, id
from
(
select visitid, 'visit' as type, 1 as sortkey, null as id from visits
union all
select visitid, 'order' as type, 2 as sortkey, orderid as id from orders
union all
select visitid, 'receipt' as type, 3 as sortkey, receiptid as id from receipts
union all
select visitid, 'shipping' as type, 4 as sortkey, shippingid as id from shippings
union all
select visitid, 'return' as type, 5 as sortkey, returnid as id from returns
) data
order by visitid, sortkey;
Or you may want string aggregation. This is DBMS dependent and you haven't mentioned your DBMS. This is for MySQL:
select
visitid,
ord.ids as orderids,
rcp.ids as receiptids,
shp.ids as shippingids,
ret.ids as returnids
from visits v
left join
(select visitid, group_concat(orderid) as ids from orders group by visitid) ord
using (visitid)
left join
(select visitid, group_concat(receiptid) as ids from receipts group by visitid) rcp
using (visitid)
left join
(select visitid, group_concat(shippingid) as ids from shippings group by visitid) shp
using (visitid)
left join
(select visitid, group_concat(returnid) as ids from returns group by visitid) ret
using (visitid)
order by visitid;

Thank you all, you have allowed me to answer the question. For my purposes, I combined the results to make this sol
select
OrderID as FormID,
1 as FormType,
'Order ID' as FormTitle
from Orders O, Visits V where (O.VisitID=V.VisitID) and (V.VisitID=112)
union all
ShippingID as FormID,
2 as FormType,
'Shipping ID' as FormTitle
from Shipping S, Visits V where (S.VisitID=V.VisitID) and (V.VisitID=112)
etc.
so now have a line by line table for VisitID=112:
FormID FormType FormTitle
23 1 Order ID
26 1 Order ID
28 1 Order ID
342 2 Shipping ID
343 2 Shipping ID
367 2 Shipping ID

Related

Aggregating three tables but getting wrong values during the aggregation operation

"employee" Table
emp_id
empName
1
ABC
2
xyx
"client" Table:
id
emp_id
clientName
1
1
a
2
1
b
3
1
c
4
2
d
"collection" Table
id
emp_id
Amount
1
2
1000
2
1
2000
3
1
1000
4
1
1200
I want to aggregate values from the three tables input tables here reported as samples. For each employee I need to find
the total collection amount for that employee (as a sum)
the clients that are involved with the corresponding employee (as a comma-separated value)
Here follows my current query.
MyQuery:
SELECT emp_id,
empName,
GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName,
SUM(collection.Amount)
FROM employee
LEFT JOIN client
ON clent.emp_id = employee.emp_id
LEFT JOIN collection
ON collection.emp_id = employee.emp_id
GROUP BY employee.emp_id;
The problem of this query is that I'm getting wrong values of sums and clients when an employee is associated to multiple of them.
Current Output:
emp_id
empName
clientName
TotalCollection
1
ABC
a,b,c,c,b,a,a,b,c
8400
2
xyz
d,d
1000
Expected Output:
emp_id
empName
clientName
TotalCollection
1
ABC
a , b , c
4200
2
xyz
d
1000
How can I solve this problem?
There are some typos in your query:
the separator inside the GROUP_CONCAT function should be a comma instead of a space, given your current output, though comma is default value, so you can really omit that clause.
each alias in your select requires the table where it comes from, as long as those field names are used in more than one tables among the ones you're joining on
your GROUP BY clause should at least contain every field that is not aggregated inside the SELECT clause in order to have a potentially correct output.
The overall conceptual problem in your query is that the join combines every row of the "employee" table with every row of the "client" table (resulting in multiple rows and higher sum of amounts during the aggregation). One way for getting out of the rabbit hole is a first aggregation on the "client" table (to have one row for each "emp_id" value), then join back with the other tables.
SELECT emp.emp_id,
emp.empName,
cl.clientName,
SUM(coll.Amount)
FROM employee emp
LEFT JOIN (SELECT emp_id,
GROUP_CONCAT(client.clientName
ORDER BY client.id) AS clientName
FROM client
GROUP BY emp_id) cl
ON cl.emp_id = emp.emp_id
LEFT JOIN (SELECT emp_id, Amount FROM collection) coll
ON coll.emp_id = emp.emp_id
GROUP BY emp.emp_id,
emp.empName,
cl.clientName
Check the demo here.
Regardless of my comment, here is a query for your desired output:
SELECT
a.emp_id,
a.empName,
a.clientName,
SUM(col.Amount) AS totalCollection
FROM (SELECT e.emp_id,
e.`empName`,
GROUP_CONCAT(DISTINCT c.clientName ORDER BY c.id ) AS clientName
FROM employee e
LEFT JOIN `client` c
ON c.emp_id = e.emp_id
GROUP BY e.`emp_id`) a
LEFT JOIN collection col
ON col.emp_id = a.emp_id
GROUP BY col.emp_id;
When having multiple joins, you should be careful about the relations and the number of results(rows) that your query generates. You might as well have multiple records in output than your desired ones.
Hope this helps
SELECT emp_id,
empName,
GROUP_CONCAT(client.clientName ORDER BY client.id SEPARATOR '') AS clientName,
C .Amount
FROM employee
LEFT JOIN client
ON clent.emp_id = employee.emp_id
LEFT JOIN (select collection.emp_id , sum(collection.Amount ) as Amount from collection group by collection.emp_id) C
ON C.emp_id = employee.emp_id
GROUP BY employee.emp_id;
it works for me now

MySQL - Marking duplicates from several table fields, as well as data from another table

I have two tables - one shows user purchases, and one shows a product id with it's corresponding product type.
My client wants to make duplicate users inactive based on last name and email address, but wants to run the query by product type (based on what type of product they purchased), and only wants to include user_ids who haven't purchased paint (product ids 5 and 6). So the query will be run multiple times - once for all people who have purchased lawnmowers, and then for all people who have purchased leafblowers etc (and there will be some overlap between these two). No user_id that has purchased paint should be made inactive.
In terms of who should stay active among the duplicates, the one to stay active will be the one with the highest product id purchased (as products are released annually). If they have multiple records with the same product id, the record to stay active will be the one with most recent d_modified and t_modified.
I also want to shift the current value of 'inactive' to the 'previously_inactive' column, so that this can be easily reversed if need be.
Here is some sample table data
If the query was run by leafblower purchases, rows 5, 6, and 7 would be made inactive. This is the expected output:
If the query was run by lawnmower purchases, rows 1 and 2 would be made inactive. This would be the expected output:
If row 4 was not the most recent, it would still not be made inactive, as user_id 888 had bought paint (and we want to exclude these user_ids from being made inactive).
This is an un-optimised version of the query for 'leafblower' purchases (it is working, but will probably be too slow in the interface):
UPDATE test.user_purchases
SET inactive = 1
WHERE id IN (
SELECT z.id
FROM (SELECT * FROM test.user_purchases) z
WHERE z.product_id IN (
SELECT product_id
FROM test.products
WHERE product_type IN ("leafblower")
)
AND id NOT IN (
SELECT a.id
FROM (SELECT * FROM test.user_purchases) a
INNER JOIN (
SELECT r.surname, r.email
FROM (SELECT * FROM test.user_purchases) r
JOIN test.products s on r.product_id = s.product_id
WHERE s.product_type IN ("paint")
) b
WHERE a.surname = b.surname
AND a.email = b.email
)
AND id NOT IN (
SELECT MAX(z.id)
FROM (SELECT * FROM test.user_purchases) z
WHERE z.product_id IN (
SELECT product_id
FROM test.products
WHERE product_type IN ("leafblower")
)
AND id NOT IN (
SELECT a.id
FROM (SELECT * FROM test.user_purchases) a
INNER JOIN (
SELECT r.surname, r.email
FROM (SELECT * FROM test.user_purchases) r
JOIN test.products s on r.product_id = s.product_id
WHERE s.product_type IN ("paint")
) b
WHERE a.surname = b.surname
AND a.email = b.email
)
GROUP BY surname, email
)
)
Any suggestions on how I can streamline this query and optimise the speed of it would be much appreciated.

"Combine" FULL OUTER JOIN and NOT IN?

I have two tables:
+-----------------------+
| Tables_in_my_database |
+-----------------------+
| orders |
| orderTaken |
+-----------------------+
In orders, there are attributes
orderId, orderName, isClosed and orderCreationTime.
In orderTaken, there are attributes
userId, orderId and orderStatus.
Let's say when
orderStatus = 1 --> the customer has taken the order
orderStatus = 2 --> the order has been shipped
orderStatus = 3 --> the order is completed
orderStatus = 4 --> the order is canceled
orderStatus = 5 --> the order has an exception
Basically the mechanism of my project is running like: A user with a unique userId will be able to take an order from the web page, where each order has its own unique orderId as well. After taken, the orderTaken table will record the userId, orderId and initially set orderStatus = 1. The shop then update the orderStatus based on various situations. Once the shop has updated isClosed = 1 then this order wouldn't be displayed at all no matter the user has taken it or not(not make sense but it's just a isClosed == 0 in the query).
Now, I want to construct a web page that will show both the new orders that the user hasn't taken yet (which should be the orders that their orderIds are not recorded in the orderTaken table under this user's userId), and the orders that the user has already taken with the orderStatus shown BUT the orderStatus IS NOT 4 or 5, group by orderCreationTime DESC (yea, maybe not make sense if I don't have a orderTakenTime but let's keep it that way), like:
OrderId 4
Order Name: PetPikachu
orderStatus = 1
CreationTime: 5am
OrderId 3
Order Name: A truck of hamsters
orderStatus = 3
CreationTime: 4am
OrderId 2
New order
Order Name: Macbuk bull
CreationTime: 3am
OrderId 1
Order Name: Jay Chou's Album
orderStatus = 2
CreationTime: 2am
I have this query written based on the knowledge I've learned:
SELECT * FROM orders A WHERE A.isClosed == '0' FULL OUTER JOIN orderTaken B WHERE B.userId = '4' AND (B.orderStatus<>'4' OR B.orderStatus<>'5') ORDER BY A.orderCreationTime DESC;
Apparently this query doesn't work, but I'm afraid to have a
ON A.orderId = B.orderId
since then the table returned will eliminate the new orders that the orderId hasn't been recorded in orderTaken B. I've also tried a NOT IN clause like
SELECT * FROM orders A WHERE A.isClosed = '0' AND A.orderId NOT IN (SELECT orderId FROM orderTaken B WHERE B.userId = '$userId' AND (B.orderStatus='4' OR B.orderStatus='5')) ORDER BY creationTime DESC;
This query works but it doesn't have the field orderStatus from orderTaken B in the returned table. I was thinking to add another JOIN orderTaken B clause after this query to get the fields from B but I think that's not a good way to write a query.
I just wanna kinda combine "NOT IN" and "FULL JOIN". Can anybody help me out? Thanks!
Just like #terje-d said, what you need is LEFT JOIN. Updated it with the the original table names and fixed the $userId filter.
For all open orders and incomplete orders.
SELECT o.`orderId`,
o.`orderName`,
ot.`orderStatus`,
o.`orderCreationTime`
FROM orders o
LEFT JOIN orderTaken ot
ON o.orderId = ot.orderId
WHERE o.isClosed = 0
AND (
ot.orderId IS NULL
OR ot.orderStatus NOT IN (4,5)
)
ORDER BY o.`orderCreationTime` DESC
For all open orders and incomplete orders for a particular user
SELECT o.`orderId`,
o.`orderName`,
ot.`orderStatus`,
o.`orderCreationTime`
FROM orders o
LEFT JOIN orderTaken ot
ON o.orderId = ot.orderId
WHERE o.isClosed = 0
AND ( ot.orderStatus IS NULL
OR (
ot.user_id = ?
AND ot.orderStatus NOT IN (4,5)
)
)
ORDER BY o.`orderCreationTime` DESC
You seem to want to find the records in orders that is not assigned to an user (i.e. does not have a related record in orderTaken) plus the ones that are assigned to an user, but where the orderStatus is not 4 or 5.
Then a full outer join is not needed as there will be no records in orderTaken without a related record in orders. A Left inner join can be used to find all the records from orders, an onclause will include data from the related records from orderTaken and the where clause can then filter out orders taken by other users, or where orderStatus is 4 or 5:
SELECT o.*, ot.userID, ot.orderStatus
FROM orders o
LEFT JOIN orderTaken ot
ON ot.orderID = o.orderID
WHERE o.isClosed = 0
AND (ot.userID IS NULL OR ot.userID = $userID AND ot.orderStatus NOT IN (4,5))
ORDER BY o.creationTime DESC

Filter on second left join - SQL

I have three tables. One consists of customers, one consists of products they have purchased and the last one of the returns they have done:
Table customer
CustID, Name
1, Tom
2, Lisa
3, Fred
Table product
CustID, Item
1, Toaster
1, Breadbox
2, Toaster
3, Toaster
Table Returns
CustID, Date, Reason
1, 2014, Guarantee
2, 2013, Guarantee
2, 2014, Guarantee
3, 2015, Guarantee
I would like to get all the customers that bought a Toaster, unless they also bought a breadbox, but not if they have returned a product more than once.
So I have tried the following:
SELECT * FROM Customer
LEFT JOIN Product ON Customer.CustID=Product.CustID
LEFT JOIN Returns ON Customer.CustID=Returns.CustID
WHERE Item = 'Toaster'
AND Customer.CustID NOT IN (
Select CustID FROM Product Where Item = 'Breadbox'
)
That gives me the ones that have bought a Toaster but not a breadbox. Hence, Lisa and Fred.
But I suspect Lisa to break the products on purpose, so I do not want to include the ones that have returned a product more than once. Hence, what do I add to the statement to only get Freds information?
How about
SELECT * FROM Customer
LEFT JOIN Product ON Customer.CustID=Product.CustID
WHERE Item = 'Toaster'
AND Customer.CustID NOT IN (
Select CustID FROM Product Where Item = 'Breadbox'
)
AND (SELECT COUNT(*) FROM Returns WHERE Customer.CustId = Returns.CustID) <= 1
The filter condition goes in the ON clause for all but the first table (in a series of LEFT JOIN:
SELECT *
FROM Customer c LEFT JOIN
Product p
ON c.CustID = p.CustID AND p.Item = 'Toaster' LEFT JOIN
Returns r
ON c.CustID = r.CustID
WHERE c.CustID NOT IN (Select p.CustID FROM Product p Where p.Item = 'Breadbox');
Conditions on the first table remain in the WHERE clause.
As a note: A table called Product that contains a CustId seems awkward. The table behaves more likes its name should CustomerProducts.
You use conditional COUNT
SELECT C.CustID, C.Name
FROM Customer C
JOIN ( SELECT CustID
FROM Products
GROUP BY CustID
HAVING COUNT(CASE WHEN Item = 'Toaster' THEN 1 END) > 1
AND COUNT(CASE WHEN item = 'Breadbox' THEN 1 END) = 0
) P -- Get customer with at least one Toaster and not Breadbox
ON C.CustID = P.CustID
JOIN ( SELECT CustID
FROM Returns
HAVING COUNT(*) < 2
) R -- Get only customers with less than 2 returns
ON C.CustID = R.CustID

MYSQL: How to join two tables using Inner join and then calculatin the total number from the second table for the following examples

I am stuck with the following requirement and I am finding it difficult to crack the query for it.
Consider a table customer with the following fields
id signup_date first_payment_date
10 2015-03-20 null
11 2015-03-20 null
12 2015-03-20 null
13 2015-03-20 null
14 2015-05-23 null
15 2015-05-23 null
Consider another table transaction_history
id product_name
10 vod trial
10 vod trial
11 vod trial
12 vod trial
12 vod
13 vod trial
14 vod trial
15 vod trial
15 vod trial
I need to pick the idfrom customer table and look up in transaction_history table based on the signup_date and first_payment_date is null.
Now I need to check if this id is present in transaction_history and check if he has at least 1 entry with product_name = "vod trial". If he has then he is a row in the result I want.
At the end I need to calculate the total number of id's from transaction_history who has at least one row where product_name="vod_trial" and this should be on a date basis mentioned in signup_date in customer table.
I wrote a query in the following manner:
SELECT
ts.guid,
cs.signup_date,
(SELECT
COUNT(ts2.guid)
FROM
transaction_history ts2
WHERE
cs.guid = ts2.guid
AND ts2.product_name = "vod trial"
HAVING COUNT(ts2.guid) = 1) AS count_ts_guid
FROM
customer AS cs,
transaction_history AS ts
WHERE
cs.guid = ts.guid
AND cs.first_payment_date IS NULL;
But in the above query I am not able to calculate the total count signup_datewise.
Would be great if someone could help me out.
Sample result:
date new trials
2015-03-20 2
2015-05-23 1
I am not sure I fully understand. You want customers without first_payment_date that have a trial entry in the transaction table?
select *
from customer
where first_payment_date is null
and id in (select id from transaction_history where product_name = 'vod trial');
Okay, from your last comment it seems, you want customers that have no trial entry in the transaction table, too. And you want to display them with their trial transaction count. So:
select signup_date,
(
select count(*)
from transaction_history th
where th.product_name = 'vod trial'
and th.id = c.id
)
from customer c
where first_payment_date is null;
If you even want to group by date, then aggregate:
select signup_date,
sum((
select count(*)
from transaction_history th
where th.product_name = 'vod trial'
and th.id = c.id
))
from customer c
where first_payment_date is null
group by signup_date;
Next try: Join all customers and transactions, such as to only get customers present in the transactions table. Then aggregate.
select c.signup_date, count(*)
from customer c
join transaction_history th on th.id = c.id and th.product_name = 'vod trial'
where c.first_payment_date is null
group by c.signup_date;
Or do you want this:
select c.signup_date, count(case when th.product_name = 'vod trial' then 1 end)
from customer c
join transaction_history th on th.id = c.id
where c.first_payment_date is null
group by c.signup_date;
I'd better make this a separate answer. You want to find customers that have only one entry in transaction_history and that entry must be 'vod trial'. So read the transaction table, group by customer id and count. Check your criteria with HAVING. Then join the found IDs with the customer table and group by date.
select c.signup_date, count(*)
from customer c
join
(
select id
from transaction_history
group by id
having count(*) = 1
and min(product_name) = 'vod trial'
) t on t.id = c.id
group by c.signup_date;