"Combine" FULL OUTER JOIN and NOT IN? - mysql

I have two tables:
+-----------------------+
| Tables_in_my_database |
+-----------------------+
| orders |
| orderTaken |
+-----------------------+
In orders, there are attributes
orderId, orderName, isClosed and orderCreationTime.
In orderTaken, there are attributes
userId, orderId and orderStatus.
Let's say when
orderStatus = 1 --> the customer has taken the order
orderStatus = 2 --> the order has been shipped
orderStatus = 3 --> the order is completed
orderStatus = 4 --> the order is canceled
orderStatus = 5 --> the order has an exception
Basically the mechanism of my project is running like: A user with a unique userId will be able to take an order from the web page, where each order has its own unique orderId as well. After taken, the orderTaken table will record the userId, orderId and initially set orderStatus = 1. The shop then update the orderStatus based on various situations. Once the shop has updated isClosed = 1 then this order wouldn't be displayed at all no matter the user has taken it or not(not make sense but it's just a isClosed == 0 in the query).
Now, I want to construct a web page that will show both the new orders that the user hasn't taken yet (which should be the orders that their orderIds are not recorded in the orderTaken table under this user's userId), and the orders that the user has already taken with the orderStatus shown BUT the orderStatus IS NOT 4 or 5, group by orderCreationTime DESC (yea, maybe not make sense if I don't have a orderTakenTime but let's keep it that way), like:
OrderId 4
Order Name: PetPikachu
orderStatus = 1
CreationTime: 5am
OrderId 3
Order Name: A truck of hamsters
orderStatus = 3
CreationTime: 4am
OrderId 2
New order
Order Name: Macbuk bull
CreationTime: 3am
OrderId 1
Order Name: Jay Chou's Album
orderStatus = 2
CreationTime: 2am
I have this query written based on the knowledge I've learned:
SELECT * FROM orders A WHERE A.isClosed == '0' FULL OUTER JOIN orderTaken B WHERE B.userId = '4' AND (B.orderStatus<>'4' OR B.orderStatus<>'5') ORDER BY A.orderCreationTime DESC;
Apparently this query doesn't work, but I'm afraid to have a
ON A.orderId = B.orderId
since then the table returned will eliminate the new orders that the orderId hasn't been recorded in orderTaken B. I've also tried a NOT IN clause like
SELECT * FROM orders A WHERE A.isClosed = '0' AND A.orderId NOT IN (SELECT orderId FROM orderTaken B WHERE B.userId = '$userId' AND (B.orderStatus='4' OR B.orderStatus='5')) ORDER BY creationTime DESC;
This query works but it doesn't have the field orderStatus from orderTaken B in the returned table. I was thinking to add another JOIN orderTaken B clause after this query to get the fields from B but I think that's not a good way to write a query.
I just wanna kinda combine "NOT IN" and "FULL JOIN". Can anybody help me out? Thanks!

Just like #terje-d said, what you need is LEFT JOIN. Updated it with the the original table names and fixed the $userId filter.
For all open orders and incomplete orders.
SELECT o.`orderId`,
o.`orderName`,
ot.`orderStatus`,
o.`orderCreationTime`
FROM orders o
LEFT JOIN orderTaken ot
ON o.orderId = ot.orderId
WHERE o.isClosed = 0
AND (
ot.orderId IS NULL
OR ot.orderStatus NOT IN (4,5)
)
ORDER BY o.`orderCreationTime` DESC
For all open orders and incomplete orders for a particular user
SELECT o.`orderId`,
o.`orderName`,
ot.`orderStatus`,
o.`orderCreationTime`
FROM orders o
LEFT JOIN orderTaken ot
ON o.orderId = ot.orderId
WHERE o.isClosed = 0
AND ( ot.orderStatus IS NULL
OR (
ot.user_id = ?
AND ot.orderStatus NOT IN (4,5)
)
)
ORDER BY o.`orderCreationTime` DESC

You seem to want to find the records in orders that is not assigned to an user (i.e. does not have a related record in orderTaken) plus the ones that are assigned to an user, but where the orderStatus is not 4 or 5.
Then a full outer join is not needed as there will be no records in orderTaken without a related record in orders. A Left inner join can be used to find all the records from orders, an onclause will include data from the related records from orderTaken and the where clause can then filter out orders taken by other users, or where orderStatus is 4 or 5:
SELECT o.*, ot.userID, ot.orderStatus
FROM orders o
LEFT JOIN orderTaken ot
ON ot.orderID = o.orderID
WHERE o.isClosed = 0
AND (ot.userID IS NULL OR ot.userID = $userID AND ot.orderStatus NOT IN (4,5))
ORDER BY o.creationTime DESC

Related

MySQL - Marking duplicates from several table fields, as well as data from another table

I have two tables - one shows user purchases, and one shows a product id with it's corresponding product type.
My client wants to make duplicate users inactive based on last name and email address, but wants to run the query by product type (based on what type of product they purchased), and only wants to include user_ids who haven't purchased paint (product ids 5 and 6). So the query will be run multiple times - once for all people who have purchased lawnmowers, and then for all people who have purchased leafblowers etc (and there will be some overlap between these two). No user_id that has purchased paint should be made inactive.
In terms of who should stay active among the duplicates, the one to stay active will be the one with the highest product id purchased (as products are released annually). If they have multiple records with the same product id, the record to stay active will be the one with most recent d_modified and t_modified.
I also want to shift the current value of 'inactive' to the 'previously_inactive' column, so that this can be easily reversed if need be.
Here is some sample table data
If the query was run by leafblower purchases, rows 5, 6, and 7 would be made inactive. This is the expected output:
If the query was run by lawnmower purchases, rows 1 and 2 would be made inactive. This would be the expected output:
If row 4 was not the most recent, it would still not be made inactive, as user_id 888 had bought paint (and we want to exclude these user_ids from being made inactive).
This is an un-optimised version of the query for 'leafblower' purchases (it is working, but will probably be too slow in the interface):
UPDATE test.user_purchases
SET inactive = 1
WHERE id IN (
SELECT z.id
FROM (SELECT * FROM test.user_purchases) z
WHERE z.product_id IN (
SELECT product_id
FROM test.products
WHERE product_type IN ("leafblower")
)
AND id NOT IN (
SELECT a.id
FROM (SELECT * FROM test.user_purchases) a
INNER JOIN (
SELECT r.surname, r.email
FROM (SELECT * FROM test.user_purchases) r
JOIN test.products s on r.product_id = s.product_id
WHERE s.product_type IN ("paint")
) b
WHERE a.surname = b.surname
AND a.email = b.email
)
AND id NOT IN (
SELECT MAX(z.id)
FROM (SELECT * FROM test.user_purchases) z
WHERE z.product_id IN (
SELECT product_id
FROM test.products
WHERE product_type IN ("leafblower")
)
AND id NOT IN (
SELECT a.id
FROM (SELECT * FROM test.user_purchases) a
INNER JOIN (
SELECT r.surname, r.email
FROM (SELECT * FROM test.user_purchases) r
JOIN test.products s on r.product_id = s.product_id
WHERE s.product_type IN ("paint")
) b
WHERE a.surname = b.surname
AND a.email = b.email
)
GROUP BY surname, email
)
)
Any suggestions on how I can streamline this query and optimise the speed of it would be much appreciated.

MySQL INNER JOIN USE

I have two tables:
+-----------------------+
| Tables_in_my_database |
+-----------------------+
| orders |
| orderTaken |
+-----------------------+
In orders, there are attributes like orderId, orderName.
In orderTaken, there are attributes like userId, orderId and orderStatus.
Basically, the mechanism of my project is running like:
A user with a unique userId will be able to take an order from the web page, where each order has its own unique orderId as well. After taken, the orderTaken table will record the userId, orderId and initially set orderStatus = 1.
Now, I want to select the orders taken by one user where the orderStatus = 1, but also I will need the orderName from my orders table to be shown. I've learned inner join and written a query as this:
SELECT * FROM orders
WHERE orders.orderId IN
(SELECT orderId FROM orderTaken
WHERE orderTaken.userId = '8' AND orderTaken.orderStatus = '1')
INNER JOIN orderTaken ON orders.orderId = orderTaken.orderId
However, MySQL keeps complaining about syntax error. I guess I cannot use inner join this way? Can anybody correct me? Thanks!
SELECT * FROM orders
INNER JOIN orderTaken ON orders.orderId = orderTaken.orderId
WHERE orderTaken.userId = '8' AND orderTaken.orderStatus = '1'
You have over complicated it. This will work.
I have left the quotes in for the userId since I assumed that's right, but if they are of data type INT then you should remove them.
If you want some reading on JOINs then see these links
Join terminology: inner, outer, semi, anti
W3Schools
Try like this
SELECT *
FROM (orders INNER JOIN ordersTaken ON orders.orderId = orderTaken.orderId)
WHERE orderId IN
(SELECT orderId
FROM orderTaken
WHERE orderTaken.userId = '8' AND orderTaken.orderStatus = '1')

How to select users without specific one to many rows in MySQL

Consider the following data set:
users table:
id (int) email (string)
1 first#example.com
2 second#example.com
order_items table:
id (int) user_id (int) generation (string)
1 1 '11'
2 1 '12'
2 1 '12.50'
3 1 '16.00'
4 2 '11'
5 2 '12'
UPDATED question
How can I select users which doesn't have order_items with generation 16.00 and have at least one order_item?
So:
email
second#example.com
1) Returning Users who don't have order item with generation 16 included users with no orders at all.
Assuming you have some kind of id column in order_items table:
select u.* from users u
left outer join order_items oi on (u.id = oi.user_id and oi.generation = 16)
where oi.id is null;
Otherwise use whatever primary key you have in order_items in the where condition to be NULL.
Updated to include answer for the question in comment
2) Returning users who don't have order item with generation 16 but have least one order.
select distinct u.* from users u
left outer join order_items oi16 on (u.id = oi.user_id and oi.generation = 16)
join order_items oiother on (u.id = oiother.user_id and oiother.generation != 16)
where oi16.id is null;
We do the filtering by using a second (normal) join which only returns users where it finds matching rows from the order_items table.
Here we need the distinct because the second join will multiply your rows depending on how many other orders the user have.
Alternatively you can also do a count or sum like this:
select u.*, count(distinct oiother.id) from users u
left outer join order_items oi16 on (u.id = oi.user_id and oi.generation = 16)
join order_items oiother on (u.id = oiother.user_id and oiother.generation != 16)
where oi16.id is null
group by u.id;
This will give you also how many other order items each returned user have. Or omit the count completely and using group by just to return distinct items.
You can use NOT EXISTS() like this:
SELECT * FROM Users u
WHERE NOT EXISTS(SELECT 1 FROM order_items o
WHERE o.userid = u.id
AND o.generation = 16)
That checks if there is a record for this user with order.generation = 16, and if there isn't it selects him.
Or not in()
SELECT * FROM Users u
WHERE u.id NOT IN(SELECT userid FROM order_items o
WHERE o.generation = 16)
That selects the list of users who have order.generation = 16, and select every id except them.
Following query should give you the desired output:
*update*
changed query as per the new result format in the question
As we want the data only from generation table, join with user table is not needed anymore. Here's the updated query:
select id, generation
from mytable where id not in (
select id from mytable
where generation = 16
group by id
);
Here is the SQL fiddle for it.

MYSQL: How to join two tables using Inner join and then calculatin the total number from the second table for the following examples

I am stuck with the following requirement and I am finding it difficult to crack the query for it.
Consider a table customer with the following fields
id signup_date first_payment_date
10 2015-03-20 null
11 2015-03-20 null
12 2015-03-20 null
13 2015-03-20 null
14 2015-05-23 null
15 2015-05-23 null
Consider another table transaction_history
id product_name
10 vod trial
10 vod trial
11 vod trial
12 vod trial
12 vod
13 vod trial
14 vod trial
15 vod trial
15 vod trial
I need to pick the idfrom customer table and look up in transaction_history table based on the signup_date and first_payment_date is null.
Now I need to check if this id is present in transaction_history and check if he has at least 1 entry with product_name = "vod trial". If he has then he is a row in the result I want.
At the end I need to calculate the total number of id's from transaction_history who has at least one row where product_name="vod_trial" and this should be on a date basis mentioned in signup_date in customer table.
I wrote a query in the following manner:
SELECT
ts.guid,
cs.signup_date,
(SELECT
COUNT(ts2.guid)
FROM
transaction_history ts2
WHERE
cs.guid = ts2.guid
AND ts2.product_name = "vod trial"
HAVING COUNT(ts2.guid) = 1) AS count_ts_guid
FROM
customer AS cs,
transaction_history AS ts
WHERE
cs.guid = ts.guid
AND cs.first_payment_date IS NULL;
But in the above query I am not able to calculate the total count signup_datewise.
Would be great if someone could help me out.
Sample result:
date new trials
2015-03-20 2
2015-05-23 1
I am not sure I fully understand. You want customers without first_payment_date that have a trial entry in the transaction table?
select *
from customer
where first_payment_date is null
and id in (select id from transaction_history where product_name = 'vod trial');
Okay, from your last comment it seems, you want customers that have no trial entry in the transaction table, too. And you want to display them with their trial transaction count. So:
select signup_date,
(
select count(*)
from transaction_history th
where th.product_name = 'vod trial'
and th.id = c.id
)
from customer c
where first_payment_date is null;
If you even want to group by date, then aggregate:
select signup_date,
sum((
select count(*)
from transaction_history th
where th.product_name = 'vod trial'
and th.id = c.id
))
from customer c
where first_payment_date is null
group by signup_date;
Next try: Join all customers and transactions, such as to only get customers present in the transactions table. Then aggregate.
select c.signup_date, count(*)
from customer c
join transaction_history th on th.id = c.id and th.product_name = 'vod trial'
where c.first_payment_date is null
group by c.signup_date;
Or do you want this:
select c.signup_date, count(case when th.product_name = 'vod trial' then 1 end)
from customer c
join transaction_history th on th.id = c.id
where c.first_payment_date is null
group by c.signup_date;
I'd better make this a separate answer. You want to find customers that have only one entry in transaction_history and that entry must be 'vod trial'. So read the transaction table, group by customer id and count. Check your criteria with HAVING. Then join the found IDs with the customer table and group by date.
select c.signup_date, count(*)
from customer c
join
(
select id
from transaction_history
group by id
having count(*) = 1
and min(product_name) = 'vod trial'
) t on t.id = c.id
group by c.signup_date;

How can I join a table in MySQL, deriving one boolean result based on some criteria?

I have three tables which I am joining in MySQL. For simplicity's sake, let's call them CUSTOMERS, ORDERS and NOTES. My goal is to create one query which is used in a Windows application to generate a list of customers, the number of orders they have, and then a flag to indicate if a human has written some notes.
Before I added the requirement of notes, my query to do this was:
SELECT c.customer `Customer`, COUNT(o.id) `Order Qty`
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.customer ASC;
Which produces:
Customer Order Qty
-------- ---------
Acme Corp. 2
Bee Inc. 3
I'd like add a boolean column to indicate if any human entered notes exist. (There are automatically generated notes, which are ignored for this query.)
The NOTES table is basically:
id int(3)
customer_id int(3)
note_datetime datetime
is_auto tinyint(1)
note text
Any customer may have any number of notes (one to many relationship), and those notes may be automatically generated (is_auto is true) or not.
I'd like to add a column that simply returns true if any of the notes for that customer are not automatic, letting users know that there are human-entered notes to read.
For this example, ACME (1) has three notes, one of which is not automatic. Bee (2) has two automatic notes:
id customer_id is_auto
-- ----------- -------
1 1 1
2 1 1
3 1 0
4 2 1
5 2 1
I've come up with the following query:
SELECT
c.customer `Customer`,
COUNT(o.id) `Order Qty`,
IF(SUM(IF(n.is_auto, 0, 1) > 0), TRUE, FALSE) `Human Notes`
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
LEFT JOIN notes n ON n.customer_id = c.id
GROUP BY c.customer ASC;
This produces:
Customer Order Qty Human Notes
-------- --------- -----------
ACME Corp. 6 TRUE # (6 orders, should be 2)
Bee Inc. 6 FALSE # (6 orders, should be 1)
The problem is that rows from the notes table inflates the number of orders artificially (2 orders multiplied by 3 notes; and 3 orders multiplied by 2 notes). I understand that this is because the LEFT JOIN creates duplicate rows of orders which match multiple rows from the NOTES table.
I have not been able to determine a proper way to rewrite this query. What would be a better way to join the notes table so that I get the desired boolean, but avoid extra rows which alter the order count?
Try just using exists instead:
SELECT c.customer, COUNT(o.id) as `Order Qty`,
(CASE WHEN EXISTS (SELECT 1 FROM notes n WHERE n.customer_id = c.id and not n.is_auto)
THEN TRUE ELSE FALSE
END) as `Human Notes`
FROM customers c LEFT JOIN
orders o
ON o.customer_id = c.id
GROUP BY c.customer ASC;
Otherwise, the number of notes messes up the count for the orders -- as you discovered.