I'm trying to run a query that shows all the members of a customer that do not belong in one of their groups. I'm comparing two tables that have a common CustomerID and using their member id's to show which members are not in the second table, CustomerGroupMember.
Here is a sample of the two tables.
Customer Member
id | CustomerID | First | Last
---------------------------------
123 | 1234 | Jim | Sample
129 | 1234 | Julie | Clark
137 | 1234 | Jack | Thomas
289 | 1234 | Sue | Smith
Customer Group Member
MemberID | CustomerID | GroupID
---------------------------------
129 | 1234 | 19
289 | 1234 | 20
Below is my query which does not seem to produce any results. I'd like it to output anyone not found in that Customer Group Member table. In the table examples above I'd see an output of members 123 and 137.
SELECT CustomerMember.* FROM CustomerMember
LEFT JOIN
CustomerGroupMember ON CustomerMember.id = CustomerGroupMember.MemberID
WHERE
CustomerMember.CustomerID = '1234' AND CustomerGroupMember.CustomerID = '1234'
AND CustomerGroupMember.MemberID IS NULL
With the second condition (CustomerGroupMember.CustomerID = '1234') you are converting your LEFT JOIN to an INNER JOIN. All rows which have NULLs in the CustomerGroupMember columns will be filtered out, since NULL can not be equal to '1234'. You need to move that condition into the ON clause:
SELECT CustomerMember.* FROM CustomerMember
LEFT JOIN
CustomerGroupMember
ON CustomerMember.id = CustomerGroupMember.MemberID
AND CustomerGroupMember.CustomerID = '1234'
WHERE
CustomerMember.CustomerID = '1234'
AND CustomerGroupMember.MemberID IS NULL;
http://rextester.com/DLTQ86207
Related
I have a simple table like this:
group | name | price
1 | john |
2 | mike |
3 | paul |
1 | sean |
4 | jack |
2 | brad |
5 | mick |
1 | bill |
4 | chad |
I have two different price values where 100EUR is for a first member of a group and 50EUR is for all additional members of that same group.
Detailed explanation. If a group has only one member, that member gets a price of 100EUR. If a group has multiple members, the first member gets a price of 100EUR, and all additional members of that same group get a price of 50EUR. There can be unlimited number of groups that will be added additionally.
The result should be like this:
group | name | price
1 | john | 100
2 | mike | 100
3 | paul | 100
1 | sean | 50
4 | jack | 100
2 | brad | 50
5 | mick | 100
1 | bill | 50
4 | chad | 50
I'd need a query which would be able to INSERT/UPDATE all missing price fields whenever I manually run it.
Thank you in advance for looking into that matter.
After a lot of trial and error I found a perfect fully functional solution, based on daviid's clever method. The issue with mysql is that by it's structure won't update tables with select methods as subquery. However, self-join (join or inner join) methods can be used instead in this case. I also had to add auto-incremental id to that table, so the final table structure is:
id | group_id | name | price
1 | 1 | john |
2 | 2 | mike |
3 | 3 | paul |
4 | 1 | sean |
5 | 4 | jack |
6 | 2 | brad |
7 | 5 | mick |
8 | 1 | bill |
9 | 4 | chad |
---
SET SQL_SAFE_UPDATES=0;
UPDATE table_name
SET price = 50;
UPDATE table_name AS a
JOIN
( SELECT id
FROM table_name
GROUP BY group_id
HAVING COUNT(*) >= 1
) AS b
ON a.id = b.id
SET a.price = 100;
Thanks also to Cody and Barmar for usable hints...
A partial answer: you can GROUP BY your "group" field and tack on a HAVING COUNT(group) > 1 to determine if that group has more than 1 member.
That is, to see all groups with more than one member it would look like:
SELECT
group
FROM table
GROUP BY group
HAVING COUNT(group) > 1
That will just tell you which groups have multiple members. Without another way to ensure ordering you cannot tell which member is "first" in their group and thus should be priced at 100 and all others priced at 50.
The following queries are not tested and might contain syntax errors. But they are good enough to understand the principle. There are many possible ways to achieve your result.
Here is my take: I would make use of one query to UPDATE the price on every row and set it to 50 whether it is the first group member or not. >table_name<, of course, needs to be changed to the name of your mentioned table.
UPDATE >table_name<
SET price = 50;
Then I would take care of each individual group and the respective first member by running the following query. Adapt the query to each group by changing the >groupId<.
UPDATE >table_name<
SET price = 100
WHERE id = (
SELECT id
FROM >table_name<
WHERE group = >groupId<
ORDER BY id
LIMIT 1
);
Take a look a the nested query: It queries the table for all members of only one group, orders them in ascending order and only returns an id per member. By applying LIMIT to the query, the result will just be the first group member's id. The resulting id can then be used in the other query to update the price and set it to 100.
But be careful: If you insert/delete (new) members with an id that is not just counting up, this query might select a "new first member".
I have a Purchases table, where I'm trying to select all rows where first name, surname and email are duplicates (for all 3).
Purchases table:
| purchase_id | product_id | user_id | firstname | surname | email |
| ------------- | -----------| ------------- | ----------- | --------- | ----------- |
| 1 | 1 | 777 | Sally | Smith | s#gmail.com |
| 2 | 2 | 777 | Sally | Smith | s#gmail.com |
| 3 | 3 | 777 | Sally | Smith | s#gmail.com |
| 4 | 1 | 888 | Bob | Smith | b#gmail.com |
Further to this, each product ID corresponds to a product type in a 'Products' table, and I'm trying to filter by 'lawnmower' purchases (so only product ID 1 & 2)
Products table:
| product_type | product_id |
| ------------- | -----------|
| lawnmower | 1 |
| lawnmower | 2 |
| leafblower | 3 |
I'm hoping to write a query that will return all purchases of the 'lawnmower' type where first name, last name, and email are duplicates (so would return the first two rows of the Purchases table).
This is where my query is at so far, however it's not returning accurate data (e.g. I know I have around 350 duplicates and it's returning 10,000 rows):
SELECT t. *
FROM database_name.purchases t
JOIN (
SELECT firstname, surname, email, count( * ) AS NumDuplicates
FROM database_name.purchases
GROUP BY firstname, surname, email
HAVING NumDuplicates >1
)tsum ON t.firstname = tsum.firstname
AND t.surname = tsum.surname
AND t.email = tsum.email
INNER JOIN database_name.products p2 ON t.product_id = p2.product_id
WHERE p2.product_type = 'lawnmower'
Just wanting to know what I need to tweak in my query syntax.
You know that you should be returning Sally Smith. Create a table from the results of your query above. Then Select * from that table where first_name=sally and surname=Smith. See if you can figure out where you are going wrong based on that. This will help you debug these type of issues yourself in the future.
Your inner SELECT does not filter on the product type. It gets all customers who have purchased any two items. Then you join it to purchases and therefore also get the purchases of customers who have bought any two items and, possibly only one, lawnmower. Add a filter on the product type in the subquery too:
SELECT t.*
FROM database_name.purchases t
INNER JOIN (SELECT purchases.userid
FROM database_name.purchases
INNER JOIN database_name.products
ON products.product_id = purchases.product_id
WHERE products.product_type = 'lawnmower'
GROUP BY userid
HAVING count(*) > 1) s
ON t.user_id = s.user_id
INNER JOIN database_name.products p
ON t.product_id = p.product_id
WHERE p.product_type = 'lawnmower';
Your schema also is problematic -- denormalised. firstname, surname and email depend on user_id (Note that I only grouped and joined using the user_id, that's enough,). So they shouldn't be in purchases, only user_id. product_type better by an ID referencing to some product type table too.
I'm having issues with a select query and can't quite figure out how to fix. I have two tables:
TABLE_students
|--------|------------|--------|
| STU_ID | EMAIL | NAME |
|--------|------------|--------|
| 1 | a#e.com | Bob |
| 2 | b#e.com | Joe |
| 3 | c#e.com | Tim |
--------------------------------
TABLE_scores
|--------|------------|-------------|--------|
| SRE_ID | STU_ID | DATE | SCORE |
|--------|------------|-------------|--------|
| 91 | 2 | 2018-04-03 | 78 |
| 92 | 2 | 2018-04-06 | 89 |
| 93 | 3 | 2018-04-03 | 67 |
| 94 | 3 | 2018-04-06 | 72 |
| 95 | 3 | 2018-04-07 | 81 |
----------------------------------------------
I'm trying to select data from both tables but have a few requirements. I need to select the student even if they don't have a score in the scores table. I also only only want the latest scores record.
The query below only returns those students that have a score and it also duplicates returns a total of 5 rows (since there are five scores). What I want is for the query to return three rows (one for each student) and their latest score value (or NULL if they don't have a score):
SELECT students.NAME, scores.SCORE FROM TABLE_students as students, TABLE_scores AS scores WHERE students.STU_ID = scores.STU_ID;
I'm having difficulty figuring out how to pull all students regardless of whether they have a score and how to pull only the latest score if they do have one.
Thank you!
This is a variation of the greatest-n-per-group question, which is common on Stack Overflow.
I would do this with a couple of joins:
SELECT s.NAME, c1.DATE, c1.SCORE
FROM students AS s
LEFT JOIN scores AS c1 ON c1.STU_ID = s.STU_ID
LEFT JOIN scores AS c2 ON c2.STU_ID = s.STU_ID
AND (c2.DATE > c1.DATE OR c2.DATE = c1.DATE AND c2.SRE_ID > c1.SRE_ID)
WHERE c2.STU_ID IS NULL;
If c2.STU_ID is null, it means the LEFT JOIN matched no rows that have a greater date (or greater SRE_ID in case of a tie) than the row in c1. This means the row in c1 must be the most recent, because there is no other row that is more recent.
P.S.: Please learn the JOIN syntax, and avoid "comma-style" joins. JOIN has been standard since 1992.
P.P.S.: I removed the superfluous "TABLE_" prefix from your table names. You don't need to use the table name to remind yourself that it's a table! :-)
You could use correlated subquery:
SELECT *,
(SELECT score FROM TABLE_scores sc
WHERE sc.stu_id = s.stu_id ORDER BY DATE DESC LIMIT 1) AS score
FROM TABLE_students s
I have a two tables with users in an old format and a new format. I want to match the users with the old format to a separate table, then exclude all users who also show up in the new user format table. My data is like this:
Table newUsers:
+----+-------+-------+----------+
| id | oldid | first | last |
+----+-------+-------+----------+
| 1 | 10 | John | Kennedy |
| 2 | 66 | Mitch | Kupchak |
+----+-------+-------+----------+
Table posts:
+----+---------+
| id | user_id |
+----+---------+
| 1 | 10 |
| 1 | 66 |
| 1 | 88 |
| 2 | 88 |
| 2 | 28 |
| 3 | 10 |
+----+---------+
Table oldUsers:
+----+----------+-------+----------+
| id | username | first | last |
+----+----------+-------+----------+
| 10 | A | John | Kennedy |
| 66 | B | Mitch | Kupchak |
| 88 | C | Dale | Earnhardt|
+----+----------+-------+----------+
Result wantend:
+----+----------+-------+----------+
| id | username | first | last |
+----+----------+-------+----------+
| 88 | C | Dale | Earnhardt|
+----+----------+-------+----------+
I want to select my result by specifying: posts.id = 1 and posts.user_id = oldUsers.id and newUsers.oldid != oldUsers.id so that I only receive oldUser.id equaling 88 because he wasn't in the newUsers list.
I have tried all kinds of JOINS and SUBQUERIES. I keep getting all of the results and not the results minus corresponding entries in the newUsers table.
select * from oldusers where id in
select * from
(select id from oldusers where id in
select distinct userid from posts where id=1)
where id not in (select oldid from newusers);
Here is a way to do it
select
o.* from oldUsers o
left join newUsers n on o.id = n.oldid
left join posts p on n.oldid = p.user_id or o.id = p.user_id
where n.id is null and p.id= 1;
For better performance add the following indexes
alter table newUsers add index oldid_idx(oldid);
alter table posts add index user_post_idx (id,user_id);
I ended up finding my answer on my own and then came here to find others tried. Abhik's code did work, but was too inefficient to use. I ended up playing with my own code and IS NULL until I found something that was much more efficient.
select o.* from posts p, oldUsers o
LEFT JOIN newUsers n ON o.id = n.oldid
WHERE p.user_id = o.id AND p.id = 1 AND n.id IS NULL
Executes in .0044 seconds. Something I can use on a production site.
With indexes added from previous answer it now executes in .001x seconds so definately going with my own code.
I am having issues trying to combine DISTINCT & ORDER BY. I have a Users table with the following attributes id, name & I have a Purchases table with the following attributes id,user_id,date_purchased,returned
I want to retrieve all unique Users that have a returned Purchase sorted by date_purchased.
Here is some sample data
Users
id | name
---+-----------
1 | Bob
2 | John
3 | Bill
4 | Frank
5 | Fred
6 | Al
Purchases
id | user_id | startdate | returned
-----+------------------+------------+---------------
100 | 1 | 2015-02-06 | true
101 | 1 | 2015-01-06 | true
102 | 1 | 2015-02-05 | false
103 | 2 | 2015-02-05 | false
104 | 2 | 2015-02-05 | false
105 | 3 | 2015-01-05 | true
106 | 3 | 2015-02-04 | true
107 | 4 | 2015-01-07 | true
108 | 5 | 2015-02-05 | false
109 | 6 | 2015-02-07 | false
110 | 6 | 2015-01-05 | true
The result should be the following user id's 1,3,4,6
Here is the query I wrote
SELECT DISTINCT (id) FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true
ORDER BY startdate )
This query correctly returns the results; however it is in the incorrect order. Reading other answers I found that you can't maintain the subquery ordering. I tried to move the ordering to the outer query; however, startdate would also need to be present in the select query & that is not what I want
Just remove the subquery and use GROUP BY:
SELECT u.id as id
FROM users u INNER JOIN
purchases p
on u.id = p.id
WHERE returned = true
GROUP BY u.id
ORDER BY MIN(startdate);
You can only rely on the result set being in a particular order when you use ORDER BY for the outermost SELECT. There is no guarantee of ordering in any other case.
As a note: ordering usually does work with subquery (sadly, because many people look at the results from some queries and generalize to all of them). The problem in this case is the distinct. It rearranges the data (i.e. sorts it) to remove duplicates.
Gordon's script gives you the data you want, but to answer your question of how to maintain a subquery's order, you can pull the column you want to order by out of the subquery and then order by it.
SELECT DISTINCT (id), innerTable.startdate FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true) as innerTable
ORDER BY innerTable.startdate