SQL: Keep subquery order in outer query - mysql

I am having issues trying to combine DISTINCT & ORDER BY. I have a Users table with the following attributes id, name & I have a Purchases table with the following attributes id,user_id,date_purchased,returned
I want to retrieve all unique Users that have a returned Purchase sorted by date_purchased.
Here is some sample data
Users
id | name
---+-----------
1 | Bob
2 | John
3 | Bill
4 | Frank
5 | Fred
6 | Al
Purchases
id | user_id | startdate | returned
-----+------------------+------------+---------------
100 | 1 | 2015-02-06 | true
101 | 1 | 2015-01-06 | true
102 | 1 | 2015-02-05 | false
103 | 2 | 2015-02-05 | false
104 | 2 | 2015-02-05 | false
105 | 3 | 2015-01-05 | true
106 | 3 | 2015-02-04 | true
107 | 4 | 2015-01-07 | true
108 | 5 | 2015-02-05 | false
109 | 6 | 2015-02-07 | false
110 | 6 | 2015-01-05 | true
The result should be the following user id's 1,3,4,6
Here is the query I wrote
SELECT DISTINCT (id) FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true
ORDER BY startdate )
This query correctly returns the results; however it is in the incorrect order. Reading other answers I found that you can't maintain the subquery ordering. I tried to move the ordering to the outer query; however, startdate would also need to be present in the select query & that is not what I want

Just remove the subquery and use GROUP BY:
SELECT u.id as id
FROM users u INNER JOIN
purchases p
on u.id = p.id
WHERE returned = true
GROUP BY u.id
ORDER BY MIN(startdate);
You can only rely on the result set being in a particular order when you use ORDER BY for the outermost SELECT. There is no guarantee of ordering in any other case.
As a note: ordering usually does work with subquery (sadly, because many people look at the results from some queries and generalize to all of them). The problem in this case is the distinct. It rearranges the data (i.e. sorts it) to remove duplicates.

Gordon's script gives you the data you want, but to answer your question of how to maintain a subquery's order, you can pull the column you want to order by out of the subquery and then order by it.
SELECT DISTINCT (id), innerTable.startdate FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true) as innerTable
ORDER BY innerTable.startdate

Related

Different price for additional members of the same group

I have a simple table like this:
group | name | price
1 | john |
2 | mike |
3 | paul |
1 | sean |
4 | jack |
2 | brad |
5 | mick |
1 | bill |
4 | chad |
I have two different price values where 100EUR is for a first member of a group and 50EUR is for all additional members of that same group.
Detailed explanation. If a group has only one member, that member gets a price of 100EUR. If a group has multiple members, the first member gets a price of 100EUR, and all additional members of that same group get a price of 50EUR. There can be unlimited number of groups that will be added additionally.
The result should be like this:
group | name | price
1 | john | 100
2 | mike | 100
3 | paul | 100
1 | sean | 50
4 | jack | 100
2 | brad | 50
5 | mick | 100
1 | bill | 50
4 | chad | 50
I'd need a query which would be able to INSERT/UPDATE all missing price fields whenever I manually run it.
Thank you in advance for looking into that matter.
After a lot of trial and error I found a perfect fully functional solution, based on daviid's clever method. The issue with mysql is that by it's structure won't update tables with select methods as subquery. However, self-join (join or inner join) methods can be used instead in this case. I also had to add auto-incremental id to that table, so the final table structure is:
id | group_id | name | price
1 | 1 | john |
2 | 2 | mike |
3 | 3 | paul |
4 | 1 | sean |
5 | 4 | jack |
6 | 2 | brad |
7 | 5 | mick |
8 | 1 | bill |
9 | 4 | chad |
---
SET SQL_SAFE_UPDATES=0;
UPDATE table_name
SET price = 50;
UPDATE table_name AS a
JOIN
( SELECT id
FROM table_name
GROUP BY group_id
HAVING COUNT(*) >= 1
) AS b
ON a.id = b.id
SET a.price = 100;
Thanks also to Cody and Barmar for usable hints...
A partial answer: you can GROUP BY your "group" field and tack on a HAVING COUNT(group) > 1 to determine if that group has more than 1 member.
That is, to see all groups with more than one member it would look like:
SELECT
group
FROM table
GROUP BY group
HAVING COUNT(group) > 1
That will just tell you which groups have multiple members. Without another way to ensure ordering you cannot tell which member is "first" in their group and thus should be priced at 100 and all others priced at 50.
The following queries are not tested and might contain syntax errors. But they are good enough to understand the principle. There are many possible ways to achieve your result.
Here is my take: I would make use of one query to UPDATE the price on every row and set it to 50 whether it is the first group member or not. >table_name<, of course, needs to be changed to the name of your mentioned table.
UPDATE >table_name<
SET price = 50;
Then I would take care of each individual group and the respective first member by running the following query. Adapt the query to each group by changing the >groupId<.
UPDATE >table_name<
SET price = 100
WHERE id = (
SELECT id
FROM >table_name<
WHERE group = >groupId<
ORDER BY id
LIMIT 1
);
Take a look a the nested query: It queries the table for all members of only one group, orders them in ascending order and only returns an id per member. By applying LIMIT to the query, the result will just be the first group member's id. The resulting id can then be used in the other query to update the price and set it to 100.
But be careful: If you insert/delete (new) members with an id that is not just counting up, this query might select a "new first member".

How to avoid calculating rows caused by the join inside SUM()?

Here is my database schema simplified:
// wallet
+----+--------+---------+
| id | credit | user_id |
+----+--------+---------+
| 1 | 1000 | 1 |
| 2 | 1500 | 2 |
+----+--------+---------+
// where_to_pay_ability
+----+-------------+-----------+
| id | business_id | wallet_id |
+----+-------------+-----------+
| 1 | 5 | 1 |
| 2 | 4 | 1 |
+----+-------------+-----------+
And this is the current query I have:
select sum(credit)
from wallet w
left join where_to_pay_ability wtpa on w.id = wtpa.wallet_id
where user_id = 1
It returs 2000. Becuase there are two rows inside where_to_pay_ability table. That's a wrong credit for me. I want to sum rows once inside wallet table. So, the expected result is 1000.
How can I do that?
It should be noted, I can do that left join with a sub-query that is GROUP BYed wallet_id (Or DISTINCT). But, I need to have those business_ids.
So, I need a condition inside the SUM() to avoid calculating rows caused by the join.
You would need to aggregate before joining:
select sum(credit)
from wallet w left join
(select wtpa.wallet_id, count(*) as cnt
from where_to_pay_ability wtpa
group by wtpa.wallet_id
) wtpa
on w.id = wtpa.wallet_id;
where user_id = 1;
In your particular example, though, you could use max() because there is only one row.

MySQL Select from Multiple Tables and most recent record

I'm having issues with a select query and can't quite figure out how to fix. I have two tables:
TABLE_students
|--------|------------|--------|
| STU_ID | EMAIL | NAME |
|--------|------------|--------|
| 1 | a#e.com | Bob |
| 2 | b#e.com | Joe |
| 3 | c#e.com | Tim |
--------------------------------
TABLE_scores
|--------|------------|-------------|--------|
| SRE_ID | STU_ID | DATE | SCORE |
|--------|------------|-------------|--------|
| 91 | 2 | 2018-04-03 | 78 |
| 92 | 2 | 2018-04-06 | 89 |
| 93 | 3 | 2018-04-03 | 67 |
| 94 | 3 | 2018-04-06 | 72 |
| 95 | 3 | 2018-04-07 | 81 |
----------------------------------------------
I'm trying to select data from both tables but have a few requirements. I need to select the student even if they don't have a score in the scores table. I also only only want the latest scores record.
The query below only returns those students that have a score and it also duplicates returns a total of 5 rows (since there are five scores). What I want is for the query to return three rows (one for each student) and their latest score value (or NULL if they don't have a score):
SELECT students.NAME, scores.SCORE FROM TABLE_students as students, TABLE_scores AS scores WHERE students.STU_ID = scores.STU_ID;
I'm having difficulty figuring out how to pull all students regardless of whether they have a score and how to pull only the latest score if they do have one.
Thank you!
This is a variation of the greatest-n-per-group question, which is common on Stack Overflow.
I would do this with a couple of joins:
SELECT s.NAME, c1.DATE, c1.SCORE
FROM students AS s
LEFT JOIN scores AS c1 ON c1.STU_ID = s.STU_ID
LEFT JOIN scores AS c2 ON c2.STU_ID = s.STU_ID
AND (c2.DATE > c1.DATE OR c2.DATE = c1.DATE AND c2.SRE_ID > c1.SRE_ID)
WHERE c2.STU_ID IS NULL;
If c2.STU_ID is null, it means the LEFT JOIN matched no rows that have a greater date (or greater SRE_ID in case of a tie) than the row in c1. This means the row in c1 must be the most recent, because there is no other row that is more recent.
P.S.: Please learn the JOIN syntax, and avoid "comma-style" joins. JOIN has been standard since 1992.
P.P.S.: I removed the superfluous "TABLE_" prefix from your table names. You don't need to use the table name to remind yourself that it's a table! :-)
You could use correlated subquery:
SELECT *,
(SELECT score FROM TABLE_scores sc
WHERE sc.stu_id = s.stu_id ORDER BY DATE DESC LIMIT 1) AS score
FROM TABLE_students s

Comparing two tables in MySQL based on a common ID

I'm trying to run a query that shows all the members of a customer that do not belong in one of their groups. I'm comparing two tables that have a common CustomerID and using their member id's to show which members are not in the second table, CustomerGroupMember.
Here is a sample of the two tables.
Customer Member
id | CustomerID | First | Last
---------------------------------
123 | 1234 | Jim | Sample
129 | 1234 | Julie | Clark
137 | 1234 | Jack | Thomas
289 | 1234 | Sue | Smith
Customer Group Member
MemberID | CustomerID | GroupID
---------------------------------
129 | 1234 | 19
289 | 1234 | 20
Below is my query which does not seem to produce any results. I'd like it to output anyone not found in that Customer Group Member table. In the table examples above I'd see an output of members 123 and 137.
SELECT CustomerMember.* FROM CustomerMember
LEFT JOIN
CustomerGroupMember ON CustomerMember.id = CustomerGroupMember.MemberID
WHERE
CustomerMember.CustomerID = '1234' AND CustomerGroupMember.CustomerID = '1234'
AND CustomerGroupMember.MemberID IS NULL
With the second condition (CustomerGroupMember.CustomerID = '1234') you are converting your LEFT JOIN to an INNER JOIN. All rows which have NULLs in the CustomerGroupMember columns will be filtered out, since NULL can not be equal to '1234'. You need to move that condition into the ON clause:
SELECT CustomerMember.* FROM CustomerMember
LEFT JOIN
CustomerGroupMember
ON CustomerMember.id = CustomerGroupMember.MemberID
AND CustomerGroupMember.CustomerID = '1234'
WHERE
CustomerMember.CustomerID = '1234'
AND CustomerGroupMember.MemberID IS NULL;
http://rextester.com/DLTQ86207

Getting all items without open end date

I need to solve following problem using (My)SQL, given is this example table:
id | item | start | end
1 | 100 | 2015-01-01 | 2015-01-14
2 | 100 | 2015-01-01 | NULL
3 | 101 | 2015-03-01 | 2015-04-15
4 | 101 | 2015-04-17 | 2015-04-22
5 | 101 | 2015-04-27 | 2015-05-11
I need a query that gives me all items where there is no open end date. So from the above I'd expect to get 101.
I tried it with GROUP and some sub-selects but didn't show up like I expected. Any help on this?
You can do this using group by and having:
select item
from example
group by item
having count(end) = count(*);
count() with a column names counts the number of non-NULL values. If this is equal to the number of rows, then no values are NULL.
You could also use:
having sum(end is null) = 0
EDIT:
I should add that the following might be faster, assuming you have the right indexes and a table for items:
select i.item
from items i
where not exists (select 1
from example e
where i.item = e.item and e.end is null
);
For performance, you want an index on example(item, end).