I'm having issues with a select query and can't quite figure out how to fix. I have two tables:
TABLE_students
|--------|------------|--------|
| STU_ID | EMAIL | NAME |
|--------|------------|--------|
| 1 | a#e.com | Bob |
| 2 | b#e.com | Joe |
| 3 | c#e.com | Tim |
--------------------------------
TABLE_scores
|--------|------------|-------------|--------|
| SRE_ID | STU_ID | DATE | SCORE |
|--------|------------|-------------|--------|
| 91 | 2 | 2018-04-03 | 78 |
| 92 | 2 | 2018-04-06 | 89 |
| 93 | 3 | 2018-04-03 | 67 |
| 94 | 3 | 2018-04-06 | 72 |
| 95 | 3 | 2018-04-07 | 81 |
----------------------------------------------
I'm trying to select data from both tables but have a few requirements. I need to select the student even if they don't have a score in the scores table. I also only only want the latest scores record.
The query below only returns those students that have a score and it also duplicates returns a total of 5 rows (since there are five scores). What I want is for the query to return three rows (one for each student) and their latest score value (or NULL if they don't have a score):
SELECT students.NAME, scores.SCORE FROM TABLE_students as students, TABLE_scores AS scores WHERE students.STU_ID = scores.STU_ID;
I'm having difficulty figuring out how to pull all students regardless of whether they have a score and how to pull only the latest score if they do have one.
Thank you!
This is a variation of the greatest-n-per-group question, which is common on Stack Overflow.
I would do this with a couple of joins:
SELECT s.NAME, c1.DATE, c1.SCORE
FROM students AS s
LEFT JOIN scores AS c1 ON c1.STU_ID = s.STU_ID
LEFT JOIN scores AS c2 ON c2.STU_ID = s.STU_ID
AND (c2.DATE > c1.DATE OR c2.DATE = c1.DATE AND c2.SRE_ID > c1.SRE_ID)
WHERE c2.STU_ID IS NULL;
If c2.STU_ID is null, it means the LEFT JOIN matched no rows that have a greater date (or greater SRE_ID in case of a tie) than the row in c1. This means the row in c1 must be the most recent, because there is no other row that is more recent.
P.S.: Please learn the JOIN syntax, and avoid "comma-style" joins. JOIN has been standard since 1992.
P.P.S.: I removed the superfluous "TABLE_" prefix from your table names. You don't need to use the table name to remind yourself that it's a table! :-)
You could use correlated subquery:
SELECT *,
(SELECT score FROM TABLE_scores sc
WHERE sc.stu_id = s.stu_id ORDER BY DATE DESC LIMIT 1) AS score
FROM TABLE_students s
Related
I have a table of students with temporary test values like this:
Table students
+----+-------------+-------+-----------+
| id | section_id | age | name |
+----+-------------+-------+-----------+
| 1 | 1 | 18 | Justin |
+----+-------------+-------+-----------+
| 2 | 2 | 14 | Jillian |
+----+-------------+-------+-----------+
| 3 | 2 | 16 | Cherry |
+----+-------------+-------+-----------+
| 4 | 3 | 19 | Ronald |
+----+-------------+-------+-----------+
| 5 | 3 | 21 | Marie |
+----+-------------+-------+-----------+
| 6 | 3 | 21 | Arthur |
+----+-------------+-------+-----------+
I want to query the table such that I want to get all the maximum ages of each section. However, if two students have the same age, the table produced will return the student with smallest id.
Return:
+----+------------+-----+--------+
| id | section_id | age | name |
+----+------------+-----+--------+
| 1 | 1 | 18 | Justin |
+----+------------+-----+--------+
| 3 | 2 | 16 | Cherry |
+----+------------+-----+--------+
| 5 | 3 | 21 | Marie |
+----+------------+-----+--------+
I tried this query:
SELECT ANY_VALUE(id), ANY_VALUE(section_id), MAX(age), ANY_VALUE(name) FROM
(SELECT id, section_id, age, name FROM students ORDER BY id) as X
GROUP BY section_id
Unfortunately, there are instances that id does not match the age and name.
I have on my end:
sql_mode = only_full_group_by
and I don't have a privilege to edit that, hence the any_value function but I have no idea how to use it.
This will do what you want.
It starts by finding the maximum age per section (including duplicates).
Then it joins those results with the minimum id per section (to eliminate duplicates).
And finally, select all fields for the matching id and section combinations.
SELECT s3.*
FROM students s3
INNER JOIN (
SELECT MIN(s2.id) AS id, s2.section_id
FROM students s2
INNER JOIN (
SELECT s1.section_id, MAX(s1.age) AS age
FROM students s1
GROUP BY s1.section_id
) s1 USING (section_id, age)
GROUP BY s2.section_id
) s2 USING (id, section_id);
Working SQL fiddle: https://www.db-fiddle.com/f/aezgAYM6A5KnXykceB7At1/0
I would simply use a correlated subquery:
select s.*
from students s
where s.id = (select s2.id
from students s2
where s2.section_id = s.section_id
order by s2.age desc, s2.id asc
limit 1
);
This is pretty much the simplest way to express the logic. And with an index on students(section, age, id), it should be the most performant as well.
please help me i have no idea for this...
I have table like this (create_at YYYY-MM-DD). ID is auto increment
-----------------------------------------------------------------
| ID | id_user | activity | create_at |
-----------------------------------------------------------------
| 1 | 10 | A | 2017-10-11 |
| 2 | 52 | A | 2017-10-11 |
| 3 | 41 | A | 2017-10-12 |
| 4 | 52 | A | 2017-10-12 |
| 5 | 41 | B | 2017-10-12 |
| 6 | 52 | B | 2017-10-13 |
| 7 | 10 | B | 2017-10-14 |
-----------------------------------------------------------------
How to get count (mysql) user who doing activity "B" after activity "A" in sameday create_at.. In this case, the result is 1 (IDUser 41).. How can i do this in mysql? thankyou
We could use a semi-join or a correlated subquery.
we start like this, users that are doing activity B
SELECT t.id_user
FROM table_like_this t
WHERE t.activity = 'B'
we can match those rows to users that are doing activity A on the "same day" with JOIN operation back to the same table...
SELECT t.id_user
FROM table_like_this t
JOIN table_like_this r
ON r.id_user = t.id_user
AND r.create_at = t.create_at
AND r.activity = 'A'
WHERE t.activity = 'B'
As far as whether activity B is occurring "after" activity A, I don't see any information in the table that can tell us that (we can't tell what time each activity A and B occurred, and can't determine which one was "after" the other.)
For testing, we can include other columns in the SELECT list, to verify which rows from t and r are being returned, if the matching is being done properly.
Once we are satisfied, we can replace the SELECT list, to get a count of distinct id_user
SELECT COUNT(DISTINCT t.id_user)
FROM ...
Note that this will collapse occurrences of id_user that performed activity A and B on several different days so that the id_user will be counted only once.
If we want to count the number of days for each id_user, and include each of those days in the count, the query would need to be changed.
These are my example table
course table include
course_id | course name
1 | java
2 | .net
3 | php
4 | ruby and rails
Indian_student table include
course_id | no.of student
2 | 10
3 | 30
Japan_student table include
course_id | no.of student
1 | 50
2 | 30
Chinese_student table include
course_id | no.of student
2 | 60
4 | 20
I want the output as
Course_id | in_stu | ja_stu | ch_stu | total
1 | 0 | 50 | 0 | 50
2 | 10 | 30 | 60 | 100
3 | 30 | 0 | 0 | 30
4 | 0 | 0 | 20 |20
But I only get the result
Course_id | in_stu | ja_stu | ch_stu | total
2 | 10 | 30 | 60 | 100
my view is
create view total_student as select
i.indian_stu as in_stu,
j.japan_stu as ja_stu,
c.chinese_stu as ch_stu,
main.course_id as course_id,
(i.indian_stu + j.japan_stu + c.chinese_stu) as total
from indian_student i, japan_student j,chinese_student c, course c
where i.course_id=main.course_id and j.course_id=main.course_id and c.course_id=main.course_id group by course_id;
can I get any advice please
You're using the 1990s-vintage method of joining tables together: comma-separated lists of tables. This is inadequate to your task. You need LEFT JOIN because the comma syntax is an INNER JOIN syntax. INNER JOIN omits rows without matches.
(Also, your sample query uses the c alias twice; that won't work. It also uses GROUP BY but I believe you want ORDER BY)
You want this:
select i.indian_stu as in_stu,
j.japan_stu as ja_stu,
x.chinese_stu as ch_stu,
main.course_id as course_id,
(i.indian_stu + j.japan_stu + c.chinese_stu) as total
from course c
left join indian_student i ON c.course_id = i.course_id
left join japan_student j ON c.course_id = j.course_id
left join chinese_student x on c.course_id = x.course_id
order by c.course_id
When you develop a view, first get the SELECT statement to work, then use it to create the view. That's easier than continually dropping and recreating the view.
I am having issues trying to combine DISTINCT & ORDER BY. I have a Users table with the following attributes id, name & I have a Purchases table with the following attributes id,user_id,date_purchased,returned
I want to retrieve all unique Users that have a returned Purchase sorted by date_purchased.
Here is some sample data
Users
id | name
---+-----------
1 | Bob
2 | John
3 | Bill
4 | Frank
5 | Fred
6 | Al
Purchases
id | user_id | startdate | returned
-----+------------------+------------+---------------
100 | 1 | 2015-02-06 | true
101 | 1 | 2015-01-06 | true
102 | 1 | 2015-02-05 | false
103 | 2 | 2015-02-05 | false
104 | 2 | 2015-02-05 | false
105 | 3 | 2015-01-05 | true
106 | 3 | 2015-02-04 | true
107 | 4 | 2015-01-07 | true
108 | 5 | 2015-02-05 | false
109 | 6 | 2015-02-07 | false
110 | 6 | 2015-01-05 | true
The result should be the following user id's 1,3,4,6
Here is the query I wrote
SELECT DISTINCT (id) FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true
ORDER BY startdate )
This query correctly returns the results; however it is in the incorrect order. Reading other answers I found that you can't maintain the subquery ordering. I tried to move the ordering to the outer query; however, startdate would also need to be present in the select query & that is not what I want
Just remove the subquery and use GROUP BY:
SELECT u.id as id
FROM users u INNER JOIN
purchases p
on u.id = p.id
WHERE returned = true
GROUP BY u.id
ORDER BY MIN(startdate);
You can only rely on the result set being in a particular order when you use ORDER BY for the outermost SELECT. There is no guarantee of ordering in any other case.
As a note: ordering usually does work with subquery (sadly, because many people look at the results from some queries and generalize to all of them). The problem in this case is the distinct. It rearranges the data (i.e. sorts it) to remove duplicates.
Gordon's script gives you the data you want, but to answer your question of how to maintain a subquery's order, you can pull the column you want to order by out of the subquery and then order by it.
SELECT DISTINCT (id), innerTable.startdate FROM (
SELECT users.id as id, purchases.startdate FROM
users INNER JOIN purchases on users.id=purchases.id
WHERE returned=true) as innerTable
ORDER BY innerTable.startdate
I have some MySQL results like this:
---------------------------
| name | something_random |
---------------------------
| john | ekjalsdjalfjkldd |
| alex | akjsldfjaekallee |
| alex | jkjlkjslakjfjflj |
| alex | kajslejajejjaddd |
| bob | ekakdie33kkd93ld |
| bob | 33kd993kakakl3ll |
| paul | 3k309dki595k3lkd |
| paul | 3k399kkfkg93lk3l |
etc...
This goes on for 1000's of rows of results. I need to limit the number of results to the first 50 unique names. I think there is a simple solution to this but I'm not sure.
I've tried using derived tables and variables but can't quite get there. If I could figure out how to increment a variable once every time a name is different I think I could say WHERE variable <= 50.
UPDATED
I've tried the Inner Join approach(es) suggested below. The problem is this:
The subselect SELECT DISTINCT name FROM testTable LIMIT 50 grabs the first 50 distinct names. Perhaps I wasn't clear enough in my original post, but this limits my query too much. In my query, not every name in the table is returned in the result. Let me modify my original example:
----------------------------------
| id | name | something_random |
----------------------------------
| 1 | john | ekjalsdjalfjkldd |
| 4 | alex | akjsldfjaekallee |
| 4 | alex | jkjlkjslakjfjflj |
| 4 | alex | kajslejajejjaddd |
| 6 | bob | ekakdie33kkd93ld |
| 6 | bob | 33kd993kakakl3ll |
| 12 | paul | 3k309dki595k3lkd |
| 12 | paul | 3k399kkfkg93lk3l |
etc...
So I added in some id numbers here. These ID numbers pertain to the people's names in the tables. So you can see in the results, not every single person/name in the table is necessarily in the result (due to some WHERE condition). So the 50th distinct name in the list will always have an ID number higher than 49. The 50th person could be id 79, 234, 4954 etc...
So back to the problem. The subselect SELECT DISTINCT name FROM testTable LIMIT 50 selects the first 50 names in the table. That means that my search results will be limited to names that have ID <=50, which is too constricting. If there are certain names that don't show up in the query (due to some WHERE condition), then they are still counted as one of the 50 distinct names. So you end up with too few results.
UPDATE 2
To #trapper: This is a basic simplification of what my query looks like:
SELECT
t1.id,
t1.name,
t2.details
FROM t1
LEFT JOIN t2 ON t1.id = t2.some_id
INNER JOIN
(SELECT DISTINCT name FROM t1 ORDER BY id LIMIT 0,50) s ON s.name = t1.name
WHERE
SOME CONDITIONS
ORDER BY
t1.id,
t1.name
And my results look like this:
----------------------------------
| id | name | details |
----------------------------------
| 1 | john | ekjalsdjalfjkldd |
| 3 | alex | akjsldfjaekallee |
| 3 | alex | jkjlkjslakjfjflj |
| 4 | alex | kajslejajejjaddd |
| 6 | bob | ekakdie33kkd93ld |
| 6 | bob | 33kd993kakakl3ll |
| 12 | paul | 3k309dki595k3lkd |
| 12 | paul | 3k399kkfkg93lk3l |
...
| 37 | bill | kajslejajejjaddd |
| 37 | bill | ekakdie33kkd93ld |
| 41 | matt | 33kd993kakakl3ll |
| 50 | jake | 3k309dki595k3lkd |
| 50 | jake | 3k399kkfkg93lk3l |
----------------------------------
The results stop at id=50. There are NOT 50 distinct names in the list. There are only roughly 23 distinct names.
My MySql syntax may be rusty, but the idea is to use a query to select the top 50 distinct names, then do a self-join on name and select the name and other information from the join.
select a.name, b.something_random
from Table b
inner join (select distinct name from Table order by RAND() limit 0,50) a
on a.name = b.name
SELECT DISTINCT name FROM table LIMIT 0,50
Edited: Ahh yes I misread question first time, this should do the trick though :)
SELECT a.name, b.something_random
FROM `table` b
INNER JOIN (SELECT DISTINCT name FROM `table` ORDER BY RAND() LIMIT 0,50) a
ON a.name = b.name ORDER BY a.name
How this work is the (SELECT DISTINCT name FROMtableORDER BY RAND() LIMIT 0,50) part is what pulls out the names to include in the join. So here I am taking 50 unique names at random, but you can change this to any other selection criteria if you want.
Then you join those results back into your table. This links each of those 50 selected names back to all of the rows with a matching name for your final results. Finally ORDER BY a.name just to be sure all the rows for each name end up grouped together.
This should do it:
SELECT tA.*
FROM
testTable tA
INNER JOIN
(SELECT distinct name FROM testTable LIMIT 50) tB ON tA.name = tB.name
;