How to get the opposite of a join? - mysql

I am trying to get the rows that don't exist in one table where one table called schedules (match_week, player_home_id, player_away_id) and the other table called match (match_week, Winner_id, Defeated_id) are joined. The players look at their schedule and play a match. I am trying to get a list of the scheduled matches that do not exist in the match table. The IDs in the match table can be in either column Winner_id or Defeated_id.
I have reviewed a number of Stack Exchange examples, but most use "IS NULL" and I don't have null values. I have used a Join that does give the output of the matches played. I would like the matches that have not been played.
CSV - wp_schedule_test
+----+------------+--------------+--------------+-----------------+-----------------+
| ID | match_week | home_player1 | away_player1 | player1_home_id | player1_away_id |
+----+------------+--------------+--------------+-----------------+-----------------+
| 1 | WEEK 1 | James Rives | Dale Hemme | 164 | 169 |
| 2 | WEEK 1 | John Head | David Foster | 81 | 175 |
| 3 | WEEK 1 | John Dalton | Eric Simmons | 82 | 23 |
| 4 | WEEK 2 | John Head | James Rives | 81 | 164 |
| 5 | WEEK 2 | Dale Hemme | John Dalton | 169 | 82 |
| 6 | WEEK 2 | David Foster | Eric Simmons | 175 | 23 |
| 7 | WEEK 3 | John Dalton | James Rives | 82 | 164 |
| 8 | WEEK 3 | John Head | Eric Simmons | 81 | 23 |
| 9 | WEEK 3 | Dale Hemme | David Foster | 169 | 175 |
| 10 | WEEK 4 | Eric Simmons | James Rives | 23 | 164 |
| 11 | WEEK 4 | David Foster | John Dalton | 175 | 82 |
| 12 | WEEK 4 | Dale Hemme | John Head | 169 | 81 |
+----+------------+--------------+--------------+-----------------+-----------------+
CSV - wp_match_scores_test
+----+------------+------------+------------+
| ID | match_week | player1_id | player2_id |
+----+------------+------------+------------+
| 5 | WEEK 1 | 82 | 23 |
| 20 | WEEK 1 | 164 | 169 |
| 21 | WEEK 2 | 164 | 81 |
| 25 | WEEK 2 | 82 | 169 |
| 61 | WEEK 3 | 175 | 169 |
| 62 | WEEK 4 | 175 | 82 |
| 69 | WEEK 2 | 175 | 23 |
| 85 | WEEK 3 | 164 | 82 |
| 86 | WEEK 4 | 164 | 23 |
+----+------------+------------+------------+
The output from the mysql query are the matches that have been played. I am trying to figure out how to list the matches that have not been played from the table Schedule.
CSV - MySQL Output
+------------+------------+------------+
| match_week | player1_id | player2_id |
+------------+------------+------------+
| WEEK 1 | 164 | 169 |
| WEEK 1 | 82 | 23 |
| WEEK 2 | 164 | 81 |
| WEEK 2 | 82 | 169 |
| WEEK 2 | 175 | 23 |
| WEEK 3 | 175 | 169 |
| WEEK 3 | 164 | 82 |
| WEEK 4 | 175 | 82 |
| WEEK 4 | 164 | 23 |
+------------+------------+------------+
MYSQL
select DISTINCT ms.match_week, ms.player1_id , ms.player2_id FROM
wp_match_scores_test ms
JOIN wp_schedules_test s
ON (s.player1_home_id = ms.player1_id or s.player1_away_id =
ms.player2_id)
Order by ms.match_week
The expected output is:
CSV - Desired Output
+------------+----------------+----------------+
| match_week | player_home_id | player_away_id |
+------------+----------------+----------------+
| WEEK 1 | 81 | 175 |
| WEEK 3 | 81 | 23 |
| WEEK 4 | 169 | 81 |
+------------+----------------+----------------+
The added code I would like to use is
SELECT s.*
FROM wp_schedules_test s
WHERE NOT EXISTS
(select DISTINCT ms.match_week, ms.player1_id , ms.player2_id FROM
wp_match_scores_test ms
JOIN wp_schedules_test s
ON (s.player1_home_id = ms.player1_id or s.player1_away_id =
ms.player2_id)
Order by ms.match_week)
Unfortunately, the output yields "No Rows"

You can use a LEFT JOIN to achieve the desired results, joining the two tables on matching player ids (noting that player id values in wp_match_scores_test can correspond to either player1_home_id or player1_away_id in wp_schedules_test). If there is no match, the result table will have NULL values from the wp_match_scores_test table values, and you can use that to select the matches which have not been played:
SELECT sch.*
FROM wp_schedule_test sch
LEFT JOIN wp_match_scores_test ms
ON (ms.player1_id = sch.player1_home_id
OR ms.player2_id = sch.player1_home_id)
AND (ms.player1_id = sch.player1_away_id
OR ms.player2_id = sch.player1_away_id)
WHERE ms.ID IS NULL
Output:
ID match_week home_player1 away_player1 player1_home_id player1_away_id
2 Week 1 John Head David Foster 81 175
8 Week 3 John Head Eric Simmons 81 23
12 Week 4 Dale Hemme John Head 169 81
Note that you can also use a NOT EXISTS query, using the same condition as I used in the JOIN:
SELECT sch.*
FROM wp_schedule_test sch
WHERE NOT EXISTS (SELECT *
FROM wp_match_scores_test ms
WHERE (ms.player1_id = sch.player1_home_id
OR ms.player2_id = sch.player1_home_id)
AND (ms.player1_id = sch.player1_away_id
OR ms.player2_id = sch.player1_away_id))
The output of this query is the same. Note though that conditions in the WHERE clause have to be evaluated for every row in the result set and that will generally make this query less efficient than the LEFT JOIN equivalent.
Demo on dbfiddle

Related

Extracting data from table with results from different search

I am trying to extract data from tables with the results from a previous search. I am not really familiar with database query's and have made one that will crash my computer from drawing too much memory.
This data is coming from a board tester and I want certain information.
How many boards were ran during a given period
How many failed
All the failure data for those boards EDIT: This is the one I need to figure out. See Edit at bottom.
The first time a board is ran it creates a record in the Board table
+----------+-------+-----+
| Board_id | Board | rev |
+----------+-------+-----+
| 1 | 1234 | 1 |
| 2 | 1234 | 1 |
| 3 | 1235 | 2 |
| 4 | 5869 | 15 |
+----------+-------+-----+
Each time the board is ran it creates a Test record
+----------+----------+---------+---------------------+
| Test_id | Board_id | Operator| Date_Time |
+----------+----------+---------+---------------------+
| 34 | 1 | 1 | 2017-08-02 09:13:34 |
| 35 | 1 | 1 | 2017-08-02 09:13:36 |
| 36 | 1 | 1 | 2017-08-02 09:13:39 |
| 37 | 2 | 1 | 2017-08-02 09:14:10 |
| 38 | 3 | 1 | 2017-08-02 09:16:24 |
| 39 | 3 | 2 | 2017-08-03 10:40:45 |
| 40 | 4 | 2 | 2017-08-03 10:43:34 |
+----------+----------+---------+---------------------+
...and Results are stored in Results
+-----------+---------+--------+-------------+-------------+
| Result_id | Test_id | Result | Upper_Limit | Lower_Limit |
+-----------+---------+----------------------+-------------+
| 40 | 34 | 2 | 4 | 1 |
| 41 | 34 | 3 | 4 | 1 |
| 42 | 34 | 4 | 4 | 1 |
| 43 | 34 | 0 | 4 | 1 |
| 44 | 35 | 2 | 4 | 1 |
| 45 | 35 | 3 | 4 | 1 |
| 46 | 35 | 4 | 4 | 1 |
| 47 | 35 | 0 | 4 | 1 |
| 48 | 36 | 2 | 4 | 1 |
| 49 | 36 | 3 | 4 | 1 |
| 50 | 36 | 4 | 4 | 1 |
| 51 | 36 | 2 | 4 | 1 |
| 52 | 37 | 2 | 4 | 1 |
| 53 | 37 | 3 | 4 | 1 |
| 54 | 37 | 4 | 4 | 1 |
| 55 | 37 | 2 | 4 | 1 |
| 56 | 38 | 2 | 4 | 1 |
| 57 | 38 | 3 | 4 | 1 |
| 58 | 38 | 4 | 4 | 1 |
| 59 | 38 | 5 | 4 | 1 |
| 60 | 39 | 2 | 4 | 1 |
| 61 | 39 | 3 | 4 | 1 |
| 62 | 39 | 4 | 4 | 1 |
| 63 | 39 | 5 | 4 | 1 |
| 64 | 40 | 2 | 4 | 1 |
| 65 | 40 | 3 | 4 | 1 |
| 66 | 40 | 4 | 4 | 1 |
| 67 | 40 | 3 | 4 | 1 |
+-----------+---------+--------+-------------+-------------+
To get the number of boards, and Board_ID, ran during a given period I query.
SELECT a.Board_ID FROM
Tests a, Results b
WHERE a.Date_Time>='2017-08-02' AND a.Date_Time<'2017-08-03' and
a.Test_ID = b.Test_ID
group by a.Board_ID
To get all associated test to those Board_ID's I query.
SELECT * from
Tests x, (
SELECT a.Board_ID FROM
Tests a, Results b
WHERE a.Date_Time>='2017-08-02' AND a.Date_Time<'2017-08-03' and
a.Test_ID = b.Test_ID
group by a.Board_ID
) y
where x.Board_ID = y.Board_ID
This gives me the correct results, but the query seems off, but when I try to get the failed results from the query above is when I have the most trouble.
SELECT d.Test_ID FROM
Boards a, Tests b, (
SELECT x.Test_ID, x.Board_ID, x.Operator, x.Date_Time from
Tests x, (
SELECT a.Board_ID FROM
Tests a, Results b
WHERE a.Date_Time>='2017-08-02' AND a.Date_Time<'2017-08-03' and
a.Test_ID = b.Test_ID
group by a.Board_ID
) y
)d
WHERE d.Test_ID = b.Test_ID and
b.Result not between Lower_Limit and Upper_Limit
EDIT:
If you look at the Test table I created you will see that board_id 3 got tested twice and on two different days. I need to see the boards that we ran on a given day, this example 2017-08-02, and all associated records to those boards. So since Board_ID #3 was ran on 2 days, and was ran on the day in question, I would need that record included in my query.
My Solution
SELECT * FROM
(
SELECT x.Test_ID, x.Board_ID, x.Operator, x.Date_Time from
Test x, (
SELECT a.Board_ID FROM
Test a
join Results b on a.Test_ID = b.Test_ID
WHERE a.Date_Time>='2017-08-11' AND a.Date_Time<'2017-08-12'
group by a.Board_ID
) y
where x.Board_ID = y.Board_ID
)d
join Boards a on a.Board_ID = d.Board_ID
join Results b on b.Test_ID = d.Test_ID
join Test_Names c on c.Test_Name_ID = b.Test_Name_ID --Table Not shown
WHERE
b.result not between Lower_Limit and Upper_Limit
From this you see I have 3 nested searches into 1. With the 3 individual searches I get all the data I need to parse the information I want. Next will be to find a way to query the database for what I need instead of parsing.
I think you're overthinking this. You don't need all the inline views. Here's how I would write it using ANSI Joins (like #CptMisery suggested in the comments)
SELECT d.test_id, b.board, b.board_rev, r.result_id, r.result -- and whatever else you need.
from tests t
join results r on t.test_id = r.test_id
join boards b on t.board_id = b.board_id
where t.Date_Time>='2017-08-02' AND t.Date_Time<'2017-08-03'
and r.result >Lower_Limit -- or >=
and r.result < Upper_Limit -- or <=, if it can be the limit value
JOIN all the tables based on their relationships (Foreign Key to Primary Key), choose your filters in the where clause, and choose the columns to "project" with Select.
SELECT d.Test_ID FROM
Boards a, Tests b, ( SELECT x.Test_ID,
x.Board_ID,
x.Operator,
x.Date_Time
from Tests x,
(SELECT a.Board_ID
FROM Tests a, Results b
WHERE a.Date_Time>='2017-08-02'
AND a.Date_Time<'2017-08-03'
and a.Test_ID = b.Test_ID
group by a.Board_ID
) y
)d
WHERE d.Test_ID = d.Test_ID
and b.Result >= Lower_Limit
and b. Result <=Upper_Limit

MYSQL/Query: How to make table rows into column

I have 3 tables tbl_contestant , tbl_criteria and tbl_judges. And then i have 1 more table combined this 3 table as my result, tbl_score.
tbl_criteria
------------------------
crit_id | criteria_name
16 | sports
tbl_judges
------------------------
judge_id | judge_name
61 | first
62 | second
63 | third
tbl_cotestant
--------------------------------------
con_id | contestant_number | contestant_name |
1 | 1 | john |
2 | 2 | sy |
3 | 3 | Nah |
tbl_score
--------------------------------------------------
score_id | crit_id | judge_id | contestant_number | score
1 | 16 | 61 | 1 | 25
2 | 16 | 61 | 2 | 25
3 | 16 | 61 | 3 | 25
4 | 16 | 62 | 1 | 25
5 | 16 | 62 | 2 | 73
6 | 16 | 62 | 3 | 59
7 | 16 | 63 | 1 | 70
8 | 16 | 63 | 2 | 80
9 | 16 | 63 | 3 | 70
How can i achieve this output, judge_id row turns into column based on crit_id
contestant_number | contestant_name | 16_judge_61 | 16_judge_62 | 16_judge_63 | total
1 | john | 25 | 25 | 70 |
2 | sy | 25 | 73 | 80 |
3 | Nah | 25 | 59 | 70 |
Please correct my query
SELECT DISTINCT(c.contestant_number) , contestant_name , j1.sports as
16_judge_61, j2.sports as 16_judge_62, j3.sports as 16_judge_63 from
tbl_criteria , tbl_score, tbl_contestant c
LEFT JOIN tbl_ // <-- i have no idea how start from here joining those 4 tables together
You could use CASE WHEN to solve this.
SELECT
s.contestant_number,
c.contestant_name,
SUM(CASE WHEN s.crit_id='16' AND s.judge_id='61' THEN s.score END) as 16_judge_61,
SUM(CASE WHEN s.crit_id='16' AND s.judge_id='62' THEN s.score END) as 16_judge_62,
SUM(CASE WHEN s.crit_id='16' AND s.judge_id='63' THEN s.score END) as 16_judge_63,
SUM(s.score) as Total
FROM tbl_score s
INNER JOIN tbl_contestant c ON s.contestant_number = c.contestant_number
GROUP BY s.contestant_number
see SQL Fiddle http://sqlfiddle.com/#!9/9efa5/1

SQL Query to Sort the result according to maximum common results

I have a problem in making SQL query. I am making a small Search Engine in which the word to page mapping or indexes are kept like this.
Sorry I wasn't able to post images here so I tried writing the output like this.
+---------+---------+-----------+--------+
| word_id | page_id | frequency | degree |
+---------+---------+-----------+--------+
| 2331 | 29 | 2 | 1 |
| 2332 | 29 | 7 | 1 |
| 2333 | 29 | 4 | 1 |
| 2334 | 29 | 1 | 1 |
| 2335 | 29 | 1 | 1 |
| 2336 | 29 | 1 | 1 |
| 2337 | 29 | 2 | 1 |
| 2338 | 29 | 7 | 1 |
| 2343 | 29 | 1 | 3 |
| 2344 | 29 | 1 | 3 |
......
......
...... and so on.
Word_id points to Words present in other table and page_id points to URLs present in other table.
Now Suppose I want to search "Rapid 3D Prototyping Services". I brought the union of results corresponding to individual words by query ->
select * from words_detail where word_id=2353 or word_id=2364 or word_id=2709 or word_id=2710;
In above query the word_ids corresponds to the 4 words in the search query and the results are as below.
Union of page_id corresponding to individual words...
mysql>
select * from words_detail where word_id=2353 or word_id=2364 or word_id=2709 or word_id=2710;
+---------+---------+-----------+--------+
| word_id | page_id | frequency | degree |
+---------+---------+-----------+--------+
| 2353 | 29 | 2 | 4 |
| 2353 | 33 | 2 | 2 |
| 2353 | 36 | 5 | 9 |
| 2353 | 40 | 1 | 4 |
| 2353 | 41 | 1 | 9 |
| 2353 | 45 | 4 | 9 |
| 2353 | 47 | 2 | 9 |
| 2353 | 49 | 4 | 9 |
| 2353 | 52 | 1 | 4 |
| 2353 | 53 | 1 | 9 |
| 2353 | 66 | 2 | 9 |
| 2364 | 29 | 1 | 4 |
| 2364 | 34 | 1 | 4 |
| 2364 | 36 | 9 | 2 |
| 2709 | 36 | 1 | 9 |
| 2710 | 36 | 1 | 9 |
+---------+---------+-----------+--------+
16 rows in set (0.00 sec)
But I want the result to be sorted according to maximum match. The earlier result should be where all 4 words match, next result should be with 3 match and so on. In other words earlier results should have those page_id which are common to 4 word_ids, next should be those which are common in 3 words_ids and so on.
I checked here but this is not working in my case because in my case OR conditions are not matched in a single row.
How can such a query can be designed?
Use the occurence of you page_id as your matching count and then order by it.
select * from words_detail A
inner join
(SELECT PAGE_ID
, COUNT(PAGE_ID) matchCount
from words_detail
where word_id=2353 or word_id=2364 or word_id=2709 or word_id=2710
group by PAGE_ID) B
on A.PAGE_ID=B.PAGE_ID
where word_id=2353 or word_id=2364 or word_id=2709 or word_id=2710
order by matchCount desc
Try this
select p.*
from words_detail p
, (select word_id, count(1) as count
from words_detail where
word_id in (2353,2364,2709,2710) group by word_id) t
where p.word_id = t.word_id
order by t.count desc;
You can do a subquery to get the number of apperances for each page. Then you have to join the subquery with your table and you will be able to order the results by the number of page appearances.
Your final query should look like this:
SELECT *
FROM words_detail,
(
SELECT page_id,
COUNT(*) AS npages
FROM words_detail
WHERE word_id IN (2353, 2364, 2709, 2710)
GROUP BY page_id
) AS matches
WHERE words_detail.page_id = matches.page_id
AND word_id IN (2353, 2364, 2709, 2710)
ORDER BY matches.npages DESC

Selecting data with a range condition on two columns

i have some articles in my database
+------+--------------+----------+--------+
| id | article_name | age_from | age_to |
+------+--------------+----------+--------+
| 1337 | article 1 | 30 | 60 |
+------+--------------+----------+--------+
| 1338 | article 2 | 16 | 35 |
+------+--------------+----------+--------+
| 1338 | article 3 | 26 | 28 |
+------+--------------+----------+--------+
The user can set some filters in the front-end. He can search articles that are made for people from 19 years to 22 years. There are also two input fields (Age from and age to). The database should return this:
+------+--------------+----------+--------+
| id | article_name | age_from | age_to |
+------+--------------+----------+--------+
| 1338 | article 2 | 16 | 35 |
+------+--------------+----------+--------+
How do i do that? i can't do it with WHERE age_from >= 19 AND age_to <= 22.
greetings
Flip the logic:
WHERE age_from <= 22
AND age_to >= 19
My favourite explanation of this kind of problem, courtesy of Rudy Limeback (aka r937): http://www.dbforums.com/6318776-post14.html

How can I SELECT rows from a table when I MAX(ColA) and GROUP BY ColB

I found this question which is very similar but I'm still having some troubles.
So I start with table named Scores
id | player | time | scoreA | scoreB |
~~~|~~~~~~~~|~~~~~~|~~~~~~~~|~~~~~~~~|
1 | John | 10 | 70 | 80 |
2 | Bob | 22 | 75 | 85 |
3 | John | 52 | 55 | 75 |
4 | Ted | 39 | 60 | 90 |
5 | John | 35 | 90 | 90 |
6 | Bob | 27 | 65 | 85 |
7 | John | 33 | 60 | 80 |
I would like to select the best average score for each player along with the information from that record. To clarify, best average score would be the highest value for (scoreA + scoreB)/2.
The results would look like this
id | player | time | scoreA | scoreB | avg_score |
~~~|~~~~~~~~|~~~~~~|~~~~~~~~|~~~~~~~~|~~~~~~~~~~~|
5 | John | 35 | 90 | 90 | 90 |
2 | Bob | 22 | 75 | 85 | 80 |
4 | Ted | 39 | 60 | 90 | 75 |
Based on the question I linked to above, I tried a query like this,
SELECT
s.*,
avg_score
FROM
Scores AS s
INNER JOIN (
SELECT
MAX((scoreA + scoreB)/2) AS avg_score,
player,
id
FROM
Scores
GROUP BY
player
) AS avg_s ON s.id = avg_s.id
ORDER BY
avg_score DESC,
s.time ASC
What this actually gives me is,
id | player | time | scoreA | scoreB | avg_score |
~~~|~~~~~~~~|~~~~~~|~~~~~~~~|~~~~~~~~|~~~~~~~~~~~|
1 | John | 10 | 70 | 80 | 90 |
2 | Bob | 22 | 75 | 85 | 80 |
4 | Ted | 39 | 60 | 90 | 75 |
As you can see, it has gotten the correct max avg_score, from record 5, but gets the rest of the information from another record, record 1. What am I missing? How do I ensure that the data all comes from the same record? I'm getting the correct avg_score but I want the rest of the data associated with that record, record 5 in this case.
Thanks in advance!
SELECT x.*
, (scoreA+scoreB)/2 avg_score
FROM scores x
JOIN
( SELECT player, MAX((scoreA+scoreB)/2) max_avg_score FROM scores GROUP BY player) y
ON y.player = x.player
AND y.max_avg_score = (scoreA+x.scoreB)/2;
Try
SELECT s.*,
q.avg_score
FROM scores s JOIN
(
SELECT player,
MAX((scoreA + scoreB)/2) AS avg_score
FROM scores
GROUP BY player
) q ON s.player = q.player
AND (s.scoreA + s.scoreB)/2 = q.avg_score
ORDER BY q.avg_score DESC, s.time ASC
Sample output:
| ID | PLAYER | TIME | SCOREA | SCOREB | AVG_SCORE |
----------------------------------------------------
| 5 | John | 35 | 90 | 90 | 90 |
| 2 | Bob | 22 | 75 | 85 | 80 |
| 4 | Ted | 39 | 60 | 90 | 75 |
Here is SQLFiddle demo