Right outer join is similar to a Union of a venn diagram right?
I mean for A right outer Join B we should get all the rows of B and any matching rows in A.
For some reason I am confused with the following:
Assume table Orders:
mysql> select * from orders;
+------------+------------+---------+----------+---------+
| orderedon | name | partnum | quantity | remarks |
+------------+------------+---------+----------+---------+
| 1996-05-19 | TRUE-WHEEL | 76 | 3 | PAID |
| 1996-09-02 | TRUE-WHEEL | 10 | 1 | PAID |
| 1996-06-30 | TRUE-WHEEL | 42 | 8 | PAID |
| 1996-06-30 | BIKE SPEC | 54 | 10 | PAID |
| 1996-05-30 | BIKE SPEC | 23 | 8 | PAID |
| 1996-01-17 | BIKE SPEC | 76 | 11 | PAID |
| 1996-01-17 | LE SHOPPE | 76 | 5 | PAID |
| 1996-06-01 | LE SHOPPE | 10 | 3 | PAID |
| 1996-06-01 | AAA BIKE | 10 | 1 | PAID |
| 1996-07-01 | AAA BIKE | 76 | 4 | PAID |
| 1996-07-01 | AAA BIKE | 46 | 14 | PAID |
| 1996-07-11 | JACKS BIKE | 76 | 14 | PAID |
| 1996-05-15 | TRUE-WHEEL | 23 | 6 | PAID |
| 1996-05-30 | BIKE SPEC | 20 | 2 | PAID |
+------------+------------+---------+----------+---------+
14 rows in set (0.00 sec)
and table Part:
mysql> select * from part;
+---------+---------------+---------+
| partnum | description | price |
+---------+---------------+---------+
| 54 | PEDALS | 54.25 |
| 42 | SEATS | 24.50 |
| 46 | TIRES | 15.25 |
| 23 | MOUNTAIN BIKE | 350.45 |
| 76 | ROAD BIKE | 530.00 |
| 10 | TANDEM | 1200.00 |
+---------+---------------+---------+
6 rows in set (0.00 sec)
I was expecting that the following query:
select p.partnum p_partnum,p.description p_desc,p.price p_price,o.name o_name,o.partnum o_partnum from part p right outer join orders o on o.partnum=54;
Would give me all the rows of Orders and just the rows of part that have partnum=54.
But I get this:
mysql> select p.partnum p_partnum,p.description p_desc,p.price p_price,o.name o_name,o.partnum o_partnum from part p right outer join orders o on o.partnum=54;
+-----------+---------------+---------+------------+-----------+
| p_partnum | p_desc | p_price | o_name | o_partnum |
+-----------+---------------+---------+------------+-----------+
| NULL | NULL | NULL | TRUE-WHEEL | 76 |
| NULL | NULL | NULL | TRUE-WHEEL | 10 |
| NULL | NULL | NULL | TRUE-WHEEL | 42 |
| 54 | PEDALS | 54.25 | BIKE SPEC | 54 |
| 42 | SEATS | 24.50 | BIKE SPEC | 54 |
| 46 | TIRES | 15.25 | BIKE SPEC | 54 |
| 23 | MOUNTAIN BIKE | 350.45 | BIKE SPEC | 54 |
| 76 | ROAD BIKE | 530.00 | BIKE SPEC | 54 |
| 10 | TANDEM | 1200.00 | BIKE SPEC | 54 |
| NULL | NULL | NULL | BIKE SPEC | 23 |
| NULL | NULL | NULL | BIKE SPEC | 76 |
| NULL | NULL | NULL | LE SHOPPE | 76 |
| NULL | NULL | NULL | LE SHOPPE | 10 |
| NULL | NULL | NULL | AAA BIKE | 10 |
| NULL | NULL | NULL | AAA BIKE | 76 |
| NULL | NULL | NULL | AAA BIKE | 46 |
| NULL | NULL | NULL | JACKS BIKE | 76 |
| NULL | NULL | NULL | TRUE-WHEEL | 23 |
| NULL | NULL | NULL | BIKE SPEC | 20 |
+-----------+---------------+---------+------------+-----------+
19 rows in set (0.00 sec)
Why am I getting the extra rows? Why does it combine the row of Order with partnum=54 to all rows of `part?
Your query is
select p.partnum p_partnum,p.description p_desc,p.price p_price,o.name o_name,o.partnum o_partnum
from part p right outer join orders o on o.partnum=54;
You are going for right join so even if there is no match in the joining table on the right side it will display records and it is happening in this case of your query
same is for left join there all the records of the table on the left side will be displayed even if they dont match
Refer http://www.w3schools.com/sql/sql_join_right.asp
For detailed explanation
Hope this helps
FROM part p
RIGHT OUTER JOIN orders o
ON o.partnum=54
...only has a condition on order, what you need to add is a condition that the part also corresponds to the order, or the database will consider any part a match;
FROM part p
RIGHT OUTER JOIN orders o
ON o.partnum=54
AND o.partnum = p.partnum
Of course, if you only want to show the rows where partnum=54, you're better off moving the o.partnum=54 to a WHERE condition instead. JOIN conditions are usually for connecting tables, WHERE usually for filtering.
Because you have RIGHT JOIN to orders with join condition partnum=54 you will get all rows from orders joined with parts when partnum=54. You have one row with partnum=54 and that row joined with all rows (cross join) from parts.
You're using o.partnum = 54 on ON condition, that's why you're getting extra rows
in your result.
You need to put the condition o.partnum = 54 on where clause.
How about this
select p.partnum p_partnum,p.description p_desc,p.price p_price,o.name o_name,o.partnum o_partnum
from part p right outer join orders o
on o.partnum = p.partnum
where o.partnum=54;
Edit : You may refer to this beautiful article how the result set gets disturb when you
put condition on ON clause and Where.
Related
I have a mysql table that holds data for team games.
Objective:
Count the number of times other SquadID's have have shared the same Team value as SquadID=21
// Selections table
+--------+---------+------+
| GameID | SquadID | Team |
+--------+---------+------+
| 1 | 5 | A |
| 1 | 7 | B |
| 1 | 11 | A |
| 1 | 21 | A |
| 2 | 5 | A |
| 2 | 7 | B |
| 2 | 11 | A |
| 2 | 21 | A |
| 3 | 5 | A |
| 3 | 7 | B |
| 3 | 11 | A |
| 3 | 21 | A |
| 4 | 5 | A |
| 4 | 11 | B |
| 4 | 21 | A |
| 5 | 5 | A |
| 5 | 11 | B |
| 5 | 21 | A |
| 6 | 5 | A |
| 6 | 11 | B |
| 6 | 21 | A |
+--------+---------+------+
// Desired Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 0 |
| 11 | 3 |
| 21 | 6 |
+----------+---------+
I've attempted to use a subquery specifying the specific player I wish to compare with and because this subquery has multiple rows, I've used in instead of =.
// Current Query
SELECT
SquadID,
COUNT(Team IN (SELECT Team FROM selections WHERE SquadID=21) AND GameID IN (SELECT GameID FROM selections WHERE SquadID=21)) AS TeamMate
FROM
selections
GROUP BY
SquadID;
The result I'm getting is the number of Games a user has played rather than the number of games a user has been on the same team as SquadID=21
// Current Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 3 |
| 11 | 6 |
| 21 | 6 |
+---------+----------+
What am I missing?
// DESCRIBE selections;
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| GameID | int(11) | NO | PRI | 0 | |
| SquadID | int(4) | NO | PRI | NULL | |
| Team | char(1) | NO | | NULL | |
| TeamID | int(11) | NO | | 1 | |
+---------+---------+------+-----+---------+-------+
General rule is to avoid nested selects and look for a better way of logically arranging joins. Lets look at a cross join:
From selections s1
inner join selects s2 on s1.gameid = s2.gameid and s1.team = s2.team
This will produce a cross joined list of each squadID that participated with another squadID (IE: they were in the same game and on same team). We are only interested in the times where the squad participated with squad 21, so add a where clause:
where s2.squadid = 21
Then it's simply choosing the field/count you want:
select s1.squad, count(1) as teammate
any aggregate needs a group by
group by s1.squad
Combine it together and give a go. Oddly, this will produce a list where squad 21 will be showing as playing on it's own team all 6 times. Adding a where clause can eliminate this
where s1.squadid <> s2.squadid
SELECT SquadID, count(t1.team) as TeamMate
FROM selections as t1 join
(select distinct team, gameid from selections where SquadID=21) as t2
on t1.Team=t2.Team and t1.Gameid=t2.Gameid
GROUP BY SquadID
This query
SELECT station_id, station_name,
COUNT(event_station) as `total_visit_count`
FROM taps AS t
JOIN event_stations AS s
ON t.event_station = s.station_id
WHERE s.event_id=6
GROUP BY s.station_id
ORDER BY s.station_id;
returns
+------------+--------------+-------------------+
| station_id | station_name | total_visit_count |
+------------+--------------+-------------------+
| 5 | Station one | 24 |
| 6 | Station two | 35 |
| 7 | St. Pancras | 34 |
+------------+--------------+-------------------+
which is just fine.
However, there are some stations in taps which have not been visited and I would like them to be shown with a total_visit_count of zer0.
+------------+--------------+-------------------+
| station_id | station_name | total_visit_count |
+------------+--------------+-------------------+
| 5 | Station one | 24 |
| 6 | Station two | 35 |
| 7 | St. Pancras | 34 |
| 8 | Station four | 0 |
+------------+--------------+-------------------+
How do I rewrite my query to to that? I imagine some kind of JOIN is required, but I can't quite see it :-(
[Update]
describe event_Stations;
+--------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------+------+-----+---------+----------------+
| station_id | int(11) | NO | PRI | NULL | auto_increment |
| event_id | int(11) | NO | | NULL | |
| station_name | text | NO | | NULL | |
| allocated | tinyint(1) | NO | | 0 | |
+--------------+------------+------+-----+---------+----------------+
4 rows in set (0.20 sec)
describe taps;
+---------------+-----------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------+------+-----+-------------------+-------+
| tag_id | int(11) | NO | | NULL | |
| time_stamp | timestamp | NO | | CURRENT_TIMESTAMP | |
| event_station | int(11) | NO | | NULL | |
| device_id | text | YES | | NULL | |
| device_type | text | YES | | NULL | |
| event_id | int(11) | NO | | NULL | |
+---------------+-----------+------+-----+-------------------+-------+
6 rows in set (0.00 sec)
select * from event_stations where event_id=6;
+------------+----------+-----------------+-----------+
| station_id | event_id | station_name | allocated |
+------------+----------+-----------------+-----------+
| 5 | 6 | Station one | 0 |
| 6 | 6 | Station two | 0 |
| 7 | 6 | St. Pancras | 0 |
| 8 | 6 | Station three | 0 |
| 9 | 6 | Station four | 0 |
| 10 | 6 | Station five | 0 |
| 11 | 6 | Station six | 0 |
| 12 | 6 | Station seven | 0 |
| 13 | 6 | Station eight | 0 |
| 14 | 6 | Station nine | 0 |
| 15 | 6 | Station ten | 0 |
| 16 | 6 | Station eleven | 0 |
+------------+----------+-----------------+-----------+
12 rows in set (0.00 sec)
First, swap the order of your join, so the primary table is sorted first (this is for organizational purposes only).
Then, use a LEFT JOIN to accomplish what you're looking for. This will ensure you pull all event_stations records (the left portion of the join), even if there is no corresponding record in the taps table (the right portion of the join). In place of the missing taps, you'll get NULL values.
COUNT will ignore nulls in aggregate, so will only return the count of non-null records. Thus, it will return 0 for your missing event_stations records.
SELECT
station_id,
station_name,
COUNT(event_station) as `total_visit_count`
FROM event_stations AS s
LEFT JOIN taps AS t
ON t.event_station = s.station_id
WHERE s.event_id = 6
GROUP BY s.station_id
ORDER BY s.station_id;
Alternatively, you could just use a RIGHT JOIN with your original join order. I personally don't like doing that, though, because I'm a LTR reader (first in order is more important).
I have several tables that I combine in an application I'm creating in PHP that essentially creates a check list. I realize I could solve this problem using a conditional in PHP, but am curious if MySQL is capable of accomplishing this and if so, how? Specifically, I have four tables which are queried using the following statement:
SELECT
cl_status.status,
users.user_first,
cl_status.date AS status_date,
cl_status.id AS status_id,
cl_status.criteria_id,
cl_criteria.id AS cid,
cl_criteria.description AS description
FROM cl_criteria
LEFT JOIN cl_lists
ON cl_criteria.cl_id = cl_lists.id
RIGHT JOIN cl_status
ON cl_criteria.id = cl_status.criteria_id
LEFT JOIN users
ON cl_status.user_id = users.user_id
WHERE cl_lists.id = '1'
Table one - cl_lists:
+----+------------------+------------+------------+-------+
| id | title | date | comp_level | owner |
+----+------------------+------------+------------+-------+
| 1 | Newcomer's guide | 1452473606 | 1 | 1 |
+----+------------------+------------+------------+-------+
Table two - cl_assign:
+----+-------+-------+------------+
| id | cl_id | owner | date |
+----+-------+-------+------------+
| 1 | 1 | 1 | 1455843514 |
+----+-------+-------+------------+
Table three - cl_status:
+----+-------------+---------+-------------+------------+--------+----------+
| id | criteria_id | user_id | description | date | status | comments |
+----+-------------+---------+-------------+------------+--------+----------+
| 2 | 66 | 1 | | NULL | 1 | NULL |
| 15 | 65 | 1 | | 1455842197 | 5 | NULL |
| 16 | 67 | 1 | | 1455842201 | 5 | NULL |
| 17 | 68 | 1 | | 1455842203 | 5 | NULL |
| 18 | 69 | 1 | | 1455842217 | 0 | NULL |
| 19 | 70 | 1 | | 1455842222 | 5 | NULL |
| 20 | 72 | 1 | | 1455842237 | 1 | NULL |
| 21 | 71 | 1 | | 1455842234 | 0 | NULL |
| 22 | 73 | 1 | | 1455842246 | 5 | NULL |
| 23 | 76 | 1 | | 1455842249 | 5 | NULL |
| 24 | 77 | 1 | | 1455842268 | 5 | NULL |
| 25 | 78 | 152 | | 1455854420 | 3 | NULL |
| 26 | 81 | 1 | | 1455843660 | 5 | NULL |
+----+-------------+---------+-------------+------------+--------+----------+
Table four - users:
+---------+------------+
| user_id | user_first |
+---------+------------+
| 1 | Mark |
| 2 | Test |
+---------+------------+
Ideally, I'd like the join to look like this:
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
| status | user_first | status_date | status_id | criteria_id | cid | description |
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
| 5 | Mark | 1455842197 | 15 | 65 | 65 | Tour of facility |
| 5 | Mark | 1455842201 | 16 | 67 | 67 | Tax forms |
| 5 | Mark | 1455842203 | 17 | 68 | 68 | 2 forms of ID |
| 0 | Mark | 1455842217 | 18 | 69 | 69 | Benefits | |
| 5 | Mark | 1455842246 | 22 | 73 | 73 | Intro to policies |
| 5 | Mark | 1455842249 | 23 | 76 | 76 | Setup email account |
| NULL | NULL | NULL | NULL | 78 | 78 | Setup Computer account |
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
However, it looks like this:
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
| status | user_first | status_date | status_id | criteria_id | cid | description |
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
| 5 | Mark | 1455842197 | 15 | 65 | 65 | Tour of facility |
| 5 | Mark | 1455842201 | 16 | 67 | 67 | Tax forms |
| 5 | Mark | 1455842203 | 17 | 68 | 68 | 2 forms of ID |
| 0 | Mark | 1455842217 | 18 | 69 | 69 | Benefits |
| 5 | Mark | 1455842246 | 22 | 73 | 73 | Intro to policies |
| 5 | Mark | 1455842249 | 23 | 76 | 76 | Setup email account |
| 3 | Temp | 1455854420 | 25 | 78 | 78 | Setup Computer account |
+--------+------------+-------------+-----------+-------------+------+-----------------------------+
Is there a way to apply the conditional before the join? Or another way to accomplish the result set that I want?
EDIT
This is a screenshot of what the application looks like:
The criteria table will include steps of every checklist I have. The list table is a list of the various checklists. The status table allows every user (such as Mark, or Test) to look at the same checklist and complete it as if it was a separate document. It also populates the date/time that the item was updated by that user.
I suspect that the RIGHT JOIN you have in your query is causing the records you want to appear to be filtered out. Remember that t1 RIGHT JOIN t2 is the same as t2 LEFT JOIN t1, meaning that t1 will lose any record which does not appear in t2, with t2 keeping all its records. Try this:
SELECT cl_status.status, users.user_first, cl_status.date AS status_date,
cl_status.id AS status_id, cl_status.criteria_id, cl_criteria.id AS cid,
cl_criteria.description AS description
FROM cl_criteria LEFT JOIN cl_lists
ON cl_criteria.cl_id = cl_lists.id
LEFT JOIN cl_status
ON cl_criteria.id = cl_status.criteria_id
LEFT JOIN users
ON cl_status.user_id = users.user_id
WHERE cl_lists.id = '1'
I'm trying to produce a formula which pits our students' reward points against their negative behaviour flags.
Students are given LEAP points (in the transactions table) for their positive behaviour. They get more points depending on the category of their reward, i.e. Model Citizen gives the student 10 points.
On the other hand, students are given single Flags for negative behaviour. The category of the Flag is then weighted in a database table, i.e. the Aggressive Defiance category will have a high weighting of 4 whereas Low Level Disruption will only be worth 1.
The difficulty therefore is trying to factor in the Flag categories' weightings. They're stored in the categories table under the Weight column.
Here's the SQL fiddle: http://sqlfiddle.com/#!2/2e5756
In my head, the pseudo-SQL code would look something like this...
SELECT
CONCAT( stu.Surname, ", ", stu.Firstname ) AS `Student`,
SUM(t.Points) AS `LEAP Points`,
SUM(<<formula>>) AS `Flags`
( `LEAP Points` - `Flags` ) AS `Worked Out Points Thing`
FROM student stu
LEFT JOIN transactions t ON t.Recipient_ID = stu.id
LEFT JOIN flags f ON f.Student_ID = stu.id
LEFT JOIN categories c ON f.Category_ID = c.ID
GROUP BY stu.id
However, it's the <<formula>> that I have no idea how to implement in MySQL. It needs to be something like this:
SUM OF[ Each of Student's Flags * that Flag's Category Weighting ]
So, if a student has these flags...
#1 f.Reason "Being naughty", f.Category_ID "1", c.Title "Low Level Disruption", c.Weight "1"
#1 Reason "Aggressively naughty!", Category "Aggressive Defiance", Category Weighting "4"
#1 Reason "Missed detention", Category "Missed Detention", Category Weighting "3"
They would have a total of 1+4+3 = 9 points to use in the Worked Out Points Thing equation.
The desired output therefore is essentially...
Student LEAP Points Flags Equation Points LEAP Points minus Flag Points
D Wraight 1000 800 200
D Wraight2 500 800 -300
D Wraight3 1200 300 900
From the SQL fiddle above, here is the required output.. I've missed out some students because I had to work these out manually:
STUDENT FLAGS LEAP EQUATION
137608 4 (2+2) 12 (2+5+5) 8 (12-4)
139027 2 (2) 7 (2+5) 5 (7-2)
139041 4 (2+1+1+NULL) 8 (2+2+2+2) 4 (8-4)
139892 4 (4) 0 -4 (0-4)
138832 4 (4) 0 -4 (0-4)
34533 4 (4) 0 -4 (0-4)
137434 0 10 (2*5) 10 (10-0)
Which will help us to work out the choices we make available to each student when looking at end of year reward trips.
Hope that makes sense.. it's kinda boggled my head trying to explain it..
Thanks in advance,
figure out your 'formula' bit first because it's the deepest part. work outwards.
build a table of flags * weight per student
select sum(weight), student_id from flags f
join categories c
on f.category_id = c.id
group by student_id
so now you've got a table of flag values to minus from sum of transactions per student
select sum(points), recipient_id from transactions
group by recipient_id
so now we have two tables with positive and negative values by student id (assuming obviously that student id is recipient id)
you want those with transactions but without flags to appear in the result, so outer join.
and number minus null is null so ifnull function on the flags to get 0
select a.student, points - ifnull(penalties, 0) as netPoints
from
(select sum(points) as points, recipient_id as student from transactions
group by student) as a
left outer join
(select sum(weight) as penalties, student_id as student from flags f
join categories c
on f.category_id = c.id
group by student) as b
on
a.student = b.student
so with the name in there it's just
select
concat(firstname, ', ', surname) as name,
ifnull(points,0) as totalPoints,
ifnull(penalties,0) as totalPenalties,
ifnull(points,0) - ifnull(penalties, 0) as netPoints,
ifnull(countFlags, 0)
from
student
left join
(select sum(points) as points, recipient_id as student from transactions
group by student) as a
on student.id = a.student
left join
(select sum(weight) as penalties, count(f.id) as countFlags, student_id as student from flags f
join categories c
on f.category_id = c.id
group by student) as b
on
student.id = b.student
join condition is always from student's id column, which is never null.
there are probably more efficient ways, but who cares?
Returning to the question (and at the risk of repeating myself!), given the following data set, what would the desired result set look like...
SELECT * FROM flags;
+------+------------+----------+---------------------+-----------+-------------+--------------------------+---------------------+
| ID | Student_ID | Staff_ID | Datetime | Period_ID | Category_ID | Action_Taken_Category_ID | Action_Taken_Status |
+------+------------+----------+---------------------+-----------+-------------+--------------------------+---------------------+
| 8843 | 137608 | 35003 | 2014-03-11 08:31:00 | 8 | 16 | 7 | P |
| 8844 | 137608 | 35003 | 2014-03-11 08:31:00 | 8 | 16 | 7 | P |
| 8845 | 139027 | 35003 | 2014-03-11 08:31:00 | 8 | 16 | 7 | P |
| 8846 | 139041 | 35003 | 2014-03-11 08:31:00 | 8 | 16 | 7 | P |
| 8847 | 139041 | 34961 | 2014-03-11 09:01:02 | 2 | 12 | 26 | P |
| 8848 | 139041 | 34996 | 2014-03-11 09:23:21 | 3 | 12 | 27 | C |
| 8849 | 139041 | 35022 | 2014-03-11 11:07:46 | 4 | 34 | 28 | P |
| 8850 | 139892 | 138439 | 2014-03-11 11:12:47 | 4 | 21 | 7 | C |
| 8851 | 138832 | 138439 | 2014-03-11 11:12:48 | 4 | 21 | 7 | C |
| 8852 | 34533 | 138439 | 2014-03-11 11:12:48 | 4 | 21 | 7 | C |
+------+------------+----------+---------------------+-----------+-------------+--------------------------+---------------------+
SELECT * FROM categories;
+----+------+--------------------------------------+--------+----------+
| ID | Type | Title | Weight | Added_By |
+----+------+--------------------------------------+--------+----------+
| 10 | F | Low level disruption | 1 | NULL |
| 11 | F | Swearing directly at another student | 2 | NULL |
| 12 | F | Late | 1 | NULL |
| 13 | F | Absconded | 3 | NULL |
| 14 | F | Refusal to follow instruction | 3 | NULL |
| 15 | F | Smoking | 2 | NULL |
| 16 | F | No homework | 2 | NULL |
| 17 | F | Disruptive outside classroom | 2 | NULL |
| 18 | F | Eating/drinking in lesson | 1 | NULL |
| 19 | F | Incorrect uniform/equipment | 1 | NULL |
| 20 | F | Phone out in lesson | 3 | NULL |
| 21 | F | Aggressive defiance | 4 | NULL |
| 22 | F | Missed detention | 3 | NULL |
| 23 | F | Inappropriate behaviour/comments | 3 | NULL |
| 32 | F | IT Misuse | NULL | NULL |
| 34 | F | Inappropriate attitude towards staff | NULL | NULL |
| 35 | F | Care & Guidance | NULL | NULL |
+----+------+--------------------------------------+--------+----------+
SELECT * FROM transactions;
+----------------+------------+----------+--------------+--------+-------------+
| Transaction_ID | Datetime | Giver_ID | Recipient_ID | Points | Category_ID |
+----------------+------------+----------+--------------+--------+-------------+
| 34 | 2011-09-07 | 35019 | 137608 | 2 | 1 |
| 35 | 2011-09-07 | 35019 | 139027 | 2 | 1 |
| 36 | 2011-09-07 | 35019 | 139041 | 2 | 1 |
| 37 | 2011-09-07 | 35019 | 139041 | 2 | 1 |
| 38 | 2011-09-07 | 35019 | 139041 | 2 | 1 |
| 39 | 2011-09-07 | 35019 | 139041 | 2 | 1 |
| 40 | 2011-09-07 | 35019 | 137434 | 2 | 1 |
| 41 | 2011-09-07 | 35019 | 137434 | 2 | 1 |
| 42 | 2011-09-07 | 35019 | 137434 | 2 | 1 |
| 43 | 2011-09-07 | 35019 | 137434 | 2 | 1 |
| 44 | 2011-09-07 | 35006 | 137434 | 2 | 1 |
| 45 | 2011-09-07 | 35006 | 90306 | 2 | 1 |
| 46 | 2011-09-07 | 35006 | 90306 | 2 | 1 |
| 47 | 2011-09-07 | 35006 | 90306 | 2 | 1 |
| 48 | 2011-09-07 | 35023 | 137608 | 5 | 2 |
| 49 | 2011-09-07 | 35023 | 139027 | 5 | 2 |
| 50 | 2011-09-07 | 35023 | 139564 | 5 | 2 |
| 51 | 2011-09-07 | 35023 | 139564 | 5 | 2 |
| 52 | 2011-09-07 | 35023 | 139564 | 5 | 2 |
| 53 | 2011-09-07 | 35023 | 137608 | 5 | 3 |
+----------------+------------+----------+--------------+--------+-------------+
SELECT id,UPN,Year_Group,Tutor_Group,SEN_Status,Flags FROM student;
+--------+---------------+------------+-------------+------------+--------+
| id | UPN | Year_Group | Tutor_Group | SEN_Status | Flags |
+--------+---------------+------------+-------------+------------+--------+
| 137608 | A929238400044 | 11 | 11VID | A | |
| 139027 | A929238401045 | 10 | 10KS | | |
| 139041 | A929238402017 | 10 | 10RJ | A | FSM |
| 139892 | A929238403018 | 9 | 9BW | | |
| 139938 | A929238403020 | 9 | 9RH | | |
| 137434 | A929238500027 | 11 | 11VID | | |
| 138832 | A929238502002 | 10 | 10RY | A | FSM,PA |
| 34533 | A929238599028 | 0 | | | PA |
| 139564 | A929241500025 | 12 | | | PA |
| 90306 | A929253100006 | 12 | SLH | A | PA |
+--------+---------------+------------+-------------+------------+--------+
Maybe this will be an easy one for some of you MySQL masters who see this stuff like a level 3 children's book.
I have multiple tables that I'm joining to produce statistical data for a report and I'm getting tripped up at the moment trying to figure it out. It's obviously imperative the figures are correct because it impacts a number of decisions going forward.
Here's the lay of the land (not the full picture, but you'll get the point):
Affiliate Table
+----+-----------+------------+---------------------+
| id | firstname | lastname | created_date |
+----+-----------+------------+---------------------+
| 1 | Mike | Johnson | 2010-11-22 17:44:37 |
| 2 | Trevor | Wilson | 2010-12-23 16:24:24 |
| 3 | Bob | Parker | 2011-11-04 10:33:49 |
+----+-----------+------------+---------------------+
Now our query should only find results for Bob Parker (id 3) so I'll only show example results for Bob.
Affiliate Link Table
+-----+-----------+--------------+-----------+----------+---------------------+
| id | parent_id | affiliate_id | link_type | linkhash | created_date |
+-----+-----------+--------------+-----------+----------+---------------------+
| 21 | NULL | 3 | PRODUCT | fa2e82a7 | 2011-06-15 16:18:37 |
| 27 | NULL | 3 | PRODUCT | 55de2ae7 | 2011-06-23 01:03:00 |
| 28 | NULL | 3 | PRODUCT | 02cae72f | 2011-06-23 01:03:00 |
| 29 | 27 | 3 | PRODUCT | a4dfb2c8 | 2011-06-23 01:03:00 |
| 30 | 28 | 3 | PRODUCT | 72cea1b2 | 2011-06-23 01:03:00 |
| 36 | 21 | 3 | PRODUCT | fa2e82a7 | 2011-06-23 01:07:03 |
| 59 | 21 | 3 | PRODUCT | ec33413f | 2011-11-04 17:49:17 |
| 60 | 27 | 3 | PRODUCT | f701188c | 2011-11-04 17:49:17 |
| 69 | 21 | 3 | PRODUCT | 6dfb89fd | 2011-11-04 17:49:17 |
+-----+-----------+--------------+-----------+----------+---------------------+
Affiliate Stats
+--------+--------------+--------------------+----------+---------------------+
| id | affiliate_id | link_id | order_id | type | created_date |
+--------+--------------+---------+----------+----------+---------------------+
| 86570 | 3 | 21 | NULL | CLICK | 2013-01-01 00:07:31 |
| 86574 | 3 | 21 | NULL | PAGEVIEW | 2013-01-01 00:08:53 |
| 86579 | 3 | 21 | 411 | SALE | 2013-01-01 00:09:52 |
| 86580 | 3 | 36 | NULL | CLICK | 2013-01-01 00:09:55 |
| 86582 | 3 | 36 | NULL | PAGEVIEW | 2013-01-01 00:09:56 |
| 86583 | 3 | 28 | NULL | CLICK | 2013-01-01 00:11:04 |
| 86584 | 3 | 28 | NULL | PAGEVIEW | 2013-01-01 00:11:04 |
| 86586 | 3 | 30 | NULL | CLICK | 2013-01-01 00:30:18 |
| 86587 | 3 | 30 | NULL | PAGEVIEW | 2013-01-01 00:30:20 |
| 86611 | 3 | 69 | NULL | CLICK | 2013-01-01 00:40:19 |
| 86613 | 3 | 69 | NULL | PAGEVIEW | 2013-01-01 00:40:19 |
| 86619 | 3 | 69 | 413 | SALE | 2013-01-01 00:42:12 |
| 86622 | 3 | 60 | NULL | CLICK | 2013-01-01 00:46:00 |
| 86624 | 3 | 60 | NULL | PAGEVIEW | 2013-01-01 00:46:01 |
| 86641 | 3 | 60 | NULL | PAGEVIEW | 2013-01-01 00:55:58 |
| 86642 | 3 | 30 | 415 | SALE | 2013-01-01 00:56:35 |
| 86643 | 3 | 28 | NULL | PAGEVIEW | 2013-01-01 00:56:43 |
| 86644 | 3 | 60 | 417 | SALE | 2013-01-01 00:56:52 |
+--------+--------------+---------+----------+----------+---------------------+
Orders
+------+--------------+---------+---------------------+
| id | affiliate_id | total | created_date |
+------+--------------+---------+---------------------+
| 411 | 3 | 138.62 | 2013-01-01 00:09:50 |
| 413 | 3 | 312.87 | 2013-01-01 00:09:52 |
| 415 | 3 | 242.59 | 2013-01-01 00:09:55 |
| 417 | 3 | 171.18 | 2013-01-01 00:09:55 |
+------+--------------+---------+---------------------+
Now the results that I need should look like this (only show main/parent link id)
+---------+---------+
| link_id | total |
+---------+---------+
| 21 | 451.49 | <- 1 order from parent (21), 1 from child (69)
| 27 | 171.18 | <- 1 order from child (69)
| 28 | 242.59 | <- 1 order from child (30)
+---------+---------+
I'm not quite sure how to write the query so that I can sum where affiliate_link.id and affiliate_link.parent_id are combined. Is this even possible with a couple of JOINs and GROUPing?
I'm not too sure why you have denormalised affiliate_id (by placing it in each table) and, therefore, whether one can rely on all Stats and Orders that stem from a particular Link to have the same affiliate_id as that Link.
If it's possible, I'd suggest changing the AffiliateLink.parent_id column such that parent records point to themselves (rather than NULL):
UPDATE AffiliateLink SET parent_id = id WHERE parent_id IS NULL
Then it's a simple case of joining and grouping:
SELECT AffiliateLink.parent_id AS link_id,
SUM(Orders.total) AS total
FROM AffiliateLink
JOIN AffiliateStats ON AffiliateStats.link_id = AffiliateLink.id
JOIN Orders ON Orders.id = AffiliateStats.order_id
WHERE AffiliateLink.affiliate_id = 3
GROUP BY AffiliateLink.parent_id
See it on sqlfiddle.
If it's not possible to make the change, you can effectively create the resulting AffiliateLink table using UNION (but beware the performance implications, as MySQL will not be able to use indexes on the result):
(
SELECT parent_id, id, affiliate_id FROM AffiliateLink WHERE parent_id IS NOT NULL
UNION ALL
SELECT id , id, affiliate_id FROM AffiliateLink WHERE parent_id IS NULL
) AS AffiliateLink
See it on sqlfiddle.