I have a data and they are recorded by each year, I am trying to compare two years( the past year and the current year) data within one mysql query
Below are my tables
Cost Items
| cid | items |
| 1 | A |
| 2 | B |
Cost
| cid | amount | year |
| 1 | 10 | 1 |
| 1 | 20 | 2 |
| 1 | 30 | 1 |
This is the result I am expecting when i want to compare the year 1 and year 2. Year 1 is the past year and year 2 is the current year
Results
items | pastCost | currentCost |
A | 10 | 20 |
A | 30 | 0 |
However the below query is what i used by gives a strange answer.
SELECT
IFNULL(ps.`amount`, '0') as pastCost
IFNULL(cs.`amount`, '0') as currentCost
FROM
`Cost Items` b
LEFT JOIN
`Cost` ps
ON
b.cID=ps.cID
AND
ps.Year = 1
LEFT JOIN
`Cost` cu
ON
b.cID=cu.cID
AND
cu.Year =2
This is the result i get from my query
items | pastCost | currentCost |
A | 10 | 20 |
A | 30 | 20 |
Please what am i doing wrong? Thanks for helping.
I'm missing something about your query; the SQL text shown can't produce that result.
There is no source for the items column in the SELECT list, and there is no table aliased as cs. (Looks like the expression in the SELECT list would need to be cu.amount
Aside from that, the results being returned look exactly like what we'd expect. Each row returned from year=2 is being matched with each row returned from year=1. If there were three rows for year=1 and two rows for year=2, we'd get six rows back... each row for year=1 "matched" with each row for year=2.
If (cid, year) tuple was UNIQUE in Cost, then this query would return a result similar to what you expect.
SELECT b.items
, IFNULL(ps.amount, '0') AS pastCost
, IFNULL(cu.amount, '0') AS currentCost
FROM `Cost Items` b
LEFT
JOIN `Cost` ps
ON ps.cid = b.cid
AND ps.Year = 1
LEFT
JOIN `Cost` cu
ON cu.cid = b.cid
AND cu.Year = 2
Since (cid, year) is not unique, you need some additional column to "match" a single row for year=1 with a single row for year=2.
Without some other column in the table, we could use an inline view to generate a value. I can illustrate how we can make MySQL return a resultset like the one you show, one way that could be done, but I don't think this is really the solution to whatever problem you are trying to solve:
SELECT b.items
, IFNULL(MAX(IF(a.year=1,a.amount,NULL)),0) AS pastCost
, IFNULL(MAX(IF(a.year=2,a.amount,NULL)),0) AS currentCost
FROM `Cost Items` b
LEFT
JOIN ( SELECT #rn := IF(c.cid=#p_cid AND c.year=#p_year,#rn+1,1) AS `rn`
, #p_cid := c.cid AS `cid`
, #p_year := c.year AS `year`
, c.amount
FROM (SELECT #p_cid := NULL, #p_year := NULL, #rn := 0) i
JOIN `Cost` c
ON c.year IN (1,2)
ORDER BY c.cid, c.year, c.amount
) a
ON a.cid = b.cid
GROUP
BY b.cid
, a.rn
A query something like that would return a resultset that looks like the one you are expecting. But again, I strongly suspect that this is not really the resultset you are really looking for.
EDIT
OP leaves comment with vaguely nebulous report of observed behavior: "the above solution doesnt work"
Well then, let's check it out... create a SQL Fiddle with some tables so we can test the query...
SQL Fiddle here http://sqlfiddle.com/#!9/e3d7e/1
create table `Cost Items` (cid int unsigned, items varchar(5));
insert into `Cost Items` (cid, items) values (1,'A'),(2,'B');
create table `Cost` (cid int unsigned, amount int, year int);
insert into `Cost` (cid, amount, year) VALUES (1,10,1),(1,20,2),(1,30,1);
And when we run the query, we get a syntax error. There's closing paren missing in the expressions in the SELECT list, easy enough to fix.
SELECT b.items
, IFNULL(MAX(IF(a.year=1,a.amount,NULL)),0) AS pastCost
, IFNULL(MAX(IF(a.year=2,a.amount,NULL)),0) AS currentCost
FROM `Cost Items` b
LEFT
JOIN ( SELECT #rn := IF(c.cid=#p_cid AND c.year=#p_year,#rn+1,1) AS `rn`
, #p_cid := c.cid AS `cid`
, #p_year := c.year AS `year`
, c.amount
FROM (SELECT #p_cid := NULL, #p_year := NULL, #rn := 0) i
JOIN `Cost` c
ON c.year IN (1,2)
ORDER BY c.cid, c.year, c.amount
) a
ON a.cid = b.cid
GROUP
BY b.cid
, a.rn
Returns:
items pastCost currentCost
------ -------- -----------
A 10 20
A 30 0
B 0 0
Related
Basically I need help in my query here. I want to be in right order which is child must be under parents name and in A-Z order. But if I add a subChild under child (Split 1) seem the order is wrong. It should be under Room Rose.
p/s : A subChild also can create another subChild
HERE I PROVIDE A DEMO
Appreciate your help me get this ordered correctly?
SELECT A.venueID
, B.mainVenueID
, A.venueName
FROM tblAdmVenue A
LEFT
JOIN tblAdmVenueLink B
ON A.venueID = B.subVenueID
ORDER
BY COALESCE(B.mainVenueID, A.venueID)
, B.mainVenueID IS NOT NULL
, A.venueID
I want it return an order something like this.
venueName
--------------
Banquet
Big Room
-Room Daisy
-Room Rose
-Split 1
Hall
-Meeting Room WP
Seem this recursive approach also in not working
WITH venue_ctg AS (
SELECT A.venueID, A.venueName, B.mainVenueID
FROM tblAdmVenue A LEFT JOIN tblAdmVenueLink B
ON A.venueID = B.subVenueID
WHERE B.mainVenueID IS NULL
UNION ALL
SELECT A.venueID, A.venueName, B.mainVenueID
FROM tblAdmVenue A LEFT JOIN tblAdmVenueLink B
ON A.venueID = B.subVenueID
WHERE B.mainVenueID IS NOT NULL
)
SELECT *
FROM venue_ctg ORDER BY venueName
output given
For your data you can use this:
To display this correctly, you can use a SEPARATPR like comma, and split the returned data, and check the hirarchy
-- schema
CREATE TABLE tblAdmVenue (
venueID VARCHAR(225) NOT NULL,
venueName VARCHAR(225) NOT NULL,
PRIMARY KEY(venueID)
);
CREATE TABLE tblAdmVenueLink (
venueLinkID VARCHAR(225) NOT NULL,
mainVenueID VARCHAR(225) NOT NULL,
subVenueID VARCHAR(225) NOT NULL,
PRIMARY KEY(venueLinkID)
-- FOREIGN KEY (DepartmentId) REFERENCES Departments(Id)
);
-- data
INSERT INTO tblAdmVenue (venueID, venueName)
VALUES ('LA43', 'Big Room'), ('LA44', 'Hall'),
('LA45', 'Room Daisy'), ('LA46', 'Room Rose'),
('LA47', 'Banquet'), ('LA48', 'Split 1'),
('LA49', 'Meeting Room WP');
INSERT INTO tblAdmVenueLink (venueLinkID, mainVenueID, subVenueID)
VALUES ('1', 'LA43', 'LA45'), ('2', 'LA43', 'LA46'),
('3', 'LA46', 'LA48'), ('4', 'LA44', 'LA49');
✓
✓
✓
✓
with recursive cte (subVenueID, mainVenueID,level) as (
select subVenueID,
mainVenueID, 1 as level
from tblAdmVenueLink
union
select p.subVenueID,
cte.mainVenueID,
cte.level+1
from tblAdmVenueLink p
inner join cte
on p.mainVenueID = cte.subVenueID
)
select
CONCAT(GROUP_CONCAT(b.venueName ORDER BY level DESC SEPARATOR '-->') ,'-->',a.venueName)
from cte c
LEFT JOIN tblAdmVenue a ON a.venueID = c.subVenueID
LEFT JOIN tblAdmVenue b ON b.venueID = c.mainVenueID
GROUP BY subVenueID;
| CONCAT(GROUP_CONCAT(b.venueName ORDER BY level DESC SEPARATOR '-->') ,'-->',a.venueName) |
| :----------------------------------------------------------------------------------------- |
| Big Room-->Room Daisy |
| Big Room-->Room Rose |
| Big Room-->Room Rose-->Split 1 |
| Hall-->Meeting Room WP |
db<>fiddle here
You want your data ordered in alphabetical order and depth first.
A common solution for this is to traverse the structure from the top element, concatenating the path to each item as you go. You can then directly use the path for ordering.
Here is how to do it in MySQL 8.0 with a recursive query
with recursive cte(venueID, venueName, mainVenueID, path, depth) as (
select v.venueID, v.venueName, cast(null as char(100)), venueName, 0
from tblAdmVenue v
where not exists (select 1 from tblAdmVenueLink l where l.subVenueID = v.venueID)
union all
select v.venueID, v.venueName, c.venueID, concat(c.path, '/', v.venueName), c.depth + 1
from cte c
inner join tblAdmVenueLink l on l.mainVenueID = c.venueID
inner join tblAdmVenue v on v.venueID = l.subVenueID
)
select * from cte order by path
The anchor of the recursive query selects top nodes (ie rows whose ids do not exist in column subVenueID of the link table). Then, the recursive part follows the relations.
As a bonus, I added a level column that represents the depth of each node, starting at 0 for top nodes.
Demo on DB Fiddle:
venueID | venueName | mainVenueID | path | depth
:------ | :-------------- | :---------- | :------------------------- | ----:
LA47 | Banquet | null | Banquet | 0
LA43 | Big Room | null | Big Room | 0
LA45 | Room Daisy | LA43 | Big Room/Room Daisy | 1
LA46 | Room Rose | LA43 | Big Room/Room Rose | 1
LA48 | Split 1 | LA46 | Big Room/Room Rose/Split 1 | 2
LA44 | Hall | null | Hall | 0
LA49 | Meeting Room WP | LA44 | Hall/Meeting Room WP | 1
Use only one table, not two. The first table has all the info needed.
Then start the CTE with the rows WHERE mainVenueID IS NULL, no JOIN needed.
This may be a good tutorial: https://stackoverflow.com/a/18660789/1766831
Its 'forest' is close to what you want.
I suppose you have:
table tblAdmVenue A is the venue list; and
table tblAdmVenueLink B is the tree relation table for parent-child
For your question on how to get a correct sorting order, I think one of the trick is to concatenate the parent venue names.
with q0(venueID, venueName, mainVenueID, venuePath) as (
select
A.venueID,
A.venueName,
null,
A.venueName
from tblAdmVenue A
left join tblAdmVenue B on A.venueID = B.subVenueID
where B.mainVenueID is null
union all
select
A.venueID,
A.venueName,
q0.venueID,
q0.venuePath + char(9) + A.venueName
from q0
inner join tblAdmVenue B on q0.venueID = B.mainVenueID
inner join tblAdmVenue A on A.venueID = B.subVenueID
)
select venueID, venueName, mainVenueID
from q0
order by venuePath
I am having trouble combining data from multiple tables. I have tried joins and subqueries but to no avail. I basically need to combine 2 queries into one. My tables (simplified):
Stock:
id int(9) PrimaryIndex
lot_number int(4)
description text
reserve int(9)
current_bid int(9)
current_bidder int(6)
Members:
member_id int(11) PrimaryIndex
name varchar(255)
Bids:
id int(9)
lot_id int(9)
bidder_id int(5)
max_bid int(9)
time_of_bid datetime
I'm currently using 2 separate queries which with 1000's of lots, makes it very inefficient. 1st query:
SELECT S.id, S.lot_number, S.description, S.reserve FROM stock S ORDER BY
S.lot_number ASC
The 2nd query within a while loop then gets the bidding info:
SELECT DISTINCT B.bidder_id, B.lot_id, B.max_bid, B.time_of_bid,
M.fname, M.lname FROM bids B, members M WHERE B.lot_id=? AND
B.bidder_id=M.member_id ORDER BY B.max_bid DESC LIMIT 2
Below is what i would like as output from a single query, if possible:
Lot No. | Reserve | Current Bid | 1st Max Bid | 1st Bidder | 2nd Max Bid | 2nd Max Bidder
1 | $100 | $120 | $150 | Steve | $110 | John
2 | $500 | $650 | $900 | Tom | $600 | Paul
I have had partial success with just getting the MAX(B.bid) and then its related details (WHERE S.id=B.id), but i cant get the top 2 bids for each lot.
First assign a row number rn to rows within each group of lot_id in table bids (highest bid gets 1, 2nd highest bid gets 2 and so on). The highest bid and second highest bid will be on two different rows after the LEFT JOIN. Use GROUP BY to merge the two rows into one.
select s.lot_number, s.reserve, s.current_bid,
max( case when rn = 1 then b.max_bid end) as first_max_bid,
max( case when rn = 1 then m.name end) as first_bidder,
max( case when rn = 2 then b.max_bid end) as second_max_bid,
max( case when rn = 2 then m.name end ) as second_bidder
from
stock s
left join
(select * from
(select *,
(#rn := if(#lot_id = lot_id, #rn+1,
if( #lot_id := lot_id, 1, 1))) as rn
from bids cross join
(select #rn := 0, #lot_id := -1) param
order by lot_id, max_bid desc
) t
where rn <= 2) b
on s.lot_number = b.lot_id
left join members m
on b.bidder_id = m.member_id
group by s.lot_number, s.reserve, s.current_bid
order by s.lot_number
I have the following database structure, and I am trying to run a single query that will show classrooms and how many students are part of the classroom, and how many rewards a classroom has allocated out, as well as how many points allocated to a single classroom (based on the classroom_id column).
Using the query at the very bottom I am trying to collect the 'totalPoints' that a classroom has assigned - based on counting the points column in the classroom_redeemed_codes table and return this as a single integer.
For some reason the values are incorrect for the totalPoints - I am doing something wrong but not sure what...
-- UPDATE --
Here is the sqlfiddle:-
http://sqlfiddle.com/#!2/a9f45
My Structure:
CREATE TABLE `organisation_classrooms` (
`classroom_id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`active` tinyint(1) NOT NULL,
`organisation_id` int(11) NOT NULL,
`period` int(1) DEFAULT '0',
`classroom_bg` int(2) DEFAULT '3',
`sortby` varchar(6) NOT NULL DEFAULT 'points',
`sound` int(1) DEFAULT '0',
PRIMARY KEY (`classroom_id`)
);
CREATE TABLE organisation_classrooms_myusers (
`classroom_id` int(11) NOT NULL,
`user_id` bigint(11) unsigned NOT NULL,
);
CREATE TABLE `classroom_redeemed_codes` (
`redeemed_code_id` int(11) NOT NULL AUTO_INCREMENT,
`myuser_id` bigint(11) unsigned NOT NULL DEFAULT '0',
`ssuser_id` bigint(11) NOT NULL DEFAULT '0',
`classroom_id` int(11) NOT NULL,
`order_product_id` int(11) NOT NULL DEFAULT '0',
`order_product_images_id` int(11) NOT NULL DEFAULT '0',
`date_redeemed` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`points` int(11) NOT NULL,
`type` int(1) NOT NULL DEFAULT '0',
`notified` int(1) NOT NULL DEFAULT '0',
`inactive` tinyint(3) NOT NULL,
PRIMARY KEY (`redeemed_code_id`),
);
SELECT
t.classroom_id,
title,
COALESCE (
COUNT(DISTINCT r.redeemed_code_id),
0
) AS totalRewards,
COALESCE (
COUNT(DISTINCT ocm.user_id),
0
) AS totalStudents,
COALESCE (sum(r.points), 0) AS totalPoints
FROM
`organisation_classrooms` `t`
LEFT OUTER JOIN classroom_redeemed_codes r ON (
r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (
r.date_redeemed >= 1393286400
OR r.date_redeemed = 0
)
)
LEFT OUTER JOIN organisation_classrooms_myusers ocm ON (
ocm.classroom_id = t.classroom_id
)
WHERE
t.organisation_id =37383
GROUP BY title
ORDER BY t.classroom_id ASC
LIMIT 10
-- EDIT --
OOPS! I hate SQL sometimes... I have made a big mistake, I am trying to count the number of STUDENTS in the classroom_redeemed_codes rather than the organisation_classrooms_myuser table. I'm really sorry I should have picked that up sooner?!
classroom_id | totalUniqueStudents
16 1
17 2
46 1
51 1
52 1
There are 7 rows in the classroom_redeemed_codes table but as classroom_id 46 has two rows although with the same myuser_id (this is the student id) this should appear as one unique student.
Does this make sense? Essentially trying to grab the number of unique students in the classroom_redeemed_codes tables based on the myuser_id column.
e.g a classroom id 46 could have 100 rows in the classroom_redeemed_codes tables, but if it is the same myuser_id for each this should show the totalUniqueStudents count as 1 and not 100.
Let me know if this isn't clear....
-- update --
I have the following query which seems to work borrowed from a user below which seems to work... (my head hurts) i'll accept the answer again. Sorry for the confusion - I think I was just over thinking this somewhat
select crc.classroom_id,
COUNT(DISTINCT crc.myuser_id) AS users,
COUNT( DISTINCT crc.redeemed_code_id ) AS classRewards,
SUM( crc.points ) as classPoints, t.title
from classroom_redeemed_codes crc
JOIN organisation_classrooms t
ON crc.classroom_id = t.classroom_id
AND t.organisation_id = 37383
where crc.inactive = 0
AND ( crc.date_redeemed >= 1393286400
OR crc.date_redeemed = 0 )
group by crc.classroom_id
I ran by first doing a pre-query aggregate of your points per specific class, then used left-join to it. I am getting more rows in the result set than your sample expected, but don't have MySQL to test/confirm directly. Howeverhere is a SQLFiddle of your query By doing your query with sum of points, and having a Cartesian result when applying the users table, it is probably the basis of duplicating the points. By pre-querying on the redeem codes itself, you just grab that value, then join to users.
SELECT
t.classroom_id,
title,
COALESCE ( r.classRewards, 0 ) AS totalRewards,
COALESCE ( r.classPoints, 0) AS totalPoints,
COALESCE ( r.uniqStudents, 0 ) as totalUniqRedeemStudents,
COALESCE ( COUNT(DISTINCT ocm.user_id), 0 ) AS totalStudents
FROM
organisation_classrooms t
LEFT JOIN ( select crc.classroom_id,
COUNT( DISTINCT crc.redeemed_code_id ) AS classRewards,
COUNT( DISTINCT crc.myuser_id ) as uniqStudents,
SUM( crc.points ) as classPoints
from classroom_redeemed_codes crc
JOIN organisation_classrooms t
ON crc.classroom_id = t.classroom_id
AND t.organisation_id = 37383
where crc.inactive = 0
AND ( crc.date_redeemed >= 1393286400
OR crc.date_redeemed = 0 )
group by crc.classroom_id ) r
ON t.classroom_id = r.classroom_id
LEFT OUTER JOIN organisation_classrooms_myusers ocm
ON t.classroom_id = ocm.classroom_id
WHERE
t.organisation_id = 37383
GROUP BY
title
ORDER BY
t.classroom_id ASC
LIMIT 10
You need sum(r.points) and a subquery in the left outer join see below
SELECT
t.classroom_id,
title,
COALESCE (
COUNT(DISTINCT r.redeemed_code_id),
0
) AS totalRewards,
COALESCE(sum(r.points),0) AS totalPoints
,COALESCE(sum(T1.cnt),0) as totalStudents
FROM
`organisation_classrooms` `t`
left outer join (select classroom_id, count(user_id) cnt
from organisation_classrooms_myusers
group by classroom_id) T1 on (T1.classroom_id=t.classroom_id)
LEFT OUTER JOIN classroom_redeemed_codes r ON (
r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (
r.date_redeemed >= 1393286400
OR r.date_redeemed = 0
)
)
WHERE
t.organisation_id =37383
GROUP BY title
ORDER BY t.classroom_id ASC
LIMIT 10
I simplified your query; there is no need to use COALLESCE together with COUNT() because COUNT() never returns NULL. For SUM() I prefer to use IFNULL() because it is shorter and more readable. The results displayed below contain only the data for classroom_id #16, #17 and #46 for easier comparison with the example provided in the question. The actual result sets are bigger and contain all the classroom_ids present in the tables. However, their presence is not needed to understand how and why it works.
SELECT
t.classroom_id,
t.title,
COUNT(DISTINCT r.redeemed_code_id) AS totalRewards,
COUNT(DISTINCT ocm.user_id) AS totalStudents,
IFNULL(SUM(r.points), 0) AS totalPoints
FROM `organisation_classrooms` t
LEFT JOIN `classroom_redeemed_codes` r
ON r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0)
LEFT JOIN `organisation_classrooms_myusers` ocm
ON ocm.classroom_id = t.classroom_id
WHERE t.organisation_id = 37383
GROUP BY t.classroom_id
ORDER BY t.classroom_id ASC
Let's try to split it in pieces and put them together after that. First, let's see what users are selected:
Query #1
SELECT
t.classroom_id,
t.title,
ocm.user_id
FROM `organisation_classrooms` t
LEFT JOIN `organisation_classrooms_myusers` ocm
ON ocm.classroom_id = t.classroom_id
WHERE t.organisation_id = 37383
ORDER BY t.classroom_id ASC
I removed the classroom_redeemed_codes table and it fields, removed GROUP BY and replaced the aggregate function COUNT(ocm.user_id) with ocm.user_id to see what users are selected.
The result show us this part of the query is correct:
classroom_id | title | user_id
-------------+-------+--------
16 | BLUE | 2
16 | BLUE | 1
17 | GREEN | 508835
17 | GREEN | 508826
46 | PINK | NULL
There are 2 users in classroom #16, another 2 in #7 and none in class #46.
Putting back the GROUP BY clause will make it return the correct values (2, 2, 0) in the totalStudents column.
Let's check now the relationship with table classroom_redeemed_codes:
Query #2
SELECT
t.classroom_id,
t.title,
r.redeemed_code_id, r.points
FROM `organisation_classrooms` t
LEFT JOIN `classroom_redeemed_codes` r
ON r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0)
WHERE t.organisation_id = 37383
ORDER BY t.classroom_id ASC
The result is:
classroom_id | title | redeemed_code_id | points
-------------+-------+------------------+-------
16 | BLUE | 7 | 50
17 | GREEN | 8 | 25
17 | GREEN | 9 | 75
46 | PINK | 5 | 250
46 | PINK | 6 | 100
Again, grouping by classroom_id will produce (1, 2, 2) in column totalRewards and (50, 100, 350) in column totalPoints which is correct.
The trouble starts when you want to combine these into a single query. No matter what kind of join you use, for the provided input you will get (2*1, 2*2, 1*2) rows for classroom_id having the values 16, 17 and 46 (in this order). The values I multiplied in parenthesis are the number of rows for each classroom_id in the first and in the query result set above.
Combined
Let' try the query that selects the rows before grouping them:
SELECT
t.classroom_id,
t.title,
r.redeemed_code_id, ocm.user_id, r.points
FROM `organisation_classrooms` t
LEFT JOIN `classroom_redeemed_codes` r
ON r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0)
LEFT JOIN `organisation_classrooms_myusers` ocm
ON ocm.classroom_id = t.classroom_id
WHERE t.organisation_id = 37383
ORDER BY t.classroom_id ASC
It returns this result set:
classroom_id | title | redeemed_code_id | user_id | points
-------------+-------+------------------+---------+-------
16 | BLUE | 7 | 2 | 50
16 | BLUE | 7 | 1 | 50 <- *
-------------+-------+------------------+---------+-------
17 | GREEN | 8 | 508835 | 25
17 | GREEN | 8 | 508826 | 25 <- *
17 | GREEN | 9 | 508835 | 75
17 | GREEN | 9 | 508826 | 75 <- *
-------------+-------+------------------+---------+-------
46 | PINK | 5 | NULL | 250
46 | PINK | 6 | NULL | 100
I added horizontal rules to separate the rows that belongs to the same group when we add the GROUP BY clause. This is basically the way a SQL query with GROUP BY is executed, no matter the name of the actual software that implements it.
As you can see, for each classroom, it combines all the redeemed codes associated with the classroom with all the users associated with the classroom. If you add more users and redeemed codes for classrooms #16, #17 and #46 in your tables you will get a much larger result set.
The next step on the execution of a GROUP BY query is to produce a single row from each group you see above. There is no problem with columns classroom_id and title, they contain a single value in each group. For the columns redeemed_code_id and user_id your query counts distinct values and that works fine too. The problem is with the addition of points.
If you just SUM() them, you get a redeemed code added for each user_id in the group. If you use SUM(DISTINCT points) it is also wrong because it will ignore the duplicates even when they are different entries in table classroom_redeemed_codes.
What you want is to add points for DISTINCT redeemed_code_id. I marked on the above result set the rows you don't want.
This is not possible using this query because on calculation of the aggregate values each column is independent of the other. We need a query that selects the desired rows before grouping them.
An Idea
We can try to add the missing columns (with NULL values) to the two simple queries above, UNION ALL them then select from this and GROUP BY.
First, let's be sure it selects what we need:
SELECT
t.classroom_id,
t.title,
NULL AS redeemed_code_id, ocm.user_id, NULL AS points
FROM `organisation_classrooms` t
LEFT JOIN `organisation_classrooms_myusers` ocm
ON ocm.classroom_id = t.classroom_id
WHERE t.organisation_id = 37383
UNION ALL
SELECT
t.classroom_id,
t.title,
r.redeemed_code_id, NULL AS user_id, r.points
FROM `organisation_classrooms` t
LEFT JOIN `classroom_redeemed_codes` r
ON r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0)
WHERE t.organisation_id = 37383
ORDER BY classroom_id
Attention! The ORDER BY clause applies to the UNIONed result set. If you want to order the rows of each SELECT (it doesn't help because UNION doesn't keep the order) you need to enclose that query in parenthesis and put the ORDER BY clauses there.
The result set looks great:
classroom_id | title | redeemed_code_id | user_id | points
-------------+-------+------------------+---------+-------
16 | BLUE | NULL | 1 | NULL
16 | BLUE | NULL | 2 | NULL
16 | BLUE | 7 | NULL | 50
-------------+-------+------------------+---------+-------
17 | GREEN | 8 | NULL | 25
17 | GREEN | 9 | NULL | 75
17 | GREEN | NULL | 508826 | NULL
17 | GREEN | NULL | 508835 | NULL
-------------+-------+------------------+---------+-------
46 | PINK | 5 | NULL | 250
46 | PINK | 6 | NULL | 100
46 | PINK | NULL | NULL | NULL
Now we could put some parenthesis around the query above (strip ORDER BY) and use it in another query, grouping the data by classroom_id, counting the users and the redeemed codes and summing their points.
You will get a query that looks awful and, on your current database schema, crawls when your tables have several hundred rows. This is why I will not write it here.
Attention!
Its performance can be improved by adding the missing indexes to your tables, on the fields that appear in the ON, WHERE, ORDER BY and GROUP BY clauses of the query.
It will bring a significant improvement but I won't rely very much on that. For really big tables (hundreds of thousands of rows) it will still crawl.
Another Idea
We can also add GROUP BY on both Query #1 and Query #2 first and UNION ALL them after that:
SELECT
t.classroom_id,
t.title,
NULL AS totalRewards,
COUNT(DISTINCT ocm.user_id) AS totalStudents,
NULL AS totalPoints
FROM `organisation_classrooms` t
LEFT JOIN `organisation_classrooms_myusers` ocm
ON ocm.classroom_id = t.classroom_id
WHERE t.organisation_id = 37383
GROUP BY t.classroom_id
UNION ALL
SELECT
t.classroom_id,
t.title,
COUNT(DISTINCT redeemed_code_id) AS totalRewards,
NULL AS totalStudents,
SUM(points) AS totalPoints
FROM `organisation_classrooms` t
LEFT JOIN `classroom_redeemed_codes` r
ON r.classroom_id = t.classroom_id
AND r.inactive = 0
AND (r.date_redeemed >= 1393286400 OR r.date_redeemed = 0)
WHERE t.organisation_id = 37383
GROUP BY t.classroom_id
ORDER BY classroom_id, totalRewards
This produces a nice result set:
classroom_id | title | totalRewards | totalStudents | totalPoints
-------------+-------+--------------+---------------+-------------
16 | BLUE | NULL | 2 | NULL
16 | BLUE | 1 | NULL | 50
17 | GREEN | NULL | 2 | NULL
17 | GREEN | 2 | NULL | 100
46 | PINK | NULL | 0 | NULL
46 | PINK | 2 | NULL | 350
This query can be embedded in another query that groups by classroom_id and SUM()s the total columns above to get the final result. But again, the final query is big and ugly and it
doesn't run very fast for large tables. And again, this is the reason I don't write it here.
Conclusion
It can be done in a single query but it doesn't look good and it doesn't work well on large tables.
Regarding the performance, put EXPLAIN in front of your query then check the values in columns type, key and Extra of the result. See the documentation for explanation of the possible values of these columns, what to try to achieve and what to avoid.
Both queries I created on both ideas produce joins of type range or ALL and having Using filesort in column Extra (all these are slow). Using them as sub-queries in bigger queries will not improve the way they are execution, on the contrary.
I recommend you to run the individual SELECT queries from the last code example as two separate queries; they will return the odd and the even rows from the above result set. Then combine their results into the client code. It will run faster this way.
I have started learning MySQL and I'm having a problem with JOIN.
I have two tables: purchase and sales
purchase
--------------
p_id date p_cost p_quantity
---------------------------------------
1 2014-03-21 100 5
2 2014-03-21 20 2
sales
--------------
s_id date s_cost s_quantity
---------------------------------------
1 2014-03-21 90 9
2 2014-03-22 20 2
I want these two tables to be joined where purchase.date=sales.date to get one of the following results:
Option 1:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 1 2014-03-21 90 9
2 2014-03-21 20 2 NULL NULL NULL NULL
NULL NULL NULL NULL 2 2014-03-22 20 2
Option 2:
p_id date p_cost p_quantity s_id date s_cost s_quantity
------------------------------------------------------------------------------
1 2014-03-21 100 5 NULL NULL NULL NULL
2 2014-03-21 20 2 1 2014-03-21 90 9
NULL NULL NULL NULL 2 2014-03-22 20 2
the main problem lies in the 2nd row of the first result. I don't want the values
2014-03-21, 90, 9 again in row 2... I want NULL instead.
I don't know whether it is possible to do this. It would be kind enough if anyone helps me out.
I tried using left join
SELECT *
FROM sales
LEFT JOIN purchase ON sales.date = purchase.date
output:
s_id date s_cost s_quantity p_id date p_cost p_quantity
1 2014-03-21 90 9 1 2014-03-21 100 5
1 2014-03-21 90 9 2 2014-03-21 20 2
2 2014-03-22 20 2 NULL NULL NULL NULL
but I want 1st 4 values of 2nd row to be NULL
Since there are no common table expressions or full outer joins to work with, the query will have some duplication and instead need to use a left join unioned with a right join;
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p LEFT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
UNION
SELECT p_id, p.date p_date, p_cost, p_quantity,
s_id, s.date s_date, s_cost, s_quantity
FROM (
SELECT *,(SELECT COUNT(*) FROM purchase p1
WHERE p1.date=p.date AND p1.p_id<p.p_id) rn FROM purchase p
) p RIGHT JOIN (
SELECT *,(SELECT COUNT(*) FROM sales s1
WHERE s1.date=s.date AND s1.s_id<s.s_id) rn FROM sales s
) s
ON s.date=p.date AND s.rn=p.rn
An SQLfiddle to test with.
In a general sense, what you're looking for is called a FULL OUTER JOIN, which is not directly available in MySQL. Instead you only get LEFT JOIN and RIGHT JOIN, which you can UNION together to get essentially the same result. For a very thorough discussion on this subject, see Full Outer Join in MySQL.
If you need help understanding the different ways to JOIN a table, I recommend A Visual Explanation of SQL Joins.
The way this is different from a regular FULL OUTER JOIN is that you're only including any particular row from either table at most once in the JOIN result. The problem being, if you have one purchase record and two sales records on a particular day, which sales record is the purchase record associated with? What is the relationship you're trying to represent between these two tables?
It doesn't sound like there's any particular relationship between purchase and sales records, except that some of them happened to take place on the same day. In which case, you're using the wrong tool for the job. If all you want to do is display these tables side by side and line the rows up by date, you don't need a JOIN at all. Instead, you should SELECT each table separately and do your formatting with some other tool (or manually).
Here's another way to get the same result, but the EXPLAIN for this is horrendous; and performance with large sets is going to be atrocious.
This is essentially two queries UNIONed together. The first query is essentially "purchase LEFT JOIN sales", the second query is essentially "sales ANTI JOIN purchase".
Because there is no foreign key relationship between the two tables, other than rows matching on date, we have to "invent" a key we can join on; we use user variables to assign ascending integer values to each row within a given date, so we can match row 1 from purchase to row 1 from sales, etc.
I wouldn't normally generate this type of result using SQL; it's not a typical JOIN operation, in the sense of how we traditionally join tables.
But, if I had to produce the specified resultset using MySQL, I would do it like this:
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #pl_i := IF(pl.date = #pl_prev_date,#pl_i+1,1) AS i
, #pl_prev_date := pl.date AS p_date
, pl.p_id
, pl.p_cost
, pl.p_quantity
FROM purchase pl
JOIN ( SELECT #pl_i := 0, #pl_prev_date := NULL ) pld
ORDER BY pl.date, pl.p_id
) p
LEFT
JOIN ( SELECT #sr_i := IF(sr.date = #sr_prev_date,#sr_i+1,1) AS i
, #sr_prev_date := sr.date AS s_date
, sr.s_id
, sr.s_cost
, sr.s_quantity
FROM sales sr
JOIN ( SELECT #sr_i := 0, #sr_prev_date := NULL ) srd
ORDER BY sr.date, sr.s_id
) s
ON s.s_date = p.p_date
AND s.i = p.i
UNION ALL
SELECT p.p_id
, p.p_date
, p.p_cost
, p.p_quantity
, s.s_id
, s.s_date
, s.s_cost
, s.s_quantity
FROM ( SELECT #sl_i := IF(sl.date = #sl_prev_date,#sl_i+1,1) AS i
, #sl_prev_date := sl.date AS s_date
, sl.s_id
, sl.s_cost
, sl.s_quantity
FROM sales sl
JOIN ( SELECT #sl_i := 0, #sl_prev_date := NULL ) sld
ORDER BY sl.date, sl.s_id
) s
LEFT
JOIN ( SELECT #pr_i := IF(pr.date = #pr_prev_date,#pr_i+1,1) AS i
, #pr_prev_date := pr.date AS p_date
, pr.p_id
, pr.p_cost
, pr.p_quantity
FROM purchase pr
JOIN ( SELECT #pr_i := 0, #pr_prev_date := NULL ) prd
ORDER BY pr.date, pr.p_id
) p
ON p.p_date = s.s_date
AND p.i = s.i
WHERE p.p_date IS NULL
ORDER BY COALESCE(p_date,s_date),COALESCE(p_id,s_id)
Consider following tables in MySQL database:
entries:
creator_id INT
entry TEXT
is_expired BOOL
other:
creator_id INT
entry TEXT
userdata:
creator_id INT
name VARCHAR
etc...
In entries and other, there can be multiple entries by 1 creator. userdata table is read only for me (placed in other database).
I'd like to achieve a following SELECT result:
+------------+---------+---------+-------+
| creator_id | entries | expired | other |
+------------+---------+---------+-------+
| 10951 | 59 | 55 | 39 |
| 70887 | 41 | 34 | 108 |
| 88309 | 38 | 20 | 102 |
| 94732 | 0 | 0 | 86 |
... where entries is equal to SELECT COUNT(entry) FROM entries GROUP BY creator_id,
expired is equal to SELECT COUNT(entry) FROM entries WHERE is_expired = 0 GROUP BY creator_id and
other is equal to SELECT COUNT(entry) FROM other GROUP BY creator_id.
I need this structure because after doing this SELECT, I need to look for user data in the "userdata" table, which I planned to do with INNER JOIN and select desired columns.
I solved this problem with selecting "NULL" into column which does not apply for given SELECT:
SELECT
creator_id,
COUNT(any_entry) as entries,
COUNT(expired_entry) as expired,
COUNT(other_entry) as other
FROM (
SELECT
creator_id,
entry AS any_entry,
NULL AS expired_entry,
NULL AS other_enry
FROM entries
UNION
SELECT
creator_id,
NULL AS any_entry,
entry AS expired_entry,
NULL AS other_enry
FROM entries
WHERE is_expired = 1
UNION
SELECT
creator_id,
NULL AS any_entry,
NULL AS expired_entry,
entry AS other_enry
FROM other
) AS tTemp
GROUP BY creator_id
ORDER BY
entries DESC,
expired DESC,
other DESC
;
I've left out the INNER JOIN and selecting other columns from userdata table on purpose (my question being about combining 3 SELECTs into 1).
Is my idea valid? = Am I trying to use the right "construction" for this?
Are these kind of SELECTs possible without creating an "empty" column? (some kind of JOIN)
Should I do it "outside the DB": make 3 SELECTs, make some order in it (let's say python lists/dicts) and then do the additional SELECTs for userdata?
Solution for a similar question does not return rows where entries and expired are 0.
Thank you for your time.
This should work (assuming all creator_ids appear in the userdata table.
SELECT userdata.creator_id, COALESCE(entries_count_,0) AS entries_count, COALESCE(expired_count_,0) AS expired_count, COALESCE(other_count_,0) AS other_count
FROM userdata
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS entries_count_
FROM entries
GROUP BY creator_id) AS entries_q
ON userdata.creator_id=entries_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS expired_count_
FROM entries
WHERE is_expired=0
GROUP BY creator_id) AS expired_q
ON userdata.creator_id=expired_q.creator_id
LEFT OUTER JOIN
(SELECT creator_id, COUNT(entry) AS other_count_
FROM other
GROUP BY creator_id) AS other_q
ON userdata.creator_id=other_q.creator_id;
Basicly, what you are doing looks correct to me.
I would rewrite it as follows though
SELECT entries.creator_id
, any_entry
, expired_entry
, other_entry
FROM (
SELECT creator_id, COUNT(entry) AS any_entry,
FROM entries
GROUP BY creator_id
) entries
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS expired_entry,
FROM entries
WHERE is_expired = 1
GROUP BY creator_id
) expired ON expired.creator_id = entries.creator_id
LEFT OUTER JOIN (
SELECT creator_id, COUNT(entry) AS other_entry
FROM other
GROUP BY creator_id
) other ON other.creator_id = entries.creator_id
How about
SELECT creator_id,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 0) AS entries,
(SELECT COUNT(*)
FROM entries e
WHERE e.creator_id = main.creator_id AND
e.is_expired = 1) as expired,
(SELECT COUNT(*)
FROM other
WHERE other.creator_id = main.creator_id) AS other,
FROM entries main
GROUP BY main.creator_id;